Edit model card

SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-small-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("himanshu23099/bge_embedding_finetune1")
# Run inference
sentences = [
    'How long does it typically take to enter or exit the parking area during peak times?',
    'The time to enter or exit the parking area during peak times can vary based on crowd density, time of day, and traffic management. Generally, it takes about 2 to 10 minutes.',
    'In a remote village, the annual kite festival attracts many visitors who come to see the vibrant displays. The event showcases dozens of kites soaring high, each crafted with unique designs. Local artisans prepare for months, selecting colors and materials to make the best creations. Everyone enjoys the lively atmosphere filled with music and laughter.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.3444
cosine_accuracy@5 0.7229
cosine_accuracy@10 0.8039
cosine_precision@1 0.3444
cosine_precision@5 0.1446
cosine_precision@10 0.0804
cosine_recall@1 0.3444
cosine_recall@5 0.7229
cosine_recall@10 0.8039
cosine_ndcg@5 0.5504
cosine_ndcg@10 0.5766
cosine_ndcg@100 0.6142
cosine_mrr@5 0.4926
cosine_mrr@10 0.5034
cosine_mrr@100 0.5113
cosine_map@100 0.5113

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,507 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 5 tokens
    • mean: 12.02 tokens
    • max: 32 tokens
    • min: 3 tokens
    • mean: 117.69 tokens
    • max: 504 tokens
    • min: 15 tokens
    • mean: 119.62 tokens
    • max: 422 tokens
  • Samples:
    anchor positive negative
    Tour departs how city What is the itinerary for 1-day Maihar tour?
    Maihar tour departs from Hotel Ilawart, Prayagraj at 7:00 AM and includes visit to Maa Sharda Devi Temple located atop Trikoota Hill. For more details and booking, click here: https://bit.ly/3YBcbI6
    List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]
    What one-day outstation tours are available from Prayagraj?
    The one-day outstation tours from Prayagraj include destinations such as Ayodhya, Varanasi, Maihar, and Chitrakoot. These tours offer a quick yet enriching journey to some of the most significant spiritual and cultural sites near Prayagraj.

    For more details, visit : https://bit.ly/4eWFRoH
    How train for Prayag reach Which airlines operate flights to Prayagraj?
    Several airlines operate flights to Prayagraj, India. However, availability may depend on your location and the time of travel. Some of the airlines that typically operate flights to Prayagraj include:

    1. Air India
    2. IndiGo
    3. SpiceJet

    For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website https://www.irctc.co.in/nget/
    List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]
    What is the best train route to Prayagraj from Ayodhya?
    To travel by train from Ayodhya to Prayagraj, you can use the Indian Railways' services. Here is a general guide for the route:

    1. Ayodhya Cantt (AY) to Prayagraj Junction (PRYJ) via Train No. 14203: This is one of the direct trains to Prayagraj from Ayodhya. It generally runs on Tuesday and Friday.

    2. Ayodhya Cantt (AY) to Prayagraj Rambag (PRRB) via Train No. 14205: This train runs regularly and is another direct route to Prayagraj.

    For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website https://www.irctc.co.in/nget/
    Why should one do the Prayagraj Panchkoshi Parikrama? The Prayagraj Panchkoshi Parikrama is a deeply revered spiritual journey that offers multiple benefits to devotees. It is believed to grant blessings equivalent to visiting all sacred pilgrimage sites in India, providing divine grace and spiritual merit. The Parikrama route covers significant temples like the Dwadash Madhav temples, Akshayavat, and Mankameshwar, which are steeped in Hindu mythology and history, allowing pilgrims to connect with the spiritual and cultural heritage of Prayagraj. This circumambulation around sacred sites is also seen as a way to cleanse one's sins and progress towards Moksha (liberation from the cycle of birth and rebirth), making it a path of introspection and spiritual growth. The pilgrimage fosters unity among people from diverse backgrounds, offering a unique cultural exchange and shared spiritual experience. By participating, devotees also help revive an ancient tradition integral to the Kumbh Mela for centuries, reconnecting with age-old practices t... Elevators are remarkable inventions that revolutionized how we navigate tall buildings. They provide a swift, efficient means of transportation between floors, making urban life more accessible. These mechanical wonders operate on a system of pulleys and counterweights, enabling them to carry heavy loads effortlessly. Safety features like emergency brakes and backup power systems ensure that passengers remain secure during their journey. Various designs and styles can be seen in buildings around the world, from sleek modern glass models to vintage models that evoke nostalgia. Elevators also highlight the advancement of engineering and technology over time, evolving from rudimentary designs to sophisticated machines with smart technology. They are essential in various settings, including residential, commercial, and industrial spaces, offering convenience and practicality. Their presence also allows for the efficient use of vertical space, fostering creativity in architectural designs a...
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Evaluation Dataset

Unnamed Dataset

  • Size: 877 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 877 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 12.13 tokens
    • max: 32 tokens
    • min: 3 tokens
    • mean: 117.82 tokens
    • max: 504 tokens
    • min: 8 tokens
    • mean: 117.68 tokens
    • max: 422 tokens
  • Samples:
    anchor positive negative
    Akhara means what Is the word Akhara related to Akhand?
    Many scholars believe that the word 'Akhara' originated from the word 'Akhand.' Initially, a group of armed ascetics was referred to as 'Akhand.' Over time, when these 'Akhand' groups evolved into centers for training in weaponry and martial arts, they came to be known as 'Akhara.'
    List of Aliases: [['Akhand', 'Akhara', 'Kalpwasi Camp', 'Naga', 'Nagas', 'Sadhu', 'sadhus']]
    Why did Adi Shankaracharya organize the Akharas?
    According to the evidence available in the Akharas and the descriptions mentioned in their history, centuries ago, Adi Shankaracharya established these Akharas with the purpose of protecting Hindu temples and monasteries from foreign and non-believer invaders, as well as safeguarding the followers of Hinduism.

    Adi Shankaracharya believed that young saints should not only be proficient in scriptures (Shastra) but also in the art of weaponry (Shastra), so they could fulfill the duty of protecting the monasteries, temples, and their followers when necessary.
    Why do so many people gather for this? Millions gather for the Kumbh Mela due to its profound spiritual, cultural, and social significance. Rooted in ancient Hindu mythology, the Mela is believed to be an auspicious time when bathing in the sacred rivers—Ganga, Yamuna, and Saraswati—can cleanse sins and lead to spiritual liberation (Moksha). The event, occurring during rare celestial alignments, amplifies these spiritual benefits. It is a unique confluence of faith, where people from diverse backgrounds come together, creating a “mini-India” that fosters unity in diversity. \n The Mela also offers opportunities for spiritual learning through discourses by saints, religious rituals like Kalpvas, Deep Daan, and cultural performances. Moreover, the Kumbh Mela is a rare platform for connecting with spiritual leaders, experiencing religious tolerance, and participating in one of the world's largest peaceful gatherings, making it a must-attend event for millions seeking spiritual growth, community, and divine blessings. In the bustling world of urban development, architects and city planners often seek innovative solutions to optimize living spaces. The integration of green spaces within urban environments not only enhances aesthetic appeal but also significantly improves residents' quality of life. Vertical gardens, rooftops, and community parks play a crucial role in providing habitats for local wildlife while promoting biodiversity in densely populated areas.

    Furthermore, advancements in sustainable technology, such as solar panels and rainwater harvesting systems, are being incorporated into these designs, offering environmentally friendly alternatives that reduce utility costs for residents. Public art installations also contribute to community identity, fostering a sense of belonging among citizens.

    Collaborative efforts between various stakeholders—governments, private sectors, and local communities—are essential to ensure these projects reflect the needs and desires of the people. The succ...
    Do parking charges vary between different parking zones or proximity to the Mela grounds? No, the parking charges are standardized and remain the same throughout, regardless of the parking zone or proximity to the Mela grounds. Charges are fixed at ₹5 for cycles, ₹15 for two-wheelers, ₹65 for 3-4 wheelers, and ₹260 for buses and heavy vehicles for 24 hours. The ancient art of pottery involves molding clay into various shapes before firing it in a kiln. Traditionally, artisans use hand tools and techniques passed down through generations. Each region often has its own distinctive styles, resulting in a rich diversity of forms, glazes, and colors. Pottery can serve practical purposes, such as in cooking and storage, while also being a medium for artistic expression and cultural storytelling.
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
      (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Normalize()
    ), 'temperature': 0.01}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • gradient_accumulation_steps: 2
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 30
  • warmup_ratio: 0.1
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 2
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 30
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss val_evaluator_cosine_ndcg@100
0.0909 10 1.9717 1.2192 0.4285
0.1818 20 1.8228 1.1896 0.4307
0.2727 30 1.9999 1.1429 0.4310
0.3636 40 1.6463 1.0845 0.4311
0.4545 50 1.9207 1.0205 0.4334
0.5455 60 1.5777 0.9509 0.4338
0.6364 70 1.4277 0.8810 0.4376
0.7273 80 1.408 0.8130 0.4432
0.8182 90 1.3565 0.7535 0.4436
0.9091 100 1.3322 0.6935 0.4495
1.0 110 0.8344 0.6420 0.4518
1.0909 120 1.1696 0.5956 0.4515
1.1818 130 0.9622 0.5524 0.4565
1.2727 140 0.9005 0.5173 0.4616
1.3636 150 0.962 0.4802 0.4662
1.4545 160 0.7924 0.4497 0.4693
1.5455 170 0.8955 0.4262 0.4711
1.6364 180 0.7652 0.4031 0.4736
1.7273 190 0.7517 0.3804 0.4773
1.8182 200 0.5669 0.3636 0.4784
1.9091 210 0.6641 0.3469 0.4813
2.0 220 0.5227 0.3267 0.4820
2.0909 230 0.6146 0.3075 0.4843
2.1818 240 0.4709 0.2908 0.4882
2.2727 250 0.5963 0.2780 0.4955
2.3636 260 0.5103 0.2668 0.4977
2.4545 270 0.4833 0.2566 0.5027
2.5455 280 0.4389 0.2431 0.5045
2.6364 290 0.4653 0.2317 0.5059
2.7273 300 0.3559 0.2263 0.5086
2.8182 310 0.4623 0.2197 0.5127
2.9091 320 0.3889 0.2103 0.5183
3.0 330 0.4014 0.2037 0.5206
3.0909 340 0.2977 0.1999 0.5228
3.1818 350 0.4656 0.1956 0.5266
3.2727 360 0.436 0.1873 0.5288
3.3636 370 0.3111 0.1803 0.5311
3.4545 380 0.333 0.1759 0.5325
3.5455 390 0.2899 0.1717 0.5381
3.6364 400 0.4245 0.1663 0.5419
3.7273 410 0.4247 0.1658 0.5421
3.8182 420 0.2251 0.1646 0.5442
3.9091 430 0.2784 0.1635 0.5448
4.0 440 0.2503 0.1613 0.5490
4.0909 450 0.2342 0.1588 0.5501
4.1818 460 0.3139 0.1584 0.5527
4.2727 470 0.2356 0.1552 0.5498
4.3636 480 0.3147 0.1496 0.5518
4.4545 490 0.2691 0.1469 0.5508
4.5455 500 0.2639 0.1466 0.5561
4.6364 510 0.1581 0.1432 0.5625
4.7273 520 0.1922 0.1406 0.5663
4.8182 530 0.2453 0.1406 0.5688
4.9091 540 0.2631 0.1399 0.5705
5.0 550 0.3324 0.1402 0.5681
5.0909 560 0.1801 0.1389 0.5715
5.1818 570 0.2096 0.1371 0.5736
5.2727 580 0.2167 0.1344 0.5743
5.3636 590 0.1553 0.1297 0.5791
5.4545 600 0.1903 0.1263 0.5790
5.5455 610 0.1388 0.1241 0.5816
5.6364 620 0.2642 0.1231 0.5809
5.7273 630 0.2119 0.1238 0.5792
5.8182 640 0.1767 0.1216 0.5809
5.9091 650 0.2167 0.1218 0.5810
6.0 660 0.26 0.1232 0.5793
6.0909 670 0.1603 0.1222 0.5807
6.1818 680 0.1534 0.1209 0.5794
6.2727 690 0.1742 0.1165 0.5821
6.3636 700 0.1133 0.1120 0.5824
6.4545 710 0.1198 0.1106 0.5817
6.5455 720 0.2019 0.1114 0.5832
6.6364 730 0.2268 0.1116 0.5823
6.7273 740 0.1779 0.1077 0.5887
6.8182 750 0.1586 0.1048 0.5892
6.9091 760 0.2074 0.1057 0.5872
7.0 770 0.1625 0.1091 0.5881
7.0909 780 0.2266 0.1079 0.5900
7.1818 790 0.148 0.1054 0.5895
7.2727 800 0.1248 0.1048 0.5916
7.3636 810 0.1753 0.1047 0.5956
7.4545 820 0.109 0.1045 0.5981
7.5455 830 0.1369 0.1056 0.5953
7.6364 840 0.1209 0.1068 0.5946
7.7273 850 0.182 0.1079 0.5952
7.8182 860 0.1116 0.1083 0.5978
7.9091 870 0.1813 0.1033 0.5985
8.0 880 0.1559 0.1010 0.6027
8.0909 890 0.1384 0.1019 0.6017
8.1818 900 0.1057 0.1034 0.6004
8.2727 910 0.1359 0.1033 0.5994
8.3636 920 0.0909 0.1008 0.6011
8.4545 930 0.0995 0.0986 0.6030
8.5455 940 0.1261 0.0973 0.6046
8.6364 950 0.1031 0.0955 0.6013
8.7273 960 0.1163 0.0949 0.6018
8.8182 970 0.1493 0.0963 0.6041
8.9091 980 0.13 0.0967 0.6044
9.0 990 0.1059 0.0937 0.6044
9.0909 1000 0.1287 0.0923 0.6045
9.1818 1010 0.1019 0.0924 0.6086
9.2727 1020 0.1645 0.0921 0.6086
9.3636 1030 0.1395 0.0931 0.6075
9.4545 1040 0.1067 0.0935 0.6051
9.5455 1050 0.1334 0.0930 0.6058
9.6364 1060 0.136 0.0919 0.6069
9.7273 1070 0.0968 0.0930 0.6052
9.8182 1080 0.1447 0.0946 0.6077
9.9091 1090 0.1288 0.0967 0.6049
10.0 1100 0.1001 0.0960 0.6034
10.0909 1110 0.1642 0.0952 0.6000
10.1818 1120 0.1737 0.0926 0.6028
10.2727 1130 0.1283 0.0906 0.6023
10.3636 1140 0.0959 0.0906 0.6073
10.4545 1150 0.0875 0.0927 0.6065
10.5455 1160 0.1284 0.0934 0.6058
10.6364 1170 0.1482 0.0937 0.6049
10.7273 1180 0.1089 0.0925 0.6018
10.8182 1190 0.0876 0.0896 0.6068
10.9091 1200 0.0849 0.0897 0.6062
11.0 1210 0.1041 0.0897 0.6073
11.0909 1220 0.107 0.0889 0.6043
11.1818 1230 0.1018 0.0868 0.6059
11.2727 1240 0.0835 0.0846 0.6106
11.3636 1250 0.1455 0.0831 0.6069
11.4545 1260 0.1071 0.0832 0.6051
11.5455 1270 0.0777 0.0839 0.6054
11.6364 1280 0.1218 0.0855 0.6051
11.7273 1290 0.0702 0.0862 0.6048
11.8182 1300 0.1017 0.0865 0.6068
11.9091 1310 0.1452 0.0860 0.6074
12.0 1320 0.1563 0.0855 0.6073
12.0909 1330 0.1026 0.0858 0.6102
12.1818 1340 0.108 0.0861 0.6062
12.2727 1350 0.078 0.0854 0.6055
12.3636 1360 0.0655 0.0847 0.6082
12.4545 1370 0.1075 0.0836 0.6085
12.5455 1380 0.0875 0.0846 0.6049
12.6364 1390 0.1082 0.0828 0.6096
12.7273 1400 0.1133 0.0816 0.6077
12.8182 1410 0.0931 0.0814 0.6106
12.9091 1420 0.0728 0.0818 0.6085
13.0 1430 0.1338 0.0827 0.6082
13.0909 1440 0.1232 0.0813 0.6076
13.1818 1450 0.093 0.0796 0.6110
13.2727 1460 0.0994 0.0793 0.6090
13.3636 1470 0.0424 0.0806 0.6109
13.4545 1480 0.0598 0.0833 0.6086
13.5455 1490 0.0813 0.0841 0.6093
13.6364 1500 0.0913 0.0817 0.6125
13.7273 1510 0.1048 0.0801 0.6133
13.8182 1520 0.0503 0.0800 0.6110
13.9091 1530 0.0954 0.0800 0.6111
14.0 1540 0.067 0.0791 0.6099
14.0909 1550 0.0808 0.0779 0.6111
14.1818 1560 0.1047 0.0783 0.6110
14.2727 1570 0.0685 0.0791 0.6125
14.3636 1580 0.1215 0.0793 0.6120
14.4545 1590 0.0761 0.0794 0.6157
14.5455 1600 0.0705 0.0790 0.6136
14.6364 1610 0.0722 0.0785 0.6098
14.7273 1620 0.0881 0.0785 0.6120
14.8182 1630 0.0668 0.0791 0.6122
14.9091 1640 0.1261 0.0787 0.6152
15.0 1650 0.0601 0.0784 0.6148
15.0909 1660 0.0701 0.0799 0.6167
15.1818 1670 0.1244 0.0794 0.6160
15.2727 1680 0.0531 0.0788 0.6174
15.3636 1690 0.0518 0.0780 0.6154
15.4545 1700 0.0961 0.0784 0.6142
15.5455 1710 0.1041 - -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.0
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
35
Safetensors
Model size
33.4M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for himanshu23099/bge_embedding_finetune1

Finetuned
(109)
this model

Evaluation results