SentenceTransformer based on BAAI/bge-small-en-v1.5

This is a sentence-transformers model finetuned from BAAI/bge-small-en-v1.5. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-small-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 384 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("himanshu23099/bge_embedding_finetune1")
# Run inference
sentences = [
    'How long does it typically take to enter or exit the parking area during peak times?',
    'The time to enter or exit the parking area during peak times can vary based on crowd density, time of day, and traffic management. Generally, it takes about 2 to 10 minutes.',
    'In a remote village, the annual kite festival attracts many visitors who come to see the vibrant displays. The event showcases dozens of kites soaring high, each crafted with unique designs. Local artisans prepare for months, selecting colors and materials to make the best creations. Everyone enjoys the lively atmosphere filled with music and laughter.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: val_evaluator
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.3444
cosine_accuracy@5	0.7229
cosine_accuracy@10	0.8039
cosine_precision@1	0.3444
cosine_precision@5	0.1446
cosine_precision@10	0.0804
cosine_recall@1	0.3444
cosine_recall@5	0.7229
cosine_recall@10	0.8039
cosine_ndcg@5	0.5504
cosine_ndcg@10	0.5766
cosine_ndcg@100	0.6142
cosine_mrr@5	0.4926
cosine_mrr@10	0.5034
cosine_mrr@100	0.5113
cosine_map@100	0.5113

Training Details

Training Dataset

Unnamed Dataset

Size: 3,507 training samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 1000 samples:

	anchor	positive	negative
type	string	string	string
details	min: 5 tokens mean: 12.02 tokens max: 32 tokens	min: 3 tokens mean: 117.69 tokens max: 504 tokens	min: 15 tokens mean: 119.62 tokens max: 422 tokens

Samples:

anchor	positive	negative
`Tour departs how city`	`What is the itinerary for 1-day Maihar tour? Maihar tour departs from Hotel Ilawart, Prayagraj at 7:00 AM and includes visit to Maa Sharda Devi Temple located atop Trikoota Hill. For more details and booking, click here: https://bit.ly/3YBcbI6 List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]`	`What one-day outstation tours are available from Prayagraj? The one-day outstation tours from Prayagraj include destinations such as Ayodhya, Varanasi, Maihar, and Chitrakoot. These tours offer a quick yet enriching journey to some of the most significant spiritual and cultural sites near Prayagraj. For more details, visit : https://bit.ly/4eWFRoH`
`How train for Prayag reach`	`Which airlines operate flights to Prayagraj? Several airlines operate flights to Prayagraj, India. However, availability may depend on your location and the time of travel. Some of the airlines that typically operate flights to Prayagraj include: 1. Air India 2. IndiGo 3. SpiceJet For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website https://www.irctc.co.in/nget/ List of Aliases: [['Allahabad', 'PYG', 'Prayagraj']]`	What is the best train route to Prayagraj from Ayodhya? To travel by train from Ayodhya to Prayagraj, you can use the Indian Railways' services. Here is a general guide for the route: 1. Ayodhya Cantt (AY) to Prayagraj Junction (PRYJ) via Train No. 14203: This is one of the direct trains to Prayagraj from Ayodhya. It generally runs on Tuesday and Friday. 2. Ayodhya Cantt (AY) to Prayagraj Rambag (PRRB) via Train No. 14205: This train runs regularly and is another direct route to Prayagraj. For the most accurate and up-to-date information on train timings to Prayagraj, please visit the IRCTC website https://www.irctc.co.in/nget/
`Why should one do the Prayagraj Panchkoshi Parikrama?`	The Prayagraj Panchkoshi Parikrama is a deeply revered spiritual journey that offers multiple benefits to devotees. It is believed to grant blessings equivalent to visiting all sacred pilgrimage sites in India, providing divine grace and spiritual merit. The Parikrama route covers significant temples like the Dwadash Madhav temples, Akshayavat, and Mankameshwar, which are steeped in Hindu mythology and history, allowing pilgrims to connect with the spiritual and cultural heritage of Prayagraj. This circumambulation around sacred sites is also seen as a way to cleanse one's sins and progress towards Moksha (liberation from the cycle of birth and rebirth), making it a path of introspection and spiritual growth. The pilgrimage fosters unity among people from diverse backgrounds, offering a unique cultural exchange and shared spiritual experience. By participating, devotees also help revive an ancient tradition integral to the Kumbh Mela for centuries, reconnecting with age-old practices t...	Elevators are remarkable inventions that revolutionized how we navigate tall buildings. They provide a swift, efficient means of transportation between floors, making urban life more accessible. These mechanical wonders operate on a system of pulleys and counterweights, enabling them to carry heavy loads effortlessly. Safety features like emergency brakes and backup power systems ensure that passengers remain secure during their journey. Various designs and styles can be seen in buildings around the world, from sleek modern glass models to vintage models that evoke nostalgia. Elevators also highlight the advancement of engineering and technology over time, evolving from rudimentary designs to sophisticated machines with smart technology. They are essential in various settings, including residential, commercial, and industrial spaces, offering convenience and practicality. Their presence also allows for the efficient use of vertical space, fostering creativity in architectural designs a...

Loss: GISTEmbedLoss with these parameters:

{'guide': SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), 'temperature': 0.01}

Evaluation Dataset

Unnamed Dataset

Size: 877 evaluation samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 877 samples:

	anchor	positive	negative
type	string	string	string
details	min: 4 tokens mean: 12.13 tokens max: 32 tokens	min: 3 tokens mean: 117.82 tokens max: 504 tokens	min: 8 tokens mean: 117.68 tokens max: 422 tokens

Samples:

anchor	positive	negative
`Akhara means what`	`Is the word Akhara related to Akhand? Many scholars believe that the word 'Akhara' originated from the word 'Akhand.' Initially, a group of armed ascetics was referred to as 'Akhand.' Over time, when these 'Akhand' groups evolved into centers for training in weaponry and martial arts, they came to be known as 'Akhara.' List of Aliases: [['Akhand', 'Akhara', 'Kalpwasi Camp', 'Naga', 'Nagas', 'Sadhu', 'sadhus']]`	Why did Adi Shankaracharya organize the Akharas? According to the evidence available in the Akharas and the descriptions mentioned in their history, centuries ago, Adi Shankaracharya established these Akharas with the purpose of protecting Hindu temples and monasteries from foreign and non-believer invaders, as well as safeguarding the followers of Hinduism. Adi Shankaracharya believed that young saints should not only be proficient in scriptures (Shastra) but also in the art of weaponry (Shastra), so they could fulfill the duty of protecting the monasteries, temples, and their followers when necessary.
`Why do so many people gather for this?`	Millions gather for the Kumbh Mela due to its profound spiritual, cultural, and social significance. Rooted in ancient Hindu mythology, the Mela is believed to be an auspicious time when bathing in the sacred rivers—Ganga, Yamuna, and Saraswati—can cleanse sins and lead to spiritual liberation (Moksha). The event, occurring during rare celestial alignments, amplifies these spiritual benefits. It is a unique confluence of faith, where people from diverse backgrounds come together, creating a “mini-India” that fosters unity in diversity. \n The Mela also offers opportunities for spiritual learning through discourses by saints, religious rituals like Kalpvas, Deep Daan, and cultural performances. Moreover, the Kumbh Mela is a rare platform for connecting with spiritual leaders, experiencing religious tolerance, and participating in one of the world's largest peaceful gatherings, making it a must-attend event for millions seeking spiritual growth, community, and divine blessings.	In the bustling world of urban development, architects and city planners often seek innovative solutions to optimize living spaces. The integration of green spaces within urban environments not only enhances aesthetic appeal but also significantly improves residents' quality of life. Vertical gardens, rooftops, and community parks play a crucial role in providing habitats for local wildlife while promoting biodiversity in densely populated areas. Furthermore, advancements in sustainable technology, such as solar panels and rainwater harvesting systems, are being incorporated into these designs, offering environmentally friendly alternatives that reduce utility costs for residents. Public art installations also contribute to community identity, fostering a sense of belonging among citizens. Collaborative efforts between various stakeholders—governments, private sectors, and local communities—are essential to ensure these projects reflect the needs and desires of the people. The succ...
`Do parking charges vary between different parking zones or proximity to the Mela grounds?`	`No, the parking charges are standardized and remain the same throughout, regardless of the parking zone or proximity to the Mela grounds. Charges are fixed at ₹5 for cycles, ₹15 for two-wheelers, ₹65 for 3-4 wheelers, and ₹260 for buses and heavy vehicles for 24 hours.`	`The ancient art of pottery involves molding clay into various shapes before firing it in a kiln. Traditionally, artisans use hand tools and techniques passed down through generations. Each region often has its own distinctive styles, resulting in a rich diversity of forms, glazes, and colors. Pottery can serve practical purposes, such as in cooking and storage, while also being a medium for artistic expression and cultural storytelling.`

Loss: GISTEmbedLoss with these parameters:

{'guide': SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
), 'temperature': 0.01}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 16
gradient_accumulation_steps: 2
learning_rate: 1e-05
weight_decay: 0.01
num_train_epochs: 30
warmup_ratio: 0.1
load_best_model_at_end: True

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 16
per_device_eval_batch_size: 8
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 2
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 1e-05
weight_decay: 0.01
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 30
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: proportional

Training Logs

Click to expand

Epoch	Step	Training Loss	Validation Loss	val_evaluator_cosine_ndcg@100
0.0909	10	1.9717	1.2192	0.4285
0.1818	20	1.8228	1.1896	0.4307
0.2727	30	1.9999	1.1429	0.4310
0.3636	40	1.6463	1.0845	0.4311
0.4545	50	1.9207	1.0205	0.4334
0.5455	60	1.5777	0.9509	0.4338
0.6364	70	1.4277	0.8810	0.4376
0.7273	80	1.408	0.8130	0.4432
0.8182	90	1.3565	0.7535	0.4436
0.9091	100	1.3322	0.6935	0.4495
1.0	110	0.8344	0.6420	0.4518
1.0909	120	1.1696	0.5956	0.4515
1.1818	130	0.9622	0.5524	0.4565
1.2727	140	0.9005	0.5173	0.4616
1.3636	150	0.962	0.4802	0.4662
1.4545	160	0.7924	0.4497	0.4693
1.5455	170	0.8955	0.4262	0.4711
1.6364	180	0.7652	0.4031	0.4736
1.7273	190	0.7517	0.3804	0.4773
1.8182	200	0.5669	0.3636	0.4784
1.9091	210	0.6641	0.3469	0.4813
2.0	220	0.5227	0.3267	0.4820
2.0909	230	0.6146	0.3075	0.4843
2.1818	240	0.4709	0.2908	0.4882
2.2727	250	0.5963	0.2780	0.4955
2.3636	260	0.5103	0.2668	0.4977
2.4545	270	0.4833	0.2566	0.5027
2.5455	280	0.4389	0.2431	0.5045
2.6364	290	0.4653	0.2317	0.5059
2.7273	300	0.3559	0.2263	0.5086
2.8182	310	0.4623	0.2197	0.5127
2.9091	320	0.3889	0.2103	0.5183
3.0	330	0.4014	0.2037	0.5206
3.0909	340	0.2977	0.1999	0.5228
3.1818	350	0.4656	0.1956	0.5266
3.2727	360	0.436	0.1873	0.5288
3.3636	370	0.3111	0.1803	0.5311
3.4545	380	0.333	0.1759	0.5325
3.5455	390	0.2899	0.1717	0.5381
3.6364	400	0.4245	0.1663	0.5419
3.7273	410	0.4247	0.1658	0.5421
3.8182	420	0.2251	0.1646	0.5442
3.9091	430	0.2784	0.1635	0.5448
4.0	440	0.2503	0.1613	0.5490
4.0909	450	0.2342	0.1588	0.5501
4.1818	460	0.3139	0.1584	0.5527
4.2727	470	0.2356	0.1552	0.5498
4.3636	480	0.3147	0.1496	0.5518
4.4545	490	0.2691	0.1469	0.5508
4.5455	500	0.2639	0.1466	0.5561
4.6364	510	0.1581	0.1432	0.5625
4.7273	520	0.1922	0.1406	0.5663
4.8182	530	0.2453	0.1406	0.5688
4.9091	540	0.2631	0.1399	0.5705
5.0	550	0.3324	0.1402	0.5681
5.0909	560	0.1801	0.1389	0.5715
5.1818	570	0.2096	0.1371	0.5736
5.2727	580	0.2167	0.1344	0.5743
5.3636	590	0.1553	0.1297	0.5791
5.4545	600	0.1903	0.1263	0.5790
5.5455	610	0.1388	0.1241	0.5816
5.6364	620	0.2642	0.1231	0.5809
5.7273	630	0.2119	0.1238	0.5792
5.8182	640	0.1767	0.1216	0.5809
5.9091	650	0.2167	0.1218	0.5810
6.0	660	0.26	0.1232	0.5793
6.0909	670	0.1603	0.1222	0.5807
6.1818	680	0.1534	0.1209	0.5794
6.2727	690	0.1742	0.1165	0.5821
6.3636	700	0.1133	0.1120	0.5824
6.4545	710	0.1198	0.1106	0.5817
6.5455	720	0.2019	0.1114	0.5832
6.6364	730	0.2268	0.1116	0.5823
6.7273	740	0.1779	0.1077	0.5887
6.8182	750	0.1586	0.1048	0.5892
6.9091	760	0.2074	0.1057	0.5872
7.0	770	0.1625	0.1091	0.5881
7.0909	780	0.2266	0.1079	0.5900
7.1818	790	0.148	0.1054	0.5895
7.2727	800	0.1248	0.1048	0.5916
7.3636	810	0.1753	0.1047	0.5956
7.4545	820	0.109	0.1045	0.5981
7.5455	830	0.1369	0.1056	0.5953
7.6364	840	0.1209	0.1068	0.5946
7.7273	850	0.182	0.1079	0.5952
7.8182	860	0.1116	0.1083	0.5978
7.9091	870	0.1813	0.1033	0.5985
8.0	880	0.1559	0.1010	0.6027
8.0909	890	0.1384	0.1019	0.6017
8.1818	900	0.1057	0.1034	0.6004
8.2727	910	0.1359	0.1033	0.5994
8.3636	920	0.0909	0.1008	0.6011
8.4545	930	0.0995	0.0986	0.6030
8.5455	940	0.1261	0.0973	0.6046
8.6364	950	0.1031	0.0955	0.6013
8.7273	960	0.1163	0.0949	0.6018
8.8182	970	0.1493	0.0963	0.6041
8.9091	980	0.13	0.0967	0.6044
9.0	990	0.1059	0.0937	0.6044
9.0909	1000	0.1287	0.0923	0.6045
9.1818	1010	0.1019	0.0924	0.6086
9.2727	1020	0.1645	0.0921	0.6086
9.3636	1030	0.1395	0.0931	0.6075
9.4545	1040	0.1067	0.0935	0.6051
9.5455	1050	0.1334	0.0930	0.6058
9.6364	1060	0.136	0.0919	0.6069
9.7273	1070	0.0968	0.0930	0.6052
9.8182	1080	0.1447	0.0946	0.6077
9.9091	1090	0.1288	0.0967	0.6049
10.0	1100	0.1001	0.0960	0.6034
10.0909	1110	0.1642	0.0952	0.6000
10.1818	1120	0.1737	0.0926	0.6028
10.2727	1130	0.1283	0.0906	0.6023
10.3636	1140	0.0959	0.0906	0.6073
10.4545	1150	0.0875	0.0927	0.6065
10.5455	1160	0.1284	0.0934	0.6058
10.6364	1170	0.1482	0.0937	0.6049
10.7273	1180	0.1089	0.0925	0.6018
10.8182	1190	0.0876	0.0896	0.6068
10.9091	1200	0.0849	0.0897	0.6062
11.0	1210	0.1041	0.0897	0.6073
11.0909	1220	0.107	0.0889	0.6043
11.1818	1230	0.1018	0.0868	0.6059
11.2727	1240	0.0835	0.0846	0.6106
11.3636	1250	0.1455	0.0831	0.6069
11.4545	1260	0.1071	0.0832	0.6051
11.5455	1270	0.0777	0.0839	0.6054
11.6364	1280	0.1218	0.0855	0.6051
11.7273	1290	0.0702	0.0862	0.6048
11.8182	1300	0.1017	0.0865	0.6068
11.9091	1310	0.1452	0.0860	0.6074
12.0	1320	0.1563	0.0855	0.6073
12.0909	1330	0.1026	0.0858	0.6102
12.1818	1340	0.108	0.0861	0.6062
12.2727	1350	0.078	0.0854	0.6055
12.3636	1360	0.0655	0.0847	0.6082
12.4545	1370	0.1075	0.0836	0.6085
12.5455	1380	0.0875	0.0846	0.6049
12.6364	1390	0.1082	0.0828	0.6096
12.7273	1400	0.1133	0.0816	0.6077
12.8182	1410	0.0931	0.0814	0.6106
12.9091	1420	0.0728	0.0818	0.6085
13.0	1430	0.1338	0.0827	0.6082
13.0909	1440	0.1232	0.0813	0.6076
13.1818	1450	0.093	0.0796	0.6110
13.2727	1460	0.0994	0.0793	0.6090
13.3636	1470	0.0424	0.0806	0.6109
13.4545	1480	0.0598	0.0833	0.6086
13.5455	1490	0.0813	0.0841	0.6093
13.6364	1500	0.0913	0.0817	0.6125
13.7273	1510	0.1048	0.0801	0.6133
13.8182	1520	0.0503	0.0800	0.6110
13.9091	1530	0.0954	0.0800	0.6111
14.0	1540	0.067	0.0791	0.6099
14.0909	1550	0.0808	0.0779	0.6111
14.1818	1560	0.1047	0.0783	0.6110
14.2727	1570	0.0685	0.0791	0.6125
14.3636	1580	0.1215	0.0793	0.6120
14.4545	1590	0.0761	0.0794	0.6157
14.5455	1600	0.0705	0.0790	0.6136
14.6364	1610	0.0722	0.0785	0.6098
14.7273	1620	0.0881	0.0785	0.6120
14.8182	1630	0.0668	0.0791	0.6122
14.9091	1640	0.1261	0.0787	0.6152
15.0	1650	0.0601	0.0784	0.6148
15.0909	1660	0.0701	0.0799	0.6167
15.1818	1670	0.1244	0.0794	0.6160
15.2727	1680	0.0531	0.0788	0.6174
15.3636	1690	0.0518	0.0780	0.6154
15.4545	1700	0.0961	0.0784	0.6142
15.5455	1710	0.1041	-	-

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.3.0
Transformers: 4.46.2
PyTorch: 2.5.1+cu121
Accelerate: 1.1.1
Datasets: 3.1.0
Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

himanshu23099
/

bge_embedding_finetune1

SentenceTransformer based on BAAI/bge-small-en-v1.5

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Information Retrieval

Training Details

Training Dataset

Unnamed Dataset

Evaluation Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

GISTEmbedLoss

Model tree for himanshu23099/bge_embedding_finetune1

Evaluation results