SentenceTransformer based on sentence-transformers/stsb-distilbert-base
This is a sentence-transformers model finetuned from sentence-transformers/stsb-distilbert-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/stsb-distilbert-base
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("alpha-brain/stsb-distilbert-base-mnrl")
# Run inference
sentences = [
'Do correlations between plasma-neuropeptides and temperament dimensions differ between suicidal patients and healthy controls?',
'Decreased plasma levels of plasma-neuropeptide Y (NPY) and plasma-corticotropin releasing hormone (CRH), and increased levels of plasma delta-sleep inducing peptide (DSIP) in suicide attempters with mood disorders have previously been observed. This study was performed in order to further understand the clinical relevance of these findings.',
"Seven hundred fifty patients entered the study. One hundred sixty-eight patients (22.4%) presented with a total of 193 extracutaneous manifestations, as follows: articular (47.2%), neurologic (17.1%), vascular (9.3%), ocular (8.3%), gastrointestinal (6.2%), respiratory (2.6%), cardiac (1%), and renal (1%). Other autoimmune conditions were present in 7.3% of patients. Neurologic involvement consisted of epilepsy, central nervous system vasculitis, peripheral neuropathy, vascular malformations, headache, and neuroimaging abnormalities. Ocular manifestations were episcleritis, uveitis, xerophthalmia, glaucoma, and papilledema. In more than one-fourth of these children, articular, neurologic, and ocular involvements were unrelated to the site of skin lesions. Raynaud's phenomenon was reported in 16 patients. Respiratory involvement consisted essentially of restrictive lung disease. Gastrointestinal involvement was reported in 12 patients and consisted exclusively of gastroesophageal reflux. Thirty patients (4%) had multiple extracutaneous features, but systemic sclerosis (SSc) developed in only 1 patient. In patients with extracutaneous involvement, the prevalence of antinuclear antibodies and rheumatoid factor was significantly higher than that among patients with only skin involvement. However, Scl-70 and anticentromere, markers of SSc, were not significantly increased.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Information Retrieval
- Dataset:
med-eval-dev
- Evaluated with
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.9825 |
cosine_accuracy@3 | 0.998 |
cosine_accuracy@5 | 0.9985 |
cosine_accuracy@10 | 0.9985 |
cosine_precision@1 | 0.9825 |
cosine_precision@3 | 0.8438 |
cosine_precision@5 | 0.5588 |
cosine_precision@10 | 0.2931 |
cosine_recall@1 | 0.3413 |
cosine_recall@3 | 0.8454 |
cosine_recall@5 | 0.9192 |
cosine_recall@10 | 0.9578 |
cosine_ndcg@10 | 0.9462 |
cosine_mrr@10 | 0.99 |
cosine_map@100 | 0.9169 |
dot_accuracy@1 | 0.9705 |
dot_accuracy@3 | 0.9955 |
dot_accuracy@5 | 0.9985 |
dot_accuracy@10 | 0.999 |
dot_precision@1 | 0.9705 |
dot_precision@3 | 0.8142 |
dot_precision@5 | 0.546 |
dot_precision@10 | 0.2899 |
dot_recall@1 | 0.3366 |
dot_recall@3 | 0.8156 |
dot_recall@5 | 0.8994 |
dot_recall@10 | 0.9481 |
dot_ndcg@10 | 0.9297 |
dot_mrr@10 | 0.9828 |
dot_map@100 | 0.8927 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 622,302 training samples
- Columns:
question
andcontexts
- Approximate statistics based on the first 1000 samples:
question contexts type string string details - min: 9 tokens
- mean: 27.35 tokens
- max: 60 tokens
- min: 5 tokens
- mean: 88.52 tokens
- max: 128 tokens
- Samples:
question contexts Does low-level human equivalent gestational lead exposure produce sex-specific motor and coordination abnormalities and late-onset obesity in year-old mice?
Low-level developmental lead exposure is linked to cognitive and neurological disorders in children. However, the long-term effects of gestational lead exposure (GLE) have received little attention.
Does insulin in combination with selenium inhibit HG/Pal-induced cardiomyocyte apoptosis by Cbl-b regulating p38MAPK/CBP/Ku70 pathway?
In this study, we investigated whether insulin and selenium in combination (In/Se) suppresses cardiomyocyte apoptosis and whether this protection is mediated by Cbl-b regulating p38MAPK/CBP/Ku70 pathway.
Does arthroscopic subacromial decompression result in normal shoulder function after two years in less than 50 % of patients?
The aim of this study was to evaluate the outcome two years after arthroscopic subacromial decompression using the Western Ontario Rotator-Cuff (WORC) index and a diagram-based questionnaire to self-assess active shoulder range of motion (ROM).
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 32,753 evaluation samples
- Columns:
question
andcontexts
- Approximate statistics based on the first 1000 samples:
question contexts type string string details - min: 11 tokens
- mean: 27.52 tokens
- max: 56 tokens
- min: 3 tokens
- mean: 88.59 tokens
- max: 128 tokens
- Samples:
question contexts Does [ Chemical components from essential oil of Pandanus amaryllifolius leave ]?
The essential oil of Pandanus amaryllifolius leaves was analyzed by gas chromatography-mass spectrum, and the relative content of each component was determined by area normalization method.
Is elevated C-reactive protein associated with the tumor depth of invasion but not with disease recurrence in stage II and III colorectal cancer?
We previously demonstrated that elevated serum C-reactive protein (CRP) level is associated with depth of tumor invasion in operable colorectal cancer. There is also increasing evidence to show that raised CRP concentration is associated with poor survival in patients with colorectal cancer. The purpose of this study was to investigate the correlation between preoperative CRP concentrations and short-term disease recurrence in cases with stage II and III colorectal cancer.
Do neuropeptide Y and peptide YY protect from weight loss caused by Bacille Calmette-Guérin in mice?
Deletion of PYY and NPY aggravated the BCG-induced loss of body weight, which was most pronounced in NPY-/-;PYY-/- mice (maximum loss: 15%). The weight loss in NPY-/-;PYY-/- mice did not normalize during the 2 week observation period. BCG suppressed the circadian pattern of locomotion, exploration and food intake. However, these changes took a different time course than the prolonged weight loss caused by BCG in NPY-/-;PYY-/- mice. The effect of BCG to increase circulating IL-6 (measured 16 days post-treatment) remained unaltered by knockout of PYY, NPY or NPY plus PYY.
- Loss:
MultipleNegativesRankingLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 64num_train_epochs
: 1
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 64per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseeval_use_gather_object
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Click to expand
Epoch | Step | Training Loss | loss | med-eval-dev_cosine_map@100 |
---|---|---|---|---|
0 | 0 | - | - | 0.3328 |
0.0103 | 100 | 0.7953 | - | - |
0.0206 | 200 | 0.5536 | - | - |
0.0257 | 250 | - | 0.1041 | 0.7474 |
0.0309 | 300 | 0.4755 | - | - |
0.0411 | 400 | 0.4464 | - | - |
0.0514 | 500 | 0.3986 | 0.0761 | 0.7786 |
0.0617 | 600 | 0.357 | - | - |
0.0720 | 700 | 0.3519 | - | - |
0.0771 | 750 | - | 0.0685 | 0.8029 |
0.0823 | 800 | 0.3197 | - | - |
0.0926 | 900 | 0.3247 | - | - |
0.1028 | 1000 | 0.3048 | 0.0549 | 0.8108 |
0.1131 | 1100 | 0.2904 | - | - |
0.1234 | 1200 | 0.281 | - | - |
0.1285 | 1250 | - | 0.0503 | 0.8181 |
0.1337 | 1300 | 0.2673 | - | - |
0.1440 | 1400 | 0.2645 | - | - |
0.1543 | 1500 | 0.2511 | 0.0457 | 0.8332 |
0.1645 | 1600 | 0.2541 | - | - |
0.1748 | 1700 | 0.2614 | - | - |
0.1800 | 1750 | - | 0.0401 | 0.8380 |
0.1851 | 1800 | 0.2263 | - | - |
0.1954 | 1900 | 0.2466 | - | - |
0.2057 | 2000 | 0.2297 | 0.0365 | 0.8421 |
0.2160 | 2100 | 0.2225 | - | - |
0.2262 | 2200 | 0.212 | - | - |
0.2314 | 2250 | - | 0.0344 | 0.8563 |
0.2365 | 2300 | 0.2257 | - | - |
0.2468 | 2400 | 0.1953 | - | - |
0.2571 | 2500 | 0.1961 | 0.0348 | 0.8578 |
0.2674 | 2600 | 0.1888 | - | - |
0.2777 | 2700 | 0.2039 | - | - |
0.2828 | 2750 | - | 0.0319 | 0.8610 |
0.2879 | 2800 | 0.1939 | - | - |
0.2982 | 2900 | 0.202 | - | - |
0.3085 | 3000 | 0.1915 | 0.0292 | 0.8678 |
0.3188 | 3100 | 0.1987 | - | - |
0.3291 | 3200 | 0.1877 | - | - |
0.3342 | 3250 | - | 0.0275 | 0.8701 |
0.3394 | 3300 | 0.1874 | - | - |
0.3497 | 3400 | 0.1689 | - | - |
0.3599 | 3500 | 0.169 | 0.0281 | 0.8789 |
0.3702 | 3600 | 0.1631 | - | - |
0.3805 | 3700 | 0.1611 | - | - |
0.3856 | 3750 | - | 0.0263 | 0.8814 |
0.3908 | 3800 | 0.1764 | - | - |
0.4011 | 3900 | 0.1796 | - | - |
0.4114 | 4000 | 0.1729 | 0.0249 | 0.8805 |
0.4216 | 4100 | 0.1551 | - | - |
0.4319 | 4200 | 0.1543 | - | - |
0.4371 | 4250 | - | 0.0241 | 0.8867 |
0.4422 | 4300 | 0.1549 | - | - |
0.4525 | 4400 | 0.1432 | - | - |
0.4628 | 4500 | 0.1592 | 0.0219 | 0.8835 |
0.4731 | 4600 | 0.1517 | - | - |
0.4833 | 4700 | 0.1463 | - | - |
0.4885 | 4750 | - | 0.0228 | 0.8928 |
0.4936 | 4800 | 0.1525 | - | - |
0.5039 | 4900 | 0.1426 | - | - |
0.5142 | 5000 | 0.1524 | 0.0209 | 0.8903 |
0.5245 | 5100 | 0.1443 | - | - |
0.5348 | 5200 | 0.1468 | - | - |
0.5399 | 5250 | - | 0.0212 | 0.8948 |
0.5450 | 5300 | 0.151 | - | - |
0.5553 | 5400 | 0.1443 | - | - |
0.5656 | 5500 | 0.1438 | 0.0212 | 0.8982 |
0.5759 | 5600 | 0.1409 | - | - |
0.5862 | 5700 | 0.1346 | - | - |
0.5913 | 5750 | - | 0.0207 | 0.8983 |
0.5965 | 5800 | 0.1315 | - | - |
0.6067 | 5900 | 0.1425 | - | - |
0.6170 | 6000 | 0.136 | 0.0188 | 0.8970 |
0.6273 | 6100 | 0.1426 | - | - |
0.6376 | 6200 | 0.1353 | - | - |
0.6427 | 6250 | - | 0.0185 | 0.8969 |
0.6479 | 6300 | 0.1269 | - | - |
0.6582 | 6400 | 0.1159 | - | - |
0.6684 | 6500 | 0.1311 | 0.0184 | 0.9028 |
0.6787 | 6600 | 0.1179 | - | - |
0.6890 | 6700 | 0.115 | - | - |
0.6942 | 6750 | - | 0.0184 | 0.9046 |
0.6993 | 6800 | 0.1254 | - | - |
0.7096 | 6900 | 0.1233 | - | - |
0.7199 | 7000 | 0.122 | 0.0174 | 0.9042 |
0.7302 | 7100 | 0.1238 | - | - |
0.7404 | 7200 | 0.1257 | - | - |
0.7456 | 7250 | - | 0.0175 | 0.9074 |
0.7507 | 7300 | 0.1222 | - | - |
0.7610 | 7400 | 0.1194 | - | - |
0.7713 | 7500 | 0.1284 | 0.0166 | 0.9080 |
0.7816 | 7600 | 0.1147 | - | - |
0.7919 | 7700 | 0.1182 | - | - |
0.7970 | 7750 | - | 0.0170 | 0.9116 |
0.8021 | 7800 | 0.1157 | - | - |
0.8124 | 7900 | 0.1299 | - | - |
0.8227 | 8000 | 0.114 | 0.0163 | 0.9105 |
0.8330 | 8100 | 0.1141 | - | - |
0.8433 | 8200 | 0.1195 | - | - |
0.8484 | 8250 | - | 0.0160 | 0.9112 |
0.8536 | 8300 | 0.1073 | - | - |
0.8638 | 8400 | 0.1044 | - | - |
0.8741 | 8500 | 0.1083 | 0.0160 | 0.9153 |
0.8844 | 8600 | 0.1103 | - | - |
0.8947 | 8700 | 0.1145 | - | - |
0.8998 | 8750 | - | 0.0154 | 0.9133 |
0.9050 | 8800 | 0.1083 | - | - |
0.9153 | 8900 | 0.1205 | - | - |
0.9255 | 9000 | 0.1124 | 0.0153 | 0.9162 |
0.9358 | 9100 | 0.1067 | - | - |
0.9461 | 9200 | 0.116 | - | - |
0.9513 | 9250 | - | 0.0152 | 0.9171 |
0.9564 | 9300 | 0.1126 | - | - |
0.9667 | 9400 | 0.1075 | - | - |
0.9770 | 9500 | 0.1128 | 0.0149 | 0.9169 |
0.9872 | 9600 | 0.1143 | - | - |
0.9975 | 9700 | 0.1175 | - | - |
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.1.1
- Transformers: 4.44.2
- PyTorch: 2.4.0
- Accelerate: 0.34.2
- Datasets: 3.0.0
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
- Downloads last month
- 8
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for alpha-brain/stsb-distilbert-base-mnrl
Base model
sentence-transformers/stsb-distilbert-baseEvaluation results
- Cosine Accuracy@1 on med eval devself-reported0.983
- Cosine Accuracy@3 on med eval devself-reported0.998
- Cosine Accuracy@5 on med eval devself-reported0.999
- Cosine Accuracy@10 on med eval devself-reported0.999
- Cosine Precision@1 on med eval devself-reported0.983
- Cosine Precision@3 on med eval devself-reported0.844
- Cosine Precision@5 on med eval devself-reported0.559
- Cosine Precision@10 on med eval devself-reported0.293
- Cosine Recall@1 on med eval devself-reported0.341
- Cosine Recall@3 on med eval devself-reported0.845