Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
Paper • 1908.10084 • Published • 13
How to use alpha-brain/stsb-distilbert-base-mnrl with sentence-transformers:
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("alpha-brain/stsb-distilbert-base-mnrl")
sentences = [
"Does fTO Genotype interact with Improvement in Aerobic Fitness on Body Weight Loss During Lifestyle Intervention?",
"The study population count 46 550 male workers, 1670 (3.6%) of whom incurred at least one work-related injury requiring admission to hospital within a period of 5 years following hearing tests conducted between 1987 and 2005. The noise exposure and hearing loss-related data were gathered during occupational noise-induced hearing loss (NIHL) screening. The hospital data were used to identify all members of the study population who were admitted, and the reason for admission. Finally, access to the death-related data made it possible to identify participants who died during the course of the study. Cox proportional hazards model taking into account hearing status, noise levels, age and cumulative duration of noise exposure at the time of the hearing test established the risk of work-related injuries leading to admission to hospital.",
"Carriers of a hereditary mutation in BRCA are at high risk for breast and ovarian cancer. The first person from a family known to carry the mutation, the index person, has to share genetic information with relatives. This study is aimed at determining the number of relatives tested for a BRCA mutation, and the exploration of facilitating and debilitating factors in the transmission of genetic information from index patient to relatives.",
"Not every participant responds with a comparable body weight loss to lifestyle intervention, despite the same compliance. Genetic factors may explain parts of this difference. Variation in fat mass and obesity-associated gene (FTO) is the strongest common genetic determinant of body weight. The aim of the present study was to evaluate the impact of FTO genotype differences in the link between improvement of fitness and reduction of body weight during a lifestyle intervention."
]
embeddings = model.encode(sentences)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [4, 4]This is a sentence-transformers model finetuned from sentence-transformers/stsb-distilbert-base. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("alpha-brain/stsb-distilbert-base-mnrl")
# Run inference
sentences = [
'Do correlations between plasma-neuropeptides and temperament dimensions differ between suicidal patients and healthy controls?',
'Decreased plasma levels of plasma-neuropeptide Y (NPY) and plasma-corticotropin releasing hormone (CRH), and increased levels of plasma delta-sleep inducing peptide (DSIP) in suicide attempters with mood disorders have previously been observed. This study was performed in order to further understand the clinical relevance of these findings.',
"Seven hundred fifty patients entered the study. One hundred sixty-eight patients (22.4%) presented with a total of 193 extracutaneous manifestations, as follows: articular (47.2%), neurologic (17.1%), vascular (9.3%), ocular (8.3%), gastrointestinal (6.2%), respiratory (2.6%), cardiac (1%), and renal (1%). Other autoimmune conditions were present in 7.3% of patients. Neurologic involvement consisted of epilepsy, central nervous system vasculitis, peripheral neuropathy, vascular malformations, headache, and neuroimaging abnormalities. Ocular manifestations were episcleritis, uveitis, xerophthalmia, glaucoma, and papilledema. In more than one-fourth of these children, articular, neurologic, and ocular involvements were unrelated to the site of skin lesions. Raynaud's phenomenon was reported in 16 patients. Respiratory involvement consisted essentially of restrictive lung disease. Gastrointestinal involvement was reported in 12 patients and consisted exclusively of gastroesophageal reflux. Thirty patients (4%) had multiple extracutaneous features, but systemic sclerosis (SSc) developed in only 1 patient. In patients with extracutaneous involvement, the prevalence of antinuclear antibodies and rheumatoid factor was significantly higher than that among patients with only skin involvement. However, Scl-70 and anticentromere, markers of SSc, were not significantly increased.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
med-eval-devInformationRetrievalEvaluator| Metric | Value |
|---|---|
| cosine_accuracy@1 | 0.9825 |
| cosine_accuracy@3 | 0.998 |
| cosine_accuracy@5 | 0.9985 |
| cosine_accuracy@10 | 0.9985 |
| cosine_precision@1 | 0.9825 |
| cosine_precision@3 | 0.8438 |
| cosine_precision@5 | 0.5588 |
| cosine_precision@10 | 0.2931 |
| cosine_recall@1 | 0.3413 |
| cosine_recall@3 | 0.8454 |
| cosine_recall@5 | 0.9192 |
| cosine_recall@10 | 0.9578 |
| cosine_ndcg@10 | 0.9462 |
| cosine_mrr@10 | 0.99 |
| cosine_map@100 | 0.9169 |
| dot_accuracy@1 | 0.9705 |
| dot_accuracy@3 | 0.9955 |
| dot_accuracy@5 | 0.9985 |
| dot_accuracy@10 | 0.999 |
| dot_precision@1 | 0.9705 |
| dot_precision@3 | 0.8142 |
| dot_precision@5 | 0.546 |
| dot_precision@10 | 0.2899 |
| dot_recall@1 | 0.3366 |
| dot_recall@3 | 0.8156 |
| dot_recall@5 | 0.8994 |
| dot_recall@10 | 0.9481 |
| dot_ndcg@10 | 0.9297 |
| dot_mrr@10 | 0.9828 |
| dot_map@100 | 0.8927 |
question and contexts| question | contexts | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | contexts |
|---|---|
Does low-level human equivalent gestational lead exposure produce sex-specific motor and coordination abnormalities and late-onset obesity in year-old mice? |
Low-level developmental lead exposure is linked to cognitive and neurological disorders in children. However, the long-term effects of gestational lead exposure (GLE) have received little attention. |
Does insulin in combination with selenium inhibit HG/Pal-induced cardiomyocyte apoptosis by Cbl-b regulating p38MAPK/CBP/Ku70 pathway? |
In this study, we investigated whether insulin and selenium in combination (In/Se) suppresses cardiomyocyte apoptosis and whether this protection is mediated by Cbl-b regulating p38MAPK/CBP/Ku70 pathway. |
Does arthroscopic subacromial decompression result in normal shoulder function after two years in less than 50 % of patients? |
The aim of this study was to evaluate the outcome two years after arthroscopic subacromial decompression using the Western Ontario Rotator-Cuff (WORC) index and a diagram-based questionnaire to self-assess active shoulder range of motion (ROM). |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
question and contexts| question | contexts | |
|---|---|---|
| type | string | string |
| details |
|
|
| question | contexts |
|---|---|
Does [ Chemical components from essential oil of Pandanus amaryllifolius leave ]? |
The essential oil of Pandanus amaryllifolius leaves was analyzed by gas chromatography-mass spectrum, and the relative content of each component was determined by area normalization method. |
Is elevated C-reactive protein associated with the tumor depth of invasion but not with disease recurrence in stage II and III colorectal cancer? |
We previously demonstrated that elevated serum C-reactive protein (CRP) level is associated with depth of tumor invasion in operable colorectal cancer. There is also increasing evidence to show that raised CRP concentration is associated with poor survival in patients with colorectal cancer. The purpose of this study was to investigate the correlation between preoperative CRP concentrations and short-term disease recurrence in cases with stage II and III colorectal cancer. |
Do neuropeptide Y and peptide YY protect from weight loss caused by Bacille Calmette-Guérin in mice? |
Deletion of PYY and NPY aggravated the BCG-induced loss of body weight, which was most pronounced in NPY-/-;PYY-/- mice (maximum loss: 15%). The weight loss in NPY-/-;PYY-/- mice did not normalize during the 2 week observation period. BCG suppressed the circadian pattern of locomotion, exploration and food intake. However, these changes took a different time course than the prolonged weight loss caused by BCG in NPY-/-;PYY-/- mice. The effect of BCG to increase circulating IL-6 (measured 16 days post-treatment) remained unaltered by knockout of PYY, NPY or NPY plus PYY. |
MultipleNegativesRankingLoss with these parameters:{
"scale": 20.0,
"similarity_fct": "cos_sim"
}
eval_strategy: stepsper_device_train_batch_size: 64num_train_epochs: 1overwrite_output_dir: Falsedo_predict: Falseeval_strategy: stepsprediction_loss_only: Trueper_device_train_batch_size: 64per_device_eval_batch_size: 8per_gpu_train_batch_size: Noneper_gpu_eval_batch_size: Nonegradient_accumulation_steps: 1eval_accumulation_steps: Nonetorch_empty_cache_steps: Nonelearning_rate: 5e-05weight_decay: 0.0adam_beta1: 0.9adam_beta2: 0.999adam_epsilon: 1e-08max_grad_norm: 1.0num_train_epochs: 1max_steps: -1lr_scheduler_type: linearlr_scheduler_kwargs: {}warmup_ratio: 0.0warmup_steps: 0log_level: passivelog_level_replica: warninglog_on_each_node: Truelogging_nan_inf_filter: Truesave_safetensors: Truesave_on_each_node: Falsesave_only_model: Falserestore_callback_states_from_checkpoint: Falseno_cuda: Falseuse_cpu: Falseuse_mps_device: Falseseed: 42data_seed: Nonejit_mode_eval: Falseuse_ipex: Falsebf16: Falsefp16: Falsefp16_opt_level: O1half_precision_backend: autobf16_full_eval: Falsefp16_full_eval: Falsetf32: Nonelocal_rank: 0ddp_backend: Nonetpu_num_cores: Nonetpu_metrics_debug: Falsedebug: []dataloader_drop_last: Falsedataloader_num_workers: 0dataloader_prefetch_factor: Nonepast_index: -1disable_tqdm: Falseremove_unused_columns: Truelabel_names: Noneload_best_model_at_end: Falseignore_data_skip: Falsefsdp: []fsdp_min_num_params: 0fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap: Noneaccelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed: Nonelabel_smoothing_factor: 0.0optim: adamw_torchoptim_args: Noneadafactor: Falsegroup_by_length: Falselength_column_name: lengthddp_find_unused_parameters: Noneddp_bucket_cap_mb: Noneddp_broadcast_buffers: Falsedataloader_pin_memory: Truedataloader_persistent_workers: Falseskip_memory_metrics: Trueuse_legacy_prediction_loop: Falsepush_to_hub: Falseresume_from_checkpoint: Nonehub_model_id: Nonehub_strategy: every_savehub_private_repo: Falsehub_always_push: Falsegradient_checkpointing: Falsegradient_checkpointing_kwargs: Noneinclude_inputs_for_metrics: Falseeval_do_concat_batches: Truefp16_backend: autopush_to_hub_model_id: Nonepush_to_hub_organization: Nonemp_parameters: auto_find_batch_size: Falsefull_determinism: Falsetorchdynamo: Noneray_scope: lastddp_timeout: 1800torch_compile: Falsetorch_compile_backend: Nonetorch_compile_mode: Nonedispatch_batches: Nonesplit_batches: Noneinclude_tokens_per_second: Falseinclude_num_input_tokens_seen: Falseneftune_noise_alpha: Noneoptim_target_modules: Nonebatch_eval_metrics: Falseeval_on_start: Falseeval_use_gather_object: Falsebatch_sampler: batch_samplermulti_dataset_batch_sampler: proportional| Epoch | Step | Training Loss | loss | med-eval-dev_cosine_map@100 |
|---|---|---|---|---|
| 0 | 0 | - | - | 0.3328 |
| 0.0103 | 100 | 0.7953 | - | - |
| 0.0206 | 200 | 0.5536 | - | - |
| 0.0257 | 250 | - | 0.1041 | 0.7474 |
| 0.0309 | 300 | 0.4755 | - | - |
| 0.0411 | 400 | 0.4464 | - | - |
| 0.0514 | 500 | 0.3986 | 0.0761 | 0.7786 |
| 0.0617 | 600 | 0.357 | - | - |
| 0.0720 | 700 | 0.3519 | - | - |
| 0.0771 | 750 | - | 0.0685 | 0.8029 |
| 0.0823 | 800 | 0.3197 | - | - |
| 0.0926 | 900 | 0.3247 | - | - |
| 0.1028 | 1000 | 0.3048 | 0.0549 | 0.8108 |
| 0.1131 | 1100 | 0.2904 | - | - |
| 0.1234 | 1200 | 0.281 | - | - |
| 0.1285 | 1250 | - | 0.0503 | 0.8181 |
| 0.1337 | 1300 | 0.2673 | - | - |
| 0.1440 | 1400 | 0.2645 | - | - |
| 0.1543 | 1500 | 0.2511 | 0.0457 | 0.8332 |
| 0.1645 | 1600 | 0.2541 | - | - |
| 0.1748 | 1700 | 0.2614 | - | - |
| 0.1800 | 1750 | - | 0.0401 | 0.8380 |
| 0.1851 | 1800 | 0.2263 | - | - |
| 0.1954 | 1900 | 0.2466 | - | - |
| 0.2057 | 2000 | 0.2297 | 0.0365 | 0.8421 |
| 0.2160 | 2100 | 0.2225 | - | - |
| 0.2262 | 2200 | 0.212 | - | - |
| 0.2314 | 2250 | - | 0.0344 | 0.8563 |
| 0.2365 | 2300 | 0.2257 | - | - |
| 0.2468 | 2400 | 0.1953 | - | - |
| 0.2571 | 2500 | 0.1961 | 0.0348 | 0.8578 |
| 0.2674 | 2600 | 0.1888 | - | - |
| 0.2777 | 2700 | 0.2039 | - | - |
| 0.2828 | 2750 | - | 0.0319 | 0.8610 |
| 0.2879 | 2800 | 0.1939 | - | - |
| 0.2982 | 2900 | 0.202 | - | - |
| 0.3085 | 3000 | 0.1915 | 0.0292 | 0.8678 |
| 0.3188 | 3100 | 0.1987 | - | - |
| 0.3291 | 3200 | 0.1877 | - | - |
| 0.3342 | 3250 | - | 0.0275 | 0.8701 |
| 0.3394 | 3300 | 0.1874 | - | - |
| 0.3497 | 3400 | 0.1689 | - | - |
| 0.3599 | 3500 | 0.169 | 0.0281 | 0.8789 |
| 0.3702 | 3600 | 0.1631 | - | - |
| 0.3805 | 3700 | 0.1611 | - | - |
| 0.3856 | 3750 | - | 0.0263 | 0.8814 |
| 0.3908 | 3800 | 0.1764 | - | - |
| 0.4011 | 3900 | 0.1796 | - | - |
| 0.4114 | 4000 | 0.1729 | 0.0249 | 0.8805 |
| 0.4216 | 4100 | 0.1551 | - | - |
| 0.4319 | 4200 | 0.1543 | - | - |
| 0.4371 | 4250 | - | 0.0241 | 0.8867 |
| 0.4422 | 4300 | 0.1549 | - | - |
| 0.4525 | 4400 | 0.1432 | - | - |
| 0.4628 | 4500 | 0.1592 | 0.0219 | 0.8835 |
| 0.4731 | 4600 | 0.1517 | - | - |
| 0.4833 | 4700 | 0.1463 | - | - |
| 0.4885 | 4750 | - | 0.0228 | 0.8928 |
| 0.4936 | 4800 | 0.1525 | - | - |
| 0.5039 | 4900 | 0.1426 | - | - |
| 0.5142 | 5000 | 0.1524 | 0.0209 | 0.8903 |
| 0.5245 | 5100 | 0.1443 | - | - |
| 0.5348 | 5200 | 0.1468 | - | - |
| 0.5399 | 5250 | - | 0.0212 | 0.8948 |
| 0.5450 | 5300 | 0.151 | - | - |
| 0.5553 | 5400 | 0.1443 | - | - |
| 0.5656 | 5500 | 0.1438 | 0.0212 | 0.8982 |
| 0.5759 | 5600 | 0.1409 | - | - |
| 0.5862 | 5700 | 0.1346 | - | - |
| 0.5913 | 5750 | - | 0.0207 | 0.8983 |
| 0.5965 | 5800 | 0.1315 | - | - |
| 0.6067 | 5900 | 0.1425 | - | - |
| 0.6170 | 6000 | 0.136 | 0.0188 | 0.8970 |
| 0.6273 | 6100 | 0.1426 | - | - |
| 0.6376 | 6200 | 0.1353 | - | - |
| 0.6427 | 6250 | - | 0.0185 | 0.8969 |
| 0.6479 | 6300 | 0.1269 | - | - |
| 0.6582 | 6400 | 0.1159 | - | - |
| 0.6684 | 6500 | 0.1311 | 0.0184 | 0.9028 |
| 0.6787 | 6600 | 0.1179 | - | - |
| 0.6890 | 6700 | 0.115 | - | - |
| 0.6942 | 6750 | - | 0.0184 | 0.9046 |
| 0.6993 | 6800 | 0.1254 | - | - |
| 0.7096 | 6900 | 0.1233 | - | - |
| 0.7199 | 7000 | 0.122 | 0.0174 | 0.9042 |
| 0.7302 | 7100 | 0.1238 | - | - |
| 0.7404 | 7200 | 0.1257 | - | - |
| 0.7456 | 7250 | - | 0.0175 | 0.9074 |
| 0.7507 | 7300 | 0.1222 | - | - |
| 0.7610 | 7400 | 0.1194 | - | - |
| 0.7713 | 7500 | 0.1284 | 0.0166 | 0.9080 |
| 0.7816 | 7600 | 0.1147 | - | - |
| 0.7919 | 7700 | 0.1182 | - | - |
| 0.7970 | 7750 | - | 0.0170 | 0.9116 |
| 0.8021 | 7800 | 0.1157 | - | - |
| 0.8124 | 7900 | 0.1299 | - | - |
| 0.8227 | 8000 | 0.114 | 0.0163 | 0.9105 |
| 0.8330 | 8100 | 0.1141 | - | - |
| 0.8433 | 8200 | 0.1195 | - | - |
| 0.8484 | 8250 | - | 0.0160 | 0.9112 |
| 0.8536 | 8300 | 0.1073 | - | - |
| 0.8638 | 8400 | 0.1044 | - | - |
| 0.8741 | 8500 | 0.1083 | 0.0160 | 0.9153 |
| 0.8844 | 8600 | 0.1103 | - | - |
| 0.8947 | 8700 | 0.1145 | - | - |
| 0.8998 | 8750 | - | 0.0154 | 0.9133 |
| 0.9050 | 8800 | 0.1083 | - | - |
| 0.9153 | 8900 | 0.1205 | - | - |
| 0.9255 | 9000 | 0.1124 | 0.0153 | 0.9162 |
| 0.9358 | 9100 | 0.1067 | - | - |
| 0.9461 | 9200 | 0.116 | - | - |
| 0.9513 | 9250 | - | 0.0152 | 0.9171 |
| 0.9564 | 9300 | 0.1126 | - | - |
| 0.9667 | 9400 | 0.1075 | - | - |
| 0.9770 | 9500 | 0.1128 | 0.0149 | 0.9169 |
| 0.9872 | 9600 | 0.1143 | - | - |
| 0.9975 | 9700 | 0.1175 | - | - |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
Base model
sentence-transformers/stsb-distilbert-base
from sentence_transformers import SentenceTransformer model = SentenceTransformer("alpha-brain/stsb-distilbert-base-mnrl") sentences = [ "Does fTO Genotype interact with Improvement in Aerobic Fitness on Body Weight Loss During Lifestyle Intervention?", "The study population count 46 550 male workers, 1670 (3.6%) of whom incurred at least one work-related injury requiring admission to hospital within a period of 5 years following hearing tests conducted between 1987 and 2005. The noise exposure and hearing loss-related data were gathered during occupational noise-induced hearing loss (NIHL) screening. The hospital data were used to identify all members of the study population who were admitted, and the reason for admission. Finally, access to the death-related data made it possible to identify participants who died during the course of the study. Cox proportional hazards model taking into account hearing status, noise levels, age and cumulative duration of noise exposure at the time of the hearing test established the risk of work-related injuries leading to admission to hospital.", "Carriers of a hereditary mutation in BRCA are at high risk for breast and ovarian cancer. The first person from a family known to carry the mutation, the index person, has to share genetic information with relatives. This study is aimed at determining the number of relatives tested for a BRCA mutation, and the exploration of facilitating and debilitating factors in the transmission of genetic information from index patient to relatives.", "Not every participant responds with a comparable body weight loss to lifestyle intervention, despite the same compliance. Genetic factors may explain parts of this difference. Variation in fat mass and obesity-associated gene (FTO) is the strongest common genetic determinant of body weight. The aim of the present study was to evaluate the impact of FTO genotype differences in the link between improvement of fitness and reduction of body weight during a lifestyle intervention." ] embeddings = model.encode(sentences) similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [4, 4]