Edit model card

MPNet base trained on AllNLI triplets

This is a sentence-transformers model finetuned from intfloat/e5-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: intfloat/e5-base-v2
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Is the Danish National Hospital Register a valuable study base for epidemiologic research in febrile seizures?',
    'The Danish National Hospital Register is a valuable tool for epidemiologic research in febrile seizures.',
    'Ans. is c i.e., Presence of depression Good prognostic factors Acute onset late onset onset after 35 years of age Presence of precipitating stressor Good premorbid adjustment catatonic best prognosis Paranoid 2nd best sho duration 6 months Married Positive symptoms Presence of depression family history of mood disorder first episode pyknic fat physique female sex good treatment compliance good response to treatment good social suppo presence of confusion or perplexity normal brain CT Scan outpatient treatment.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Triplet

Metric eval-dataset test-dataset
cosine_accuracy 1.0 0.97

Training Details

Training Dataset

Unnamed Dataset

  • Size: 800 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 800 samples:
    sentence1 sentence2 label
    type string string float
    details
    • min: 5 tokens
    • mean: 22.88 tokens
    • max: 205 tokens
    • min: 4 tokens
    • mean: 81.77 tokens
    • max: 512 tokens
    • min: 1.0
    • mean: 1.0
    • max: 1.0
  • Samples:
    sentence1 sentence2 label
    Triad of biotin deficiency is Dermatitis, glossitis, Alopecia 407 H 314 Basic pathology 8th Biotin deficiency clinical features Adult Mental changes depression, hallucination , paresthesia, anorexia, nausea, A scaling, seborrheic and erythematous rash may occur around the eye, nose, mouth, as well as extremities 407 H Infant hypotonia, lethargy, apathy, alopecia and a characteristic rash that includes the ears.Symptoms of biotin deficiency includes Anaemia, loss of apepite dermatitis, glossitis 150 U. Satyanarayan Symptoms of biotin deficiency Dermatitis spectacle eyed appearance due to circumocular alopecia, pallor of skin membrane, depression, Lassitude, somnolence, anemia and hypercholesterolaemia 173 Rana Shinde 6th 1.0
    Drug responsible for the below condition Thalidomide given to pregnant lady can lead to hypoplasia of limbs called as Phocomelia . 1.0
    Is benefit from procarbazine , lomustine , and vincristine in oligodendroglial tumors associated with mutation of IDH? IDH mutational status identified patients with oligodendroglial tumors who did and did not benefit from alkylating agent chemotherapy with RT. Although patients with codeleted tumors lived longest, patients with noncodeleted IDH mutated tumors also lived longer after CRT. 1.0
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 100 evaluation samples
  • Columns: question, answer, and hard_negative
  • Approximate statistics based on the first 100 samples:
    question answer hard_negative
    type string string NoneType
    details
    • min: 5 tokens
    • mean: 22.52 tokens
    • max: 103 tokens
    • min: 10 tokens
    • mean: 83.51 tokens
    • max: 403 tokens
  • Samples:
    question answer hard_negative
    Hutchinsons secondaries In skull are due to tumors in Adrenal neuroblastomas are malig8nant neoplasms arising from sympathetic neuroblsts in Medulla of adrenal gland Neuroblastoma is a cancer that develops from immature nerve cells found in several areas of the body.Neuroblastoma most commonly arises in and around the adrenalglands, which have similar origins to nerve cells and sit atop the kidneys. None
    Proliferative glomerular deposits in the kidney are found in IgA nephropathy or Berger s disease immune complex mediated glomerulonephritis defined by the presence of diffuse mesangial IgA deposits often associated with mesangial hypercellularity. Male preponderance, peak incidence in the second and third decades of life.Clinical and laboratory findings Two most common presentations recurrent episodes of macroscopic hematuria during or immediately following an upper respiratory infection often accompanied by proteinuria or persistent asymptomatic microscopic hematuriaIgA deposited in the mesangium is typically polymeric and of the IgA1 subclass. IgM, IgG, C3, or immunoglobulin light chains may be codistributed with IgAPresence of elevated serum IgA levels in 20 50 of patients, IgA deposition in skin biopsies in 15 55 of patients, elevated levels of secretory IgA and IgA fibronectin complexesIgA nephropathy is a benign disease mostly, 5 30 of patients go into a complete remission, with others having hematuria but well preserved renal functionAbou... None
    Does meconium aspiration induce oxidative injury in the hippocampus of newborn piglets? Our data thus suggest that oxidative injury associated with pulmonary, but not systemic, hemodynamic disturbances may contribute to hippocampal damage after meconium aspiration in newborns. None
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • do_predict: True
  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: True
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step eval-dataset_cosine_accuracy test-dataset_cosine_accuracy
0 0 1.0 -
1.0 25 - 0.97

Framework Versions

  • Python: 3.11.10
  • Sentence Transformers: 3.3.0
  • Transformers: 4.46.2
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.1.1
  • Datasets: 3.1.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
4
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for shrijayan/all-mpnet-base-v2-sample

Finetuned
(23)
this model

Evaluation results