wwydmanski's picture
Upload folder using huggingface_hub
49570a9 verified
|
raw
history blame
17.5 kB
metadata
base_model: allenai/specter2_base
library_name: sentence-transformers
pipeline_tag: sentence-similarity
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:9988
  - loss:MultipleNegativesRankingLoss
widget:
  - source_sentence: Splenomegaly in Malta fever
    sentences:
      - 'TROPICAL SPLENOMEGALY. '
      - >-
        [Voluminous migrating spleen in the course of Malta fever: effects of
        splenectomy]. 
      - '[Adenoma of appendix]. '
  - source_sentence: sRNA regulation
    sentences:
      - 'SR proteins control a complex network of RNA-processing events. '
      - >-
        Convergence of submodality-specific input onto neurons in primary
        somatosensory cortex. 
      - 'Dynamic features of gene expression control by small regulatory RNAs. '
  - source_sentence: Foley catheter hysterosalpingography
    sentences:
      - 'Hysterosalpingography using a Foley catheter. '
      - >-
        [Long-term follow-up of adult patients with isolated congenital AV
        block]. 
      - 'Hysterosalpingography. '
  - source_sentence: Anti-endoglin monoclonal antibodies
    sentences:
      - >-
        Cortisol response to general anaesthesia for medical imaging in
        children. 
      - >-
        Anti-endoglin monoclonal antibodies are effective for suppressing
        metastasis and the primary tumors by targeting tumor vasculature. 
      - 'Endoglin: Beyond the Endothelium. '
  - source_sentence: Alternariol Methyl Ether Quantitation
    sentences:
      - >-
        Stable isotope dilution assays of alternariol and alternariol monomethyl
        ether in beverages. 
      - >-
        The roles of eotaxin and the STAT6 signalling pathway in eosinophil
        recruitment and host resistance to the nematodes Nippostrongylus
        brasiliensis and Heligmosomoides bakeri. 
      - >-
        Mechanisms of Action and Toxicity of the Mycotoxin Alternariol: A
        Review. 

SentenceTransformer based on allenai/specter2_base

This is a sentence-transformers model finetuned from allenai/specter2_base on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: allenai/specter2_base
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'Alternariol Methyl Ether Quantitation',
    'Stable isotope dilution assays of alternariol and alternariol monomethyl ether in beverages. ',
    'Mechanisms of Action and Toxicity of the Mycotoxin Alternariol: A Review. ',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 9,988 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 4 tokens
    • mean: 7.66 tokens
    • max: 34 tokens
    • min: 6 tokens
    • mean: 19.05 tokens
    • max: 42 tokens
    • min: 4 tokens
    • mean: 11.84 tokens
    • max: 48 tokens
  • Samples:
    anchor positive negative
    mechanotransduction pathways Signalling cascades in mechanotransduction: cell-matrix interactions and mechanical loading. Mechanotransduction: May the force be with you.
    FSR-tunable comb filter Multiwavelength Raman fiber laser with a continuously-tunable spacing. Tunable multiwavelength fiber laser using a comb filter based on erbium-ytterbium co-doped polarization maintaining fiber loop mirror.
    Radiation pneumonitis enhancement Induction and concurrent taxanes enhance both the pulmonary metabolic radiation response and the radiation pneumonitis response in patients with esophagus cancer. Imaging of Hypersensitivity Pneumonitis.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • lr_scheduler_type: cosine_with_restarts
  • warmup_ratio: 0.1
  • bf16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 32
  • per_device_eval_batch_size: 32
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: cosine_with_restarts
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0095 1 2.9432
0.0190 2 3.0121
0.0286 3 2.9051
0.0381 4 2.7906
0.0476 5 2.6592
0.0571 6 2.2835
0.0667 7 2.1373
0.0762 8 1.7872
0.0857 9 1.6329
0.0952 10 1.5184
0.1048 11 1.234
0.1143 12 1.0315
0.1238 13 0.9664
0.1333 14 0.9369
0.1429 15 0.6871
0.1524 16 0.5633
0.1619 17 0.5141
0.1714 18 0.5259
0.1810 19 0.4295
0.1905 20 0.4585
0.2 21 0.2799
0.2095 22 0.4226
0.2190 23 0.2524
0.2286 24 0.2135
0.2381 25 0.1958
0.2476 26 0.1823
0.2571 27 0.393
0.2667 28 0.3186
0.2762 29 0.1414
0.2857 30 0.1927
0.2952 31 0.2597
0.3048 32 0.1291
0.3143 33 0.1488
0.3238 34 0.1203
0.3333 35 0.2001
0.3429 36 0.1877
0.3524 37 0.0713
0.3619 38 0.1778
0.3714 39 0.1179
0.3810 40 0.147
0.3905 41 0.1158
0.4 42 0.1003
0.4095 43 0.158
0.4190 44 0.159
0.4286 45 0.063
0.4381 46 0.1309
0.4476 47 0.0327
0.4571 48 0.1665
0.4667 49 0.1064
0.4762 50 0.0699
0.4857 51 0.0674
0.4952 52 0.0508
0.5048 53 0.0493
0.5143 54 0.0565
0.5238 55 0.0366
0.5333 56 0.0606
0.5429 57 0.0727
0.5524 58 0.092
0.5619 59 0.0628
0.5714 60 0.0369
0.5810 61 0.0889
0.5905 62 0.0409
0.6 63 0.0545
0.6095 64 0.0856
0.6190 65 0.0478
0.6286 66 0.0584
0.6381 67 0.0757
0.6476 68 0.0609
0.6571 69 0.0381
0.6667 70 0.069
0.6762 71 0.0243
0.6857 72 0.0517
0.6952 73 0.0332
0.7048 74 0.0662
0.7143 75 0.0753
0.7238 76 0.0914
0.7333 77 0.1094
0.7429 78 0.0557
0.7524 79 0.0436
0.7619 80 0.0137
0.7714 81 0.0399
0.7810 82 0.0278
0.7905 83 0.0438
0.8 84 0.1392
0.8095 85 0.0299
0.8190 86 0.0667
0.8286 87 0.0404
0.8381 88 0.0166
0.8476 89 0.1679
0.8571 90 0.0282
0.8667 91 0.0628
0.8762 92 0.0618
0.8857 93 0.0167
0.8952 94 0.2108
0.9048 95 0.0749
0.9143 96 0.0997
0.9238 97 0.0675
0.9333 98 0.0409
0.9429 99 0.0355
0.9524 100 0.1391
0.9619 101 0.0938
0.9714 102 0.0526
0.9810 103 0.0035
0.9905 104 0.0022
1.0 105 0.0016

Framework Versions

  • Python: 3.9.19
  • Sentence Transformers: 3.1.1
  • Transformers: 4.45.2
  • PyTorch: 2.5.0
  • Accelerate: 1.0.1
  • Datasets: 2.19.0
  • Tokenizers: 0.20.3

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}