SentenceTransformer based on pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1

This is a sentence-transformers model finetuned from pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1
  • Maximum Sequence Length: 1024 tokens
  • Output Dimensionality: 384 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 1024, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_100K-BioASQ-Epoch_3")
# Run inference
sentences = [
    'How do the structural properties of nanoporous materials influence their efficiency in catalytic reactions?',
    "A highly robust cluster-based indium(III)-organic framework with efficient catalytic activity in cycloaddition of CO2 and Knoevenagel condensation. The efficient catalytic performance displayed by MOFs is decided by an appropriate charge/radius ratio of defect metal sites, large enough solvent-accessible channels and Lewis base sites capable of polarizing substrate molecules. Herein, the solvothermal self-assembly led to a highly robust nanochannel-based framework of {·2DMF·5H2O}n (NUC-66) with a 56.8% void volume, which is a combination of a tetranuclear cluster (abbreviated as {In4}) and a conjugated tetracyclic pentacarboxylic acid ligand of 4,4'-(4-(4-carboxyphenyl)pyridine-2,6-diyl)diisophthalic acid (H5CPDD). To the best of our knowledge, NUC-66 is a rarely reported {In4}-based 3D framework with embedded hierarchical triangular-microporous (2.9 Å) and hexagonal-nanoporous (12.0 Å) channels, which are shaped by six rows of {In4} clusters. After solvent exchange and vacuum drying, the surface of nanochannels in desolvated NUC-66a is modified by unsaturated In3+ ions, Npyridine atoms and μ3-OH groups, all of which display polarization ability towards polar molecules due to their Lewis acidity or basicity. The catalytic experiments performed showed that NUC-66a had high catalytic activity in the cycloaddition reactions of epoxides with CO2 under mild conditions, which should be ascribed to its structural advantages including nanoscale channels, rich bifunctional active sites, large surface areas and chemical stability. Moreover, NUC-66a, as a heterogeneous catalyst, could greatly accelerate the Knoevenagel condensation reactions of aldehydes and malononitrile. Hence, this work confirms that the development of rigid nanoporous cluster-based MOFs built on metal ions with a high charge and large radius ratio will be more likely to realize practical applications, such as catalysis, adsorption and separation of gas, etc.",
    'Absolute quantification of dehydroacetic acid in processed foods using quantitative 1H NMR. An absolute quantification method for the determination of dehydroacetic acid in processed foods using quantitative (1)H NMR was developed and validated. The level of dehydroacetic acid was determined using the proton signals of dehydroacetic acid referenced to 1,4-bis (trimethylsilyl) benzene-d4 after simple solvent extraction from processed foods. All the recoveries from three processed foods spiked at two different concentrations were larger than 85%. The proposed method also proved to be precise, with inter-day precision and excellent linearity. The limit of quantification was confirmed as 0.13g/kg in processed foods, which is sufficiently low for the purposes of monitoring dehydroacetic acid. Furthermore, the method is rapid and easy to apply, and provides International System of Units traceability without the need for authentic analyte reference materials. Therefore, the proposed method is a useful and practical tool for determining the level of dehydroacetic acid in processed foods.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 137,221 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 23.82 tokens
    • max: 47 tokens
    • min: 27 tokens
    • mean: 281.43 tokens
    • max: 915 tokens
  • Samples:
    anchor positive
    How do menstrual-related factors, such as pain and cycle irregularity, impact the mental health and well-being of young women in educational settings? Determinants of premenstrual dysphoric disorder and associated factors among regular undergraduate students at Hawassa University Southern, Ethiopia, 2023: institution-based cross-sectional study. BACKGROUND: Premenstrual dysphoric disorder (PMDD) is a condition causing severe emotional, physical, and behavioral symptoms before menstruation. It greatly hinders daily activities, affecting academic and interpersonal relationships. Attention is not given to premenstrual disorders among female students in higher education. As a result, students are susceptible to stress, and their academic success is influenced by various factors, including their menstrual cycle, and the long-term outcomes and consequences are poorly researched. Even though PMDD has a significant negative impact on student's academic achievement and success limited research has been conducted in low- and middle-income countries including Ethiopia, especially in the study setting. Therefore, a study is needed to assess premenstrual dysphoric disorder and associated factors among regular undergraduate students at Hawassa University. METHODS: An institutional-based cross-sectional study was conducted among 374 regular undergraduate female students at Hawassa University, College of Medicine and Health Sciences. A self-administered structured premenstrual symptoms screening tool for adolescents was used to assess premenstrual dysphoric disorder. The collected data were loaded into a statistical package for the social science version 25 and analyzed using it. Both bivariate and multivariate logistic regression were used to identify factors associated with premenstrual dysphoric disorder. Each independent variable was entered separately into bivariate analysis, and a variable with a p-value less than 0.25 were included in the multivariate analysis to adjust the possible confounders. Statistically significant was declared at a 95% confidence interval when variable with a p-value less than 0.05 in the multivariate analysis with premenstrual dysphoric disorder. RESULTS: The magnitude of premenstrual dysphoric disorder in this study was 62.6% (95% CI 57.4-67.5). Having severe premenstrual pain (AOR = 6.44;95%CI 1.02-40.73), having irregular menstrual cycle (AOR = 2.21; 95% CI 1.32-3.70), students who had poor social support (AOR = 5.10;95%CI, (2.76-12.92) and moderate social support (AOR = 4.93;95%CI (2.18-11.18), and students who used contraception (AOR = 3.76;95%CI, 2.21-6,40) were statistically significant factors with the outcome variable. CONCLUSION: The prevalence of premenstrual dysphoric disorder was high as compared to other studies. There was a strong link between irregular menstrual cycle, severe menstrual pain (severe dysmenorrhea), poor social support, and contraception use with premenstrual dysphoric disorder. This needs early screening and intervention to prevent the complications and worsening of the symptoms that affect students' academic performance by the institution.
    How do sleep patterns influence cognitive function and learning in humans, and what are the broader implications for understanding neurological disorders? Neurochemical mechanisms for memory processing during sleep: basic findings in humans and neuropsychiatric implications. Sleep is essential for memory formation. Active systems consolidation maintains that memory traces that are initially stored in a transient store such as the hippocampus are gradually redistributed towards more permanent storage sites such as the cortex during sleep replay. The complementary synaptic homeostasis theory posits that weak memory traces are erased during sleep through a competitive down-selection mechanism, ensuring the brain's capability to learn new information. We discuss evidence from neuropharmacological experiments in humans to show how major neurotransmitters and neuromodulators are implicated in these memory processes. As to the major excitatory neurotransmitter glutamate that plays a prominent role in inducing synaptic consolidation, we show that these processes, while strengthening cortical memory traces during sleep, are insufficient to explain the consolidation of hippocampus-dependent declarative memories. In the inhibitory GABAergic system, we will offer insights how drugs may alter the intricate interplay of sleep oscillations that have been identified to be crucial for strengthening memories during sleep. Regarding the dopaminergic reward system, we will show how it is engaged during sleep replay, but that dopaminergic neuromodulation likely plays a side role for enhancing relevant memories during sleep. Also, we briefly go into basic evidence on acetylcholine and cortisol whose low tone during slow wave sleep (SWS) is crucial in supporting hippocampal-to-neocortical memory transmission. Finally, we will outline how these insights can be used to improve treatment of neuropsychiatric disorders focusing mainly on anxiety disorders, depression, and addiction that are strongly related to memory processing.
    What are the underlying physiological mechanisms by which elevated brain natriuretic peptide levels interact with heart rate variability to increase the likelihood of cardiovascular events? The Combination of Non-dipper Heart Rate and High Brain Natriuretic Peptide Predicts Cardiovascular Events: The Japan Morning Surge-Home Blood Pressure (J-HOP) Study. BACKGROUND: We hypothesized that the association between the dipping heart rate (HR) pattern and cardiovascular (CV) events differs according to the brain natriuretic peptide (BNP) level. METHODS: We examined a subgroup of 1,369 patients from the Japan Morning Surge Home Blood Pressure study; these were patients who had CV risk factors and had undergone ambulatory blood pressure (BP) monitoring. HR non-dipping status was defined as (awake HR - sleep HR)/awake HR <0.1, and high BNP was defined as ≥35 pg/ml. We divided the patients into four groups according to their HR dipper status (dipping or non-dipping) and BNP level (normal or high). RESULTS: The mean follow-up period was 60 ± 30 months. The primary endpoints were fatal/nonfatal CV events (myocardial infarction, angina pectoris, stroke, hospitalization for heart failure, and aortic dissection). During the follow-up period, 23 patients (2.8%) in the dipper HR with normal BNP group, 8 patients (4.4%) in the non-dipper HR with normal BNP group, 24 patients (9.5%) in the dipper HR with high-BNP group, and 25 patients (21.0%) in the non-dipper HR with high-BNP group suffered primary endpoints (log rank 78.8, P < 0.001). Non-dipper HR was revealed as an independent predictor of CV events (hazard ratio, 2.13; 95% confidence interval, 1.35-3.36; P = 0.001) after adjusting for age, gender and smoking, dyslipidemia, diabetes mellitus, chronic kidney disease, BNP, non-dipper BP, 24-h HR, and 24-h systolic blood pressure. CONCLUSIONS: The combination of non-dipper HR and higher BNP was associated with a higher incidence of CV events.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 15,247 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 6 tokens
    • mean: 24.38 tokens
    • max: 46 tokens
    • min: 25 tokens
    • mean: 280.0 tokens
    • max: 866 tokens
  • Samples:
    anchor positive
    What are the underlying mechanisms by which electroporation enhances the immunogenicity of low-dose DNA vaccines, and what implications does this have for vaccine design and efficacy? Immunotherapeutic Effects of Different Doses of Mycobacterium tuberculosis ag85a/b DNA Vaccine Delivered by Electroporation. Background: Tuberculosis (TB) is a major global public health problem. New treatment methods on TB are urgently demanded. Methods: Ninety-six female BALB/c mice were challenged with 2×104 colony-forming units (CFUs) of MTB H37Rv through tail vein injection, then was treated with 10μg, 50μg, 100μg, and 200μg of Mycobacterium tuberculosis (MTB) ag85a/b chimeric DNA vaccine delivered by intramuscular injection (IM) and electroporation (EP), respectively. The immunotherapeutic effects were evaluated immunologically, bacteriologically, and pathologically. Results: Compared with the phosphate-buffered saline (PBS) group, the CD4+IFN-γ+ T cells% in whole blood from 200 μg DNA IM group and four DNA EP groups increased significantly (P<0.05), CD8+IFN-γ+ T cells% (in 200 μg DNA EP group), CD4+IL-4+ T cells% (50 μg DNA IM group) and CD8+IL-4+ T cells% (50 μg and 100 μg DNA IM group, 100 μg and 200 μg DNA EP group) increased significantly only in a few DNA groups (P< 0.05). The CD4+CD25+ Treg cells% decreased significantly in all DNA vaccine groups (P<0.01). Except for the 10 μg DNA IM group, the lung and spleen colony-forming units (CFUs) of the other seven DNA immunization groups decreased significantly (P<0.001, P<0.01), especially the 100 μg DNA IM group and 50 μg DNA EP group significantly reduced the pulmonary bacterial loads and lung lesions than the other DNA groups. Conclusions: An MTB ag85a/b chimeric DNA vaccine could induce Th1-type cellular immune reactions. DNA immunization by EP could improve the immunogenicity of the low-dose DNA vaccine, reduce DNA dose, and produce good immunotherapeutic effects on the mouse TB model, to provide the basis for the future human clinical trial of MTB ag85a/b chimeric DNA vaccine.
    What is known about prostate cancer screening in the UK Supporting informed decision making online in 20 minutes: an observational web-log study of a PSA test decision aid. BACKGROUND: Web-based decision aids are known to have an effect on knowledge, attitude, and behavior; important components of informed decision making. We know what decision aids achieve in randomized controlled trials (RCTs), but we still know very little about how they are used and how this relates to the informed decision making outcome measures. OBJECTIVE: To examine men's use of an online decision aid for prostate cancer screening using website transaction log files (web-logs), and to examine associations between usage and components of informed decision making. METHODS: We conducted an observational web-log analysis of users of an online decision aid, Prosdex. Men between 50 and 75 years of age were recruited for an associated RCT from 26 general practices across South Wales, United Kingdom. Men allocated to one arm of the RCT were included in the current study. Time and usage data were derived from website log files. Components of informed decision making were measured by an online questionnaire. RESULTS: Available for analysis were 82 web-logs. Overall, there was large variation in the use of Prosdex. The mean total time spent on the site was 20 minutes. The mean number of pages accessed was 32 (SD 21) out of a possible 60 pages. Significant associations were found between increased usage and increased knowledge (Spearman rank correlation [rho] = 0.69, P < .01), between increased usage and less favorable attitude towards PSA testing (rho = -0.52, P < .01), and between increased usage and reduced intention to undergo PSA testing (rho = -0.44, P < .01). A bimodal distribution identified two types of user: low access and high access users. CONCLUSIONS: Increased usage of Prosdex leads to more informed decision making, the key aim of the UK Prostate Cancer Risk Management Programme. However, developers realistically have roughly 20 minutes to provide useful information that will support informed decision making when the patient uses a web-based interface. Future decision aids need to be developed with this limitation in mind. We recommend that web-log analysis should be an integral part of online decision aid development and analysis. TRIAL REGISTRATION: ISRCTN48473735; http://www.controlled-trials.com/ISRCTN48473735 (Archived by WebCite at http://www.webcitation.org/5pqeF89tS).
    How does early life adiposity influence long-term cardiovascular health, and what are the implications for prevention and intervention strategies? Adiposity is associated with endothelial activation in healthy 2-3 year-old children. Adiposity is associated with C-reactive protein level in healthy 2-3 year-old children and with other markers of endothelial activation in adults, but data are lacking in very young children. Data from 491 healthy Hispanic children were analyzed. Mean age was 2.7 years (SD 0.5, range 2-3 years); mean body mass index (BMI) was 17.2 kg/m2 (SD 1.9) among boys and 17.1 kg/m2 (SD 2.1) among girls. E-selectin level was associated with BMI (R = 0.11; p < 0.02), ponderal index (p < 0.02), waist circumference (p = 0.02), fasting insulin (p < 0.02), and insulin resistance (p < or = 0.05); these associations remained significant after adjustment for age, sex and fasting glucose. sVCAM was also associated with BMI (R = 0.12; p < 0.05). These observations indicate that adiposity is associated with inflammation and endothelial activation in very early childhood.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 5
  • warmup_ratio: 0.1
  • fp16: True
  • load_best_model_at_end: True
  • resume_from_checkpoint: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 128
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 5
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: True
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss
0.0932 100 0.3536 -
0.1864 200 0.227 -
0.2796 300 0.1599 -
0.3728 400 0.1448 -
0.4660 500 0.1276 -
0.5592 600 0.1187 -
0.6524 700 0.1191 -
0.7456 800 0.1082 -
0.8388 900 0.1026 -
0.9320 1000 0.0991 -
1.0 1073 - 0.0138
1.0252 1100 0.089 -
1.1184 1200 0.0759 -
1.2116 1300 0.0726 -
1.3048 1400 0.075 -
1.3979 1500 0.0732 -
1.4911 1600 0.07 -
1.5843 1700 0.0706 -
1.6775 1800 0.0708 -
1.7707 1900 0.0691 -
1.8639 2000 0.0713 -
1.9571 2100 0.0626 -
2.0 2146 - 0.0115
2.0503 2200 0.0564 -
2.1435 2300 0.0547 -
2.2367 2400 0.052 -
2.3299 2500 0.0491 -
2.4231 2600 0.0542 -
2.5163 2700 0.0506 -
2.6095 2800 0.0508 -
2.7027 2900 0.0493 -
2.7959 3000 0.0537 -
2.8891 3100 0.0499 -
2.9823 3200 0.0488 -
3.0 3219 - 0.0101

Framework Versions

  • Python: 3.12.2
  • Sentence Transformers: 3.2.1
  • Transformers: 4.44.2
  • PyTorch: 2.5.0
  • Accelerate: 1.0.1
  • Datasets: 3.0.2
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
22
Safetensors
Model size
41.5M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pankajrajdeo/UMLS-Pubmed-ST-TCE-Epoch-1-QA_100K-BioASQ-Epoch_3

Finetuned
(7)
this model