---
language:
- en
license: apache-2.0
tags:
- sentence-transformers
- sparse-encoder
- sparse
- splade
- generated_from_trainer
- dataset_size:100000
- loss:SpladeLoss
- loss:SparseMultipleNegativesRankingLoss
- loss:FlopsLoss
base_model: distilbert/distilbert-base-uncased
widget:
- text: Citicoline promotes changes in brain dopamine (DA) and acetylcholine (ACh)
    receptors and modulates their release. In aging mice, citicoline administration
    led to a partial recovery of receptor function and density, with a dose-dependent
    increase in DA and ACh receptor densities. This is significant because aging is
    associated with a decrease in the number of DA and ACh receptors. Additionally,
    citicoline may reduce dopaminergic cell loss and stimulate acetylcholine synthesis,
    indicating its potential role in modulating neurotransmitter metabolism.
- text: After surgery, all patients were transferred to the intensive care unit and
    received mechanical ventilation support. Invasive arterial pressure and electrocardiogram
    (ECG) monitorization were performed, and daily ECGs were recorded. Patients with
    acceptable levels of bleeding started anticoagulation treatment with enoxaparin
    and warfarin. Some patients also received an oral beta-blocker (metoprolol) for
    rhythm control. ECG recordings were made every six hours in the early postoperative
    period.
- text: 'How does the prognosis for hip fractures differ between Japanese patients
    and Caucasian populations?

    '
- text: The diagnosis of fibromyalgia is currently assessed using the 2016 revised
    FM criteria, which is based on the Fibromyalgia Research Criteria. However, due
    to the level of subjectivity in the diagnostic rubric, objective key causal factors/mechanisms
    and measures to confirm the diagnosis have not been identified.
- text: Individuals with sports-related concussions may experience a range of symptoms
    that can affect their physical, cognitive, behavioral, and emotional health. These
    symptoms can include dizziness, headache, poor sleep, and emotional problems.
    While 90% of people with a sports concussion recover within 7 to 10 days, at least
    10% may experience prolonged symptoms. It is important to evaluate these symptoms
    as they can provide valuable information for estimating prognosis and predicting
    the time course and extent of expected recovery.
datasets:
- tomaarsen/miriad-4.4M-split
pipeline_tag: feature-extraction
library_name: sentence-transformers
metrics:
- dot_accuracy@1
- dot_accuracy@3
- dot_accuracy@5
- dot_accuracy@10
- dot_precision@1
- dot_precision@3
- dot_precision@5
- dot_precision@10
- dot_recall@1
- dot_recall@3
- dot_recall@5
- dot_recall@10
- dot_ndcg@10
- dot_mrr@10
- dot_map@100
- query_active_dims
- query_sparsity_ratio
- corpus_active_dims
- corpus_sparsity_ratio
co2_eq_emissions:
  emissions: 46.07984513287871
  energy_consumed: 0.11854800112394254
  source: codecarbon
  training_type: fine-tuning
  on_cloud: false
  cpu_model: 13th Gen Intel(R) Core(TM) i7-13700K
  ram_total_size: 31.777088165283203
  hours_used: 0.375
  hardware_used: 1 x NVIDIA GeForce RTX 3090
model-index:
- name: DistilBERT base trained on MIRIAD question-answer tuples
  results:
  - task:
      type: sparse-information-retrieval
      name: Sparse Information Retrieval
    dataset:
      name: miriad eval
      type: miriad_eval
    metrics:
    - type: dot_accuracy@1
      value: 0.9747
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.9919
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.9945
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.9964
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.9747
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.3306333333333333
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.19890000000000005
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.09964000000000003
      name: Dot Precision@10
    - type: dot_recall@1
      value: 0.9747
      name: Dot Recall@1
    - type: dot_recall@3
      value: 0.9919
      name: Dot Recall@3
    - type: dot_recall@5
      value: 0.9945
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.9964
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.9867362731234464
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.9835060714285719
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.9836646878971993
      name: Dot Map@100
    - type: query_active_dims
      value: 28.703100204467773
      name: Query Active Dims
    - type: query_sparsity_ratio
      value: 0.999059593073702
      name: Query Sparsity Ratio
    - type: corpus_active_dims
      value: 64.08699798583984
      name: Corpus Active Dims
    - type: corpus_sparsity_ratio
      value: 0.9979003014879155
      name: Corpus Sparsity Ratio
  - task:
      type: sparse-information-retrieval
      name: Sparse Information Retrieval
    dataset:
      name: miriad test
      type: miriad_test
    metrics:
    - type: dot_accuracy@1
      value: 0.9765
      name: Dot Accuracy@1
    - type: dot_accuracy@3
      value: 0.9931
      name: Dot Accuracy@3
    - type: dot_accuracy@5
      value: 0.996
      name: Dot Accuracy@5
    - type: dot_accuracy@10
      value: 0.998
      name: Dot Accuracy@10
    - type: dot_precision@1
      value: 0.9765
      name: Dot Precision@1
    - type: dot_precision@3
      value: 0.3310333333333333
      name: Dot Precision@3
    - type: dot_precision@5
      value: 0.19920000000000004
      name: Dot Precision@5
    - type: dot_precision@10
      value: 0.09980000000000003
      name: Dot Precision@10
    - type: dot_recall@1
      value: 0.9765
      name: Dot Recall@1
    - type: dot_recall@3
      value: 0.9931
      name: Dot Recall@3
    - type: dot_recall@5
      value: 0.996
      name: Dot Recall@5
    - type: dot_recall@10
      value: 0.998
      name: Dot Recall@10
    - type: dot_ndcg@10
      value: 0.9883079218290853
      name: Dot Ndcg@10
    - type: dot_mrr@10
      value: 0.985087023809524
      name: Dot Mrr@10
    - type: dot_map@100
      value: 0.9851723804723872
      name: Dot Map@100
    - type: query_active_dims
      value: 28.688600540161133
      name: Query Active Dims
    - type: query_sparsity_ratio
      value: 0.9990600681298683
      name: Query Sparsity Ratio
    - type: corpus_active_dims
      value: 64.32160186767578
      name: Corpus Active Dims
    - type: corpus_sparsity_ratio
      value: 0.9978926151016422
      name: Corpus Sparsity Ratio
---

# DistilBERT base trained on MIRIAD question-answer tuples

This is a [SPLADE Sparse Encoder](https://www.sbert.net/docs/sparse_encoder/usage/usage.html) model finetuned from [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) on the [miriad-4.4_m-split](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split) dataset using the [sentence-transformers](https://www.SBERT.net) library. It maps sentences & paragraphs to a 30522-dimensional sparse vector space   and can be used for semantic search and sparse retrieval.
## Model Details

### Model Description
- **Model Type:** SPLADE Sparse Encoder
- **Base model:** [distilbert/distilbert-base-uncased](https://huggingface.co/distilbert/distilbert-base-uncased) <!-- at revision 12040accade4e8a0f71eabdb258fecc2e7e948be -->
- **Maximum Sequence Length:** 512 tokens
- **Output Dimensionality:** 30522 dimensions
- **Similarity Function:** Dot Product
- **Training Dataset:**
    - [miriad-4.4_m-split](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split)
- **Language:** en
- **License:** apache-2.0

### Model Sources

- **Documentation:** [Sentence Transformers Documentation](https://sbert.net)
- **Documentation:** [Sparse Encoder Documentation](https://www.sbert.net/docs/sparse_encoder/usage/usage.html)
- **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers)
- **Hugging Face:** [Sparse Encoders on Hugging Face](https://huggingface.co/models?library=sentence-transformers&other=sparse-encoder)

### Full Model Architecture

```
SparseEncoder(
  (0): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False}) with MLMTransformer model: DistilBertForMaskedLM 
  (1): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
)
```

## Usage

### Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

```bash
pip install -U sentence-transformers
```

Then you can load this model and run inference.
```python
from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("tomaarsen/splade-distilbert-base-uncased-miriad")
# Run inference
queries = [
    "What are the common symptoms experienced by individuals with sports-related concussions and how do they impact their overall health?\n",
]
documents = [
    'Individuals with sports-related concussions may experience a range of symptoms that can affect their physical, cognitive, behavioral, and emotional health. These symptoms can include dizziness, headache, poor sleep, and emotional problems. While 90% of people with a sports concussion recover within 7 to 10 days, at least 10% may experience prolonged symptoms. It is important to evaluate these symptoms as they can provide valuable information for estimating prognosis and predicting the time course and extent of expected recovery.',
    "The physical parameters used to evaluate the tablets included color and appearance, weight variation, hardness, friability, thickness, and disintegration time. These parameters are important indicators of the tablet's quality, stability, and suitability for human use.",
    "The risk factors for developing depression in Alzheimer's Disease (AD) include a family history of depressive symptoms, a personal history of depression, gender, and a young onset of AD. Sleep disturbances, which are common in AD, are also a key predictor of depressive symptoms.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[43.7983,  0.0000,  9.6222]])
```

<!--
### Direct Usage (Transformers)

<details><summary>Click to see the direct usage in Transformers</summary>

</details>
-->

<!--
### Downstream Usage (Sentence Transformers)

You can finetune this model on your own dataset.

<details><summary>Click to expand</summary>

</details>
-->

<!--
### Out-of-Scope Use

*List how the model may foreseeably be misused and address what users ought not to do with the model.*
-->

## Evaluation

### Metrics

#### Sparse Information Retrieval

* Datasets: `miriad_eval` and `miriad_test`
* Evaluated with [<code>SparseInformationRetrievalEvaluator</code>](https://sbert.net/docs/package_reference/sparse_encoder/evaluation.html#sentence_transformers.sparse_encoder.evaluation.SparseInformationRetrievalEvaluator)

| Metric                | miriad_eval | miriad_test |
|:----------------------|:------------|:------------|
| dot_accuracy@1        | 0.9747      | 0.9765      |
| dot_accuracy@3        | 0.9919      | 0.9931      |
| dot_accuracy@5        | 0.9945      | 0.996       |
| dot_accuracy@10       | 0.9964      | 0.998       |
| dot_precision@1       | 0.9747      | 0.9765      |
| dot_precision@3       | 0.3306      | 0.331       |
| dot_precision@5       | 0.1989      | 0.1992      |
| dot_precision@10      | 0.0996      | 0.0998      |
| dot_recall@1          | 0.9747      | 0.9765      |
| dot_recall@3          | 0.9919      | 0.9931      |
| dot_recall@5          | 0.9945      | 0.996       |
| dot_recall@10         | 0.9964      | 0.998       |
| **dot_ndcg@10**       | **0.9867**  | **0.9883**  |
| dot_mrr@10            | 0.9835      | 0.9851      |
| dot_map@100           | 0.9837      | 0.9852      |
| query_active_dims     | 28.7031     | 28.6886     |
| query_sparsity_ratio  | 0.9991      | 0.9991      |
| corpus_active_dims    | 64.087      | 64.3216     |
| corpus_sparsity_ratio | 0.9979      | 0.9979      |

<!--
## Bias, Risks and Limitations

*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
-->

<!--
### Recommendations

*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
-->

## Training Details

### Training Dataset

#### miriad-4.4_m-split

* Dataset: [miriad-4.4_m-split](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split) at [596b9ab](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split/tree/596b9ab305d52cb73644ed5b5004957c7bfaae40)
* Size: 100,000 training samples
* Columns: <code>question</code> and <code>answer</code>
* Approximate statistics based on the first 1000 samples:
  |         | question                                                                          | answer                                                                               |
  |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                            | string                                                                               |
  | details | <ul><li>min: 9 tokens</li><li>mean: 23.38 tokens</li><li>max: 71 tokens</li></ul> | <ul><li>min: 24 tokens</li><li>mean: 103.31 tokens</li><li>max: 315 tokens</li></ul> |
* Samples:
  | question                                                                                                                                                                                              | answer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>What factors may contribute to increased pulmonary conduit durability in patients who undergo the Ross operation compared to those with right ventricular outflow tract obstruction?<br></code> | <code>Several factors may contribute to increased pulmonary conduit durability in patients who undergo the Ross operation compared to those with right ventricular outflow tract obstruction. These factors include later age at operation (allowing for larger homografts), more normal pulmonary artery architecture, absence of severe right ventricular hypertrophy, and more natural positioning of the homograft. However, further systematic studies are needed to confirm these associations.</code>                                                                                                                                                                                                                               |
  | <code>How does MCAM expression in hMSC affect the growth and maintenance of hematopoietic progenitors?</code>                                                                                         | <code>MCAM expression in hMSC has been shown to support the growth of hematopoietic progenitors. It enhances the adhesion and migration of HSPC, potentially through direct cell-cell interactions. However, the putative interaction partner of MCAM on HSPC remains unknown. Additionally, MCAM expression in hMSC does not seem to regulate the expression or secretion of SDF-1, a key factor in HSPC homing and maintenance.</code>                                                                                                                                                                                                                                                                                                   |
  | <code>What is the relationship between Fanconi anemia and breast and ovarian cancer susceptibility genes?<br></code>                                                                                  | <code>Fanconi anemia is a rare, autosomal recessive syndrome characterized by chromosomal instability, cancer susceptibility, and hypersensitivity to DNA cross-linking agents. It has been found that all known Fanconi anemia proteins cooperate with breast and/or ovarian cancer susceptibility gene products (BRCA1 and BRCA2) in a pathway required for cellular resistance to DNA cross-linking agents. This pathway, known as the "Fanconi anemia-BRCA pathway," is a DNA damage-activated signaling pathway that controls DNA repair. Methylation of one of the Fanconi anemia genes, FANCF, can lead to the inactivation of this pathway in breast and ovarian cancer, suggesting its importance in human carcinogenesis.</code> |
* Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
  ```json
  {
      "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
      "lambda_corpus": 3e-05,
      "lambda_query": 5e-05
  }
  ```

### Evaluation Dataset

#### miriad-4.4_m-split

* Dataset: [miriad-4.4_m-split](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split) at [596b9ab](https://huggingface.co/datasets/tomaarsen/miriad-4.4M-split/tree/596b9ab305d52cb73644ed5b5004957c7bfaae40)
* Size: 1,000 evaluation samples
* Columns: <code>question</code> and <code>answer</code>
* Approximate statistics based on the first 1000 samples:
  |         | question                                                                          | answer                                                                               |
  |:--------|:----------------------------------------------------------------------------------|:-------------------------------------------------------------------------------------|
  | type    | string                                                                            | string                                                                               |
  | details | <ul><li>min: 8 tokens</li><li>mean: 23.55 tokens</li><li>max: 74 tokens</li></ul> | <ul><li>min: 26 tokens</li><li>mean: 103.03 tokens</li><li>max: 262 tokens</li></ul> |
* Samples:
  | question                                                                                                                                                                                  | answer                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
  |:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
  | <code>What are some hereditary cancer syndromes that can result in various forms of cancer?<br></code>                                                                                    | <code>Hereditary cancer syndromes, such as Hereditary Breast and Ovarian Cancer (HBOC) and Lynch Syndrome (LS), can result in various forms of cancer due to germline mutations in cancer predisposition genes. These syndromes are associated with an increased risk of developing specific types of cancer.</code>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
  | <code>How do MAK-4 and MAK-5 exert their antioxidant properties?<br></code>                                                                                                               | <code>MAK-4 and MAK-5 have been shown to have antioxidant properties both in vitro and in vivo. These preparations contain multiple antioxidants such as alpha-tocopherol, beta-carotene, ascorbate, bioflavonoid, catechin, polyphenols, riboflavin, and tannic acid. These antioxidants are known to scavenge free radicals and reactive oxygen species (ROS) such as superoxide, hydroxyl, and peroxyl radicals, as well as hydrogen peroxide. In the present study, the antioxidant properties of MAK-4 and MAK-5 were confirmed in mice, with higher oxygen radical absorbance capacity (ORAC) values observed in mice fed the MAK-supplemented diet. Additionally, the activity of liver enzymes GPX, GST, and QR, which are involved in detoxification processes, were upregulated in the MAK-fed mice. This suggests that MAK-4 and MAK-5 may protect against carcinogenesis by reducing oxidative stress and enhancing detoxification processes.</code> |
  | <code>What are the primary indications for a decompressive craniectomy, and what role does neurocritical care play in determining the suitability of a patient for this procedure?</code> | <code>The primary indications for a decompressive craniectomy include refractory intracranial pressure (ICP) and progressive neurological deterioration due to mass effect from conditions like head trauma, or ischemic or hemorrhagic cerebrovascular disease. Neurocritical care and ICP monitoring are essential in identifying suitable candidates for the procedure, as it is considered a rescue surgical technique. These measures help to assess the patient's condition and determine the need for decompressive craniectomy in cases of elevated ICP.</code>                                                                                                                                                                                                                                                                                                                                                                                          |
* Loss: [<code>SpladeLoss</code>](https://sbert.net/docs/package_reference/sparse_encoder/losses.html#spladeloss) with these parameters:
  ```json
  {
      "loss": "SparseMultipleNegativesRankingLoss(scale=1.0, similarity_fct='dot_score')",
      "lambda_corpus": 3e-05,
      "lambda_query": 5e-05
  }
  ```

### Training Hyperparameters
#### Non-Default Hyperparameters

- `eval_strategy`: steps
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `learning_rate`: 2e-05
- `num_train_epochs`: 1
- `warmup_ratio`: 0.1
- `fp16`: True
- `batch_sampler`: no_duplicates

#### All Hyperparameters
<details><summary>Click to expand</summary>

- `overwrite_output_dir`: False
- `do_predict`: False
- `eval_strategy`: steps
- `prediction_loss_only`: True
- `per_device_train_batch_size`: 16
- `per_device_eval_batch_size`: 16
- `per_gpu_train_batch_size`: None
- `per_gpu_eval_batch_size`: None
- `gradient_accumulation_steps`: 1
- `eval_accumulation_steps`: None
- `torch_empty_cache_steps`: None
- `learning_rate`: 2e-05
- `weight_decay`: 0.0
- `adam_beta1`: 0.9
- `adam_beta2`: 0.999
- `adam_epsilon`: 1e-08
- `max_grad_norm`: 1.0
- `num_train_epochs`: 1
- `max_steps`: -1
- `lr_scheduler_type`: linear
- `lr_scheduler_kwargs`: {}
- `warmup_ratio`: 0.1
- `warmup_steps`: 0
- `log_level`: passive
- `log_level_replica`: warning
- `log_on_each_node`: True
- `logging_nan_inf_filter`: True
- `save_safetensors`: True
- `save_on_each_node`: False
- `save_only_model`: False
- `restore_callback_states_from_checkpoint`: False
- `no_cuda`: False
- `use_cpu`: False
- `use_mps_device`: False
- `seed`: 42
- `data_seed`: None
- `jit_mode_eval`: False
- `use_ipex`: False
- `bf16`: False
- `fp16`: True
- `fp16_opt_level`: O1
- `half_precision_backend`: auto
- `bf16_full_eval`: False
- `fp16_full_eval`: False
- `tf32`: None
- `local_rank`: 0
- `ddp_backend`: None
- `tpu_num_cores`: None
- `tpu_metrics_debug`: False
- `debug`: []
- `dataloader_drop_last`: False
- `dataloader_num_workers`: 0
- `dataloader_prefetch_factor`: None
- `past_index`: -1
- `disable_tqdm`: False
- `remove_unused_columns`: True
- `label_names`: None
- `load_best_model_at_end`: False
- `ignore_data_skip`: False
- `fsdp`: []
- `fsdp_min_num_params`: 0
- `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
- `fsdp_transformer_layer_cls_to_wrap`: None
- `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
- `deepspeed`: None
- `label_smoothing_factor`: 0.0
- `optim`: adamw_torch
- `optim_args`: None
- `adafactor`: False
- `group_by_length`: False
- `length_column_name`: length
- `ddp_find_unused_parameters`: None
- `ddp_bucket_cap_mb`: None
- `ddp_broadcast_buffers`: False
- `dataloader_pin_memory`: True
- `dataloader_persistent_workers`: False
- `skip_memory_metrics`: True
- `use_legacy_prediction_loop`: False
- `push_to_hub`: False
- `resume_from_checkpoint`: None
- `hub_model_id`: None
- `hub_strategy`: every_save
- `hub_private_repo`: None
- `hub_always_push`: False
- `gradient_checkpointing`: False
- `gradient_checkpointing_kwargs`: None
- `include_inputs_for_metrics`: False
- `include_for_metrics`: []
- `eval_do_concat_batches`: True
- `fp16_backend`: auto
- `push_to_hub_model_id`: None
- `push_to_hub_organization`: None
- `mp_parameters`: 
- `auto_find_batch_size`: False
- `full_determinism`: False
- `torchdynamo`: None
- `ray_scope`: last
- `ddp_timeout`: 1800
- `torch_compile`: False
- `torch_compile_backend`: None
- `torch_compile_mode`: None
- `include_tokens_per_second`: False
- `include_num_input_tokens_seen`: False
- `neftune_noise_alpha`: None
- `optim_target_modules`: None
- `batch_eval_metrics`: False
- `eval_on_start`: False
- `use_liger_kernel`: False
- `eval_use_gather_object`: False
- `average_tokens_across_devices`: False
- `prompts`: None
- `batch_sampler`: no_duplicates
- `multi_dataset_batch_sampler`: proportional
- `router_mapping`: {}
- `learning_rate_mapping`: {}

</details>

### Training Logs
| Epoch | Step | Training Loss | Validation Loss | miriad_eval_dot_ndcg@10 | miriad_test_dot_ndcg@10 |
|:-----:|:----:|:-------------:|:---------------:|:-----------------------:|:-----------------------:|
| 0.032 | 200  | 287.5421      | -               | -                       | -                       |
| 0.064 | 400  | 0.1454        | -               | -                       | -                       |
| 0.096 | 600  | 0.0469        | -               | -                       | -                       |
| 0.128 | 800  | 0.0105        | -               | -                       | -                       |
| 0.16  | 1000 | 0.0016        | 0.0016          | 0.9759                  | -                       |
| 0.192 | 1200 | 0.0084        | -               | -                       | -                       |
| 0.224 | 1400 | 0.0069        | -               | -                       | -                       |
| 0.256 | 1600 | 0.0031        | -               | -                       | -                       |
| 0.288 | 1800 | 0.0061        | -               | -                       | -                       |
| 0.32  | 2000 | 0.0061        | 0.0006          | 0.9817                  | -                       |
| 0.352 | 2200 | 0.0012        | -               | -                       | -                       |
| 0.384 | 2400 | 0.0034        | -               | -                       | -                       |
| 0.416 | 2600 | 0.0057        | -               | -                       | -                       |
| 0.448 | 2800 | 0.0023        | -               | -                       | -                       |
| 0.48  | 3000 | 0.0034        | 0.0005          | 0.9829                  | -                       |
| 0.512 | 3200 | 0.0006        | -               | -                       | -                       |
| 0.544 | 3400 | 0.002         | -               | -                       | -                       |
| 0.576 | 3600 | 0.0025        | -               | -                       | -                       |
| 0.608 | 3800 | 0.0008        | -               | -                       | -                       |
| 0.64  | 4000 | 0.0019        | 0.0006          | 0.9834                  | -                       |
| 0.672 | 4200 | 0.0106        | -               | -                       | -                       |
| 0.704 | 4400 | 0.0084        | -               | -                       | -                       |
| 0.736 | 4600 | 0.0035        | -               | -                       | -                       |
| 0.768 | 4800 | 0.0016        | -               | -                       | -                       |
| 0.8   | 5000 | 0.0037        | 0.0004          | 0.9860                  | -                       |
| 0.832 | 5200 | 0.0044        | -               | -                       | -                       |
| 0.864 | 5400 | 0.004         | -               | -                       | -                       |
| 0.896 | 5600 | 0.0005        | -               | -                       | -                       |
| 0.928 | 5800 | 0.0013        | -               | -                       | -                       |
| 0.96  | 6000 | 0.0012        | 0.0005          | 0.9868                  | -                       |
| 0.992 | 6200 | 0.0009        | -               | -                       | -                       |
| -1    | -1   | -             | -               | 0.9867                  | 0.9883                  |


### Environmental Impact
Carbon emissions were measured using [CodeCarbon](https://github.com/mlco2/codecarbon).
- **Energy Consumed**: 0.119 kWh
- **Carbon Emitted**: 0.046 kg of CO2
- **Hours Used**: 0.375 hours

### Training Hardware
- **On Cloud**: No
- **GPU Model**: 1 x NVIDIA GeForce RTX 3090
- **CPU Model**: 13th Gen Intel(R) Core(TM) i7-13700K
- **RAM Size**: 31.78 GB

### Framework Versions
- Python: 3.11.6
- Sentence Transformers: 4.2.0.dev0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.5.1
- Datasets: 2.21.0
- Tokenizers: 0.21.1

## Citation

### BibTeX

#### Sentence Transformers
```bibtex
@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
```

#### SpladeLoss
```bibtex
@misc{formal2022distillationhardnegativesampling,
      title={From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective},
      author={Thibault Formal and Carlos Lassance and Benjamin Piwowarski and Stéphane Clinchant},
      year={2022},
      eprint={2205.04733},
      archivePrefix={arXiv},
      primaryClass={cs.IR},
      url={https://arxiv.org/abs/2205.04733},
}
```

#### SparseMultipleNegativesRankingLoss
```bibtex
@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
```

#### FlopsLoss
```bibtex
@article{paria2020minimizing,
    title={Minimizing flops to learn efficient sparse representations},
    author={Paria, Biswajit and Yeh, Chih-Kuan and Yen, Ian EH and Xu, Ning and Ravikumar, Pradeep and P{'o}czos, Barnab{'a}s},
    journal={arXiv preprint arXiv:2004.05665},
    year={2020}
    }
```

<!--
## Glossary

*Clearly define terms in order to be accessible across audiences.*
-->

<!--
## Model Card Authors

*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
-->

<!--
## Model Card Contact

*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
-->