CrossEncoder based on bansalaman18/bert-uncased_L-10_H-128_A-2

This is a Cross Encoder model finetuned from bansalaman18/bert-uncased_L-10_H-128_A-2 on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.

Model Details

Model Description

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the 🤗 Hub
model = CrossEncoder("rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-10_H-128_A-2-listnet")
# Get scores for pairs of texts
pairs = [
    ['what is the appropriate amount of carbs per day', 'Percent of Calories. According to the Institute of Medicine, children and adults should consume 45 to 65 percent of their calorie intake as carbohydrates, and at least 130 grams of carbs per day.'],
    ['what is the appropriate amount of carbs per day', 'This means women following 1,200-calorie weight loss diets need about 135 to 195 grams of carbs each day, women consuming 1,600-calorie diets need 180 to 260 grams, women following 2,000-calorie diets need 225 to 325 grams and women consuming 2,400 calories per day require 270 to 390 grams of carbohydrates each day.'],
    ['what is the appropriate amount of carbs per day', 'Determine the number of grams of carbs you need each day by calculating 45 to 65 percent of your total calorie intake, and dividing by 4. For example, if you eat a 2,000-calorie diet, shoot for 225 to 325 grams of carbs per day; and if you eat 2,500 calories a day, aim for 281 to 406 grams of carbs.'],
    ['what is the appropriate amount of carbs per day', 'The Dietary Guidelines for Americans recommends that carbohydrates make up 45 to 65 percent of your total daily calories. So, if you get 2,000 calories a day, between 900 and 1,300 calories should be from carbohydrates. That translates to between 225 and 325 grams of carbohydrates a day. You can find the carbohydrate content of packaged foods on the Nutrition Facts label.'],
    ['what is the appropriate amount of carbs per day', '1 All of those 1155 calories will come from carbs. 2  And, since 1 gram of carbs contains 4 calories, all our example person would need to do now is divide 1155 by 4 and get 288. 3  Which means, this example person would need to eat about 288 grams of carbs per day.'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'what is the appropriate amount of carbs per day',
    [
        'Percent of Calories. According to the Institute of Medicine, children and adults should consume 45 to 65 percent of their calorie intake as carbohydrates, and at least 130 grams of carbs per day.',
        'This means women following 1,200-calorie weight loss diets need about 135 to 195 grams of carbs each day, women consuming 1,600-calorie diets need 180 to 260 grams, women following 2,000-calorie diets need 225 to 325 grams and women consuming 2,400 calories per day require 270 to 390 grams of carbohydrates each day.',
        'Determine the number of grams of carbs you need each day by calculating 45 to 65 percent of your total calorie intake, and dividing by 4. For example, if you eat a 2,000-calorie diet, shoot for 225 to 325 grams of carbs per day; and if you eat 2,500 calories a day, aim for 281 to 406 grams of carbs.',
        'The Dietary Guidelines for Americans recommends that carbohydrates make up 45 to 65 percent of your total daily calories. So, if you get 2,000 calories a day, between 900 and 1,300 calories should be from carbohydrates. That translates to between 225 and 325 grams of carbohydrates a day. You can find the carbohydrate content of packaged foods on the Nutrition Facts label.',
        '1 All of those 1155 calories will come from carbs. 2  And, since 1 gram of carbs contains 4 calories, all our example person would need to do now is divide 1155 by 4 and get 288. 3  Which means, this example person would need to eat about 288 grams of carbs per day.',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Reranking

  • Datasets: NanoMSMARCO_R100, NanoNFCorpus_R100 and NanoNQ_R100
  • Evaluated with CrossEncoderRerankingEvaluator with these parameters:
    {
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric NanoMSMARCO_R100 NanoNFCorpus_R100 NanoNQ_R100
map 0.0722 (-0.4174) 0.2861 (+0.0251) 0.0400 (-0.3796)
mrr@10 0.0520 (-0.4255) 0.3993 (-0.1005) 0.0149 (-0.4118)
ndcg@10 0.0630 (-0.4775) 0.2827 (-0.0424) 0.0251 (-0.4756)

Cross Encoder Nano BEIR

  • Dataset: NanoBEIR_R100_mean
  • Evaluated with CrossEncoderNanoBEIREvaluator with these parameters:
    {
        "dataset_names": [
            "msmarco",
            "nfcorpus",
            "nq"
        ],
        "rerank_k": 100,
        "at_k": 10,
        "always_rerank_positives": true
    }
    
Metric Value
map 0.1328 (-0.2573)
mrr@10 0.1554 (-0.3126)
ndcg@10 0.1236 (-0.3318)

Training Details

Training Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 78,704 training samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 8 characters
    • mean: 33.9 characters
    • max: 96 characters
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
    • min: 3 elements
    • mean: 6.50 elements
    • max: 10 elements
  • Samples:
    query docs labels
    cost to replace compressor in air conditioner ['Parts: $560 - $792. The average cost for an ac compressor replacement is between $764 to $1051. Labor costs are estimated between $204 to $259 while parts are priced between $560 to $792. Get a personalized estimate based on your location and specific car. Estimate does not include taxes and fees. ', "The cost of an AC compressor depends on the size of the unit, where you live, and who is going to install it. In general, a 2.5 ton compressor will cost between $1000 and $180 … 0 depending on the manufacturer and installer. One user said: You can find a 5 ton compressor for under $1000 online. If it's for a car, 200.00 & up/re manufactured.", "Depends how old the one getting replaced is... if you need to change your inside airhandler coil as well as the outdoor condensor (some of the new outdoor units won't get mate to old coils) it would be more. I got quotes for 3000-4000 here in phoenix, AZ for a 13 seer 2.5 ton AC unit. Compressors are the most expensive component in an A/c, but ev... [1, 0, 0, 0, 0, ...]
    how much more is a cpa cost for personal taxes ["Private firms and individual CPA's generally charge one of two ways either a flat rate, or an hourly rate. Hourly rates vary by city. The rates can be as low as $25 per hour, or as high as $300 per hour. Flat rate means you will pay one rate for the preparation of your taxes. For personal returns, which are more standardized, you can expect a total fee between $400 and $500 for your IRS 1040 return and associated forms. For business returns, some individual CPA's charge a sliding scale, based on receipts. For example, if your business has annual receipts of $500,000, they can charge a fee of $400. However, for a business with annual receipts of $1 million, they may charge $850 to file the return", "1 For a simple start-up, expect a minimum of 0.5-1.5 hours of consultation ($75-$600) to go over your business structure and basic tax issues. 1 You'll pay lower rates for routine work done by a less-experienced associate or lesser-trained employee, such as $30-$50 for bookkeeping services... [1, 0, 0, 0, 0, ...]
    what is the production of guinness ["Guinness (/ˈɡɪnɨs/) is an Irish dry stout that originated in the brewery of Arthur Guinness (1725–1803) at St. James's Gate, Dublin. Guinness is one of the most successful beer brands worldwide. It is brewed in almost 60 countries and is available in over 120. For a short time in the late 1990s, Guinness produced the St James's Gate range of craft-style beers, available in a small number of Dublin pubs. The beers were: Pilsner Gold, Wicked Red Ale, Wildcat Wheat Beer and Dark Angel Lager.", "The draught beer 's thick, creamy head comes from mixing the beer with nitrogen when poured. It is popular with the Irish both in Ireland and abroad, and, in spite of a decline in consumption since 2001, is still the best-selling alcoholic drink in Ireland where Guinness & Co. makes almost €2 billion annually. For a short time in the late 1990s, Guinness produced the St James's Gate range of craft-style beers, available in a small number of Dublin pubs. The beers were: Pilsner Gold, Wicked Red Al... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Evaluation Dataset

ms_marco

  • Dataset: ms_marco at a47ee7a
  • Size: 1,000 evaluation samples
  • Columns: query, docs, and labels
  • Approximate statistics based on the first 1000 samples:
    query docs labels
    type string list list
    details
    • min: 10 characters
    • mean: 33.74 characters
    • max: 93 characters
    • min: 3 elements
    • mean: 7.11 elements
    • max: 12 elements
    • min: 3 elements
    • mean: 7.11 elements
    • max: 12 elements
  • Samples:
    query docs labels
    what is the appropriate amount of carbs per day ['Percent of Calories. According to the Institute of Medicine, children and adults should consume 45 to 65 percent of their calorie intake as carbohydrates, and at least 130 grams of carbs per day.', 'This means women following 1,200-calorie weight loss diets need about 135 to 195 grams of carbs each day, women consuming 1,600-calorie diets need 180 to 260 grams, women following 2,000-calorie diets need 225 to 325 grams and women consuming 2,400 calories per day require 270 to 390 grams of carbohydrates each day.', 'Determine the number of grams of carbs you need each day by calculating 45 to 65 percent of your total calorie intake, and dividing by 4. For example, if you eat a 2,000-calorie diet, shoot for 225 to 325 grams of carbs per day; and if you eat 2,500 calories a day, aim for 281 to 406 grams of carbs.', 'The Dietary Guidelines for Americans recommends that carbohydrates make up 45 to 65 percent of your total daily calories. So, if you get 2,000 calories a day, between 900 and... [1, 0, 0, 0, 0, ...]
    can you cook strawberries ['Yup I used to make it myself, you take white cake mix make it like the box says and just add some freshly diced or sliced strawberries before pouring into the baking pans. If you want a strawberry flavored cake with fresh strawberries in it; then you mix the white cake mix with a box a regular strawberry jello mix. Mix it well till your dry cake mix is pink then go by the directions on the box for the cake mix and add the fresh strawberries last. Ive done both and both come out great. You actually can use white cake mix and any flavor jello for any flavor cake. innosint_lil_angel · 9 years ago.', 'To avoid this, cook and mash the strawberries, then strain them through a sieve. Then you can thicken this freshly strained strawberry juice with a bit of cornstarch and add sugar and the remaining ingredients called for in your recipe. You can make strawberry pie with a single or double crust -- as an open-faced pie or a covered pie. There is a super easy option here -- pre-made pie crust ... [1, 0, 0, 0, 0, ...]
    is coronary artery calcification ischemia ['Conditions that can cause myocardial ischemia include: 1 Coronary artery disease (atherosclerosis). 2 Plaques made up mostly of cholesterol build up on your artery walls and restrict blood flow. 3 Atherosclerosis is the most common cause of myocardial ischemia.', 'Note the large amount of calcium in the left anterior descending (LAD) and left circumflex arteries. Coronary artery calcification-CT. Section caudal to that in the previous image shows calcium in the left anterior descending (LAD) artery as it courses down the front of the heart.', 'No calcium (pink) is present in the LAD or diagonal branch. Coronary artery calcification-CT. Image obtained in a patient with a large amount of calcium in the left anterior descending (LAD) artery. Note that other hyperattenuating structures (eg, bone, calcified lymph nodes) are pink.', 'The measurement of coronary artery calcification (CAC) by computed tomography (CT) has received considerable attention for diagnosis and risk stratificatio... [1, 0, 0, 0, 0, ...]
  • Loss: ListNetLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "mini_batch_size": 16
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional
  • router_mapping: {}
  • learning_rate_mapping: {}

Training Logs

Epoch Step Training Loss Validation Loss NanoMSMARCO_R100_ndcg@10 NanoNFCorpus_R100_ndcg@10 NanoNQ_R100_ndcg@10 NanoBEIR_R100_mean_ndcg@10
-1 -1 - - 0.0818 (-0.4587) 0.2761 (-0.0490) 0.0186 (-0.4820) 0.1255 (-0.3299)
0.0002 1 1.9804 - - - - -
0.0508 250 2.0893 - - - - -
0.1016 500 2.0953 2.0773 0.0387 (-0.5018) 0.2372 (-0.0878) 0.0759 (-0.4247) 0.1173 (-0.3381)
0.1525 750 2.0888 - - - - -
0.2033 1000 2.091 2.077 0.0630 (-0.4775) 0.2827 (-0.0424) 0.0251 (-0.4756) 0.1236 (-0.3318)
0.2541 1250 2.085 - - - - -
0.3049 1500 2.0964 2.0769 0.0361 (-0.5043) 0.2754 (-0.0497) 0.0216 (-0.4791) 0.1110 (-0.3444)
0.3558 1750 2.0892 - - - - -
0.4066 2000 2.0886 2.0766 0.0426 (-0.4978) 0.2560 (-0.0690) 0.0308 (-0.4698) 0.1098 (-0.3455)
0.4574 2250 2.0848 - - - - -
0.5082 2500 2.093 2.0762 0.0577 (-0.4827) 0.2551 (-0.0699) 0.0270 (-0.4737) 0.1133 (-0.3421)
0.5591 2750 2.0817 - - - - -
0.6099 3000 2.0862 2.0760 0.0382 (-0.5022) 0.2470 (-0.0781) 0.0292 (-0.4715) 0.1048 (-0.3506)
0.6607 3250 2.0865 - - - - -
0.7115 3500 2.0858 2.0758 0.0456 (-0.4948) 0.2354 (-0.0896) 0.0237 (-0.4769) 0.1016 (-0.3538)
0.7624 3750 2.0845 - - - - -
0.8132 4000 2.0841 2.0756 0.0576 (-0.4828) 0.2247 (-0.1003) 0.0229 (-0.4778) 0.1017 (-0.3536)
0.8640 4250 2.0884 - - - - -
0.9148 4500 2.0893 2.0756 0.0440 (-0.4964) 0.2211 (-0.1040) 0.0351 (-0.4655) 0.1001 (-0.3553)
0.9656 4750 2.0845 - - - - -
-1 -1 - - 0.0630 (-0.4775) 0.2827 (-0.0424) 0.0251 (-0.4756) 0.1236 (-0.3318)
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.18
  • Sentence Transformers: 5.0.0
  • Transformers: 4.56.0.dev0
  • PyTorch: 2.7.1+cu126
  • Accelerate: 1.9.0
  • Datasets: 4.0.0
  • Tokenizers: 0.21.4

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

ListNetLoss

@inproceedings{cao2007learning,
    title={Learning to Rank: From Pairwise Approach to Listwise Approach},
    author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
    booktitle={Proceedings of the 24th international conference on Machine learning},
    pages={129--136},
    year={2007}
}
Downloads last month
11
Safetensors
Model size
5.97M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-10_H-128_A-2-listnet

Finetuned
(1)
this model

Dataset used to train rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-10_H-128_A-2-listnet

Evaluation results