CrossEncoder based on bansalaman18/bert-uncased_L-12_H-512_A-8
This is a Cross Encoder model finetuned from bansalaman18/bert-uncased_L-12_H-512_A-8 on the ms_marco dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for text reranking and semantic search.
Model Details
Model Description
- Model Type: Cross Encoder
- Base model: bansalaman18/bert-uncased_L-12_H-512_A-8
- Maximum Sequence Length: 512 tokens
- Number of Output Labels: 1 label
- Training Dataset:
- Language: en
Model Sources
- Documentation: Sentence Transformers Documentation
- Documentation: Cross Encoder Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Cross Encoders on Hugging Face
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import CrossEncoder
# Download from the 🤗 Hub
model = CrossEncoder("rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-12_H-512_A-8-listnet")
# Get scores for pairs of texts
pairs = [
['what is lactate dehydrogenase', 'Lactate dehydrogenase (LDH) is an enzyme that helps facilitate the process of turning sugar into energy for your cells to use. LDH is present in many kinds of organs and tissues throughout the body, including the liver, heart, pancreas, kidneys, skeletal muscles, brain, and blood cells. When illness or injury damages your cells, LDH may be released into the bloodstream, causing the level of LDH in your blood to rise.'],
['what is lactate dehydrogenase', 'A lactate dehydrogenase (LDH or LD) is an enzyme found in nearly all living cells (animals, plants, and prokaryotes). LDH catalyzes the conversion of pyruvate to lactate and back, as it converts NADH to NAD + and back. A dehydrogenase is an enzyme that transfers a hydride from one molecule to another. LDH exist in four distinct enzyme classes. This article is about the common NAD(P)-dependent L-lactate dehydrogenase. Tissue breakdown releases LDH, and therefore LDH can be measured as a surrogate for tissue breakdown, e.g. hemolysis. LDH is measured by the lactate dehydrogenase (LDH) test (also known as the LDH test or Lactic acid dehydrogenase test).'],
['what is lactate dehydrogenase', 'Lactic Acid Dehydrogenase (LDH). Guide. Lactic acid dehydrogenase (LDH) is an enzyme that helps produce energy. It is present in almost all of the tissues in the body and its levels rise in response to cell damage. LDH levels are measured from a sample of blood taken from a vein. '],
['what is lactate dehydrogenase', 'Lactate dehydrogenase deficiency is a condition that affects how the body breaks down sugar to use as energy in cells, primarily muscle cells. There are two types of this condition: lactate dehydrogenase-A deficiency (sometimes called glycogen storage disease XI) and lactate dehydrogenase-B deficiency. In some people with lactate dehydrogenase-A deficiency, high-intensity exercise or other strenuous activity leads to the breakdown of muscle tissue (rhabdomyolysis). The destruction of muscle tissue releases a protein called myoglobin, which is processed by the kidneys and released in the urine (myoglobinuria).'],
['what is lactate dehydrogenase', 'Summary. The protein encoded by this gene catalyzes the conversion of L-lactate and NAD to pyruvate and NADH in the final step of anaerobic glycolysis. The protein is found predominantly in muscle tissue and belongs to the lactate dehydrogenase family. Mutations in this gene have been linked to exertional myoglobinuria. '],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)
# Or rank different texts based on similarity to a single text
ranks = model.rank(
'what is lactate dehydrogenase',
[
'Lactate dehydrogenase (LDH) is an enzyme that helps facilitate the process of turning sugar into energy for your cells to use. LDH is present in many kinds of organs and tissues throughout the body, including the liver, heart, pancreas, kidneys, skeletal muscles, brain, and blood cells. When illness or injury damages your cells, LDH may be released into the bloodstream, causing the level of LDH in your blood to rise.',
'A lactate dehydrogenase (LDH or LD) is an enzyme found in nearly all living cells (animals, plants, and prokaryotes). LDH catalyzes the conversion of pyruvate to lactate and back, as it converts NADH to NAD + and back. A dehydrogenase is an enzyme that transfers a hydride from one molecule to another. LDH exist in four distinct enzyme classes. This article is about the common NAD(P)-dependent L-lactate dehydrogenase. Tissue breakdown releases LDH, and therefore LDH can be measured as a surrogate for tissue breakdown, e.g. hemolysis. LDH is measured by the lactate dehydrogenase (LDH) test (also known as the LDH test or Lactic acid dehydrogenase test).',
'Lactic Acid Dehydrogenase (LDH). Guide. Lactic acid dehydrogenase (LDH) is an enzyme that helps produce energy. It is present in almost all of the tissues in the body and its levels rise in response to cell damage. LDH levels are measured from a sample of blood taken from a vein. ',
'Lactate dehydrogenase deficiency is a condition that affects how the body breaks down sugar to use as energy in cells, primarily muscle cells. There are two types of this condition: lactate dehydrogenase-A deficiency (sometimes called glycogen storage disease XI) and lactate dehydrogenase-B deficiency. In some people with lactate dehydrogenase-A deficiency, high-intensity exercise or other strenuous activity leads to the breakdown of muscle tissue (rhabdomyolysis). The destruction of muscle tissue releases a protein called myoglobin, which is processed by the kidneys and released in the urine (myoglobinuria).',
'Summary. The protein encoded by this gene catalyzes the conversion of L-lactate and NAD to pyruvate and NADH in the final step of anaerobic glycolysis. The protein is found predominantly in muscle tissue and belongs to the lactate dehydrogenase family. Mutations in this gene have been linked to exertional myoglobinuria. ',
]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]
Evaluation
Metrics
Cross Encoder Reranking
- Datasets:
NanoMSMARCO_R100
,NanoNFCorpus_R100
andNanoNQ_R100
- Evaluated with
CrossEncoderRerankingEvaluator
with these parameters:{ "at_k": 10, "always_rerank_positives": true }
Metric | NanoMSMARCO_R100 | NanoNFCorpus_R100 | NanoNQ_R100 |
---|---|---|---|
map | 0.0687 (-0.4208) | 0.2704 (+0.0094) | 0.0560 (-0.3636) |
mrr@10 | 0.0446 (-0.4329) | 0.4075 (-0.0923) | 0.0367 (-0.3900) |
ndcg@10 | 0.0620 (-0.4785) | 0.2659 (-0.0591) | 0.0681 (-0.4325) |
Cross Encoder Nano BEIR
- Dataset:
NanoBEIR_R100_mean
- Evaluated with
CrossEncoderNanoBEIREvaluator
with these parameters:{ "dataset_names": [ "msmarco", "nfcorpus", "nq" ], "rerank_k": 100, "at_k": 10, "always_rerank_positives": true }
Metric | Value |
---|---|
map | 0.1317 (-0.2583) |
mrr@10 | 0.1629 (-0.3051) |
ndcg@10 | 0.1320 (-0.3234) |
Training Details
Training Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 78,704 training samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 34.03 characters
- max: 103 characters
- min: 4 elements
- mean: 7.00 elements
- max: 10 elements
- min: 4 elements
- mean: 7.00 elements
- max: 10 elements
- Samples:
query docs labels define wear
['Wear is related to interactions between surfaces and specifically the removal and deformation of material on a surface as a result of mechanical action of the opposite surface.', 'n a loss of tooth substance in contact areas through functional wear and friction, resulting in broadening and flattening of the contacts and a decrease in the mesiodistal dimension of the teeth and the dentition as a whole. wear, occlusal, n attritional loss of substance on opposing occlusal units or surfaces.', 'Wear is defined as to have on the body or to reduce the quality of the appearance by constant use. 1 An example of wear is to have on a pair of sunglasses. 2 An example of wear is to wear a hole in the elbow of a jacket.', 'Street Wear. Street wear is defined as west coast skateboarding styles. A lot of street wear companies are based out of the west coast and focus on the styles a classic skateboarder would wear. This includes fitted pants, normally classic vans, screen printed large tees, and ...
[1, 0, 0, 0, 0, ...]
eschooltoday stem cells
['In genetic terms, stem cells are cells in the embryo that are not specialized. After fertilization, there are two types of cells in the embryo. Specialized cells: These are the cells modified with clearly defined instructions or tasks. They are the cells that go on to define set things like taste, hearing, sex and the like. As they divide and grow, they do NOT change into any kind of cell. These are cells in the embryo (just after fertilization), usually obtained from human embryos that are a few days old and are left over from human fertility treatments. These are somewhat ‘generic cells’ and can grow into any of the about 250 cell types in the human body. This type is called Stem Cell.', "Photosynthesis is a chemical process through which plants, some bacteria and algae, produce glucose and oxygen from carbon dioxide and water, using only light as a source of energy. This process is extremely important for life on earth as it provides the oxygen that all other life depend on. Just ...
[1, 0, 0, 0, 0, ...]
does a presidential candidate have to be born in the u s
["Republican U.S. Sen. Ted Cruz, a Tea Party favorite who is widely seen as a potential presidential candidate in the 2016 election, was born in Calgary, Canada. Because his mother was a citizen of the United States, Cruz has maintained he also is a natural born citizen of the United States. You don't have to be born in the United States to be eligible to serve as president of the United States as long as one of more of your parents were American citizens at the time of birth, it is commonly held. The Congressional Research Service concluded in 2011 :", 'His mother was born in Delaware. The family returned to the United States when Cruz was 4. The Constitution gives three eligibility requirements to be president: one must be 35 years of age, a resident within the United States for 14 years, and a natural born Citizen, a term not defined in the Constitution.', "Conventional wisdom holds that candidates for president must be born on U.S. soil to serve in the highest office in the land. T...
[1, 0, 0, 0, 0, ...]
- Loss:
ListNetLoss
with these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "mini_batch_size": 16 }
Evaluation Dataset
ms_marco
- Dataset: ms_marco at a47ee7a
- Size: 1,000 evaluation samples
- Columns:
query
,docs
, andlabels
- Approximate statistics based on the first 1000 samples:
query docs labels type string list list details - min: 11 characters
- mean: 33.88 characters
- max: 105 characters
- min: 2 elements
- mean: 6.00 elements
- max: 10 elements
- min: 2 elements
- mean: 6.00 elements
- max: 10 elements
- Samples:
query docs labels what is lactate dehydrogenase
['Lactate dehydrogenase (LDH) is an enzyme that helps facilitate the process of turning sugar into energy for your cells to use. LDH is present in many kinds of organs and tissues throughout the body, including the liver, heart, pancreas, kidneys, skeletal muscles, brain, and blood cells. When illness or injury damages your cells, LDH may be released into the bloodstream, causing the level of LDH in your blood to rise.', 'A lactate dehydrogenase (LDH or LD) is an enzyme found in nearly all living cells (animals, plants, and prokaryotes). LDH catalyzes the conversion of pyruvate to lactate and back, as it converts NADH to NAD + and back. A dehydrogenase is an enzyme that transfers a hydride from one molecule to another. LDH exist in four distinct enzyme classes. This article is about the common NAD(P)-dependent L-lactate dehydrogenase. Tissue breakdown releases LDH, and therefore LDH can be measured as a surrogate for tissue breakdown, e.g. hemolysis. LDH is measured by the lactate dehy...
[1, 0, 0, 0, 0, ...]
how is platinum produced
['Platinum is found uncombined in alluvial deposits. Most commercially produced platinum comes from South Africa, from the mineral cooperite (platinum sulfide). Some platinum is prepared as a by-product of copper and nickel refining. Platinum is used in the chemicals industry as a catalyst for the production of nitric acid, silicone and benzene. It is also used as a catalyst to improve the efficiency of fuel cells. The electronics industry uses platinum for computer hard disks and thermocouples.', 'Platinum is a chemical element with symbol Pt and atomic number 78. It is a dense, malleable, ductile, highly unreactive, precious, gray-white transition metal. Platinum has six naturally occurring isotopes: 190 Pt, 192 Pt, 194 Pt, 195 Pt, 196 Pt, and 198 Pt. The most abundant of these is 195 Pt, comprising 33.83% of all platinum.', 'Welcome to Platinum Produce. Platinum Produce Company is a hydroponic greenhouse located in Blenheim, Ontario. Platinum Produce was started in 2003 and has grow...
[1, 0, 0, 0, 0, ...]
accounting process definition
['(The Accounting Cycle). The accounting process is a series of activities that begins with a transaction and ends with the closing of the books. Because this process is repeated each reporting period, it is referred to as the accounting cycle and includes these major steps: Identify the transaction or other recognizable event. ', 'Closing Process. The accounting closing process, also called closing the books, is the steps required to prepare accounts for financial statement preparation and the start of the next accounting period. The closing process consists of steps to transfer temporary account balances to permanent accounts and make the general ledger ready for the next accounting period. The closing process consists of three main steps: 1 Identify temporary accounts that need to be close', 'The steps required for individual transactions in the accounting process are: 1 Identify the transaction. 2 First, determine what kind of transaction it may be. 3 Examples are buying goods ...
[1, 0, 0, 0, 0, ...]
- Loss:
ListNetLoss
with these parameters:{ "activation_fn": "torch.nn.modules.linear.Identity", "mini_batch_size": 16 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16learning_rate
: 2e-05num_train_epochs
: 1warmup_ratio
: 0.1seed
: 12bf16
: Trueload_best_model_at_end
: True
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 2e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 12data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Truefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Trueignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportionalrouter_mapping
: {}learning_rate_mapping
: {}
Training Logs
Epoch | Step | Training Loss | Validation Loss | NanoMSMARCO_R100_ndcg@10 | NanoNFCorpus_R100_ndcg@10 | NanoNQ_R100_ndcg@10 | NanoBEIR_R100_mean_ndcg@10 |
---|---|---|---|---|---|---|---|
-1 | -1 | - | - | 0.0359 (-0.5045) | 0.2933 (-0.0318) | 0.0371 (-0.4635) | 0.1221 (-0.3333) |
0.0002 | 1 | 2.173 | - | - | - | - | - |
0.0508 | 250 | 2.089 | - | - | - | - | - |
0.1016 | 500 | 2.0896 | 2.0896 | 0.0416 (-0.4988) | 0.2979 (-0.0271) | 0.0164 (-0.4842) | 0.1186 (-0.3367) |
0.1525 | 750 | 2.0909 | - | - | - | - | - |
0.2033 | 1000 | 2.095 | 2.0888 | 0.0521 (-0.4883) | 0.2484 (-0.0767) | 0.0736 (-0.4270) | 0.1247 (-0.3307) |
0.2541 | 1250 | 2.0841 | - | - | - | - | - |
0.3049 | 1500 | 2.0862 | 2.0881 | 0.0527 (-0.4877) | 0.2643 (-0.0607) | 0.0590 (-0.4416) | 0.1253 (-0.3300) |
0.3558 | 1750 | 2.0871 | - | - | - | - | - |
0.4066 | 2000 | 2.0885 | 2.0878 | 0.0547 (-0.4857) | 0.2587 (-0.0663) | 0.0693 (-0.4314) | 0.1276 (-0.3278) |
0.4574 | 2250 | 2.085 | - | - | - | - | - |
0.5082 | 2500 | 2.0898 | 2.0878 | 0.0459 (-0.4945) | 0.2493 (-0.0757) | 0.0521 (-0.4485) | 0.1158 (-0.3396) |
0.5591 | 2750 | 2.0835 | - | - | - | - | - |
0.6099 | 3000 | 2.0882 | 2.0884 | 0.0648 (-0.4756) | 0.2549 (-0.0701) | 0.0567 (-0.4440) | 0.1255 (-0.3299) |
0.6607 | 3250 | 2.0868 | - | - | - | - | - |
0.7115 | 3500 | 2.0845 | 2.0872 | 0.0679 (-0.4725) | 0.2479 (-0.0772) | 0.0692 (-0.4314) | 0.1283 (-0.3270) |
0.7624 | 3750 | 2.0886 | - | - | - | - | - |
0.8132 | 4000 | 2.0827 | 2.0873 | 0.0635 (-0.4769) | 0.2589 (-0.0661) | 0.0699 (-0.4308) | 0.1308 (-0.3246) |
0.8640 | 4250 | 2.0852 | - | - | - | - | - |
0.9148 | 4500 | 2.0838 | 2.0871 | 0.0620 (-0.4785) | 0.2659 (-0.0591) | 0.0681 (-0.4325) | 0.1320 (-0.3234) |
0.9656 | 4750 | 2.0831 | - | - | - | - | - |
-1 | -1 | - | - | 0.0620 (-0.4785) | 0.2659 (-0.0591) | 0.0681 (-0.4325) | 0.1320 (-0.3234) |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.18
- Sentence Transformers: 5.0.0
- Transformers: 4.56.0.dev0
- PyTorch: 2.7.1+cu126
- Accelerate: 1.9.0
- Datasets: 4.0.0
- Tokenizers: 0.21.4
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
ListNetLoss
@inproceedings{cao2007learning,
title={Learning to Rank: From Pairwise Approach to Listwise Approach},
author={Cao, Zhe and Qin, Tao and Liu, Tie-Yan and Tsai, Ming-Feng and Li, Hang},
booktitle={Proceedings of the 24th international conference on Machine learning},
pages={129--136},
year={2007}
}
- Downloads last month
- 6
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-12_H-512_A-8-listnet
Base model
bansalaman18/bert-uncased_L-12_H-512_A-8Dataset used to train rahulseetharaman/reranker-msmarco-v1.1-bert-uncased_L-12_H-512_A-8-listnet
Evaluation results
- Map on NanoMSMARCO R100self-reported0.069
- Mrr@10 on NanoMSMARCO R100self-reported0.045
- Ndcg@10 on NanoMSMARCO R100self-reported0.062
- Map on NanoNFCorpus R100self-reported0.270
- Mrr@10 on NanoNFCorpus R100self-reported0.407
- Ndcg@10 on NanoNFCorpus R100self-reported0.266
- Map on NanoNQ R100self-reported0.056
- Mrr@10 on NanoNQ R100self-reported0.037
- Ndcg@10 on NanoNQ R100self-reported0.068
- Map on NanoBEIR R100 meanself-reported0.132