joe32140's picture
Add new SentenceTransformer model
b610ec2 verified
|
raw
history blame
31.2 kB
metadata
base_model: answerdotai/ModernBERT-base
datasets:
  - lightonai/ms-marco-en-bge
language:
  - en
library_name: PyLate
pipeline_tag: sentence-similarity
tags:
  - ColBERT
  - PyLate
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:808728
  - loss:Distillation

PyLate model based on answerdotai/ModernBERT-base

This is a PyLate model finetuned from answerdotai/ModernBERT-base on the train dataset. It maps sentences & paragraphs to sequences of 128-dimensional dense vectors and can be used for semantic textual similarity using the MaxSim operator.

Model Details

Model Description

  • Model Type: PyLate model
  • Base model: answerdotai/ModernBERT-base
  • Document Length: 180 tokens
  • Query Length: 32 tokens
  • Output Dimensionality: 128 tokens
  • Similarity Function: MaxSim
  • Training Dataset:
  • Language: en

Model Sources

Full Model Architecture

ColBERT(
  (0): Transformer({'max_seq_length': 179, 'do_lower_case': False}) with Transformer model: ModernBertModel 
  (1): Dense({'in_features': 768, 'out_features': 128, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
)

Usage

First install the PyLate library:

pip install -U pylate

Retrieval

PyLate provides a streamlined interface to index and retrieve documents using ColBERT models. The index leverages the Voyager HNSW index to efficiently handle document embeddings and enable fast retrieval.

Indexing documents

First, load the ColBERT model and initialize the Voyager index, then encode and index your documents:

from pylate import indexes, models, retrieve

# Step 1: Load the ColBERT model
model = models.ColBERT(
    model_name_or_path=pylate_model_id,
)

# Step 2: Initialize the Voyager index
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
    override=True,  # This overwrites the existing index if any
)

# Step 3: Encode the documents
documents_ids = ["1", "2", "3"]
documents = ["document 1 text", "document 2 text", "document 3 text"]

documents_embeddings = model.encode(
    documents,
    batch_size=32,
    is_query=False,  # Ensure that it is set to False to indicate that these are documents, not queries
    show_progress_bar=True,
)

# Step 4: Add document embeddings to the index by providing embeddings and corresponding ids
index.add_documents(
    documents_ids=documents_ids,
    documents_embeddings=documents_embeddings,
)

Note that you do not have to recreate the index and encode the documents every time. Once you have created an index and added the documents, you can re-use the index later by loading it:

# To load an index, simply instantiate it with the correct folder/name and without overriding it
index = indexes.Voyager(
    index_folder="pylate-index",
    index_name="index",
)

Retrieving top-k documents for queries

Once the documents are indexed, you can retrieve the top-k most relevant documents for a given set of queries. To do so, initialize the ColBERT retriever with the index you want to search in, encode the queries and then retrieve the top-k documents to get the top matches ids and relevance scores:

# Step 1: Initialize the ColBERT retriever
retriever = retrieve.ColBERT(index=index)

# Step 2: Encode the queries
queries_embeddings = model.encode(
    ["query for document 3", "query for document 1"],
    batch_size=32,
    is_query=True,  #  # Ensure that it is set to False to indicate that these are queries
    show_progress_bar=True,
)

# Step 3: Retrieve top-k documents
scores = retriever.retrieve(
    queries_embeddings=queries_embeddings, 
    k=10,  # Retrieve the top 10 matches for each query
)

Reranking

If you only want to use the ColBERT model to perform reranking on top of your first-stage retrieval pipeline without building an index, you can simply use rank function and pass the queries and documents to rerank:

from pylate import rank, models

queries = [
    "query A",
    "query B",
]

documents = [
    ["document A", "document B"],
    ["document 1", "document C", "document B"],
]

documents_ids = [
    [1, 2],
    [1, 3, 2],
]

model = models.ColBERT(
    model_name_or_path=pylate_model_id,
)

queries_embeddings = model.encode(
    queries,
    is_query=True,
)

documents_embeddings = model.encode(
    documents,
    is_query=False,
)

reranked_documents = rank.rerank(
    documents_ids=documents_ids,
    queries_embeddings=queries_embeddings,
    documents_embeddings=documents_embeddings,
)

Training Details

Training Dataset

train

  • Dataset: train at 11e6ffa
  • Size: 808,728 training samples
  • Columns: query_id, document_ids, and scores
  • Approximate statistics based on the first 1000 samples:
    query_id document_ids scores
    type string list list
    details
    • min: 5 tokens
    • mean: 5.59 tokens
    • max: 6 tokens
    • size: 32 elements
    • size: 32 elements
  • Samples:
    query_id document_ids scores
    121352 ['2259784', '4923159', '40211', '1545154', '8527175', ...] [0.2343463897705078, 0.639204204082489, 0.3806908428668976, 0.5623092651367188, 0.8051995635032654, ...]
    634306 ['7723525', '1874779', '379307', '2738583', '7599583', ...] [0.7124203443527222, 0.7379189729690552, 0.5786551237106323, 0.6142299175262451, 0.6755089163780212, ...]
    920825 ['5976297', '2866112', '3560294', '3285659', '4706740', ...] [0.6462352871894836, 0.7880821228027344, 0.791019856929779, 0.7709633111953735, 0.8284491300582886, ...]
  • Loss: pylate.losses.distillation.Distillation

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 4
  • gradient_accumulation_steps: 4
  • learning_rate: 8e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.05
  • bf16: True
  • tf32: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 4
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 8e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.05
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: True
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0.0020 100 0.0524
0.0040 200 0.0482
0.0059 300 0.0464
0.0079 400 0.043
0.0099 500 0.0387
0.0119 600 0.0383
0.0138 700 0.0345
0.0158 800 0.0307
0.0178 900 0.0294
0.0198 1000 0.0275
0.0218 1100 0.0271
0.0237 1200 0.0264
0.0257 1300 0.0258
0.0277 1400 0.0246
0.0297 1500 0.0239
0.0317 1600 0.023
0.0336 1700 0.0216
0.0356 1800 0.0282
0.0376 1900 0.0211
0.0396 2000 0.0205
0.0415 2100 0.0197
0.0435 2200 0.0187
0.0455 2300 0.0184
0.0475 2400 0.0177
0.0495 2500 0.0179
0.0514 2600 0.0173
0.0534 2700 0.0169
0.0554 2800 0.0163
0.0574 2900 0.016
0.0594 3000 0.016
0.0613 3100 0.0147
0.0633 3200 0.0148
0.0653 3300 0.0155
0.0673 3400 0.0149
0.0692 3500 0.0149
0.0712 3600 0.0141
0.0732 3700 0.0145
0.0752 3800 0.0142
0.0772 3900 0.0143
0.0791 4000 0.0137
0.0811 4100 0.0134
0.0831 4200 0.0129
0.0851 4300 0.0133
0.0871 4400 0.0135
0.0890 4500 0.0128
0.0910 4600 0.0126
0.0930 4700 0.0126
0.0950 4800 0.0129
0.0969 4900 0.0127
0.0989 5000 0.0127
0.1009 5100 0.0125
0.1029 5200 0.0119
0.1049 5300 0.0124
0.1068 5400 0.012
0.1088 5500 0.013
0.1108 5600 0.0119
0.1128 5700 0.0118
0.1147 5800 0.0121
0.1167 5900 0.0119
0.1187 6000 0.0116
0.1207 6100 0.0112
0.1227 6200 0.0116
0.1246 6300 0.0115
0.1266 6400 0.0119
0.1286 6500 0.0115
0.1306 6600 0.0109
0.1326 6700 0.0114
0.1345 6800 0.0114
0.1365 6900 0.0109
0.1385 7000 0.011
0.1405 7100 0.0111
0.1424 7200 0.0109
0.1444 7300 0.0108
0.1464 7400 0.0112
0.1484 7500 0.0106
0.1504 7600 0.011
0.1523 7700 0.0106
0.1543 7800 0.0107
0.1563 7900 0.0108
0.1583 8000 0.0106
0.1603 8100 0.0107
0.1622 8200 0.0108
0.1642 8300 0.0103
0.1662 8400 0.0107
0.1682 8500 0.0104
0.1701 8600 0.011
0.1721 8700 0.0105
0.1741 8800 0.0105
0.1761 8900 0.01
0.1781 9000 0.0106
0.1800 9100 0.0105
0.1820 9200 0.0104
0.1840 9300 0.0104
0.1860 9400 0.0107
0.1879 9500 0.0102
0.1899 9600 0.0103
0.1919 9700 0.0105
0.1939 9800 0.01
0.1959 9900 0.0098
0.1978 10000 0.0099
0.1998 10100 0.0099
0.2018 10200 0.0099
0.2038 10300 0.0098
0.2058 10400 0.01
0.2077 10500 0.0101
0.2097 10600 0.0098
0.2117 10700 0.0101
0.2137 10800 0.0098
0.2156 10900 0.0101
0.2176 11000 0.01
0.2196 11100 0.01
0.2216 11200 0.0096
0.2236 11300 0.0096
0.2255 11400 0.0096
0.2275 11500 0.0098
0.2295 11600 0.0099
0.2315 11700 0.0094
0.2335 11800 0.0096
0.2354 11900 0.0094
0.2374 12000 0.0098
0.2394 12100 0.0095
0.2414 12200 0.0095
0.2433 12300 0.0098
0.2453 12400 0.0097
0.2473 12500 0.0094
0.2493 12600 0.0093
0.2513 12700 0.0093
0.2532 12800 0.0092
0.2552 12900 0.0094
0.2572 13000 0.0095
0.2592 13100 0.0093
0.2612 13200 0.009
0.2631 13300 0.0087
0.2651 13400 0.0089
0.2671 13500 0.009
0.2691 13600 0.0091
0.2710 13700 0.0092
0.2730 13800 0.0089
0.2750 13900 0.0091
0.2770 14000 0.0092
0.2790 14100 0.0088
0.2809 14200 0.009
0.2829 14300 0.0091
0.2849 14400 0.0086
0.2869 14500 0.009
0.2888 14600 0.0088
0.2908 14700 0.0092
0.2928 14800 0.009
0.2948 14900 0.0088
0.2968 15000 0.0087
0.2987 15100 0.0085
0.3007 15200 0.009
0.3027 15300 0.0088
0.3047 15400 0.0086
0.3067 15500 0.0087
0.3086 15600 0.0088
0.3106 15700 0.0085
0.3126 15800 0.0088
0.3146 15900 0.0085
0.3165 16000 0.0086
0.3185 16100 0.0086
0.3205 16200 0.0087
0.3225 16300 0.0088
0.3245 16400 0.0087
0.3264 16500 0.0087
0.3284 16600 0.0086
0.3304 16700 0.0087
0.3324 16800 0.0092
0.3344 16900 0.0085
0.3363 17000 0.0088
0.3383 17100 0.0084
0.3403 17200 0.0088
0.3423 17300 0.0083
0.3442 17400 0.0085
0.3462 17500 0.0083
0.3482 17600 0.0084
0.3502 17700 0.0084
0.3522 17800 0.0083
0.3541 17900 0.0087
0.3561 18000 0.0083
0.3581 18100 0.0085
0.3601 18200 0.0082
0.3621 18300 0.0079
0.3640 18400 0.0085
0.3660 18500 0.0084
0.3680 18600 0.0082
0.3700 18700 0.0083
0.3719 18800 0.0082
0.3739 18900 0.0082
0.3759 19000 0.0083
0.3779 19100 0.0081
0.3799 19200 0.0083
0.3818 19300 0.0079
0.3838 19400 0.0083
0.3858 19500 0.0082
0.3878 19600 0.0084
0.3897 19700 0.0084
0.3917 19800 0.008
0.3937 19900 0.0081
0.3957 20000 0.0083
0.3977 20100 0.0082
0.3996 20200 0.0078
0.4016 20300 0.0079
0.4036 20400 0.0081
0.4056 20500 0.0085
0.4076 20600 0.0082
0.4095 20700 0.008
0.4115 20800 0.0079
0.4135 20900 0.0081
0.4155 21000 0.008
0.4174 21100 0.0079
0.4194 21200 0.0077
0.4214 21300 0.0078
0.4234 21400 0.0082
0.4254 21500 0.008
0.4273 21600 0.0076
0.4293 21700 0.0075
0.4313 21800 0.0078
0.4333 21900 0.0081
0.4353 22000 0.0077
0.4372 22100 0.0079
0.4392 22200 0.0078
0.4412 22300 0.0078
0.4432 22400 0.0077
0.4451 22500 0.0078
0.4471 22600 0.0079
0.4491 22700 0.0078
0.4511 22800 0.0079
0.4531 22900 0.0075
0.4550 23000 0.0077
0.4570 23100 0.0076
0.4590 23200 0.0078
0.4610 23300 0.0075
0.4629 23400 0.0075
0.4649 23500 0.0078
0.4669 23600 0.0075
0.4689 23700 0.0076
0.4709 23800 0.0075
0.4728 23900 0.0075
0.4748 24000 0.0075
0.4768 24100 0.0076
0.4788 24200 0.0079
0.4808 24300 0.0076
0.4827 24400 0.0077
0.4847 24500 0.0077
0.4867 24600 0.0073
0.4887 24700 0.0077
0.4906 24800 0.0076
0.4926 24900 0.0075
0.4946 25000 0.0076
0.4966 25100 0.0078
0.4986 25200 0.0077
0.5005 25300 0.0076
0.5025 25400 0.0076
0.5045 25500 0.0076
0.5065 25600 0.0073
0.5085 25700 0.0075
0.5104 25800 0.0072
0.5124 25900 0.0074
0.5144 26000 0.0075
0.5164 26100 0.0075
0.5183 26200 0.0072
0.5203 26300 0.0073
0.5223 26400 0.0073
0.5243 26500 0.0073
0.5263 26600 0.0076
0.5282 26700 0.0075
0.5302 26800 0.0075
0.5322 26900 0.0071
0.5342 27000 0.0074
0.5362 27100 0.0073
0.5381 27200 0.0072
0.5401 27300 0.0071
0.5421 27400 0.0073
0.5441 27500 0.0072
0.5460 27600 0.0076
0.5480 27700 0.0072
0.5500 27800 0.0074
0.5520 27900 0.0072
0.5540 28000 0.0072
0.5559 28100 0.0071
0.5579 28200 0.0069
0.5599 28300 0.0071
0.5619 28400 0.0075
0.5638 28500 0.0074
0.5658 28600 0.0072
0.5678 28700 0.0074
0.5698 28800 0.0072
0.5718 28900 0.0072
0.5737 29000 0.0073
0.5757 29100 0.0072
0.5777 29200 0.0069
0.5797 29300 0.0069
0.5817 29400 0.007
0.5836 29500 0.0071
0.5856 29600 0.007
0.5876 29700 0.0069
0.5896 29800 0.0072
0.5915 29900 0.007
0.5935 30000 0.007
0.5955 30100 0.007
0.5975 30200 0.0069
0.5995 30300 0.0068
0.6014 30400 0.0071
0.6034 30500 0.007
0.6054 30600 0.0071
0.6074 30700 0.007
0.6094 30800 0.0069
0.6113 30900 0.007
0.6133 31000 0.0071
0.6153 31100 0.0069
0.6173 31200 0.007
0.6192 31300 0.0068
0.6212 31400 0.0069
0.6232 31500 0.0068
0.6252 31600 0.0068
0.6272 31700 0.007
0.6291 31800 0.0068
0.6311 31900 0.0069
0.6331 32000 0.0068
0.6351 32100 0.0069
0.6370 32200 0.0066
0.6390 32300 0.0068
0.6410 32400 0.0067
0.6430 32500 0.0068
0.6450 32600 0.0069
0.6469 32700 0.0068
0.6489 32800 0.0065
0.6509 32900 0.0068
0.6529 33000 0.0067
0.6549 33100 0.0066
0.6568 33200 0.0069
0.6588 33300 0.0067
0.6608 33400 0.0067
0.6628 33500 0.0068
0.6647 33600 0.0066
0.6667 33700 0.0069
0.6687 33800 0.0069
0.6707 33900 0.0064
0.6727 34000 0.0065
0.6746 34100 0.0067
0.6766 34200 0.0063
0.6786 34300 0.0067
0.6806 34400 0.0066
0.6826 34500 0.0065
0.6845 34600 0.0064
0.6865 34700 0.0066
0.6885 34800 0.0065
0.6905 34900 0.0064
0.6924 35000 0.0066
0.6944 35100 0.0064
0.6964 35200 0.0064
0.6984 35300 0.0066
0.7004 35400 0.0065
0.7023 35500 0.0067
0.7043 35600 0.0065
0.7063 35700 0.0064
0.7083 35800 0.0066
0.7103 35900 0.0065
0.7122 36000 0.0067
0.7142 36100 0.0069
0.7162 36200 0.0065
0.7182 36300 0.0064
0.7201 36400 0.0064
0.7221 36500 0.0066
0.7241 36600 0.0065
0.7261 36700 0.0062
0.7281 36800 0.0068
0.7300 36900 0.0064
0.7320 37000 0.0067
0.7340 37100 0.0063
0.7360 37200 0.0063
0.7379 37300 0.0064
0.7399 37400 0.0066
0.7419 37500 0.0065
0.7439 37600 0.0064
0.7459 37700 0.0065
0.7478 37800 0.0064
0.7498 37900 0.0063
0.7518 38000 0.0062
0.7538 38100 0.0064
0.7558 38200 0.0062
0.7577 38300 0.0064
0.7597 38400 0.0063
0.7617 38500 0.0063
0.7637 38600 0.0065
0.7656 38700 0.0063
0.7676 38800 0.0064
0.7696 38900 0.0062
0.7716 39000 0.0062
0.7736 39100 0.0062
0.7755 39200 0.0063
0.7775 39300 0.0065
0.7795 39400 0.0061
0.7815 39500 0.0062
0.7835 39600 0.0063
0.7854 39700 0.0062
0.7874 39800 0.0062
0.7894 39900 0.0063
0.7914 40000 0.0059
0.7933 40100 0.0063
0.7953 40200 0.0064
0.7973 40300 0.006
0.7993 40400 0.0063
0.8013 40500 0.0061
0.8032 40600 0.0061
0.8052 40700 0.0062
0.8072 40800 0.0062
0.8092 40900 0.006
0.8112 41000 0.0061
0.8131 41100 0.0063
0.8151 41200 0.0059
0.8171 41300 0.0062
0.8191 41400 0.0062
0.8210 41500 0.0062
0.8230 41600 0.0062
0.8250 41700 0.0061
0.8270 41800 0.0061
0.8290 41900 0.0061
0.8309 42000 0.0063
0.8329 42100 0.0064
0.8349 42200 0.0063
0.8369 42300 0.0063
0.8388 42400 0.0061
0.8408 42500 0.0062
0.8428 42600 0.0062
0.8448 42700 0.0061
0.8468 42800 0.0059
0.8487 42900 0.006
0.8507 43000 0.0061
0.8527 43100 0.0062
0.8547 43200 0.0058
0.8567 43300 0.0065
0.8586 43400 0.0064
0.8606 43500 0.006
0.8626 43600 0.0061
0.8646 43700 0.0059
0.8665 43800 0.0063
0.8685 43900 0.0061
0.8705 44000 0.006
0.8725 44100 0.0061
0.8745 44200 0.0061
0.8764 44300 0.0059
0.8784 44400 0.006
0.8804 44500 0.006
0.8824 44600 0.0059
0.8844 44700 0.0062
0.8863 44800 0.006
0.8883 44900 0.006
0.8903 45000 0.0058
0.8923 45100 0.006
0.8942 45200 0.0061
0.8962 45300 0.006
0.8982 45400 0.0059
0.9002 45500 0.0059
0.9022 45600 0.006
0.9041 45700 0.0062
0.9061 45800 0.0056
0.9081 45900 0.0057
0.9101 46000 0.006
0.9120 46100 0.0059
0.9140 46200 0.006
0.9160 46300 0.0059
0.9180 46400 0.0062
0.9200 46500 0.0059
0.9219 46600 0.0059
0.9239 46700 0.006
0.9259 46800 0.0059
0.9279 46900 0.0058
0.9299 47000 0.0057
0.9318 47100 0.0058
0.9338 47200 0.0058
0.9358 47300 0.0059
0.9378 47400 0.0059
0.9397 47500 0.0058
0.9417 47600 0.006
0.9437 47700 0.0058
0.9457 47800 0.006
0.9477 47900 0.0059
0.9496 48000 0.0058
0.9516 48100 0.0057
0.9536 48200 0.006
0.9556 48300 0.0057
0.9576 48400 0.006
0.9595 48500 0.0058
0.9615 48600 0.0058
0.9635 48700 0.0058
0.9655 48800 0.0057
0.9674 48900 0.0058
0.9694 49000 0.006
0.9714 49100 0.0055
0.9734 49200 0.0058
0.9754 49300 0.0059
0.9773 49400 0.0057
0.9793 49500 0.0055
0.9813 49600 0.0059
0.9833 49700 0.0058
0.9853 49800 0.0059
0.9872 49900 0.0058
0.9892 50000 0.0056
0.9912 50100 0.0058
0.9932 50200 0.0058
0.9951 50300 0.0059
0.9971 50400 0.0059
0.9991 50500 0.006

Framework Versions

  • Python: 3.11.9
  • Sentence Transformers: 3.3.0
  • PyLate: 1.1.4
  • Transformers: 4.48.0.dev0
  • PyTorch: 2.4.0
  • Accelerate: 1.2.1
  • Datasets: 2.21.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084"
}

PyLate

@misc{PyLate,
title={PyLate: Flexible Training and Retrieval for Late Interaction Models},
author={Chaffin, Antoine and Sourty, Raphaël},
url={https://github.com/lightonai/pylate},
year={2024}
}