strickvl/finetuned-all-MiniLM-L6-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: sentence-transformers/all-MiniLM-L6-v2
Maximum Sequence Length: 256 tokens
Output Dimensionality: 384 tokens
Similarity Function: Cosine Similarity
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("strickvl/finetuned-all-MiniLM-L6-v2")
# Run inference
sentences = [
    'Can you explain how the `query_similar_docs` function handles document reranking?',
    'ry_similar_docs(\n\nquestion: str,\n\nurl_ending: str,use_reranking: bool = False,\n\nreturned_sample_size: int = 5,\n\n) -> Tuple[str, str, List[str]]:\n\n"""Query similar documents for a given question and URL ending."""\n\nembedded_question = get_embeddings(question)\n\ndb_conn = get_db_conn()\n\nnum_docs = 20 if use_reranking else returned_sample_size\n\n# get (content, url) tuples for the top n similar documents\n\ntop_similar_docs = get_topn_similar_docs(\n\nembedded_question, db_conn, n=num_docs, include_metadata=True\n\nif use_reranking:\n\nreranked_docs_and_urls = rerank_documents(question, top_similar_docs)[\n\n:returned_sample_size\n\nurls = [doc[1] for doc in reranked_docs_and_urls]\n\nelse:\n\nurls = [doc[1] for doc in top_similar_docs]  # Unpacking URLs\n\nreturn (question, url_ending, urls)\n\nWe get the embeddings for the question being passed into the function and connect to our PostgreSQL database. If we\'re using reranking, we get the top 20 documents similar to our query and rerank them using the rerank_documents helper function. We then extract the URLs from the reranked documents and return them. Note that we only return 5 URLs, but in the case of reranking we get a larger number of documents and URLs back from the database to pass to our reranker, but in the end we always choose the top five reranked documents to return.\n\nNow that we\'ve added reranking to our pipeline, we can evaluate the performance of our reranker and see how it affects the quality of the retrieved documents.\n\nCode Example\n\nTo explore the full code, visit the Complete Guide repository and for this section, particularly the eval_retrieval.py file.\n\nPreviousUnderstanding reranking\n\nNextEvaluating reranking performance\n\nLast updated 15 days ago',
    " use for the database connection.\ndatabase_ssl_ca:# The path to the client SSL certificate to use for the database connection.\ndatabase_ssl_cert:\n\n# The path to the client SSL key to use for the database connection.\ndatabase_ssl_key:\n\n# Whether to verify the database server SSL certificate.\ndatabase_ssl_verify_server_cert:\n\nRun the deploy command and pass the config file above to it.Copyzenml deploy --config=/PATH/TO/FILENote To be able to run the deploy command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks.\n\nConfiguration file templates\n\nBase configuration file\n\nBelow is the general structure of a config file. Use this as a base and then add any cloud-specific parameters from the sections below.\n\n# Name of the server deployment.\n\nname:\n\n# The server provider type, one of aws, gcp or azure.\n\nprovider:\n\n# The path to the kubectl config file to use for deployment.\n\nkubectl_config_path:\n\n# The Kubernetes namespace to deploy the ZenML server to.\n\nnamespace: zenmlserver\n\n# The path to the ZenML server helm chart to use for deployment.\n\nhelm_chart:\n\n# The repository and tag to use for the ZenML server Docker image.\n\nzenmlserver_image_repo: zenmldocker/zenml\n\nzenmlserver_image_tag: latest\n\n# Whether to deploy an nginx ingress controller as part of the deployment.\n\ncreate_ingress_controller: true\n\n# Whether to use TLS for the ingress.\n\ningress_tls: true\n\n# Whether to generate self-signed TLS certificates for the ingress.\n\ningress_tls_generate_certs: true\n\n# The name of the Kubernetes secret to use for the ingress.\n\ningress_tls_secret_name: zenml-tls-certs\n\n# The ingress controller's IP address. The ZenML server will be exposed on a subdomain of this IP. For AWS, if you have a hostname instead, use the following command to get the IP address: `dig +short <hostname>`.\n\ningress_controller_ip:\n\n# Whether to create a SQL database service as part of the recipe.\n\ndeploy_db: true\n\n# The username and password for the database.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: dim_384
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.3012
cosine_accuracy@3	0.5422
cosine_accuracy@5	0.6747
cosine_accuracy@10	0.741
cosine_precision@1	0.3012
cosine_precision@3	0.1807
cosine_precision@5	0.1349
cosine_precision@10	0.0741
cosine_recall@1	0.3012
cosine_recall@3	0.5422
cosine_recall@5	0.6747
cosine_recall@10	0.741
cosine_ndcg@10	0.5192
cosine_mrr@10	0.4479
cosine_map@100	0.4579

Information Retrieval

Dataset: dim_256
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.2952
cosine_accuracy@3	0.5301
cosine_accuracy@5	0.6325
cosine_accuracy@10	0.7349
cosine_precision@1	0.2952
cosine_precision@3	0.1767
cosine_precision@5	0.1265
cosine_precision@10	0.0735
cosine_recall@1	0.2952
cosine_recall@3	0.5301
cosine_recall@5	0.6325
cosine_recall@10	0.7349
cosine_ndcg@10	0.5119
cosine_mrr@10	0.441
cosine_map@100	0.4503

Information Retrieval

Dataset: dim_128
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.2711
cosine_accuracy@3	0.512
cosine_accuracy@5	0.6145
cosine_accuracy@10	0.6988
cosine_precision@1	0.2711
cosine_precision@3	0.1707
cosine_precision@5	0.1229
cosine_precision@10	0.0699
cosine_recall@1	0.2711
cosine_recall@3	0.512
cosine_recall@5	0.6145
cosine_recall@10	0.6988
cosine_ndcg@10	0.4884
cosine_mrr@10	0.4208
cosine_map@100	0.4308

Information Retrieval

Dataset: dim_64
Evaluated with InformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.253
cosine_accuracy@3	0.4578
cosine_accuracy@5	0.5542
cosine_accuracy@10	0.6566
cosine_precision@1	0.253
cosine_precision@3	0.1526
cosine_precision@5	0.1108
cosine_precision@10	0.0657
cosine_recall@1	0.253
cosine_recall@3	0.4578
cosine_recall@5	0.5542
cosine_recall@10	0.6566
cosine_ndcg@10	0.4466
cosine_mrr@10	0.3805
cosine_map@100	0.3906

Training Details

Training Dataset

Unnamed Dataset

Size: 1,490 training samples
Columns: positive and anchor
Approximate statistics based on the first 1000 samples:
positive anchor
type string string
details
min: 9 tokens
mean: 21.12 tokens
max: 49 tokens

min: 21 tokens
mean: 240.72 tokens
max: 256 tokens

	positive	anchor
type	string	string
details	min: 9 tokens mean: 21.12 tokens max: 49 tokens	min: 21 tokens mean: 240.72 tokens max: 256 tokens

Samples:

positive	anchor
`Can you provide the details for the Azure service principal with the ID 273d2812-2643-4446-82e6-6098b8ccdaa4?`	┃┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ ID │ 273d2812-2643-4446-82e6-6098b8ccdaa4 ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ NAME │ azure-service-principal ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ TYPE │ 🇦 azure ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ AUTH METHOD │ service-principal ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ RESOURCE TYPES │ 🇦 azure-generic, 📦 blob-container, 🌀 kubernetes-cluster, 🐳 docker-registry ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ RESOURCE NAME │ ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ SECRET ID │ 50d9f230-c4ea-400e-b2d7-6b52ba2a6f90 ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ SESSION DURATION │ N/A ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨ ┃ EXPIRES IN │ N/A ┃ ┠──────────────────┼────────────────────────────────────────────────────────────────────────────────┨
`What are the new features introduced in ZenML 0.20.0 regarding the Metadata Store?`	ed to update the way they are registered in ZenML.the updated ZenML server provides a new and improved collaborative experience. When connected to a ZenML server, you can now share your ZenML Stacks and Stack Components with other users. If you were previously using the ZenML Profiles or the ZenML server to share your ZenML Stacks, you should switch to the new ZenML server and Dashboard and update your existing workflows to reflect the new features. ZenML takes over the Metadata Store role ZenML can now run as a server that can be accessed via a REST API and also comes with a visual user interface (called the ZenML Dashboard). This server can be deployed in arbitrary environments (local, on-prem, via Docker, on AWS, GCP, Azure etc.) and supports user management, workspace scoping, and more. The release introduces a series of commands to facilitate managing the lifecycle of the ZenML server and to access the pipeline and pipeline run information: zenml connect / disconnect / down / up / logs / status can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see the ZenML deployment documentation. zenml pipeline list / runs / delete can be used to display information and about and manage your pipelines and pipeline runs. In ZenML 0.13.2 and earlier versions, information about pipelines and pipeline runs used to be stored in a separate stack component called the Metadata Store. Starting with 0.20.0, the role of the Metadata Store is now taken over by ZenML itself. This means that the Metadata Store is no longer a separate component in the ZenML architecture, but rather a part of the ZenML core, located wherever ZenML is deployed: locally on your machine or running remotely as a server.
`Which environment variables should I set to use the Azure Service Connector authentication method in ZenML?`	-client-id","client_secret": "my-client-secret"}).Note: The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the ZENML_SECRETS_STORE_AUTH_METHOD and ZENML_SECRETS_STORE_AUTH_CONFIG variables to use the Azure Service Connector authentication method. ZENML_SECRETS_STORE_AZURE_CLIENT_ID: The Azure application service principal client ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. ZENML_SECRETS_STORE_AZURE_CLIENT_SECRET: The Azure application service principal client secret to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. ZENML_SECRETS_STORE_AZURE_TENANT_ID: The Azure application service principal tenant ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable. These configuration options are only relevant if you're using Hashicorp Vault as the secrets store backend. ZENML_SECRETS_STORE_TYPE: Set this to hashicorp in order to set this type of secret store. ZENML_SECRETS_STORE_VAULT_ADDR: The URL of the HashiCorp Vault server to connect to. NOTE: this is the same as setting the VAULT_ADDR environment variable. ZENML_SECRETS_STORE_VAULT_TOKEN: The token to use to authenticate with the HashiCorp Vault server. NOTE: this is the same as setting the VAULT_TOKEN environment variable. ZENML_SECRETS_STORE_VAULT_NAMESPACE: The Vault Enterprise namespace. Not required for Vault OSS. NOTE: this is the same as setting the VAULT_NAMESPACE environment variable.

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "MultipleNegativesRankingLoss",
    "matryoshka_dims": [
        384,
        256,
        128,
        64
    ],
    "matryoshka_weights": [
        1,
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
per_device_train_batch_size: 32
per_device_eval_batch_size: 16
gradient_accumulation_steps: 16
learning_rate: 2e-05
num_train_epochs: 4
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: True
tf32: True
load_best_model_at_end: True
optim: adamw_torch_fused
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 32
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 16
eval_accumulation_steps: None
learning_rate: 2e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 4
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: True
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: True
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: False
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	dim_128_cosine_map@100	dim_256_cosine_map@100	dim_384_cosine_map@100	dim_64_cosine_map@100
0.6667	1	0.3800	0.3986	0.4149	0.3471
2.0	3	0.4194	0.4473	0.4557	0.3762
2.6667	4	0.4308	0.4503	0.4579	0.3906

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.10.14
Sentence Transformers: 3.0.1
Transformers: 4.41.2
PyTorch: 2.3.1+cu121
Accelerate: 0.31.0
Datasets: 2.19.1
Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning}, 
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply}, 
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}

Downloads last month: 2

Safetensors

Model size

22.7M params

Tensor type

F32

Model tree for strickvl/finetuned-all-MiniLM-L6-v2

Base model

sentence-transformers/all-MiniLM-L6-v2

Finetuned

(553)

this model

Evaluation results

Cosine Accuracy@1 on dim 384
self-reported

0.301
Cosine Accuracy@3 on dim 384
self-reported

0.542
Cosine Accuracy@5 on dim 384
self-reported

0.675
Cosine Accuracy@10 on dim 384
self-reported

0.741
Cosine Precision@1 on dim 384
self-reported

0.301
Cosine Precision@3 on dim 384
self-reported

0.181
Cosine Precision@5 on dim 384
self-reported

0.135
Cosine Precision@10 on dim 384
self-reported

0.074
Cosine Recall@1 on dim 384
self-reported

0.301
Cosine Recall@3 on dim 384
self-reported

0.542

View on Papers With Code