strickvl/finetuned-all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
- Language: en
- License: apache-2.0
Model Sources
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer("strickvl/finetuned-all-MiniLM-L6-v2")
sentences = [
'Can you explain how the `query_similar_docs` function handles document reranking?',
'ry_similar_docs(\n\nquestion: str,\n\nurl_ending: str,use_reranking: bool = False,\n\nreturned_sample_size: int = 5,\n\n) -> Tuple[str, str, List[str]]:\n\n"""Query similar documents for a given question and URL ending."""\n\nembedded_question = get_embeddings(question)\n\ndb_conn = get_db_conn()\n\nnum_docs = 20 if use_reranking else returned_sample_size\n\n# get (content, url) tuples for the top n similar documents\n\ntop_similar_docs = get_topn_similar_docs(\n\nembedded_question, db_conn, n=num_docs, include_metadata=True\n\nif use_reranking:\n\nreranked_docs_and_urls = rerank_documents(question, top_similar_docs)[\n\n:returned_sample_size\n\nurls = [doc[1] for doc in reranked_docs_and_urls]\n\nelse:\n\nurls = [doc[1] for doc in top_similar_docs] # Unpacking URLs\n\nreturn (question, url_ending, urls)\n\nWe get the embeddings for the question being passed into the function and connect to our PostgreSQL database. If we\'re using reranking, we get the top 20 documents similar to our query and rerank them using the rerank_documents helper function. We then extract the URLs from the reranked documents and return them. Note that we only return 5 URLs, but in the case of reranking we get a larger number of documents and URLs back from the database to pass to our reranker, but in the end we always choose the top five reranked documents to return.\n\nNow that we\'ve added reranking to our pipeline, we can evaluate the performance of our reranker and see how it affects the quality of the retrieved documents.\n\nCode Example\n\nTo explore the full code, visit the Complete Guide repository and for this section, particularly the eval_retrieval.py file.\n\nPreviousUnderstanding reranking\n\nNextEvaluating reranking performance\n\nLast updated 15 days ago',
" use for the database connection.\ndatabase_ssl_ca:# The path to the client SSL certificate to use for the database connection.\ndatabase_ssl_cert:\n\n# The path to the client SSL key to use for the database connection.\ndatabase_ssl_key:\n\n# Whether to verify the database server SSL certificate.\ndatabase_ssl_verify_server_cert:\n\nRun the deploy command and pass the config file above to it.Copyzenml deploy --config=/PATH/TO/FILENote To be able to run the deploy command, you should have your cloud provider's CLI configured locally with permissions to create resources like MySQL databases and networks.\n\nConfiguration file templates\n\nBase configuration file\n\nBelow is the general structure of a config file. Use this as a base and then add any cloud-specific parameters from the sections below.\n\n# Name of the server deployment.\n\nname:\n\n# The server provider type, one of aws, gcp or azure.\n\nprovider:\n\n# The path to the kubectl config file to use for deployment.\n\nkubectl_config_path:\n\n# The Kubernetes namespace to deploy the ZenML server to.\n\nnamespace: zenmlserver\n\n# The path to the ZenML server helm chart to use for deployment.\n\nhelm_chart:\n\n# The repository and tag to use for the ZenML server Docker image.\n\nzenmlserver_image_repo: zenmldocker/zenml\n\nzenmlserver_image_tag: latest\n\n# Whether to deploy an nginx ingress controller as part of the deployment.\n\ncreate_ingress_controller: true\n\n# Whether to use TLS for the ingress.\n\ningress_tls: true\n\n# Whether to generate self-signed TLS certificates for the ingress.\n\ningress_tls_generate_certs: true\n\n# The name of the Kubernetes secret to use for the ingress.\n\ningress_tls_secret_name: zenml-tls-certs\n\n# The ingress controller's IP address. The ZenML server will be exposed on a subdomain of this IP. For AWS, if you have a hostname instead, use the following command to get the IP address: `dig +short <hostname>`.\n\ningress_controller_ip:\n\n# Whether to create a SQL database service as part of the recipe.\n\ndeploy_db: true\n\n# The username and password for the database.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
Evaluation
Metrics
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.3012 |
cosine_accuracy@3 |
0.5422 |
cosine_accuracy@5 |
0.6747 |
cosine_accuracy@10 |
0.741 |
cosine_precision@1 |
0.3012 |
cosine_precision@3 |
0.1807 |
cosine_precision@5 |
0.1349 |
cosine_precision@10 |
0.0741 |
cosine_recall@1 |
0.3012 |
cosine_recall@3 |
0.5422 |
cosine_recall@5 |
0.6747 |
cosine_recall@10 |
0.741 |
cosine_ndcg@10 |
0.5192 |
cosine_mrr@10 |
0.4479 |
cosine_map@100 |
0.4579 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.2952 |
cosine_accuracy@3 |
0.5301 |
cosine_accuracy@5 |
0.6325 |
cosine_accuracy@10 |
0.7349 |
cosine_precision@1 |
0.2952 |
cosine_precision@3 |
0.1767 |
cosine_precision@5 |
0.1265 |
cosine_precision@10 |
0.0735 |
cosine_recall@1 |
0.2952 |
cosine_recall@3 |
0.5301 |
cosine_recall@5 |
0.6325 |
cosine_recall@10 |
0.7349 |
cosine_ndcg@10 |
0.5119 |
cosine_mrr@10 |
0.441 |
cosine_map@100 |
0.4503 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.2711 |
cosine_accuracy@3 |
0.512 |
cosine_accuracy@5 |
0.6145 |
cosine_accuracy@10 |
0.6988 |
cosine_precision@1 |
0.2711 |
cosine_precision@3 |
0.1707 |
cosine_precision@5 |
0.1229 |
cosine_precision@10 |
0.0699 |
cosine_recall@1 |
0.2711 |
cosine_recall@3 |
0.512 |
cosine_recall@5 |
0.6145 |
cosine_recall@10 |
0.6988 |
cosine_ndcg@10 |
0.4884 |
cosine_mrr@10 |
0.4208 |
cosine_map@100 |
0.4308 |
Information Retrieval
Metric |
Value |
cosine_accuracy@1 |
0.253 |
cosine_accuracy@3 |
0.4578 |
cosine_accuracy@5 |
0.5542 |
cosine_accuracy@10 |
0.6566 |
cosine_precision@1 |
0.253 |
cosine_precision@3 |
0.1526 |
cosine_precision@5 |
0.1108 |
cosine_precision@10 |
0.0657 |
cosine_recall@1 |
0.253 |
cosine_recall@3 |
0.4578 |
cosine_recall@5 |
0.5542 |
cosine_recall@10 |
0.6566 |
cosine_ndcg@10 |
0.4466 |
cosine_mrr@10 |
0.3805 |
cosine_map@100 |
0.3906 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,490 training samples
- Columns:
positive
and anchor
- Approximate statistics based on the first 1000 samples:
|
positive |
anchor |
type |
string |
string |
details |
- min: 9 tokens
- mean: 21.12 tokens
- max: 49 tokens
|
- min: 21 tokens
- mean: 240.72 tokens
- max: 256 tokens
|
- Samples:
positive |
anchor |
Can you provide the details for the Azure service principal with the ID 273d2812-2643-4446-82e6-6098b8ccdaa4? |
ββ βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β ID β 273d2812-2643-4446-82e6-6098b8ccdaa4 β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β NAME β azure-service-principal β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β TYPE β π¦ azure β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β AUTH METHOD β service-principal β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β RESOURCE TYPES β π¦ azure-generic, π¦ blob-container, π kubernetes-cluster, π³ docker-registry β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β RESOURCE NAME β β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β SECRET ID β 50d9f230-c4ea-400e-b2d7-6b52ba2a6f90 β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β SESSION DURATION β N/A β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨
β EXPIRES IN β N/A β
β βββββββββββββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ¨ |
What are the new features introduced in ZenML 0.20.0 regarding the Metadata Store? |
ed to update the way they are registered in ZenML.the updated ZenML server provides a new and improved collaborative experience. When connected to a ZenML server, you can now share your ZenML Stacks and Stack Components with other users. If you were previously using the ZenML Profiles or the ZenML server to share your ZenML Stacks, you should switch to the new ZenML server and Dashboard and update your existing workflows to reflect the new features.
ZenML takes over the Metadata Store role
ZenML can now run as a server that can be accessed via a REST API and also comes with a visual user interface (called the ZenML Dashboard). This server can be deployed in arbitrary environments (local, on-prem, via Docker, on AWS, GCP, Azure etc.) and supports user management, workspace scoping, and more.
The release introduces a series of commands to facilitate managing the lifecycle of the ZenML server and to access the pipeline and pipeline run information:
zenml connect / disconnect / down / up / logs / status can be used to configure your client to connect to a ZenML server, to start a local ZenML Dashboard or to deploy a ZenML server to a cloud environment. For more information on how to use these commands, see the ZenML deployment documentation.
zenml pipeline list / runs / delete can be used to display information and about and manage your pipelines and pipeline runs.
In ZenML 0.13.2 and earlier versions, information about pipelines and pipeline runs used to be stored in a separate stack component called the Metadata Store. Starting with 0.20.0, the role of the Metadata Store is now taken over by ZenML itself. This means that the Metadata Store is no longer a separate component in the ZenML architecture, but rather a part of the ZenML core, located wherever ZenML is deployed: locally on your machine or running remotely as a server. |
Which environment variables should I set to use the Azure Service Connector authentication method in ZenML? |
-client-id","client_secret": "my-client-secret"}).Note: The remaining configuration options are deprecated and may be removed in a future release. Instead, you should set the ZENML_SECRETS_STORE_AUTH_METHOD and ZENML_SECRETS_STORE_AUTH_CONFIG variables to use the Azure Service Connector authentication method.
ZENML_SECRETS_STORE_AZURE_CLIENT_ID: The Azure application service principal client ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
ZENML_SECRETS_STORE_AZURE_CLIENT_SECRET: The Azure application service principal client secret to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
ZENML_SECRETS_STORE_AZURE_TENANT_ID: The Azure application service principal tenant ID to use to authenticate with the Azure Key Vault API. If you are running the ZenML server hosted in Azure and are using a managed identity to access the Azure Key Vault service, you can omit this variable.
These configuration options are only relevant if you're using Hashicorp Vault as the secrets store backend.
ZENML_SECRETS_STORE_TYPE: Set this to hashicorp in order to set this type of secret store.
ZENML_SECRETS_STORE_VAULT_ADDR: The URL of the HashiCorp Vault server to connect to. NOTE: this is the same as setting the VAULT_ADDR environment variable.
ZENML_SECRETS_STORE_VAULT_TOKEN: The token to use to authenticate with the HashiCorp Vault server. NOTE: this is the same as setting the VAULT_TOKEN environment variable.
ZENML_SECRETS_STORE_VAULT_NAMESPACE: The Vault Enterprise namespace. Not required for Vault OSS. NOTE: this is the same as setting the VAULT_NAMESPACE environment variable. |
- Loss:
MatryoshkaLoss
with these parameters:{
"loss": "MultipleNegativesRankingLoss",
"matryoshka_dims": [
384,
256,
128,
64
],
"matryoshka_weights": [
1,
1,
1,
1
],
"n_dims_per_step": -1
}
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epoch
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
gradient_accumulation_steps
: 16
learning_rate
: 2e-05
num_train_epochs
: 4
lr_scheduler_type
: cosine
warmup_ratio
: 0.1
bf16
: True
tf32
: True
load_best_model_at_end
: True
optim
: adamw_torch_fused
batch_sampler
: no_duplicates
All Hyperparameters
Click to expand
overwrite_output_dir
: False
do_predict
: False
eval_strategy
: epoch
prediction_loss_only
: True
per_device_train_batch_size
: 32
per_device_eval_batch_size
: 16
per_gpu_train_batch_size
: None
per_gpu_eval_batch_size
: None
gradient_accumulation_steps
: 16
eval_accumulation_steps
: None
learning_rate
: 2e-05
weight_decay
: 0.0
adam_beta1
: 0.9
adam_beta2
: 0.999
adam_epsilon
: 1e-08
max_grad_norm
: 1.0
num_train_epochs
: 4
max_steps
: -1
lr_scheduler_type
: cosine
lr_scheduler_kwargs
: {}
warmup_ratio
: 0.1
warmup_steps
: 0
log_level
: passive
log_level_replica
: warning
log_on_each_node
: True
logging_nan_inf_filter
: True
save_safetensors
: True
save_on_each_node
: False
save_only_model
: False
restore_callback_states_from_checkpoint
: False
no_cuda
: False
use_cpu
: False
use_mps_device
: False
seed
: 42
data_seed
: None
jit_mode_eval
: False
use_ipex
: False
bf16
: True
fp16
: False
fp16_opt_level
: O1
half_precision_backend
: auto
bf16_full_eval
: False
fp16_full_eval
: False
tf32
: True
local_rank
: 0
ddp_backend
: None
tpu_num_cores
: None
tpu_metrics_debug
: False
debug
: []
dataloader_drop_last
: False
dataloader_num_workers
: 0
dataloader_prefetch_factor
: None
past_index
: -1
disable_tqdm
: True
remove_unused_columns
: True
label_names
: None
load_best_model_at_end
: True
ignore_data_skip
: False
fsdp
: []
fsdp_min_num_params
: 0
fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap
: None
accelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed
: None
label_smoothing_factor
: 0.0
optim
: adamw_torch_fused
optim_args
: None
adafactor
: False
group_by_length
: False
length_column_name
: length
ddp_find_unused_parameters
: None
ddp_bucket_cap_mb
: None
ddp_broadcast_buffers
: False
dataloader_pin_memory
: True
dataloader_persistent_workers
: False
skip_memory_metrics
: True
use_legacy_prediction_loop
: False
push_to_hub
: False
resume_from_checkpoint
: None
hub_model_id
: None
hub_strategy
: every_save
hub_private_repo
: False
hub_always_push
: False
gradient_checkpointing
: False
gradient_checkpointing_kwargs
: None
include_inputs_for_metrics
: False
eval_do_concat_batches
: True
fp16_backend
: auto
push_to_hub_model_id
: None
push_to_hub_organization
: None
mp_parameters
:
auto_find_batch_size
: False
full_determinism
: False
torchdynamo
: None
ray_scope
: last
ddp_timeout
: 1800
torch_compile
: False
torch_compile_backend
: None
torch_compile_mode
: None
dispatch_batches
: None
split_batches
: None
include_tokens_per_second
: False
include_num_input_tokens_seen
: False
neftune_noise_alpha
: None
optim_target_modules
: None
batch_eval_metrics
: False
batch_sampler
: no_duplicates
multi_dataset_batch_sampler
: proportional
Training Logs
Epoch |
Step |
dim_128_cosine_map@100 |
dim_256_cosine_map@100 |
dim_384_cosine_map@100 |
dim_64_cosine_map@100 |
0.6667 |
1 |
0.3800 |
0.3986 |
0.4149 |
0.3471 |
2.0 |
3 |
0.4194 |
0.4473 |
0.4557 |
0.3762 |
2.6667 |
4 |
0.4308 |
0.4503 |
0.4579 |
0.3906 |
- The bold row denotes the saved checkpoint.
Framework Versions
- Python: 3.10.14
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.1+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
MatryoshkaLoss
@misc{kusupati2024matryoshka,
title={Matryoshka Representation Learning},
author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
year={2024},
eprint={2205.13147},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
MultipleNegativesRankingLoss
@misc{henderson2017efficient,
title={Efficient Natural Language Response Suggestion for Smart Reply},
author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
year={2017},
eprint={1705.00652},
archivePrefix={arXiv},
primaryClass={cs.CL}
}