SentenceTransformer based on sentence-transformers/all-MiniLM-L6-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-MiniLM-L6-v2. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-MiniLM-L6-v2
- Maximum Sequence Length: 256 tokens
- Output Dimensionality: 384 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("Hgkang00/FT-label-consent-10")
# Run inference
sentences = [
'I engage in risky behaviors like reckless driving or reckless sexual encounters.',
'Symptoms during a manic episode include inflated self-esteem or grandiosity,increased goal-directed activity, or excessive involvement in risky activities.',
'Marked decrease in functioning in areas like work, interpersonal relations, or self-care since the onset of the disturbance.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Evaluation
Metrics
Semantic Similarity
- Dataset:
FT_label
- Evaluated with
EmbeddingSimilarityEvaluator
Metric | Value |
---|---|
pearson_cosine | 0.4057 |
spearman_cosine | 0.4158 |
pearson_manhattan | 0.4294 |
spearman_manhattan | 0.4164 |
pearson_euclidean | 0.4293 |
spearman_euclidean | 0.4158 |
pearson_dot | 0.4057 |
spearman_dot | 0.4158 |
pearson_max | 0.4294 |
spearman_max | 0.4164 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 33,800 training samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 29 tokens
- mean: 29.0 tokens
- max: 29 tokens
- min: 14 tokens
- mean: 25.15 tokens
- max: 43 tokens
- min: 0.0
- mean: 0.06
- max: 1.0
- Samples:
sentence1 sentence2 score Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
I often hear voices telling me things that are not real, even when I'm alone in my room.
1.0
Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
I have strong beliefs that people are plotting against me and trying to harm me, which makes it hard for me to trust anyone.
1.0
Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
Sometimes, I see things that others around me don't see, like strange figures or objects.
1.0
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Evaluation Dataset
Unnamed Dataset
- Size: 4,225 evaluation samples
- Columns:
sentence1
,sentence2
, andscore
- Approximate statistics based on the first 1000 samples:
sentence1 sentence2 score type string string float details - min: 18 tokens
- mean: 31.8 tokens
- max: 60 tokens
- min: 15 tokens
- mean: 24.59 tokens
- max: 41 tokens
- min: 0.0
- mean: 0.06
- max: 1.0
- Samples:
sentence1 sentence2 score Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
People around me have noticed that my behavior is becoming more erratic and unpredictable.
1.0
Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
There are times when I repeat certain actions or words without any clear purpose, almost like being stuck in a loop.
0.0
Presence of delusions, hallucinations or disorganized speech, for a significant portion of time within a 1-month period
I feel detached from reality at times and have trouble distinguishing between what is real and what is not.
0.0
- Loss:
CoSENTLoss
with these parameters:{ "scale": 20.0, "similarity_fct": "pairwise_cos_sim" }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: epochper_device_train_batch_size
: 256per_device_eval_batch_size
: 128num_train_epochs
: 10warmup_ratio
: 0.1
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: epochprediction_loss_only
: Trueper_device_train_batch_size
: 256per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 10max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.1warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: proportional
Training Logs
Epoch | Step | Training Loss | loss | FT_label_spearman_cosine |
---|---|---|---|---|
0.0377 | 10 | 11.8816 | - | - |
0.0755 | 20 | 12.0633 | - | - |
0.1132 | 30 | 11.2972 | - | - |
0.1509 | 40 | 11.4435 | - | - |
0.1887 | 50 | 10.9872 | - | - |
0.2264 | 60 | 10.3121 | - | - |
0.2642 | 70 | 10.0711 | - | - |
0.3019 | 80 | 9.6888 | - | - |
0.3396 | 90 | 9.2037 | - | - |
0.3774 | 100 | 8.6158 | - | - |
0.4151 | 110 | 8.4605 | - | - |
0.4528 | 120 | 8.202 | - | - |
0.4906 | 130 | 7.9642 | - | - |
0.5283 | 140 | 7.8384 | - | - |
0.5660 | 150 | 7.8803 | - | - |
0.6038 | 160 | 7.419 | - | - |
1.0 | 133 | 8.435 | 8.1138 | 0.3813 |
2.0 | 266 | 7.7886 | 8.2494 | 0.4003 |
3.0 | 399 | 7.164 | 8.7060 | 0.4048 |
4.0 | 532 | 6.5921 | 9.5854 | 0.3882 |
5.0 | 665 | 6.2349 | 10.5716 | 0.4042 |
6.0 | 798 | 5.7831 | 10.9500 | 0.4147 |
7.0 | 931 | 5.4894 | 11.6387 | 0.4120 |
8.0 | 1064 | 5.2348 | 12.2129 | 0.4113 |
9.0 | 1197 | 5.0118 | 12.4632 | 0.4099 |
10.0 | 1330 | 4.8566 | 12.7203 | 0.4158 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.0
- Transformers: 4.41.1
- PyTorch: 2.3.0+cu121
- Accelerate: 0.30.1
- Datasets: 2.19.1
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
CoSENTLoss
@online{kexuefm-8847,
title={CoSENT: A more efficient sentence vector scheme than Sentence-BERT},
author={Su Jianlin},
year={2022},
month={Jan},
url={https://kexue.fm/archives/8847},
}
- Downloads last month
- 2
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for Hgkang00/FT-label-consent-10
Base model
sentence-transformers/all-MiniLM-L6-v2Evaluation results
- Pearson Cosine on FT labelself-reported0.406
- Spearman Cosine on FT labelself-reported0.416
- Pearson Manhattan on FT labelself-reported0.429
- Spearman Manhattan on FT labelself-reported0.416
- Pearson Euclidean on FT labelself-reported0.429
- Spearman Euclidean on FT labelself-reported0.416
- Pearson Dot on FT labelself-reported0.406
- Spearman Dot on FT labelself-reported0.416
- Pearson Max on FT labelself-reported0.429
- Spearman Max on FT labelself-reported0.416