SentenceTransformer based on sentence-transformers/distilbert-base-nli-mean-tokens
This is a sentence-transformers model finetuned from sentence-transformers/distilbert-base-nli-mean-tokens. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/distilbert-base-nli-mean-tokens
- Maximum Sequence Length: 128 tokens
- Output Dimensionality: 768 tokens
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: DistilBertModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("DivyaMereddy007/RecipeBert_v5original_epoc50_Copy_of_TrainSetenceTransforme-Finetuning_v5_DistilledBert")
# Run inference
sentences = [
'Rhubarb Coffee Cake ["1 1/2 c. sugar", "1/2 c. butter", "1 egg", "1 c. buttermilk", "2 c. flour", "1/2 tsp. salt", "1 tsp. soda", "1 c. buttermilk", "2 c. rhubarb, finely cut", "1 tsp. vanilla"] ["Cream sugar and butter.", "Add egg and beat well.", "To creamed butter, sugar and egg, add alternately buttermilk with mixture of flour, salt and soda.", "Mix well.", "Add rhubarb and vanilla.", "Pour into greased 9 x 13-inch pan and add Topping."]',
'Prize-Winning Meat Loaf ["1 1/2 lb. ground beef", "1 c. tomato juice", "3/4 c. oats (uncooked)", "1 egg, beaten", "1/4 c. chopped onion", "1/4 tsp. pepper", "1 1/2 tsp. salt"] ["Mix well.", "Press firmly into an 8 1/2 x 4 1/2 x 2 1/2-inch loaf pan.", "Bake in preheated moderate oven.", "Bake at 350\\u00b0 for 1 hour.", "Let stand 5 minutes before slicing.", "Makes 8 servings."]',
'Angel Biscuits ["5 c. flour", "3 Tbsp. sugar", "4 tsp. baking powder", "1 1/2 pkg. dry yeast", "2 c. buttermilk", "1 tsp. soda", "1 1/2 sticks margarine", "1/2 c. warm water"] ["Mix flour, sugar, baking powder, soda and salt together.", "Cut in margarine, dissolve yeast in warm water.", "Stir into buttermilk and add to dry mixture.", "Cover and chill."]',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 1,746 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 63 tokens
- mean: 119.05 tokens
- max: 128 tokens
- min: 63 tokens
- mean: 118.49 tokens
- max: 128 tokens
- min: 0.0
- mean: 0.19
- max: 1.0
- Samples:
sentence_0 sentence_1 label Strawberry Whatever ["1 lb. frozen strawberries in juice", "1 small can crushed pineapple", "3 ripe bananas", "1 c. chopped pecans", "1 large pkg. strawberry Jell-O", "1 1/2 c. boiling water", "1 pt. sour cream"] ["Mix Jell-O in boiling water.", "Add strawberries, pineapple, crushed bananas and nuts.", "Spread 1/2 mixture in 13 x 6 1/2-inch pan.", "Allow to gel in freezer 30 minutes.", "Add layer of sour cream, then remaining mixture on top.", "Gel and serve."]
One Hour Rolls ["1 c. milk", "2 Tbsp. sugar", "1 pkg. dry yeast", "1 Tbsp. salt", "3 Tbsp. Crisco oil", "2 c. plain flour"] ["Put flour into a large mixing bowl.", "Combine sugar, milk, salt and oil in a saucepan and heat to boiling; remove from heat and let cool to lukewarm.", "Add yeast and mix well.", "Pour into flour and stir.", "Batter will be sticky.", "Roll out batter on a floured board and cut with biscuit cutter.", "Lightly brush tops with melted oleo and fold over.", "Place rolls on a cookie sheet, put in a warm place and let rise for 1 hour.", "Bake at 350\u00b0 for about 20 minutes. Yield: 2 1/2 dozen."]
0.1
Broccoli Dip For Crackers ["16 oz. sour cream", "1 pkg. dry vegetable soup mix", "10 oz. pkg. frozen chopped broccoli, thawed and drained", "4 to 6 oz. Cheddar cheese, grated"] ["Mix together sour cream, soup mix, broccoli and half of cheese.", "Sprinkle remaining cheese on top.", "Bake at 350\u00b0 for 30 minutes, uncovered.", "Serve hot with vegetable crackers."]
Vegetable-Burger Soup ["1/2 lb. ground beef", "2 c. water", "1 tsp. sugar", "1 pkg. Cup-a-Soup onion soup mix (dry)", "1 lb. can stewed tomatoes", "1 (8 oz.) can tomato sauce", "1 (10 oz.) pkg. frozen mixed vegetables"] ["Lightly brown beef in soup pot.", "Drain off excess fat.", "Stir in tomatoes, tomato sauce, water, frozen vegetables, soup mix and sugar.", "Bring to a boil.", "Reduce heat and simmer for 20 minutes. Serve."]
0.4
Summer Spaghetti ["1 lb. very thin spaghetti", "1/2 bottle McCormick Salad Supreme (seasoning)", "1 bottle Zesty Italian dressing"] ["Prepare spaghetti per package.", "Drain.", "Melt a little butter through it.", "Marinate overnight in Salad Supreme and Zesty Italian dressing.", "Just before serving, add cucumbers, tomatoes, green peppers, mushrooms, olives or whatever your taste may want."]
Chicken Funny ["1 large whole chicken", "2 (10 1/2 oz.) cans chicken gravy", "1 (10 1/2 oz.) can cream of mushroom soup", "1 (6 oz.) box Stove Top stuffing", "4 oz. shredded cheese"] ["Boil and debone chicken.", "Put bite size pieces in average size square casserole dish.", "Pour gravy and cream of mushroom soup over chicken; level.", "Make stuffing according to instructions on box (do not make too moist).", "Put stuffing on top of chicken and gravy; level.", "Sprinkle shredded cheese on top and bake at 350\u00b0 for approximately 20 minutes or until golden and bubbly."]
0.3
- Loss:
CosineSimilarityLoss
with these parameters:{ "loss_fct": "torch.nn.modules.loss.MSELoss" }
Training Hyperparameters
Non-Default Hyperparameters
per_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 50multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 50max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Falsehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseeval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falsebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
4.5455 | 500 | 0.0594 |
9.0909 | 1000 | 0.0099 |
13.6364 | 1500 | 0.0085 |
18.1818 | 2000 | 0.0077 |
22.7273 | 2500 | 0.0074 |
27.2727 | 3000 | 0.0071 |
31.8182 | 3500 | 0.0068 |
36.3636 | 4000 | 0.0066 |
40.9091 | 4500 | 0.0063 |
45.4545 | 5000 | 0.006 |
50.0 | 5500 | 0.0057 |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 3.0.1
- Transformers: 4.41.2
- PyTorch: 2.3.0+cu121
- Accelerate: 0.31.0
- Datasets: 2.19.2
- Tokenizers: 0.19.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.