SetFit with sentence-transformers/paraphrase-mpnet-base-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-mpnet-base-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: sentence-transformers/paraphrase-mpnet-base-v2
Classification head: a LogisticRegression instance
Maximum Sequence Length: 512 tokens
Number of Classes: 5 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
order tracking	'What is the delivery status for my order placed using phone number 123456789?' 'I ordered the Cake Decorating Kit 4 days ago, can you provide the tracking information?' 'I ordered the Cake Stands 2 days ago with order no 54321 how long will it take to deliver?'
general faq	'How do the traditional hand-woven Banarasi sarees from HKV Benaras differ from those made by machine-driven industries?' 'What are the key factors to consider when developing a personalized diet plan for weight loss?' "Are there any scientific studies that support Green Tea's role in preventing Alzheimer's and Parkinson's diseases?"
product policy	'How do you use the information collected through tracking tools like Google Analytics and cookies?' 'How does bakeyy handle returns for items that were purchased with a thank you discount?' 'What is the procedure for returning a product that was part of a special occasion promotion?'
product discoverability	'What is the price of the organic honey?' 'Variety of cookie boxes' 'what apparells do you have from Drew House'
product faq	'What is the price of the bestseller honey?' 'Do you offer any bulk discounts on organic honey?' 'Are the big plum cake boxes available in packs of 30?'

Evaluation

Metrics

Label	Accuracy
all	0.84

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Shankhdhar/classifier_woog_firstbud_updated")
# Run inference
preds = model("cookie boxes with dividers")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	11.9760	28

Label	Training Sample Count
general faq	24
order tracking	34
product discoverability	50
product faq	50
product policy	50

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (2, 2)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
seed: 42
eval_max_steps: -1
load_best_model_at_end: True

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0005	1	0.2048	-
0.0235	50	0.2874	-
0.0470	100	0.126	-
0.0705	150	0.0388	-
0.0940	200	0.0786	-
0.1175	250	0.0049	-
0.1410	300	0.0048	-
0.1646	350	0.0018	-
0.1881	400	0.0011	-
0.2116	450	0.0004	-
0.2351	500	0.0006	-
0.2586	550	0.0005	-
0.2821	600	0.0012	-
0.3056	650	0.0004	-
0.3291	700	0.0003	-
0.3526	750	0.0002	-
0.3761	800	0.0002	-
0.3996	850	0.0002	-
0.4231	900	0.0002	-
0.4466	950	0.0008	-
0.4701	1000	0.0002	-
0.4937	1050	0.0003	-
0.5172	1100	0.0001	-
0.5407	1150	0.0002	-
0.5642	1200	0.0001	-
0.5877	1250	0.0001	-
0.6112	1300	0.0001	-
0.6347	1350	0.0004	-
0.6582	1400	0.0002	-
0.6817	1450	0.0001	-
0.7052	1500	0.0002	-
0.7287	1550	0.0001	-
0.7522	1600	0.0001	-
0.7757	1650	0.0001	-
0.7992	1700	0.0001	-
0.8228	1750	0.0001	-
0.8463	1800	0.0001	-
0.8698	1850	0.0001	-
0.8933	1900	0.0001	-
0.9168	1950	0.0001	-
0.9403	2000	0.0001	-
0.9638	2050	0.0001	-
0.9873	2100	0.0002	-
1.0108	2150	0.0001	-
1.0343	2200	0.0001	-
1.0578	2250	0.0001	-
1.0813	2300	0.0001	-
1.1048	2350	0.0001	-
1.1283	2400	0.0	-
1.1519	2450	0.0001	-
1.1754	2500	0.0	-
1.1989	2550	0.0001	-
1.2224	2600	0.0007	-
1.2459	2650	0.0001	-
1.2694	2700	0.0001	-
1.2929	2750	0.0001	-
1.3164	2800	0.0001	-
1.3399	2850	0.0001	-
1.3634	2900	0.0001	-
1.3869	2950	0.0001	-
1.4104	3000	0.0001	-
1.4339	3050	0.0001	-
1.4575	3100	0.0001	-
1.4810	3150	0.0001	-
1.5045	3200	0.0001	-
1.5280	3250	0.0001	-
1.5515	3300	0.0001	-
1.5750	3350	0.0001	-
1.5985	3400	0.0001	-
1.6220	3450	0.0001	-
1.6455	3500	0.0001	-
1.6690	3550	0.0001	-
1.6925	3600	0.0001	-
1.7160	3650	0.0	-
1.7395	3700	0.0001	-
1.7630	3750	0.0001	-
1.7866	3800	0.0	-
1.8101	3850	0.0001	-
1.8336	3900	0.0001	-
1.8571	3950	0.0	-
1.8806	4000	0.0001	-
1.9041	4050	0.0001	-
1.9276	4100	0.0001	-
1.9511	4150	0.0001	-
1.9746	4200	0.0001	-
1.9981	4250	0.0001	-

Framework Versions

Python: 3.10.13
SetFit: 1.0.3
Sentence Transformers: 3.0.1
Transformers: 4.39.0
PyTorch: 2.2.2+cu121
Datasets: 2.19.2
Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Shankhdhar
/

classifier_woog_firstbud_updated