SetFit with Qwen/Qwen3-Embedding-0.6B

This is a SetFit model that can be used for Text Classification. This SetFit model uses Qwen/Qwen3-Embedding-0.6B as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

Fine-tuning a Sentence Transformer with contrastive learning.
Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Type: SetFit
Sentence Transformer body: Qwen/Qwen3-Embedding-0.6B
Classification head: a LogisticRegression instance
Maximum Sequence Length: 32768 tokens
Number of Classes: 2 classes

Model Sources

Repository: SetFit on GitHub
Paper: Efficient Few-Shot Learning Without Prompts
Blogpost: SetFit: Efficient Few-Shot Learning Without Prompts

Model Labels

Label	Examples
L	'So it will be possible for you to monitise your expertize on an sport market.' 'Moreover, observing such occasions is also an excellent wat to liven up your holidays and to get new feelings and knowledge about the body.' 'i claim that it brings you, your family and friends closer.'
H	"There is an opinion that watching sports is time consuming and is not an efficient way to spend one's free time." 'It develops a logical thinking and concentration.' 'But in my opinion, watching sports competition can be a good and useful enough way of relax for people who enjoy it.'

Evaluation

Metrics

Label	Accuracy
all	0.7959

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("Zlovoblachko/dim2_Qwen_setfit_model")
# Run inference
preds = model(" Watching sports helps people to develop their social life.")

Training Details

Training Set Metrics

Training set	Min	Median	Max
Word count	4	18.0633	48

Label	Training Sample Count
L	150
H	150

Training Hyperparameters

batch_size: (16, 16)
num_epochs: (1, 1)
max_steps: -1
sampling_strategy: oversampling
body_learning_rate: (2e-05, 1e-05)
head_learning_rate: 0.01
loss: CosineSimilarityLoss
distance_metric: cosine_distance
margin: 0.25
end_to_end: False
use_amp: False
warmup_proportion: 0.1
l2_weight: 0.01
seed: 42
eval_max_steps: -1
load_best_model_at_end: False

Training Results

Epoch	Step	Training Loss	Validation Loss
0.0004	1	0.2694	-
0.0177	50	0.2589	-
0.0353	100	0.2489	-
0.0530	150	0.1486	-
0.0706	200	0.0375	-
0.0883	250	0.0014	-
0.1059	300	0.0	-
0.1236	350	0.0	-
0.1412	400	0.0	-
0.1589	450	0.0	-
0.1766	500	0.0	-
0.1942	550	0.0	-
0.2119	600	0.0	-
0.2295	650	0.0	-
0.2472	700	0.0	-
0.2648	750	0.0	-
0.2825	800	0.0	-
0.3001	850	0.0	-
0.3178	900	0.0	-
0.3355	950	0.0	-
0.3531	1000	0.0	-
0.3708	1050	0.0	-
0.3884	1100	0.0	-
0.4061	1150	0.0	-
0.4237	1200	0.0	-
0.4414	1250	0.0	-
0.4590	1300	0.0	-
0.4767	1350	0.0	-
0.4944	1400	0.0	-
0.5120	1450	0.0	-
0.5297	1500	0.0	-
0.5473	1550	0.0	-
0.5650	1600	0.0	-
0.5826	1650	0.0	-
0.6003	1700	0.0	-
0.6179	1750	0.0	-
0.6356	1800	0.0	-
0.6532	1850	0.0	-
0.6709	1900	0.0	-
0.6886	1950	0.0	-
0.7062	2000	0.0	-
0.7239	2050	0.0	-
0.7415	2100	0.0	-
0.7592	2150	0.0	-
0.7768	2200	0.0	-
0.7945	2250	0.0	-
0.8121	2300	0.0	-
0.8298	2350	0.0	-
0.8475	2400	0.0	-
0.8651	2450	0.0	-
0.8828	2500	0.0	-
0.9004	2550	0.0	-
0.9181	2600	0.0	-
0.9357	2650	0.0	-
0.9534	2700	0.0	-
0.9710	2750	0.0	-
0.9887	2800	0.0	-

Framework Versions

Python: 3.11.13
SetFit: 1.1.3
Sentence Transformers: 5.0.0
Transformers: 4.55.0
PyTorch: 2.6.0+cu124
Datasets: 4.0.0
Tokenizers: 0.21.4

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}

Downloads last month: 6

Safetensors

Model size

596M params

Tensor type

F32

Model tree for Zlovoblachko/dim2_Qwen_setfit_model

Base model

Qwen/Qwen3-0.6B-Base

Finetuned

Qwen/Qwen3-Embedding-0.6B

Finetuned

(61)

this model

Evaluation results

Accuracy on Unknown
test set self-reported

0.796

View on Papers With Code