Edit model card

SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A LogisticRegression instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Model Labels

Label Examples
matches-match_time
  • 'Norwich City vs Newcastle United'
  • 'will Manchester United play with chelsea'
  • 'est-ce que Manchester United jouera avec chelsea'
matches-match_result
  • 'Liverpool and West Ham result'
  • 'what is the score of Wolverhampton match'
  • 'who won in Liverpool vs Newcastle United match'
greet-who_are_you
  • 'how can you help me'
  • "pourquoi j'ai besoin de toi"
  • 'je ne te comprends pas'
matches-team_next_match
  • 'Real Madrid fixtures'
  • 'quels sont les prochains matchs de Borussia Dortmund'
  • 'próximos partidos de Atletico Madrid'
greet-good_bye
  • 'See you later'
  • 'A plus tard'
  • 'stop'
greet-hi
  • 'Hello buddy'
  • 'Salut'
  • 'Hey'

Evaluation

Metrics

Label Accuracy
all 1.0

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("setfit_model_id")
# Run inference
preds = model("au revoir")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 5.2 10
Label Training Sample Count
greet-hi 5
greet-who_are_you 7
greet-good_bye 5
matches-team_next_match 21
matches-match_time 12
matches-match_result 15

Training Hyperparameters

  • batch_size: (4, 4)
  • num_epochs: (4, 4)
  • max_steps: -1
  • sampling_strategy: oversampling
  • body_learning_rate: (2e-05, 1e-05)
  • head_learning_rate: 0.01
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: True

Training Results

Epoch Step Training Loss Validation Loss
0.0012 1 0.1544 -
0.0121 10 0.0658 -
0.0241 20 0.1235 -
0.0362 30 0.2422 -
0.0483 40 0.2876 -
0.0603 50 0.1208 -
0.0724 60 0.1358 -
0.0844 70 0.1494 -
0.0965 80 0.1284 -
0.1086 90 0.1107 -
0.1206 100 0.2395 -
0.1327 110 0.0661 -
0.1448 120 0.1554 -
0.1568 130 0.0258 -
0.1689 140 0.0279 -
0.1809 150 0.1162 -
0.1930 160 0.0244 -
0.2051 170 0.0221 -
0.2171 180 0.0813 -
0.2292 190 0.0188 -
0.2413 200 0.03 -
0.2533 210 0.0019 -
0.2654 220 0.0076 -
0.2774 230 0.01 -
0.2895 240 0.0025 -
0.3016 250 0.0705 -
0.3136 260 0.0044 -
0.3257 270 0.0038 -
0.3378 280 0.006 -
0.3498 290 0.0018 -
0.3619 300 0.0003 -
0.3739 310 0.0007 -
0.3860 320 0.0128 -
0.3981 330 0.0022 -
0.4101 340 0.0008 -
0.4222 350 0.004 -
0.4343 360 0.0006 -
0.4463 370 0.0007 -
0.4584 380 0.0005 -
0.4704 390 0.0057 -
0.4825 400 0.0007 -
0.4946 410 0.0022 -
0.5066 420 0.0012 -
0.5187 430 0.0009 -
0.5308 440 0.0004 -
0.5428 450 0.0032 -
0.5549 460 0.0007 -
0.5669 470 0.0008 -
0.5790 480 0.0005 -
0.5911 490 0.0005 -
0.6031 500 0.0008 -
0.6152 510 0.0008 -
0.6273 520 0.0004 -
0.6393 530 0.0015 -
0.6514 540 0.0002 -
0.6634 550 0.0006 -
0.6755 560 0.0015 -
0.6876 570 0.0024 -
0.6996 580 0.0004 -
0.7117 590 0.0005 -
0.7238 600 0.0011 -
0.7358 610 0.0008 -
0.7479 620 0.0002 -
0.7600 630 0.0006 -
0.7720 640 0.0003 -
0.7841 650 0.0002 -
0.7961 660 0.0007 -
0.8082 670 0.0009 -
0.8203 680 0.0002 -
0.8323 690 0.0006 -
0.8444 700 0.0015 -
0.8565 710 0.0003 -
0.8685 720 0.0003 -
0.8806 730 0.0003 -
0.8926 740 0.0015 -
0.9047 750 0.0003 -
0.9168 760 0.0005 -
0.9288 770 0.0002 -
0.9409 780 0.0003 -
0.9530 790 0.0002 -
0.9650 800 0.0004 -
0.9771 810 0.0003 -
0.9891 820 0.001 -
1.0 829 - 0.0216
1.0012 830 0.0003 -
1.0133 840 0.0007 -
1.0253 850 0.0004 -
1.0374 860 0.0001 -
1.0495 870 0.0008 -
1.0615 880 0.0003 -
1.0736 890 0.0006 -
1.0856 900 0.0001 -
1.0977 910 0.0018 -
1.1098 920 0.0 -
1.1218 930 0.0001 -
1.1339 940 0.0007 -
1.1460 950 0.0009 -
1.1580 960 0.0004 -
1.1701 970 0.0003 -
1.1821 980 0.0015 -
1.1942 990 0.0002 -
1.2063 1000 0.0005 -
1.2183 1010 0.0002 -
1.2304 1020 0.0003 -
1.2425 1030 0.0001 -
1.2545 1040 0.0002 -
1.2666 1050 0.0004 -
1.2786 1060 0.0001 -
1.2907 1070 0.0002 -
1.3028 1080 0.0001 -
1.3148 1090 0.0002 -
1.3269 1100 0.0001 -
1.3390 1110 0.0002 -
1.3510 1120 0.0003 -
1.3631 1130 0.0001 -
1.3752 1140 0.0001 -
1.3872 1150 0.0001 -
1.3993 1160 0.0002 -
1.4113 1170 0.0001 -
1.4234 1180 0.0005 -
1.4355 1190 0.0002 -
1.4475 1200 0.0002 -
1.4596 1210 0.0002 -
1.4717 1220 0.0001 -
1.4837 1230 0.0001 -
1.4958 1240 0.0001 -
1.5078 1250 0.0001 -
1.5199 1260 0.001 -
1.5320 1270 0.0001 -
1.5440 1280 0.0003 -
1.5561 1290 0.0001 -
1.5682 1300 0.0002 -
1.5802 1310 0.0005 -
1.5923 1320 0.0002 -
1.6043 1330 0.0001 -
1.6164 1340 0.0004 -
1.6285 1350 0.0002 -
1.6405 1360 0.0001 -
1.6526 1370 0.0004 -
1.6647 1380 0.0003 -
1.6767 1390 0.0002 -
1.6888 1400 0.0001 -
1.7008 1410 0.0008 -
1.7129 1420 0.0003 -
1.7250 1430 0.0005 -
1.7370 1440 0.0001 -
1.7491 1450 0.0001 -
1.7612 1460 0.0001 -
1.7732 1470 0.0007 -
1.7853 1480 0.0001 -
1.7973 1490 0.0002 -
1.8094 1500 0.0001 -
1.8215 1510 0.001 -
1.8335 1520 0.0002 -
1.8456 1530 0.0003 -
1.8577 1540 0.0004 -
1.8697 1550 0.0005 -
1.8818 1560 0.0001 -
1.8938 1570 0.0006 -
1.9059 1580 0.0005 -
1.9180 1590 0.0002 -
1.9300 1600 0.0002 -
1.9421 1610 0.0001 -
1.9542 1620 0.0003 -
1.9662 1630 0.0005 -
1.9783 1640 0.0007 -
1.9903 1650 0.0001 -
2.0 1658 - 0.0186
2.0024 1660 0.0 -
2.0145 1670 0.0001 -
2.0265 1680 0.0002 -
2.0386 1690 0.0001 -
2.0507 1700 0.0002 -
2.0627 1710 0.0001 -
2.0748 1720 0.0001 -
2.0869 1730 0.0002 -
2.0989 1740 0.0001 -
2.1110 1750 0.0002 -
2.1230 1760 0.0001 -
2.1351 1770 0.0003 -
2.1472 1780 0.0006 -
2.1592 1790 0.0001 -
2.1713 1800 0.0002 -
2.1834 1810 0.0002 -
2.1954 1820 0.0001 -
2.2075 1830 0.0 -
2.2195 1840 0.0001 -
2.2316 1850 0.0002 -
2.2437 1860 0.0004 -
2.2557 1870 0.0003 -
2.2678 1880 0.0002 -
2.2799 1890 0.0002 -
2.2919 1900 0.0004 -
2.3040 1910 0.0002 -
2.3160 1920 0.0001 -
2.3281 1930 0.0 -
2.3402 1940 0.0002 -
2.3522 1950 0.0001 -
2.3643 1960 0.0 -
2.3764 1970 0.0003 -
2.3884 1980 0.0002 -
2.4005 1990 0.0001 -
2.4125 2000 0.0003 -
2.4246 2010 0.0003 -
2.4367 2020 0.0002 -
2.4487 2030 0.0002 -
2.4608 2040 0.0002 -
2.4729 2050 0.0001 -
2.4849 2060 0.0001 -
2.4970 2070 0.0002 -
2.5090 2080 0.0 -
2.5211 2090 0.0002 -
2.5332 2100 0.0004 -
2.5452 2110 0.0005 -
2.5573 2120 0.0003 -
2.5694 2130 0.0001 -
2.5814 2140 0.0002 -
2.5935 2150 0.0008 -
2.6055 2160 0.0002 -
2.6176 2170 0.0003 -
2.6297 2180 0.0001 -
2.6417 2190 0.0002 -
2.6538 2200 0.0001 -
2.6659 2210 0.0001 -
2.6779 2220 0.0 -
2.6900 2230 0.0002 -
2.7021 2240 0.0 -
2.7141 2250 0.0001 -
2.7262 2260 0.0001 -
2.7382 2270 0.0003 -
2.7503 2280 0.0001 -
2.7624 2290 0.0003 -
2.7744 2300 0.0001 -
2.7865 2310 0.0002 -
2.7986 2320 0.0001 -
2.8106 2330 0.0001 -
2.8227 2340 0.0001 -
2.8347 2350 0.0001 -
2.8468 2360 0.0002 -
2.8589 2370 0.0001 -
2.8709 2380 0.0001 -
2.8830 2390 0.0 -
2.8951 2400 0.0 -
2.9071 2410 0.0 -
2.9192 2420 0.0001 -
2.9312 2430 0.0002 -
2.9433 2440 0.0001 -
2.9554 2450 0.0001 -
2.9674 2460 0.0001 -
2.9795 2470 0.0003 -
2.9916 2480 0.0001 -
3.0 2487 - 0.0176
3.0036 2490 0.0001 -
3.0157 2500 0.0 -
3.0277 2510 0.0002 -
3.0398 2520 0.0 -
3.0519 2530 0.0002 -
3.0639 2540 0.0002 -
3.0760 2550 0.0 -
3.0881 2560 0.0001 -
3.1001 2570 0.0001 -
3.1122 2580 0.0003 -
3.1242 2590 0.0003 -
3.1363 2600 0.0001 -
3.1484 2610 0.0 -
3.1604 2620 0.0002 -
3.1725 2630 0.0001 -
3.1846 2640 0.0001 -
3.1966 2650 0.0001 -
3.2087 2660 0.0003 -
3.2207 2670 0.0001 -
3.2328 2680 0.0001 -
3.2449 2690 0.0001 -
3.2569 2700 0.0001 -
3.2690 2710 0.0002 -
3.2811 2720 0.0001 -
3.2931 2730 0.0005 -
3.3052 2740 0.0 -
3.3172 2750 0.0001 -
3.3293 2760 0.0002 -
3.3414 2770 0.0003 -
3.3534 2780 0.0001 -
3.3655 2790 0.0001 -
3.3776 2800 0.0001 -
3.3896 2810 0.0004 -
3.4017 2820 0.0001 -
3.4138 2830 0.0002 -
3.4258 2840 0.0001 -
3.4379 2850 0.0003 -
3.4499 2860 0.0001 -
3.4620 2870 0.0002 -
3.4741 2880 0.0001 -
3.4861 2890 0.0003 -
3.4982 2900 0.0003 -
3.5103 2910 0.0001 -
3.5223 2920 0.0 -
3.5344 2930 0.0 -
3.5464 2940 0.0001 -
3.5585 2950 0.0002 -
3.5706 2960 0.0002 -
3.5826 2970 0.0001 -
3.5947 2980 0.0 -
3.6068 2990 0.0001 -
3.6188 3000 0.0003 -
3.6309 3010 0.0001 -
3.6429 3020 0.0 -
3.6550 3030 0.0002 -
3.6671 3040 0.0003 -
3.6791 3050 0.0005 -
3.6912 3060 0.0001 -
3.7033 3070 0.0 -
3.7153 3080 0.0001 -
3.7274 3090 0.0002 -
3.7394 3100 0.0001 -
3.7515 3110 0.0001 -
3.7636 3120 0.0002 -
3.7756 3130 0.0001 -
3.7877 3140 0.0 -
3.7998 3150 0.0001 -
3.8118 3160 0.0001 -
3.8239 3170 0.0001 -
3.8359 3180 0.0001 -
3.8480 3190 0.0005 -
3.8601 3200 0.0 -
3.8721 3210 0.0001 -
3.8842 3220 0.0001 -
3.8963 3230 0.0001 -
3.9083 3240 0.0001 -
3.9204 3250 0.0001 -
3.9324 3260 0.0 -
3.9445 3270 0.0001 -
3.9566 3280 0.0001 -
3.9686 3290 0.0002 -
3.9807 3300 0.0002 -
3.9928 3310 0.0001 -
4.0 3316 - 0.0187
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • SetFit: 1.0.3
  • Sentence Transformers: 3.1.0
  • Transformers: 4.39.0
  • PyTorch: 2.4.0+cu121
  • Datasets: 3.0.0
  • Tokenizers: 0.15.2

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
28
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Examples
Inference API (serverless) is not available, repository is disabled.

Model tree for alexandremn/botpress_football_sft_model

Evaluation results