bhaskars113 commited on
Commit
2035f05
1 Parent(s): 90e25bf

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": false,
4
+ "pooling_mode_mean_tokens": true,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false
9
+ }
README.md ADDED
@@ -0,0 +1,197 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: setfit
3
+ tags:
4
+ - setfit
5
+ - sentence-transformers
6
+ - text-classification
7
+ - generated_from_setfit_trainer
8
+ metrics:
9
+ - accuracy
10
+ widget:
11
+ - text: I really enjoy beer logos and branding. I like Budweiser design and also the
12
+ modern Modelo can as well.
13
+ - text: Drinking for me is a big trigger for dissociation. I can do maybe two beers
14
+ before I start to slide. At that point, I don't feel the effects so I overdrink.
15
+ I don't drink wine or hard alcohol at all anymore. ... Personally I agree alcohol
16
+ isn't good for physical or mental health and it's a supremely negative drug. I
17
+ have respect for those that can avoid it altogether. Also limit it to no more
18
+ than 2 full beers across 72 hours and a week break after. I got to this point
19
+ after 6 mos sober & then joining a SMART recovery program because full abstinence
20
+ was too much for me to be successful. As a foodie, HSP, wine- & hop-head, it's
21
+ a lot for me to cut out entirely.
22
+ - text: 'Big time, because my ADHD is one of the 2 bigger drivers for my anxiety and
23
+ depression. So when I would drink especially when I had taken anti depressants
24
+ that day, I would get to the point where i had suicidal thoughts over the smallest
25
+ things, which mind you the last time I drank, just a month and change before I
26
+ went to a mental hospital because of it. And this was me like 2 beers in. My doctor
27
+ at the hospital told me something that I have hugged onto: suicide can happen
28
+ at any time. I had thought about it but I stopped because of my wife and family.
29
+ When I was drinking, I didn’t think about any of them.'
30
+ - text: That and going out is expensive. I’d much rather knock back a couple of beers
31
+ and play Switch. Cheaper that way, plus I don’t end up smelling like an ashtray.
32
+ - text: By my house pizza is pretty inexpensive. I might be able to get two cheap
33
+ beers too!
34
+ pipeline_tag: text-classification
35
+ inference: true
36
+ base_model: sentence-transformers/paraphrase-mpnet-base-v2
37
+ ---
38
+
39
+ # SetFit with sentence-transformers/paraphrase-mpnet-base-v2
40
+
41
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
42
+
43
+ The model has been trained using an efficient few-shot learning technique that involves:
44
+
45
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
46
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
47
+
48
+ ## Model Details
49
+
50
+ ### Model Description
51
+ - **Model Type:** SetFit
52
+ - **Sentence Transformer body:** [sentence-transformers/paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2)
53
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
54
+ - **Maximum Sequence Length:** 512 tokens
55
+ - **Number of Classes:** 3 classes
56
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
57
+ <!-- - **Language:** Unknown -->
58
+ <!-- - **License:** Unknown -->
59
+
60
+ ### Model Sources
61
+
62
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
63
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
64
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
65
+
66
+ ### Model Labels
67
+ | Label | Examples |
68
+ |:------|:-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
69
+ | 1 | <ul><li>'I was spending too much money on beer and it wasn’t helping my life in any capacity, so I cut it out. I have enough other expensive hobbies I don’t need liver damage to be one of them.'</li><li>"And I forgot the worst: eating out is expensive, and beer is crazy expensive. That's really annoying."</li><li>'Young me also didn’t realize a few ballpark beers could have you reevaluating your monthly budget'</li></ul> |
70
+ | 2 | <ul><li>'Mental health problems and obesity often go hand in hand. In particular depression can be countered through endorphines released through simple Workouts including (!) normal paced walking outside. I do factor these things in. But if you eat unhealthy, only sit at home in the shadow, smoke tobacco or even worse weed with the occasional beer, you do not give yourself a fighting chance. There are exceptions. Yes.'</li><li>'It\'s also essential for vitamin d Alcohol isn\'t essential and has no positive health outcomes. Even when you consider "getting together with the boys" as a positive mental health aspect, it\'s negated by all the other effects. I still have a few beer a week, but I\'m aware of its consequences'</li><li>"I drink on SSRI but I know two things. If I drink a lot the other day my anxiety is hell and I have to double the dose of my anti anxiety meds so I do it only if I don't have to do anything important the other day and veeeery occasionally. If I occasionally drink one or two beers yes it hits me more hard, I used to be that kind of person who needed a lot of alcohol to feel the high and now with one glass of wine I feel it, but I don't have any problem the other day. Be careful because the first time I found out the first thing I hated myself, I had to sleep all day to get through the hangxiety"</li></ul> |
71
+ | 0 | <ul><li>"I'm not sure if that's actually true (maybe I'm wrong) cause with the exception of the occasional craft brew, I always found the alcohol level to be the same on both sides of the border. Budweiser down there and Molson up here are both 5%."</li><li>'?? angolbryggeri - Hazy Crazy\n\n✴️ IPA\n\n?? Sweden ????\n\n??Abv 6.5%\n\n⭐️ 3.60 / 5.0 ~ avg 3.67\n\n?? systembolaget\n\n#beer #bier #birra #öl #cerveza #øl #craftbeer #ipa #dipa #tipa #sour #gose #berlinerweisse #paleale #pilsner #lager #stout #beeroftheday #beerphotografy #hantverksöl #untappd #beergeek #beerlover #ilovebeer #cheers #beerstagram #instabeer #beerporn #ängöl #sweden'</li><li>'Lately some popular breweries around me have catered to lighter beers away from mostly pales les. Hefeweizens, Pilsner’s, blondes, and it’s been really nice. My local had 3 awesome pilsners that taste straight out of Europe.'</li></ul> |
72
+
73
+ ## Uses
74
+
75
+ ### Direct Use for Inference
76
+
77
+ First install the SetFit library:
78
+
79
+ ```bash
80
+ pip install setfit
81
+ ```
82
+
83
+ Then you can load this model and run inference.
84
+
85
+ ```python
86
+ from setfit import SetFitModel
87
+
88
+ # Download from the 🤗 Hub
89
+ model = SetFitModel.from_pretrained("bhaskars113/beer-budget-health-model")
90
+ # Run inference
91
+ preds = model("By my house pizza is pretty inexpensive. I might be able to get two cheap beers too!")
92
+ ```
93
+
94
+ <!--
95
+ ### Downstream Use
96
+
97
+ *List how someone could finetune this model on their own dataset.*
98
+ -->
99
+
100
+ <!--
101
+ ### Out-of-Scope Use
102
+
103
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
104
+ -->
105
+
106
+ <!--
107
+ ## Bias, Risks and Limitations
108
+
109
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
110
+ -->
111
+
112
+ <!--
113
+ ### Recommendations
114
+
115
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
116
+ -->
117
+
118
+ ## Training Details
119
+
120
+ ### Training Set Metrics
121
+ | Training set | Min | Median | Max |
122
+ |:-------------|:----|:--------|:----|
123
+ | Word count | 12 | 50.7391 | 177 |
124
+
125
+ | Label | Training Sample Count |
126
+ |:------|:----------------------|
127
+ | 0 | 16 |
128
+ | 1 | 15 |
129
+ | 2 | 15 |
130
+
131
+ ### Training Hyperparameters
132
+ - batch_size: (16, 16)
133
+ - num_epochs: (1, 1)
134
+ - max_steps: -1
135
+ - sampling_strategy: oversampling
136
+ - num_iterations: 20
137
+ - body_learning_rate: (2e-05, 2e-05)
138
+ - head_learning_rate: 2e-05
139
+ - loss: CosineSimilarityLoss
140
+ - distance_metric: cosine_distance
141
+ - margin: 0.25
142
+ - end_to_end: False
143
+ - use_amp: False
144
+ - warmup_proportion: 0.1
145
+ - seed: 42
146
+ - eval_max_steps: -1
147
+ - load_best_model_at_end: False
148
+
149
+ ### Training Results
150
+ | Epoch | Step | Training Loss | Validation Loss |
151
+ |:------:|:----:|:-------------:|:---------------:|
152
+ | 0.0087 | 1 | 0.203 | - |
153
+ | 0.4348 | 50 | 0.003 | - |
154
+ | 0.8696 | 100 | 0.0007 | - |
155
+
156
+ ### Framework Versions
157
+ - Python: 3.10.12
158
+ - SetFit: 1.0.3
159
+ - Sentence Transformers: 2.3.1
160
+ - Transformers: 4.35.2
161
+ - PyTorch: 2.1.0+cu121
162
+ - Datasets: 2.17.1
163
+ - Tokenizers: 0.15.2
164
+
165
+ ## Citation
166
+
167
+ ### BibTeX
168
+ ```bibtex
169
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
170
+ doi = {10.48550/ARXIV.2209.11055},
171
+ url = {https://arxiv.org/abs/2209.11055},
172
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
173
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
174
+ title = {Efficient Few-Shot Learning Without Prompts},
175
+ publisher = {arXiv},
176
+ year = {2022},
177
+ copyright = {Creative Commons Attribution 4.0 International}
178
+ }
179
+ ```
180
+
181
+ <!--
182
+ ## Glossary
183
+
184
+ *Clearly define terms in order to be accessible across audiences.*
185
+ -->
186
+
187
+ <!--
188
+ ## Model Card Authors
189
+
190
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
191
+ -->
192
+
193
+ <!--
194
+ ## Model Card Contact
195
+
196
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
197
+ -->
config.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "sentence-transformers/paraphrase-mpnet-base-v2",
3
+ "architectures": [
4
+ "MPNetModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-05,
15
+ "max_position_embeddings": 514,
16
+ "model_type": "mpnet",
17
+ "num_attention_heads": 12,
18
+ "num_hidden_layers": 12,
19
+ "pad_token_id": 1,
20
+ "relative_attention_num_buckets": 32,
21
+ "torch_dtype": "float32",
22
+ "transformers_version": "4.35.2",
23
+ "vocab_size": 30527
24
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "2.0.0",
4
+ "transformers": "4.7.0",
5
+ "pytorch": "1.9.0+cu102"
6
+ }
7
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "labels": null,
3
+ "normalize_embeddings": false
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:24388a5995c08315bb1188dc50e615ea7bc31b4a3332b06176f7efab1fd80a3f
3
+ size 437967672
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1bc4a26b4fc792407ad7e27973072817b35d55aae056ca273073deb09d6a8e78
3
+ size 19327
modules.json ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ }
14
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": false
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "cls_token": {
10
+ "content": "<s>",
11
+ "lstrip": false,
12
+ "normalized": true,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "eos_token": {
17
+ "content": "</s>",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "mask_token": {
24
+ "content": "<mask>",
25
+ "lstrip": true,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "pad_token": {
31
+ "content": "<pad>",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ },
37
+ "sep_token": {
38
+ "content": "</s>",
39
+ "lstrip": false,
40
+ "normalized": true,
41
+ "rstrip": false,
42
+ "single_word": false
43
+ },
44
+ "unk_token": {
45
+ "content": "[UNK]",
46
+ "lstrip": false,
47
+ "normalized": false,
48
+ "rstrip": false,
49
+ "single_word": false
50
+ }
51
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "<s>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "1": {
12
+ "content": "<pad>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "2": {
20
+ "content": "</s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "104": {
28
+ "content": "[UNK]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "30526": {
36
+ "content": "<mask>",
37
+ "lstrip": true,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "bos_token": "<s>",
45
+ "clean_up_tokenization_spaces": true,
46
+ "cls_token": "<s>",
47
+ "do_basic_tokenize": true,
48
+ "do_lower_case": true,
49
+ "eos_token": "</s>",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "never_split": null,
53
+ "pad_token": "<pad>",
54
+ "sep_token": "</s>",
55
+ "strip_accents": null,
56
+ "tokenize_chinese_chars": true,
57
+ "tokenizer_class": "MPNetTokenizer",
58
+ "unk_token": "[UNK]"
59
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff