Netta1994 commited on
Commit
090cea6
1 Parent(s): 16dac45

Add SetFit model

Browse files
1_Pooling/config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "word_embedding_dimension": 768,
3
+ "pooling_mode_cls_token": true,
4
+ "pooling_mode_mean_tokens": false,
5
+ "pooling_mode_max_tokens": false,
6
+ "pooling_mode_mean_sqrt_len_tokens": false,
7
+ "pooling_mode_weightedmean_tokens": false,
8
+ "pooling_mode_lasttoken": false,
9
+ "include_prompt": true
10
+ }
README.md ADDED
@@ -0,0 +1,261 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: BAAI/bge-base-en-v1.5
3
+ library_name: setfit
4
+ metrics:
5
+ - accuracy
6
+ pipeline_tag: text-classification
7
+ tags:
8
+ - setfit
9
+ - sentence-transformers
10
+ - text-classification
11
+ - generated_from_setfit_trainer
12
+ widget:
13
+ - text: 'Reasoning:
14
+
15
+ The answer provides a solid overview of identifying a funnel spider, including
16
+ its dark brown or black body, shiny carapace, and large fangs. These points align
17
+ well with the details in the provided document. However, while the answer includes
18
+ the key features described in the document, it misses a few additional characteristics
19
+ such as the spinnerets, size variations, and geographical habitat that are valuable
20
+ in identifying funnel spiders more comprehensively. Nonetheless, the answer remains
21
+ relevant and concise basedon the essential points covered.
22
+
23
+ Evaluation:'
24
+ - text: "Reasoning:\nThe answer provides a comprehensive and accurate description\
25
+ \ of how to write a paper in MLA format. It mentions key points such as setting\
26
+ \ up 1-inch margins, using 12-point font, and double-spacing the text, which are\
27
+ \ directly aligned with the instructions in the provided document. It also addresses\
28
+ \ creating a running header, typing the heading information in the upper left\
29
+ \ corner, and centering the paper's title, all of which are specified in the document.\
30
+ \ The answer is well-supported by the document, relevant to the question asked,\
31
+ \ and presented concisely.\n\nFinal Evaluation: \nEvaluation:"
32
+ - text: 'Reasoning:
33
+
34
+ The provided answer offers relevant and practical advice for getting into medical
35
+ school, including focusing on core science subjects, gaining clinical experience,
36
+ and preparing for the MCAT. These points align well with the suggestions in the
37
+ document. However, the answer overlooks several important details such as engaging
38
+ in extracurricular activities, seeking leadership opportunities, and preparing
39
+ a comprehensive application, which the document emphasizes. Including these aspects
40
+ would have made the response more comprehensive and better aligned with the provided
41
+ document.
42
+
43
+
44
+ Evaluation:'
45
+ - text: "Reasoning:\nThe provided answer offers several strategic tips for becoming\
46
+ \ adept at hide and seek. The suggestions include creative hiding strategies like\
47
+ \ staying in the room where the seeker started, using camouflage, and hiding in\
48
+ \ plain sight, which align well with the document's advice. The document recommends\
49
+ \ looking for long edges to hide behind, using dense curtains, hiding in laundry\
50
+ \ baskets, and looking for multi-colored areas, all of which are echoed in the\
51
+ \ answer. The advice given is concise, relevant, and matches the tips and guidelines\
52
+ \ from the document.\n\nFinal Evaluation: \nEvaluation:"
53
+ - text: 'Reasoning:
54
+
55
+ 1. **Context Grounding**: The answer refers to making a saline solution for treating
56
+ a baby''s cough, using a method that significantly deviates from the details provided
57
+ in the document. The document provides specific instructions for a saline solution
58
+ and its administration, but the quantities and steps in the answer do not align
59
+ with the document''s instructions. Additionally, the document does not suggest
60
+ using 2 cups of water and the method of inserting the suction bulb into the baby''s
61
+ nostril about an inch is incorrect and potentially harmful, deviating significantly
62
+ from the recommended depth.
63
+
64
+ 2. **Relevance**: While the answer aims to address how to treat a baby''s cough,
65
+ it includes incorrect measurements and method details, which makes it less reliable
66
+ and potentially unsafe.
67
+
68
+ 3. **Conciseness**: The answer is fairly concise but includes inaccuracies that
69
+ make it untrustworthy.
70
+
71
+
72
+ Due to these points, the answer does not meet the necessary criteria of being
73
+ well-grounded in the document, relevant, or correctly concise.
74
+
75
+
76
+ Final Evaluation:'
77
+ inference: true
78
+ model-index:
79
+ - name: SetFit with BAAI/bge-base-en-v1.5
80
+ results:
81
+ - task:
82
+ type: text-classification
83
+ name: Text Classification
84
+ dataset:
85
+ name: Unknown
86
+ type: unknown
87
+ split: test
88
+ metrics:
89
+ - type: accuracy
90
+ value: 0.84
91
+ name: Accuracy
92
+ ---
93
+
94
+ # SetFit with BAAI/bge-base-en-v1.5
95
+
96
+ This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5) as the Sentence Transformer embedding model. A [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance is used for classification.
97
+
98
+ The model has been trained using an efficient few-shot learning technique that involves:
99
+
100
+ 1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
101
+ 2. Training a classification head with features from the fine-tuned Sentence Transformer.
102
+
103
+ ## Model Details
104
+
105
+ ### Model Description
106
+ - **Model Type:** SetFit
107
+ - **Sentence Transformer body:** [BAAI/bge-base-en-v1.5](https://huggingface.co/BAAI/bge-base-en-v1.5)
108
+ - **Classification head:** a [LogisticRegression](https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html) instance
109
+ - **Maximum Sequence Length:** 512 tokens
110
+ - **Number of Classes:** 2 classes
111
+ <!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
112
+ <!-- - **Language:** Unknown -->
113
+ <!-- - **License:** Unknown -->
114
+
115
+ ### Model Sources
116
+
117
+ - **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
118
+ - **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
119
+ - **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
120
+
121
+ ### Model Labels
122
+ | Label | Examples |
123
+ |:------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
124
+ | 0 | <ul><li>"Reasoning:\nThe answer provided closely aligns with the specific instructions given in the document about petting a bearded dragon. It correctly mentions using 1 or 2 fingers to gently stroke the dragon's head, lowering your hand slowly to avoid startling it, and washing hands before and after petting to reduce the risk of bacteria transfer. However, the part about using a specific perfume or scent to help the dragon recognize you is not supported by the text and is, in fact, incorrect.\n\nFinal Evaluation: \nResult:"</li><li>'Reasoning:\nThe answer provided addresses the physical characteristics of a funnel spider but includes several inaccuracies and deviations from the information in the provided document. Key errors include describing the funnel spider as light brown or gray with a soft, dull carapace, which contradicts the document’s description of a dark brown or black body and a hard, shiny carapace. Additionally, the claim that funnel spiders have 3 non-poisonous fangs pointing sideways is incorrect based on the document, which states that the funnel spider has two large, downward-pointing fangs that are poisonous. The document provides clear and detailed descriptions that should form the basis for an accurate answer.\n\nFinal Evaluation:'</li><li>'Reasoning:\nThe answer provided, "Luis Figo left Barcelona to join Real Madrid," while factually correct according to the provided document, is entirely unrelated to the question "How to Calculate Real Estate Commissions." The document and the answer focus on a historical event in soccer rather than providing any information or calculations related to real estate commissions. \n\nFinal Evaluation:'</li></ul> |
125
+ | 1 | <ul><li>'Reasoning:\nThe answer is well-supported by the document and directly relates to the question of how to hold a note while singing. It addresses key aspects such as breathing techniques, posture, and controlled release of air, all of which are mentioned in the provided document. The answer stays concise and clear, without deviating into unrelated topics, effectively summarizing the necessary steps for holding a note.\n\nFinal result:'</li><li>'Reasoning:\nThe answer is well-founded in the provided document and directly relates to the question of how to stop feeling empty. It suggests practical actions like keeping a journal, trying new activities, and making new friends, all of which are discussed in the document. The recommendations in the answer are summarized clearly and are appropriate responses to the question without providing extraneous information.\n\nFinal Evaluation:'</li><li>'Reasoning:\nThe answer aligns well with the instructions provided in the document and effectively addresses the question of how to dry curly hair. It begins by recommending gently squeezing out excess water, followed by the application of a leave-in conditioner and the use of a wide-tooth comb for detangling, which are all steps mentioned in the document. The answer then advises adding styling products and parting the hair to lift the roots, which helps expedite the air-drying process. The key points from the document are reflected in the answer, ensuring it is contextually grounded and relevant.\n\nEvaluation:'</li></ul> |
126
+
127
+ ## Evaluation
128
+
129
+ ### Metrics
130
+ | Label | Accuracy |
131
+ |:--------|:---------|
132
+ | **all** | 0.84 |
133
+
134
+ ## Uses
135
+
136
+ ### Direct Use for Inference
137
+
138
+ First install the SetFit library:
139
+
140
+ ```bash
141
+ pip install setfit
142
+ ```
143
+
144
+ Then you can load this model and run inference.
145
+
146
+ ```python
147
+ from setfit import SetFitModel
148
+
149
+ # Download from the 🤗 Hub
150
+ model = SetFitModel.from_pretrained("Netta1994/setfit_baai_wikisum_gpt-4o_improved-cot-instructions_chat_few_shot_generated_remove_f")
151
+ # Run inference
152
+ preds = model("Reasoning:
153
+ The answer provides a solid overview of identifying a funnel spider, including its dark brown or black body, shiny carapace, and large fangs. These points align well with the details in the provided document. However, while the answer includes the key features described in the document, it misses a few additional characteristics such as the spinnerets, size variations, and geographical habitat that are valuable in identifying funnel spiders more comprehensively. Nonetheless, the answer remains relevant and concise basedon the essential points covered.
154
+ Evaluation:")
155
+ ```
156
+
157
+ <!--
158
+ ### Downstream Use
159
+
160
+ *List how someone could finetune this model on their own dataset.*
161
+ -->
162
+
163
+ <!--
164
+ ### Out-of-Scope Use
165
+
166
+ *List how the model may foreseeably be misused and address what users ought not to do with the model.*
167
+ -->
168
+
169
+ <!--
170
+ ## Bias, Risks and Limitations
171
+
172
+ *What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
173
+ -->
174
+
175
+ <!--
176
+ ### Recommendations
177
+
178
+ *What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
179
+ -->
180
+
181
+ ## Training Details
182
+
183
+ ### Training Set Metrics
184
+ | Training set | Min | Median | Max |
185
+ |:-------------|:----|:--------|:----|
186
+ | Word count | 57 | 92.5070 | 176 |
187
+
188
+ | Label | Training Sample Count |
189
+ |:------|:----------------------|
190
+ | 0 | 34 |
191
+ | 1 | 37 |
192
+
193
+ ### Training Hyperparameters
194
+ - batch_size: (16, 16)
195
+ - num_epochs: (1, 1)
196
+ - max_steps: -1
197
+ - sampling_strategy: oversampling
198
+ - num_iterations: 20
199
+ - body_learning_rate: (2e-05, 2e-05)
200
+ - head_learning_rate: 2e-05
201
+ - loss: CosineSimilarityLoss
202
+ - distance_metric: cosine_distance
203
+ - margin: 0.25
204
+ - end_to_end: False
205
+ - use_amp: False
206
+ - warmup_proportion: 0.1
207
+ - l2_weight: 0.01
208
+ - seed: 42
209
+ - eval_max_steps: -1
210
+ - load_best_model_at_end: False
211
+
212
+ ### Training Results
213
+ | Epoch | Step | Training Loss | Validation Loss |
214
+ |:------:|:----:|:-------------:|:---------------:|
215
+ | 0.0056 | 1 | 0.2159 | - |
216
+ | 0.2809 | 50 | 0.2444 | - |
217
+ | 0.5618 | 100 | 0.0815 | - |
218
+ | 0.8427 | 150 | 0.0041 | - |
219
+
220
+ ### Framework Versions
221
+ - Python: 3.10.14
222
+ - SetFit: 1.1.0
223
+ - Sentence Transformers: 3.1.0
224
+ - Transformers: 4.44.0
225
+ - PyTorch: 2.4.1+cu121
226
+ - Datasets: 2.19.2
227
+ - Tokenizers: 0.19.1
228
+
229
+ ## Citation
230
+
231
+ ### BibTeX
232
+ ```bibtex
233
+ @article{https://doi.org/10.48550/arxiv.2209.11055,
234
+ doi = {10.48550/ARXIV.2209.11055},
235
+ url = {https://arxiv.org/abs/2209.11055},
236
+ author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
237
+ keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
238
+ title = {Efficient Few-Shot Learning Without Prompts},
239
+ publisher = {arXiv},
240
+ year = {2022},
241
+ copyright = {Creative Commons Attribution 4.0 International}
242
+ }
243
+ ```
244
+
245
+ <!--
246
+ ## Glossary
247
+
248
+ *Clearly define terms in order to be accessible across audiences.*
249
+ -->
250
+
251
+ <!--
252
+ ## Model Card Authors
253
+
254
+ *Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
255
+ -->
256
+
257
+ <!--
258
+ ## Model Card Contact
259
+
260
+ *Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
261
+ -->
config.json ADDED
@@ -0,0 +1,32 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "BAAI/bge-base-en-v1.5",
3
+ "architectures": [
4
+ "BertModel"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "classifier_dropout": null,
8
+ "gradient_checkpointing": false,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "id2label": {
13
+ "0": "LABEL_0"
14
+ },
15
+ "initializer_range": 0.02,
16
+ "intermediate_size": 3072,
17
+ "label2id": {
18
+ "LABEL_0": 0
19
+ },
20
+ "layer_norm_eps": 1e-12,
21
+ "max_position_embeddings": 512,
22
+ "model_type": "bert",
23
+ "num_attention_heads": 12,
24
+ "num_hidden_layers": 12,
25
+ "pad_token_id": 0,
26
+ "position_embedding_type": "absolute",
27
+ "torch_dtype": "float32",
28
+ "transformers_version": "4.44.0",
29
+ "type_vocab_size": 2,
30
+ "use_cache": true,
31
+ "vocab_size": 30522
32
+ }
config_sentence_transformers.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "__version__": {
3
+ "sentence_transformers": "3.1.0",
4
+ "transformers": "4.44.0",
5
+ "pytorch": "2.4.1+cu121"
6
+ },
7
+ "prompts": {},
8
+ "default_prompt_name": null,
9
+ "similarity_fn_name": null
10
+ }
config_setfit.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "normalize_embeddings": false,
3
+ "labels": null
4
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a18f1d2fb54ceed55f839c474d0efd34b14b8a78fd1e8e16ac9d1b6f5aec7249
3
+ size 437951328
model_head.pkl ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:39305b88397d03e24852c28ef118211d100db284b49c17876de857ac3b0edec4
3
+ size 7007
modules.json ADDED
@@ -0,0 +1,20 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ [
2
+ {
3
+ "idx": 0,
4
+ "name": "0",
5
+ "path": "",
6
+ "type": "sentence_transformers.models.Transformer"
7
+ },
8
+ {
9
+ "idx": 1,
10
+ "name": "1",
11
+ "path": "1_Pooling",
12
+ "type": "sentence_transformers.models.Pooling"
13
+ },
14
+ {
15
+ "idx": 2,
16
+ "name": "2",
17
+ "path": "2_Normalize",
18
+ "type": "sentence_transformers.models.Normalize"
19
+ }
20
+ ]
sentence_bert_config.json ADDED
@@ -0,0 +1,4 @@
 
 
 
 
 
1
+ {
2
+ "max_seq_length": 512,
3
+ "do_lower_case": true
4
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,37 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "cls_token": {
3
+ "content": "[CLS]",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "mask_token": {
10
+ "content": "[MASK]",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": {
17
+ "content": "[PAD]",
18
+ "lstrip": false,
19
+ "normalized": false,
20
+ "rstrip": false,
21
+ "single_word": false
22
+ },
23
+ "sep_token": {
24
+ "content": "[SEP]",
25
+ "lstrip": false,
26
+ "normalized": false,
27
+ "rstrip": false,
28
+ "single_word": false
29
+ },
30
+ "unk_token": {
31
+ "content": "[UNK]",
32
+ "lstrip": false,
33
+ "normalized": false,
34
+ "rstrip": false,
35
+ "single_word": false
36
+ }
37
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "added_tokens_decoder": {
3
+ "0": {
4
+ "content": "[PAD]",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false,
9
+ "special": true
10
+ },
11
+ "100": {
12
+ "content": "[UNK]",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false,
17
+ "special": true
18
+ },
19
+ "101": {
20
+ "content": "[CLS]",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false,
25
+ "special": true
26
+ },
27
+ "102": {
28
+ "content": "[SEP]",
29
+ "lstrip": false,
30
+ "normalized": false,
31
+ "rstrip": false,
32
+ "single_word": false,
33
+ "special": true
34
+ },
35
+ "103": {
36
+ "content": "[MASK]",
37
+ "lstrip": false,
38
+ "normalized": false,
39
+ "rstrip": false,
40
+ "single_word": false,
41
+ "special": true
42
+ }
43
+ },
44
+ "clean_up_tokenization_spaces": true,
45
+ "cls_token": "[CLS]",
46
+ "do_basic_tokenize": true,
47
+ "do_lower_case": true,
48
+ "mask_token": "[MASK]",
49
+ "model_max_length": 512,
50
+ "never_split": null,
51
+ "pad_token": "[PAD]",
52
+ "sep_token": "[SEP]",
53
+ "strip_accents": null,
54
+ "tokenize_chinese_chars": true,
55
+ "tokenizer_class": "BertTokenizer",
56
+ "unk_token": "[UNK]"
57
+ }
vocab.txt ADDED
The diff for this file is too large to render. See raw diff