MoritzLaurer HF staff commited on
Commit
6887196
·
verified ·
1 Parent(s): 9a82c7d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -15
README.md CHANGED
@@ -20,6 +20,8 @@ The model can do one universal classification task: determine whether a hypothes
20
  (`entailment` vs. `not_entailment`).
21
  This task format is based on the Natural Language Inference task (NLI).
22
  The task is so universal that any classification task can be reformulated into this task.
 
 
23
 
24
 
25
  ## Training data
@@ -29,12 +31,9 @@ I first created a list of 500+ diverse text classification tasks for 25 professi
29
  I then used this as seed data to generate several hundred thousand texts for the different tasks with Mixtral-8x7B-Instruct-v0.1.
30
  The final dataset used is available in the [synthetic_zeroshot_mixtral_v0.1](https://huggingface.co/datasets/MoritzLaurer/synthetic_zeroshot_mixtral_v0.1) dataset
31
  in the subset `mixtral_written_text_for_tasks_v4`. Data curation was done in multiple iterations and I will release more information on this process soon.
32
- 2. Two commercially-friendly NLI datasets: ([MNLI](https://huggingface.co/datasets/nyu-mll/multi_nli), [FEVER-NLI](https://huggingface.co/datasets/fever).
33
  These datasets were added to increase generalization. Datasets like ANLI were excluded due to their non-commercial license.
34
 
35
- Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
36
- as opposed to three classes (entailment/neutral/contradiction)
37
-
38
  The model was only trained on English data. I will release a multilingual version of this model soon.
39
  For __multilingual use-cases__,
40
  I alternatively recommend machine translating texts to English with libraries like [EasyNMT](https://github.com/UKPLab/EasyNMT).
@@ -43,8 +42,7 @@ validation with English data can be easier if you don't speak all languages in y
43
 
44
 
45
 
46
- ### How to use the model
47
- #### Simple zero-shot classification pipeline
48
  ```python
49
  #!pip install transformers[sentencepiece]
50
  from transformers import pipeline
@@ -58,10 +56,6 @@ print(output)
58
 
59
  `multi_label=False` forces the model to decide on only one class. `multi_label=True` enables the model to choose multiple classes.
60
 
61
- ### Details on data and training
62
-
63
- Reproduction code is available here, in the `v2_synthetic_data` directory: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
64
-
65
 
66
  ## Metrics
67
 
@@ -69,10 +63,6 @@ The model was evaluated on 28 different text classification tasks with the [bala
69
  The main reference point is `facebook/bart-large-mnli` which is at the time of writing (27.03.24) the most used commercially-friendly 0-shot classifier.
70
  The different `...zeroshot-v2.0` models were all trained with the same data and the only difference the the underlying foundation model.
71
 
72
- Note that my `...zeroshot-v1.1` models (e.g. [deberta-v3-base-zeroshot-v1.1-all-33](https://huggingface.co/MoritzLaurer/deberta-v3-base-zeroshot-v1.1-all-33))
73
- perform better on these 28 datasets, but they are trained on several datasets with non-commercial licenses.
74
- For commercial users, I therefore recommend using the v2.0 model and non-commercial users might get better performance with the v1.1 models.
75
-
76
  ![results_aggreg_v2.0](https://raw.githubusercontent.com/MoritzLaurer/zeroshot-classifier/e859471dd183ad44b705c047130433301386aab8/v2_synthetic_data/results/zeroshot-v2.0-aggreg.png)
77
 
78
  | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0 | roberta-large-zeroshot-v2.0 | deberta-v3-base-zeroshot-v2.0 | deberta-v3-large-zeroshot-v2.0 |
@@ -109,6 +99,21 @@ For commercial users, I therefore recommend using the v2.0 model and non-commerc
109
 
110
 
111
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
112
 
113
  ## Limitations and bias
114
  The model can only do text classification tasks.
@@ -149,7 +154,8 @@ If you have questions or ideas for cooperation, contact me at moritz{at}huggingf
149
 
150
 
151
  ### Flexible usage and "prompting"
152
- You can formulate your own hypotheses by changing the `hypothesis_template` of the zeroshot pipeline. For example:
 
153
 
154
  ```python
155
  from transformers import pipeline
 
20
  (`entailment` vs. `not_entailment`).
21
  This task format is based on the Natural Language Inference task (NLI).
22
  The task is so universal that any classification task can be reformulated into this task.
23
+ Note that compared to other NLI models, this model predicts two classes (`entailment` vs. `not_entailment`)
24
+ as opposed to three classes (entailment/neutral/contradiction).
25
 
26
 
27
  ## Training data
 
31
  I then used this as seed data to generate several hundred thousand texts for the different tasks with Mixtral-8x7B-Instruct-v0.1.
32
  The final dataset used is available in the [synthetic_zeroshot_mixtral_v0.1](https://huggingface.co/datasets/MoritzLaurer/synthetic_zeroshot_mixtral_v0.1) dataset
33
  in the subset `mixtral_written_text_for_tasks_v4`. Data curation was done in multiple iterations and I will release more information on this process soon.
34
+ 2. Two commercially-friendly NLI datasets: ([MNLI](https://huggingface.co/datasets/nyu-mll/multi_nli), [FEVER-NLI](https://huggingface.co/datasets/fever)).
35
  These datasets were added to increase generalization. Datasets like ANLI were excluded due to their non-commercial license.
36
 
 
 
 
37
  The model was only trained on English data. I will release a multilingual version of this model soon.
38
  For __multilingual use-cases__,
39
  I alternatively recommend machine translating texts to English with libraries like [EasyNMT](https://github.com/UKPLab/EasyNMT).
 
42
 
43
 
44
 
45
+ ## How to use the model
 
46
  ```python
47
  #!pip install transformers[sentencepiece]
48
  from transformers import pipeline
 
56
 
57
  `multi_label=False` forces the model to decide on only one class. `multi_label=True` enables the model to choose multiple classes.
58
 
 
 
 
 
59
 
60
  ## Metrics
61
 
 
63
  The main reference point is `facebook/bart-large-mnli` which is at the time of writing (27.03.24) the most used commercially-friendly 0-shot classifier.
64
  The different `...zeroshot-v2.0` models were all trained with the same data and the only difference the the underlying foundation model.
65
 
 
 
 
 
66
  ![results_aggreg_v2.0](https://raw.githubusercontent.com/MoritzLaurer/zeroshot-classifier/e859471dd183ad44b705c047130433301386aab8/v2_synthetic_data/results/zeroshot-v2.0-aggreg.png)
67
 
68
  | | facebook/bart-large-mnli | roberta-base-zeroshot-v2.0 | roberta-large-zeroshot-v2.0 | deberta-v3-base-zeroshot-v2.0 | deberta-v3-large-zeroshot-v2.0 |
 
99
 
100
 
101
 
102
+ ## When to use which model
103
+
104
+ - deberta-v3 vs. roberta: deberta-v3 performs clearly better than roberta, but it is slower.
105
+ roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention.
106
+ These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model.
107
+ If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and [HF Inference Endpoints](https://ui.endpoints.huggingface.co/catalog)).
108
+ - `zeroshot-v1.1` vs. `zeroshot-v2.0` models: My `zeroshot-v1.1` models (see [Zeroshot Classifier Collection](https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f)))
109
+ perform better on these 28 datasets, but they are trained on several datasets with non-commercial licenses.
110
+ For commercial users, I therefore recommend using a v2.0 model and non-commercial users might get better performance with a v1.1 model.
111
+
112
+ ## Reproduction
113
+
114
+ Reproduction code is available here, in the `v2_synthetic_data` directory: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
115
+
116
+
117
 
118
  ## Limitations and bias
119
  The model can only do text classification tasks.
 
154
 
155
 
156
  ### Flexible usage and "prompting"
157
+ You can formulate your own hypotheses by changing the `hypothesis_template` of the zeroshot pipeline.
158
+ Similar to "prompt engineering" for LLMs, you can test different formulations of your `hypothesis_template` and verbalized classes to improve performance.
159
 
160
  ```python
161
  from transformers import pipeline