Zero-Shot Classification
Transformers
PyTorch
Safetensors
English
deberta-v2
text-classification
deberta-v3-large
nli
natural-language-inference
multitask
multi-task
pipeline
extreme-multi-task
extreme-mtl
tasksource
zero-shot
rlhf
Instructions to use sileod/deberta-v3-large-tasksource-nli with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use sileod/deberta-v3-large-tasksource-nli with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("zero-shot-classification", model="sileod/deberta-v3-large-tasksource-nli")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("sileod/deberta-v3-large-tasksource-nli") model = AutoModelForSequenceClassification.from_pretrained("sileod/deberta-v3-large-tasksource-nli") - Notebooks
- Google Colab
- Kaggle
Update README.md
Browse files
README.md
CHANGED
|
@@ -230,7 +230,7 @@ The untuned model CLS embedding also has strong linear probing performance (90%
|
|
| 230 |
|
| 231 |
This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic rlhf, anli... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
|
| 232 |
Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
| 233 |
-
The number of examples per task was capped to 64k. The model was trained for
|
| 234 |
|
| 235 |
|
| 236 |
tasksource training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|
|
|
|
| 230 |
|
| 231 |
This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic rlhf, anli... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
|
| 232 |
Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
|
| 233 |
+
The number of examples per task was capped to 64k. The model was trained for 30k steps with a batch size of 384, and a peak learning rate of 2e-5.
|
| 234 |
|
| 235 |
|
| 236 |
tasksource training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
|