Update README.md
Browse files
README.md
CHANGED
@@ -8,7 +8,7 @@ tags:
|
|
8 |
datasets:
|
9 |
- argilla/distilabel-capybara-dpo-7k-binarized
|
10 |
model-index:
|
11 |
-
- name: zephyr-orpo-141b-
|
12 |
results: []
|
13 |
inference:
|
14 |
parameters:
|
@@ -31,7 +31,7 @@ Zephyr is a series of language models that are trained to act as helpful assista
|
|
31 |
|
32 |
<!-- Provide a longer summary of what this model is. -->
|
33 |
|
34 |
-
- **Model type:** A Mixture of Experts (MoE) model with 141B total parameters and 39B active parameters. Fine-tuned on a mix of publicly available, synthetic datasets.
|
35 |
- **Language(s) (NLP):** Primarily English.
|
36 |
- **License:** Apache 2.0
|
37 |
- **Finetuned from model:** [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1)
|
@@ -115,9 +115,6 @@ The following hyperparameters were used during training:
|
|
115 |
- lr_scheduler_warmup_steps: 100
|
116 |
- num_epochs: 3
|
117 |
|
118 |
-
### Training results
|
119 |
-
|
120 |
-
|
121 |
|
122 |
### Framework versions
|
123 |
|
|
|
8 |
datasets:
|
9 |
- argilla/distilabel-capybara-dpo-7k-binarized
|
10 |
model-index:
|
11 |
+
- name: zephyr-orpo-141b-A35b-v0.1
|
12 |
results: []
|
13 |
inference:
|
14 |
parameters:
|
|
|
31 |
|
32 |
<!-- Provide a longer summary of what this model is. -->
|
33 |
|
34 |
+
- **Model type:** A Mixture of Experts (MoE) model with 141B total parameters and 39B active parameters. (We initially made a small error in calculating the number of active parameters for the model ID. The model card states the correct number.) Fine-tuned on a mix of publicly available, synthetic datasets.
|
35 |
- **Language(s) (NLP):** Primarily English.
|
36 |
- **License:** Apache 2.0
|
37 |
- **Finetuned from model:** [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1)
|
|
|
115 |
- lr_scheduler_warmup_steps: 100
|
116 |
- num_epochs: 3
|
117 |
|
|
|
|
|
|
|
118 |
|
119 |
### Framework versions
|
120 |
|