HuggingFaceH4
/

zephyr-orpo-141b-A35b-v0.1

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

lewtun HF staff commited on Apr 18

Commit

af0fbbf

•

1 Parent(s): 6ead143

Update README.md

Files changed (1) hide show

README.md +2 -5

README.md CHANGED Viewed

@@ -8,7 +8,7 @@ tags:
 datasets:
 - argilla/distilabel-capybara-dpo-7k-binarized
 model-index:
-- name: zephyr-orpo-141b-A39b-v0.1
   results: []
 inference:
   parameters:
@@ -31,7 +31,7 @@ Zephyr is a series of language models that are trained to act as helpful assista
 <!-- Provide a longer summary of what this model is. -->
-- **Model type:** A Mixture of Experts (MoE) model with 141B total parameters and 39B active parameters. Fine-tuned on a mix of publicly available, synthetic datasets.
 - **Language(s) (NLP):** Primarily English.
 - **License:** Apache 2.0
 - **Finetuned from model:** [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1)
@@ -115,9 +115,6 @@ The following hyperparameters were used during training:
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 3
-### Training results
 ### Framework versions

 datasets:
 - argilla/distilabel-capybara-dpo-7k-binarized
 model-index:
+- name: zephyr-orpo-141b-A35b-v0.1
   results: []
 inference:
   parameters:
 <!-- Provide a longer summary of what this model is. -->
+- **Model type:** A Mixture of Experts (MoE) model with 141B total parameters and 39B active parameters. (We initially made a small error in calculating the number of active parameters for the model ID. The model card states the correct number.) Fine-tuned on a mix of publicly available, synthetic datasets.
 - **Language(s) (NLP):** Primarily English.
 - **License:** Apache 2.0
 - **Finetuned from model:** [mistral-community/Mixtral-8x22B-v0.1](https://huggingface.co/mistral-community/Mixtral-8x22B-v0.1)
 - lr_scheduler_warmup_steps: 100
 - num_epochs: 3
 ### Framework versions