Zyphra
/

Zamba2-1.2B-instruct

Text Generation

Inference Endpoints

Model card Files Files and versions Community

add training recipe

#1

by qanthony - opened Oct 1

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

Files changed (1) hide show

README.md +4 -1

README.md CHANGED Viewed

@@ -5,7 +5,10 @@ license: apache-2.0
 # Model Card for Zamba2-1.2B
-Zamba2-1.2B-instruct is obtained from Zamba2-1.2B by fine-tuning on instruction-following and chat datasets.
 Zamba2-1.2B-Instruct is a hybrid model composed of state-space ([Mamba2](https://github.com/state-spaces/mamba)) and transformer blocks. It is based on the [Zamba2-1.2B](https://huggingface.co/Zyphra/Zamba2-1.2B) architecture.

 # Model Card for Zamba2-1.2B
+Zamba2-1.2B-instruct is obtained from Zamba2-1.2B by fine-tuning on instruction-following and chat datasets. Specifically:
+1. SFT of the base [Zamba2-1.2B](https://huggingface.co/Zyphra/Zamba2-1.2B) model on [ultrachat_200k](HuggingFaceH4/ultrachat_200k) and [Infinity-Instruct](https://huggingface.co/datasets/BAAI/Infinity-Instruct)
+2. DPO of the SFT checkpoint on [ultrafeedback_binarized](https://huggingface.co/datasets/HuggingFaceH4/ultrafeedback_binarized), [orca_dpo_pairs](https://huggingface.co/datasets/Intel/orca_dpo_pairs), and [OpenHermesPreferences](https://huggingface.co/datasets/argilla/OpenHermesPreferences)
 Zamba2-1.2B-Instruct is a hybrid model composed of state-space ([Mamba2](https://github.com/state-spaces/mamba)) and transformer blocks. It is based on the [Zamba2-1.2B](https://huggingface.co/Zyphra/Zamba2-1.2B) architecture.