snorkelai
/

Snorkel-Mistral-PairRM-DPO

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

viethoangtranduong commited on Jan 23, 2024

Commit

7326eae

·

verified ·

1 Parent(s): 95d62cc

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -19,7 +19,8 @@ We utilize ONLY the prompts from [UltraFeedback](https://huggingface.co/datasets
 This overview provides a high-level summary of our approach.
 We plan to release more detailed results and findings in the coming weeks on the [Snorkel blog](https://snorkel.ai/blog/).
-**Training recipe**: This data is formatted to be compatible with the Hugging Face's [Zephyr recipe](https://github.com/huggingface/alignment-handbook/tree/main/recipes/zephyr-7b-beta).
 We executed the n_th DPO iteration using the "train/test_iteration_{n}".
 ### Key Premises:

 This overview provides a high-level summary of our approach.
 We plan to release more detailed results and findings in the coming weeks on the [Snorkel blog](https://snorkel.ai/blog/).
+### Training recipe:
+- This data is formatted to be compatible with the Hugging Face's [Zephyr recipe](https://github.com/huggingface/alignment-handbook/tree/main/recipes/zephyr-7b-beta).
 We executed the n_th DPO iteration using the "train/test_iteration_{n}".
 ### Key Premises: