Update README.md
Browse files
README.md
CHANGED
@@ -20,9 +20,11 @@ language:
|
|
20 |
library_name: transformers
|
21 |
pipeline_tag: text-generation
|
22 |
---
|
|
|
|
|
23 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64fc6d81d75293f417fee1d1/7xlnpalOC4qtu-VABsib4.jpeg)
|
24 |
|
25 |
-
|
26 |
|
27 |
<!-- Provide a quick summary of what the model is/does. -->
|
28 |
AlphaMonarch-dora is a DPO fine-tuned of [mlabonne/NeuralMonarch-7B](https://huggingface.co/mlabonne/NeuralMonarch-7B/) using the [argilla/OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/argilla/OpenHermes2.5-dpo-binarized-alpha) preference dataset using DoRA. This model is slightly less performant on the Nous and Openllm leaderboards in comparison to base [AlphaMonarch](https://huggingface.co/mlabonne/AlphaMonarch-7B) and [AlphaMonarch-laser](https://huggingface.co/abideen/AlphaMonarch-laser). I have trained this model for 1080 steps. All hyperparams were kept consist across all these experiments.
|
|
|
20 |
library_name: transformers
|
21 |
pipeline_tag: text-generation
|
22 |
---
|
23 |
+
# AlphaMonarch-dora
|
24 |
+
|
25 |
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/64fc6d81d75293f417fee1d1/7xlnpalOC4qtu-VABsib4.jpeg)
|
26 |
|
27 |
+
|
28 |
|
29 |
<!-- Provide a quick summary of what the model is/does. -->
|
30 |
AlphaMonarch-dora is a DPO fine-tuned of [mlabonne/NeuralMonarch-7B](https://huggingface.co/mlabonne/NeuralMonarch-7B/) using the [argilla/OpenHermes2.5-dpo-binarized-alpha](https://huggingface.co/datasets/argilla/OpenHermes2.5-dpo-binarized-alpha) preference dataset using DoRA. This model is slightly less performant on the Nous and Openllm leaderboards in comparison to base [AlphaMonarch](https://huggingface.co/mlabonne/AlphaMonarch-7B) and [AlphaMonarch-laser](https://huggingface.co/abideen/AlphaMonarch-laser). I have trained this model for 1080 steps. All hyperparams were kept consist across all these experiments.
|