Update README.md
Browse files
README.md
CHANGED
@@ -11,4 +11,7 @@ datasets:
|
|
11 |
|
12 |
|
13 |
This model is Yi-34B-200K fine-tuned using DPO on rawrr_v1 dataset using QLoRA at ctx 200. I then merged the adapter with base model.
|
14 |
-
This model is akin to raw LLaMa 65B, it's not meant to follow instructions but instead should be useful as base for further fine-tuning.
|
|
|
|
|
|
|
|
11 |
|
12 |
|
13 |
This model is Yi-34B-200K fine-tuned using DPO on rawrr_v1 dataset using QLoRA at ctx 200. I then merged the adapter with base model.
|
14 |
+
This model is akin to raw LLaMa 65B, it's not meant to follow instructions but instead should be useful as base for further fine-tuning.
|
15 |
+
|
16 |
+
Rawrr_v1 dataset made it so that this model issue less refusals, especially for benign topics, and is moreso completion focused rather than instruct focused.
|
17 |
+
Base yi-34B-200k suffers from contamination on instruct and refusal datasets, i am attempting to fix that by training base models with DPO on rawrr dataset, making them more raw.
|