|
--- |
|
license: other |
|
license_name: yi-license |
|
license_link: LICENSE |
|
tags: |
|
- finetune |
|
- fine-tune |
|
datasets: |
|
- adamo1139/rawrr_v1 |
|
--- |
|
|
|
|
|
This model is Yi-34B-200K fine-tuned using DPO on rawrr_v1 dataset using QLoRA at ctx 200. I then merged the adapter with base model. |
|
This model is akin to raw LLaMa 65B, it's not meant to follow instructions but instead should be useful as base for further fine-tuning. |
|
|
|
Rawrr_v1 dataset made it so that this model issue less refusals, especially for benign topics, and is moreso completion focused rather than instruct focused. |
|
Base yi-34B-200k suffers from contamination on instruct and refusal datasets, i am attempting to fix that by training base models with DPO on rawrr dataset, making them more raw. |
|
|
|
License: |
|
yi-license + non-commercial use only |