File size: 690 Bytes
730782e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
---
language:
- en
license: apache-2.0
---
dpo-qlora-Qwen1.5-0.5B-Chat-xtuner is an dpo model from Qwen/Qwen1.5-0.5B-Chat. Direct preference optimization (DPO) is used for fine-tuning on HuggingFaceH4/ultrafeedback_binarized.
## Limitations of `dpo-qlora-Qwen1.5-0.5B-Chat-xtuner`
* Generate Inaccurate Code and Facts: The model may produce incorrect code snippets and statements. Users should treat these outputs as suggestions or starting points, not as definitive or accurate solutions.
* Unreliable Responses to Instruction: The model has not undergone instruction fine-tuning. As a result, it may struggle or fail to adhere to intricate or nuanced instructions provided by users.
|