Amu
/

dpo-qlora-Qwen1.5-0.5B-Chat-xtuner

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

dpo-qlora-Qwen1.5-0.5B-Chat-xtuner / README.md

Amu's picture

Amu

init model

730782e 4 months ago

|

history blame contribute delete

No virus

690 Bytes

	---
	language:
	- en
	license: apache-2.0
	---

	dpo-qlora-Qwen1.5-0.5B-Chat-xtuner is an dpo model from Qwen/Qwen1.5-0.5B-Chat. Direct preference optimization (DPO) is used for fine-tuning on HuggingFaceH4/ultrafeedback_binarized.

	## Limitations of `dpo-qlora-Qwen1.5-0.5B-Chat-xtuner`

	* Generate Inaccurate Code and Facts: The model may produce incorrect code snippets and statements. Users should treat these outputs as suggestions or starting points, not as definitive or accurate solutions.

	* Unreliable Responses to Instruction: The model has not undergone instruction fine-tuning. As a result, it may struggle or fail to adhere to intricate or nuanced instructions provided by users.