license: apache-2.0 | |
language: | |
- en | |
# 4season/model_eval_test | |
# **Introduction** | |
This model is test version, alignment-tuned model. | |
We utilize state-of-the-art instruction fine-tuning methods including direct preference optimization (DPO). | |
After DPO training, we linearly merged models to boost performance. |