May I ask how the your method of merge adapters to base model?
Sorry to bother you.
This question may not be directly related to your model, but I've been looking around and yet to find a solution.
I've fine-tuned a model using QLORA, and I can't merge the adapters (checkpoint) back to the base model.
I've tried the script provided by The Block, but there were some errors showing layer sizes don't match.
And your model seems works fine, so I wonder how did you merge the model.
Thank you.
Also, your model is really impressive
Hi,
I tried two ways for fusion:
- as I trained the model with https://github.com/hiyouga/LLaMA-Efficient-Tuning, I directly use https://github.com/hiyouga/LLaMA-Efficient-Tuning/blob/main/src/export_model.py to do the fusion job.
- or you can use https://github.com/jondurbin/qlora/blob/main/qmerge.py to do it.
I tried the two methods, generally, the second will get better ARC(+0.15) and Truthful_QA(+0.3) scores but the other two(MMLU(-0.2) and HelloSwag(-0.2)) seems to degenerate.
The version for leadboard is generated by the first fusion method.
Thank you so much, would check the two methods you mentioned.