|
--- |
|
license: llama2 |
|
--- |
|
|
|
# LM-cocktail 10.7B v1 |
|
|
|
|
|
This is a 50%-50% model of the SOLAR model and meow. |
|
|
|
https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0 |
|
|
|
https://huggingface.co/rishiraj/meow |
|
|
|
|
|
who rank #1 and #2 among models <13B in the https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard by 2023/12/20. |
|
|
|
|
|
# Code |
|
|
|
The LM-cocktail is novel technique for merging multiple models https://arxiv.org/abs/2311.13534 |
|
|
|
Code is backed up by this repo https://github.com/FlagOpen/FlagEmbedding.git |
|
|
|
Merging scripts available under the [./scripts](./scripts) folder |
|
|
|
|
|
# Result |
|
|
|
The SOLAR model is the first model <30B that can answer this question from my test: |
|
|
|
``` |
|
What will AI be like in the year 1010 A.D? |
|
``` |
|
|
|
without hullicinating into 1010 A.D is a future time (like other llama2 models) |
|
|
|
Models greater than that, like Yi-34B could answer this paradoxic question correctly as well, since it is huge enough. |
|
|
|
### SOLAR 10.7B output |
|
|
|
![img](./assets/SOLAR.png) |
|
|
|
### LMCocktail 10.7B output1 |
|
|
|
![img](./assets/SOLAR_mixed.png) |
|
|
|
### LMCocktail 10.7B output2 |
|
|
|
![img](./assets/SOLAR_mixed2.png) |