--- license: apache-2.0 tags: - merge - mergekit - lazymergekit - mistralai/Mistral-7B-Instruct-v0.2 - beowolx/CodeNinja-1.0-OpenChat-7B base_model: - mistralai/Mistral-7B-Instruct-v0.2 - beowolx/CodeNinja-1.0-OpenChat-7B model-index: - name: Hugo-7B-slerp results: - task: type: text-generation name: Text Generation dataset: name: AI2 Reasoning Challenge (25-Shot) type: ai2_arc config: ARC-Challenge split: test args: num_few_shot: 25 metrics: - type: acc_norm value: 64.51 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: HellaSwag (10-Shot) type: hellaswag split: validation args: num_few_shot: 10 metrics: - type: acc_norm value: 84.77 name: normalized accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: MMLU (5-Shot) type: cais/mmlu config: all split: test args: num_few_shot: 5 metrics: - type: acc value: 62.54 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: TruthfulQA (0-shot) type: truthful_qa config: multiple_choice split: validation args: num_few_shot: 0 metrics: - type: mc2 value: 57.13 source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: Winogrande (5-shot) type: winogrande config: winogrande_xl split: validation args: num_few_shot: 5 metrics: - type: acc value: 80.03 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard - task: type: text-generation name: Text Generation dataset: name: GSM8k (5-shot) type: gsm8k config: main split: test args: num_few_shot: 5 metrics: - type: acc value: 53.45 name: accuracy source: url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=paulilioaica/Hugo-7B-slerp name: Open LLM Leaderboard --- # Hugo-7B-slerp
Hugo-7B-slerp is a successful merge of the following models using mergekit: * [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) * [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B) ## 🧩 Configuration ```yaml slices: - sources: - model: mistralai/Mistral-7B-Instruct-v0.2 layer_range: [0, 32] - model: beowolx/CodeNinja-1.0-OpenChat-7B layer_range: [0, 32] merge_method: slerp base_model: mistralai/Mistral-7B-Instruct-v0.2 parameters: t: - filter: self_attn value: [0, 0.5, 0.3, 0.7, 1] - filter: mlp value: [1, 0.5, 0.7, 0.3, 0] - value: 0.5 dtype: bfloat16 ``` ## 📈 Performance | Model | Average | ARC | HellaSwag | MMLU | TruthfulQA | Winogrande | GSM8K | | --- | --- | --- | --- | --- | --- | --- | --- | | [paulilioaica/Hugo-7B-slerp](#) | **67.07** | **64.51** | 84.77 | **62.54** | 57.13 | **80.03** | 53.45 | | [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2) | 65.71 | 63.14 | 84.88 | 60.78 | 68.26 | 77.19 | 40.03 | | [beowolx/CodeNinja-1.0-OpenChat-7B](https://huggingface.co/beowolx/CodeNinja-1.0-OpenChat-7B) | 67.4 | 63.48 | 83.65 | 63.77 | 47.16 | 79.79 | 66.57 | With bold one can see the benchmarks where this merge overtakes the basemodel in performance. ## 💻 Usage ```python !pip install -qU transformers accelerate from transformers import AutoTokenizer import transformers import torch model = "paulilioaica/Hugo-7B-slerp" messages = [{"role": "user", "content": "What is a large language model?"}] tokenizer = AutoTokenizer.from_pretrained(model) pipeline = transformers.pipeline( "conversational", model=model, torch_dtype=torch.float16, device_map="auto", ) outputs = pipeline(messages, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95) print(outputs) ``` ## 🛈 More on megekit [mergekit](https://huggingface.co/blog/mlabonne/merge-models) # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_paulilioaica__Hugo-7B-slerp) | Metric |Value| |---------------------------------|----:| |Avg. |67.07| |AI2 Reasoning Challenge (25-Shot)|64.51| |HellaSwag (10-Shot) |84.77| |MMLU (5-Shot) |62.54| |TruthfulQA (0-shot) |57.13| |Winogrande (5-shot) |80.03| |GSM8k (5-shot) |53.45|