aashish1904 commited on
Commit
06b5dac
1 Parent(s): 216cc6f

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +213 -0
README.md ADDED
@@ -0,0 +1,213 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ license: other
5
+ tags:
6
+ - merge
7
+ - mergekit
8
+ - lazymergekit
9
+ base_model:
10
+ - NousResearch/Meta-Llama-3-8B-Instruct
11
+ - mlabonne/OrpoLlama-3-8B
12
+ - cognitivecomputations/dolphin-2.9-llama3-8b
13
+ - Danielbrdz/Barcenas-Llama3-8b-ORPO
14
+ - VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
15
+ - vicgalle/Configurable-Llama-3-8B-v0.3
16
+ - MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
17
+ model-index:
18
+ - name: ChimeraLlama-3-8B-v3
19
+ results:
20
+ - task:
21
+ type: text-generation
22
+ name: Text Generation
23
+ dataset:
24
+ name: IFEval (0-Shot)
25
+ type: HuggingFaceH4/ifeval
26
+ args:
27
+ num_few_shot: 0
28
+ metrics:
29
+ - type: inst_level_strict_acc and prompt_level_strict_acc
30
+ value: 44.08
31
+ name: strict accuracy
32
+ source:
33
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
34
+ name: Open LLM Leaderboard
35
+ - task:
36
+ type: text-generation
37
+ name: Text Generation
38
+ dataset:
39
+ name: BBH (3-Shot)
40
+ type: BBH
41
+ args:
42
+ num_few_shot: 3
43
+ metrics:
44
+ - type: acc_norm
45
+ value: 27.65
46
+ name: normalized accuracy
47
+ source:
48
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
49
+ name: Open LLM Leaderboard
50
+ - task:
51
+ type: text-generation
52
+ name: Text Generation
53
+ dataset:
54
+ name: MATH Lvl 5 (4-Shot)
55
+ type: hendrycks/competition_math
56
+ args:
57
+ num_few_shot: 4
58
+ metrics:
59
+ - type: exact_match
60
+ value: 7.85
61
+ name: exact match
62
+ source:
63
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
64
+ name: Open LLM Leaderboard
65
+ - task:
66
+ type: text-generation
67
+ name: Text Generation
68
+ dataset:
69
+ name: GPQA (0-shot)
70
+ type: Idavidrein/gpqa
71
+ args:
72
+ num_few_shot: 0
73
+ metrics:
74
+ - type: acc_norm
75
+ value: 5.59
76
+ name: acc_norm
77
+ source:
78
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
79
+ name: Open LLM Leaderboard
80
+ - task:
81
+ type: text-generation
82
+ name: Text Generation
83
+ dataset:
84
+ name: MuSR (0-shot)
85
+ type: TAUR-Lab/MuSR
86
+ args:
87
+ num_few_shot: 0
88
+ metrics:
89
+ - type: acc_norm
90
+ value: 8.38
91
+ name: acc_norm
92
+ source:
93
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
94
+ name: Open LLM Leaderboard
95
+ - task:
96
+ type: text-generation
97
+ name: Text Generation
98
+ dataset:
99
+ name: MMLU-PRO (5-shot)
100
+ type: TIGER-Lab/MMLU-Pro
101
+ config: main
102
+ split: test
103
+ args:
104
+ num_few_shot: 5
105
+ metrics:
106
+ - type: acc
107
+ value: 29.65
108
+ name: accuracy
109
+ source:
110
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=mlabonne/ChimeraLlama-3-8B-v3
111
+ name: Open LLM Leaderboard
112
+
113
+ ---
114
+
115
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
116
+
117
+
118
+ # QuantFactory/ChimeraLlama-3-8B-v3-GGUF
119
+ This is quantized version of [mlabonne/ChimeraLlama-3-8B-v3](https://huggingface.co/mlabonne/ChimeraLlama-3-8B-v3) created using llama.cpp
120
+
121
+ # Original Model Card
122
+
123
+
124
+ # ChimeraLlama-3-8B-v3
125
+
126
+ ChimeraLlama-3-8B-v3 is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
127
+ * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
128
+ * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
129
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
130
+ * [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO)
131
+ * [VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct](https://huggingface.co/VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct)
132
+ * [vicgalle/Configurable-Llama-3-8B-v0.3](https://huggingface.co/vicgalle/Configurable-Llama-3-8B-v0.3)
133
+ * [MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3)
134
+
135
+ ## 🧩 Configuration
136
+
137
+ ```yaml
138
+ models:
139
+ - model: NousResearch/Meta-Llama-3-8B
140
+ # No parameters necessary for base model
141
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
142
+ parameters:
143
+ density: 0.6
144
+ weight: 0.5
145
+ - model: mlabonne/OrpoLlama-3-8B
146
+ parameters:
147
+ density: 0.55
148
+ weight: 0.05
149
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
150
+ parameters:
151
+ density: 0.55
152
+ weight: 0.05
153
+ - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
154
+ parameters:
155
+ density: 0.55
156
+ weight: 0.2
157
+ - model: VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
158
+ parameters:
159
+ density: 0.55
160
+ weight: 0.1
161
+ - model: vicgalle/Configurable-Llama-3-8B-v0.3
162
+ parameters:
163
+ density: 0.55
164
+ weight: 0.05
165
+ - model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
166
+ parameters:
167
+ density: 0.55
168
+ weight: 0.05
169
+ merge_method: dare_ties
170
+ base_model: NousResearch/Meta-Llama-3-8B
171
+ parameters:
172
+ int8_mask: true
173
+ dtype: float16
174
+ ```
175
+
176
+ ## 💻 Usage
177
+
178
+ ```python
179
+ !pip install -qU transformers accelerate
180
+
181
+ from transformers import AutoTokenizer
182
+ import transformers
183
+ import torch
184
+
185
+ model = "mlabonne/ChimeraLlama-3-8B-v3"
186
+ messages = [{"role": "user", "content": "What is a large language model?"}]
187
+
188
+ tokenizer = AutoTokenizer.from_pretrained(model)
189
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
190
+ pipeline = transformers.pipeline(
191
+ "text-generation",
192
+ model=model,
193
+ torch_dtype=torch.float16,
194
+ device_map="auto",
195
+ )
196
+
197
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
198
+ print(outputs[0]["generated_text"])
199
+ ```
200
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
201
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_mlabonne__ChimeraLlama-3-8B-v3)
202
+
203
+ | Metric |Value|
204
+ |-------------------|----:|
205
+ |Avg. |20.53|
206
+ |IFEval (0-Shot) |44.08|
207
+ |BBH (3-Shot) |27.65|
208
+ |MATH Lvl 5 (4-Shot)| 7.85|
209
+ |GPQA (0-shot) | 5.59|
210
+ |MuSR (0-shot) | 8.38|
211
+ |MMLU-PRO (5-shot) |29.65|
212
+
213
+