munish0838 commited on
Commit
f16ca6e
1 Parent(s): c12fb30

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +351 -0
README.md ADDED
@@ -0,0 +1,351 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ language:
5
+ - en
6
+ license: mit
7
+ library_name: transformers
8
+ tags:
9
+ - axolotl
10
+ - finetune
11
+ - dpo
12
+ - microsoft
13
+ - phi
14
+ - pytorch
15
+ - phi-3
16
+ - nlp
17
+ - code
18
+ - chatml
19
+ base_model: microsoft/Phi-3-mini-4k-instruct
20
+ pipeline_tag: text-generation
21
+ inference: false
22
+ model_creator: MaziyarPanahi
23
+ quantized_by: MaziyarPanahi
24
+ model-index:
25
+ - name: calme-2.3-phi3-4b
26
+ results:
27
+ - task:
28
+ type: text-generation
29
+ name: Text Generation
30
+ dataset:
31
+ name: AI2 Reasoning Challenge (25-Shot)
32
+ type: ai2_arc
33
+ config: ARC-Challenge
34
+ split: test
35
+ args:
36
+ num_few_shot: 25
37
+ metrics:
38
+ - type: acc_norm
39
+ value: 63.48
40
+ name: normalized accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: HellaSwag (10-Shot)
49
+ type: hellaswag
50
+ split: validation
51
+ args:
52
+ num_few_shot: 10
53
+ metrics:
54
+ - type: acc_norm
55
+ value: 80.86
56
+ name: normalized accuracy
57
+ source:
58
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
59
+ name: Open LLM Leaderboard
60
+ - task:
61
+ type: text-generation
62
+ name: Text Generation
63
+ dataset:
64
+ name: MMLU (5-Shot)
65
+ type: cais/mmlu
66
+ config: all
67
+ split: test
68
+ args:
69
+ num_few_shot: 5
70
+ metrics:
71
+ - type: acc
72
+ value: 69.24
73
+ name: accuracy
74
+ source:
75
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
76
+ name: Open LLM Leaderboard
77
+ - task:
78
+ type: text-generation
79
+ name: Text Generation
80
+ dataset:
81
+ name: TruthfulQA (0-shot)
82
+ type: truthful_qa
83
+ config: multiple_choice
84
+ split: validation
85
+ args:
86
+ num_few_shot: 0
87
+ metrics:
88
+ - type: mc2
89
+ value: 60.66
90
+ source:
91
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
92
+ name: Open LLM Leaderboard
93
+ - task:
94
+ type: text-generation
95
+ name: Text Generation
96
+ dataset:
97
+ name: Winogrande (5-shot)
98
+ type: winogrande
99
+ config: winogrande_xl
100
+ split: validation
101
+ args:
102
+ num_few_shot: 5
103
+ metrics:
104
+ - type: acc
105
+ value: 72.77
106
+ name: accuracy
107
+ source:
108
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
109
+ name: Open LLM Leaderboard
110
+ - task:
111
+ type: text-generation
112
+ name: Text Generation
113
+ dataset:
114
+ name: GSM8k (5-shot)
115
+ type: gsm8k
116
+ config: main
117
+ split: test
118
+ args:
119
+ num_few_shot: 5
120
+ metrics:
121
+ - type: acc
122
+ value: 74.53
123
+ name: accuracy
124
+ source:
125
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
126
+ name: Open LLM Leaderboard
127
+ - task:
128
+ type: text-generation
129
+ name: Text Generation
130
+ dataset:
131
+ name: IFEval (0-Shot)
132
+ type: HuggingFaceH4/ifeval
133
+ args:
134
+ num_few_shot: 0
135
+ metrics:
136
+ - type: inst_level_strict_acc and prompt_level_strict_acc
137
+ value: 49.26
138
+ name: strict accuracy
139
+ source:
140
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
141
+ name: Open LLM Leaderboard
142
+ - task:
143
+ type: text-generation
144
+ name: Text Generation
145
+ dataset:
146
+ name: BBH (3-Shot)
147
+ type: BBH
148
+ args:
149
+ num_few_shot: 3
150
+ metrics:
151
+ - type: acc_norm
152
+ value: 37.66
153
+ name: normalized accuracy
154
+ source:
155
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
156
+ name: Open LLM Leaderboard
157
+ - task:
158
+ type: text-generation
159
+ name: Text Generation
160
+ dataset:
161
+ name: MATH Lvl 5 (4-Shot)
162
+ type: hendrycks/competition_math
163
+ args:
164
+ num_few_shot: 4
165
+ metrics:
166
+ - type: exact_match
167
+ value: 2.95
168
+ name: exact match
169
+ source:
170
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
171
+ name: Open LLM Leaderboard
172
+ - task:
173
+ type: text-generation
174
+ name: Text Generation
175
+ dataset:
176
+ name: GPQA (0-shot)
177
+ type: Idavidrein/gpqa
178
+ args:
179
+ num_few_shot: 0
180
+ metrics:
181
+ - type: acc_norm
182
+ value: 9.06
183
+ name: acc_norm
184
+ source:
185
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
186
+ name: Open LLM Leaderboard
187
+ - task:
188
+ type: text-generation
189
+ name: Text Generation
190
+ dataset:
191
+ name: MuSR (0-shot)
192
+ type: TAUR-Lab/MuSR
193
+ args:
194
+ num_few_shot: 0
195
+ metrics:
196
+ - type: acc_norm
197
+ value: 7.75
198
+ name: acc_norm
199
+ source:
200
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
201
+ name: Open LLM Leaderboard
202
+ - task:
203
+ type: text-generation
204
+ name: Text Generation
205
+ dataset:
206
+ name: MMLU-PRO (5-shot)
207
+ type: TIGER-Lab/MMLU-Pro
208
+ config: main
209
+ split: test
210
+ args:
211
+ num_few_shot: 5
212
+ metrics:
213
+ - type: acc
214
+ value: 31.42
215
+ name: accuracy
216
+ source:
217
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=MaziyarPanahi/calme-2.3-phi3-4b
218
+ name: Open LLM Leaderboard
219
+
220
+ ---
221
+
222
+ ![](https://cdn.discordapp.com/attachments/791342238541152306/1264099835221381251/image.png?ex=669ca436&is=669b52b6&hm=129f56187c31e1ed22cbd1bcdbc677a2baeea5090761d2f1a458c8b1ec7cca4b&)
223
+
224
+ # QuantFactory/calme-2.3-phi3-4b-GGUF
225
+ This is quantized version of [MaziyarPanahi/calme-2.3-phi3-4b](https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b) created using llama.cpp
226
+
227
+ # Original Model Card
228
+
229
+
230
+ <img src="./phi-3-instruct.webp" alt="Phi-3 Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
231
+
232
+
233
+ # MaziyarPanahi/calme-2.3-phi3-4b
234
+
235
+ This model is a fine-tune (DPO) of `microsoft/Phi-3-mini-4k-instruct` model.
236
+
237
+ # ⚡ Quantized GGUF
238
+
239
+ All GGUF models are available here: [MaziyarPanahi/calme-2.3-phi3-4b-GGUF](https://huggingface.co/MaziyarPanahi/calme-2.3-phi3-4b-GGUF)
240
+
241
+ # 🏆 [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
242
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__calme-2.3-phi3-4b)
243
+
244
+
245
+
246
+ ** Leaderboard 2**
247
+
248
+ | Metric |Value|
249
+ |-------------------|----:|
250
+ |Avg. |23.38|
251
+ |IFEval (0-Shot) |49.26|
252
+ |BBH (3-Shot) |37.66|
253
+ |MATH Lvl 5 (4-Shot)| 2.95|
254
+ |GPQA (0-shot) | 9.06|
255
+ |MuSR (0-shot) | 7.75|
256
+ |MMLU-PRO (5-shot) |31.42|
257
+
258
+
259
+ ** Leaderboard 1**
260
+
261
+ | Metric |Value|
262
+ |---------------------------------|----:|
263
+ |Avg. |70.26|
264
+ |AI2 Reasoning Challenge (25-Shot)|63.48|
265
+ |HellaSwag (10-Shot) |80.86|
266
+ |MMLU (5-Shot) |69.24|
267
+ |TruthfulQA (0-shot) |60.66|
268
+ |Winogrande (5-shot) |72.77|
269
+ |GSM8k (5-shot) |74.53|
270
+
271
+
272
+
273
+ `MaziyarPanahi/calme-2.3-phi3-4b` is the best-performing Phi-3-mini-4k model on the Open LLM Leaderboard. (03/06/2024).
274
+
275
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/5fd5e18a90b6dc4633f6d292/tKhQ55r7znR4X8GofwYj1.png)
276
+
277
+ # Prompt Template
278
+
279
+ This model uses `ChatML` prompt template:
280
+
281
+ ```
282
+ <|im_start|>system
283
+ {System}
284
+ <|im_end|>
285
+ <|im_start|>user
286
+ {User}
287
+ <|im_end|>
288
+ <|im_start|>assistant
289
+ {Assistant}
290
+ ````
291
+
292
+ # How to use
293
+
294
+ You can use this model by using `MaziyarPanahi/calme-2.3-phi3-4b` as the model name in Hugging Face's
295
+ transformers library.
296
+
297
+ ```python
298
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
299
+ from transformers import pipeline
300
+ import torch
301
+
302
+ model_id = "MaziyarPanahi/calme-2.3-phi3-4b"
303
+
304
+ model = AutoModelForCausalLM.from_pretrained(
305
+ model_id,
306
+ torch_dtype=torch.bfloat16,
307
+ device_map="auto",
308
+ trust_remote_code=True,
309
+ # attn_implementation="flash_attention_2"
310
+ )
311
+
312
+ tokenizer = AutoTokenizer.from_pretrained(
313
+ model_id,
314
+ trust_remote_code=True
315
+ )
316
+
317
+ streamer = TextStreamer(tokenizer)
318
+
319
+ messages = [
320
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
321
+ {"role": "user", "content": "Who are you?"},
322
+ ]
323
+
324
+ # this should work perfectly for the model to stop generating
325
+ terminators = [
326
+ tokenizer.eos_token_id, # this should be <|im_end|>
327
+ tokenizer.convert_tokens_to_ids("<|assistant|>"), # sometimes model stops generating at <|assistant|>
328
+ tokenizer.convert_tokens_to_ids("<|end|>") # sometimes model stops generating at <|end|>
329
+ ]
330
+
331
+ pipe = pipeline(
332
+ "text-generation",
333
+ model=model,
334
+ tokenizer=tokenizer,
335
+ )
336
+
337
+ generation_args = {
338
+ "max_new_tokens": 500,
339
+ "return_full_text": False,
340
+ "temperature": 0.0,
341
+ "do_sample": False,
342
+ "streamer": streamer,
343
+ "eos_token_id": terminators,
344
+ }
345
+
346
+ output = pipe(messages, **generation_args)
347
+ print(output[0]['generated_text'])
348
+
349
+
350
+ ```
351
+