MoreWrong commited on
Commit
e15943b
1 Parent(s): 5a3c4b5

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +234 -0
README.md ADDED
@@ -0,0 +1,234 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: llama3
5
+ library_name: transformers
6
+ tags:
7
+ - axolotl
8
+ - finetune
9
+ - dpo
10
+ - facebook
11
+ - meta
12
+ - pytorch
13
+ - llama
14
+ - llama-3
15
+ base_model: meta-llama/Meta-Llama-3-8B-Instruct
16
+ datasets:
17
+ - Intel/orca_dpo_pairs
18
+ model_name: Llama-3-8B-Instruct-DPO-v0.3
19
+ pipeline_tag: text-generation
20
+ license_name: llama3
21
+ license_link: LICENSE
22
+ inference: false
23
+ model_creator: MaziyarPanahi
24
+ quantized_by: MaziyarPanahi
25
+ model-index:
26
+ - name: Llama-3-8B-Instruct-DPO-v0.3
27
+ results:
28
+ - task:
29
+ type: text-generation
30
+ name: Text Generation
31
+ dataset:
32
+ name: AI2 Reasoning Challenge (25-Shot)
33
+ type: ai2_arc
34
+ config: ARC-Challenge
35
+ split: test
36
+ args:
37
+ num_few_shot: 25
38
+ metrics:
39
+ - type: acc_norm
40
+ value: 62.63
41
+ name: normalized accuracy
42
+ source:
43
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
44
+ name: Open LLM Leaderboard
45
+ - task:
46
+ type: text-generation
47
+ name: Text Generation
48
+ dataset:
49
+ name: HellaSwag (10-Shot)
50
+ type: hellaswag
51
+ split: validation
52
+ args:
53
+ num_few_shot: 10
54
+ metrics:
55
+ - type: acc_norm
56
+ value: 79.2
57
+ name: normalized accuracy
58
+ source:
59
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
60
+ name: Open LLM Leaderboard
61
+ - task:
62
+ type: text-generation
63
+ name: Text Generation
64
+ dataset:
65
+ name: MMLU (5-Shot)
66
+ type: cais/mmlu
67
+ config: all
68
+ split: test
69
+ args:
70
+ num_few_shot: 5
71
+ metrics:
72
+ - type: acc
73
+ value: 68.33
74
+ name: accuracy
75
+ source:
76
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
77
+ name: Open LLM Leaderboard
78
+ - task:
79
+ type: text-generation
80
+ name: Text Generation
81
+ dataset:
82
+ name: TruthfulQA (0-shot)
83
+ type: truthful_qa
84
+ config: multiple_choice
85
+ split: validation
86
+ args:
87
+ num_few_shot: 0
88
+ metrics:
89
+ - type: mc2
90
+ value: 53.29
91
+ source:
92
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
93
+ name: Open LLM Leaderboard
94
+ - task:
95
+ type: text-generation
96
+ name: Text Generation
97
+ dataset:
98
+ name: Winogrande (5-shot)
99
+ type: winogrande
100
+ config: winogrande_xl
101
+ split: validation
102
+ args:
103
+ num_few_shot: 5
104
+ metrics:
105
+ - type: acc
106
+ value: 75.37
107
+ name: accuracy
108
+ source:
109
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
110
+ name: Open LLM Leaderboard
111
+ - task:
112
+ type: text-generation
113
+ name: Text Generation
114
+ dataset:
115
+ name: GSM8k (5-shot)
116
+ type: gsm8k
117
+ config: main
118
+ split: test
119
+ args:
120
+ num_few_shot: 5
121
+ metrics:
122
+ - type: acc
123
+ value: 70.58
124
+ name: accuracy
125
+ source:
126
+ url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
127
+ name: Open LLM Leaderboard
128
+ ---
129
+
130
+ <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
131
+
132
+
133
+ # Llama-3-8B-Instruct-DPO-v0.3 (32k)
134
+
135
+ This model is a fine-tune (DPO) of `meta-llama/Meta-Llama-3-8B-Instruct` model. I have used `rope_theta` to extend the context length up to 32K safely.
136
+
137
+ # Quantized GGUF
138
+
139
+ All GGUF models come with context length of `32000`: [Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF)
140
+
141
+ # Prompt Template
142
+
143
+ This model uses `ChatML` prompt template:
144
+
145
+ ```
146
+ <|im_start|>system
147
+ {System}
148
+ <|im_end|>
149
+ <|im_start|>user
150
+ {User}
151
+ <|im_end|>
152
+ <|im_start|>assistant
153
+ {Assistant}
154
+ ````
155
+
156
+ # How to use
157
+
158
+ You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3` as the model name in Hugging Face's
159
+ transformers library.
160
+
161
+ ```python
162
+ from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
163
+ from transformers import pipeline
164
+ import torch
165
+
166
+ model_id = "MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3"
167
+
168
+ model = AutoModelForCausalLM.from_pretrained(
169
+ model_id,
170
+ torch_dtype=torch.bfloat16,
171
+ device_map="auto",
172
+ trust_remote_code=True,
173
+ # attn_implementation="flash_attention_2"
174
+ )
175
+
176
+ tokenizer = AutoTokenizer.from_pretrained(
177
+ model_id,
178
+ trust_remote_code=True
179
+ )
180
+
181
+ streamer = TextStreamer(tokenizer)
182
+
183
+ pipeline = pipeline(
184
+ "text-generation",
185
+ model=model,
186
+ tokenizer=tokenizer,
187
+ model_kwargs={"torch_dtype": torch.bfloat16},
188
+ streamer=streamer
189
+ )
190
+
191
+ # Then you can use the pipeline to generate text.
192
+
193
+ messages = [
194
+ {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
195
+ {"role": "user", "content": "Who are you?"},
196
+ ]
197
+
198
+ prompt = tokenizer.apply_chat_template(
199
+ messages,
200
+ tokenize=False,
201
+ add_generation_prompt=True
202
+ )
203
+
204
+ terminators = [
205
+ tokenizer.eos_token_id,
206
+ tokenizer.convert_tokens_to_ids("<|im_end|>")
207
+ ]
208
+
209
+ outputs = pipeline(
210
+ prompt,
211
+ max_new_tokens=8192,
212
+ eos_token_id=terminators,
213
+ do_sample=True,
214
+ temperature=0.6,
215
+ top_p=0.95,
216
+ )
217
+ print(outputs[0]["generated_text"][len(prompt):])
218
+ ```
219
+
220
+
221
+
222
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
223
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__Llama-3-8B-Instruct-DPO-v0.3)
224
+
225
+ | Metric |Value|
226
+ |---------------------------------|----:|
227
+ |Avg. |68.23|
228
+ |AI2 Reasoning Challenge (25-Shot)|62.63|
229
+ |HellaSwag (10-Shot) |79.20|
230
+ |MMLU (5-Shot) |68.33|
231
+ |TruthfulQA (0-shot) |53.29|
232
+ |Winogrande (5-shot) |75.37|
233
+ |GSM8k (5-shot) |70.58|
234
+