MoreWrong commited on
Commit
b6d4154
1 Parent(s): e15943b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -209
README.md CHANGED
@@ -13,222 +13,17 @@ tags:
13
  - llama
14
  - llama-3
15
  base_model: meta-llama/Meta-Llama-3-8B-Instruct
16
- datasets:
17
- - Intel/orca_dpo_pairs
18
  model_name: Llama-3-8B-Instruct-DPO-v0.3
19
  pipeline_tag: text-generation
20
  license_name: llama3
21
  license_link: LICENSE
22
- inference: false
23
- model_creator: MaziyarPanahi
24
- quantized_by: MaziyarPanahi
25
- model-index:
26
- - name: Llama-3-8B-Instruct-DPO-v0.3
27
- results:
28
- - task:
29
- type: text-generation
30
- name: Text Generation
31
- dataset:
32
- name: AI2 Reasoning Challenge (25-Shot)
33
- type: ai2_arc
34
- config: ARC-Challenge
35
- split: test
36
- args:
37
- num_few_shot: 25
38
- metrics:
39
- - type: acc_norm
40
- value: 62.63
41
- name: normalized accuracy
42
- source:
43
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
44
- name: Open LLM Leaderboard
45
- - task:
46
- type: text-generation
47
- name: Text Generation
48
- dataset:
49
- name: HellaSwag (10-Shot)
50
- type: hellaswag
51
- split: validation
52
- args:
53
- num_few_shot: 10
54
- metrics:
55
- - type: acc_norm
56
- value: 79.2
57
- name: normalized accuracy
58
- source:
59
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
60
- name: Open LLM Leaderboard
61
- - task:
62
- type: text-generation
63
- name: Text Generation
64
- dataset:
65
- name: MMLU (5-Shot)
66
- type: cais/mmlu
67
- config: all
68
- split: test
69
- args:
70
- num_few_shot: 5
71
- metrics:
72
- - type: acc
73
- value: 68.33
74
- name: accuracy
75
- source:
76
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
77
- name: Open LLM Leaderboard
78
- - task:
79
- type: text-generation
80
- name: Text Generation
81
- dataset:
82
- name: TruthfulQA (0-shot)
83
- type: truthful_qa
84
- config: multiple_choice
85
- split: validation
86
- args:
87
- num_few_shot: 0
88
- metrics:
89
- - type: mc2
90
- value: 53.29
91
- source:
92
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
93
- name: Open LLM Leaderboard
94
- - task:
95
- type: text-generation
96
- name: Text Generation
97
- dataset:
98
- name: Winogrande (5-shot)
99
- type: winogrande
100
- config: winogrande_xl
101
- split: validation
102
- args:
103
- num_few_shot: 5
104
- metrics:
105
- - type: acc
106
- value: 75.37
107
- name: accuracy
108
- source:
109
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
110
- name: Open LLM Leaderboard
111
- - task:
112
- type: text-generation
113
- name: Text Generation
114
- dataset:
115
- name: GSM8k (5-shot)
116
- type: gsm8k
117
- config: main
118
- split: test
119
- args:
120
- num_few_shot: 5
121
- metrics:
122
- - type: acc
123
- value: 70.58
124
- name: accuracy
125
- source:
126
- url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard?query=MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
127
- name: Open LLM Leaderboard
128
  ---
129
 
130
  <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
131
 
132
 
133
- # Llama-3-8B-Instruct-DPO-v0.3 (32k)
134
-
135
- This model is a fine-tune (DPO) of `meta-llama/Meta-Llama-3-8B-Instruct` model. I have used `rope_theta` to extend the context length up to 32K safely.
136
-
137
- # Quantized GGUF
138
-
139
- All GGUF models come with context length of `32000`: [Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3-32k-GGUF)
140
-
141
- # Prompt Template
142
-
143
- This model uses `ChatML` prompt template:
144
-
145
- ```
146
- <|im_start|>system
147
- {System}
148
- <|im_end|>
149
- <|im_start|>user
150
- {User}
151
- <|im_end|>
152
- <|im_start|>assistant
153
- {Assistant}
154
- ````
155
-
156
- # How to use
157
-
158
- You can use this model by using `MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3` as the model name in Hugging Face's
159
- transformers library.
160
-
161
- ```python
162
- from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
163
- from transformers import pipeline
164
- import torch
165
-
166
- model_id = "MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3"
167
-
168
- model = AutoModelForCausalLM.from_pretrained(
169
- model_id,
170
- torch_dtype=torch.bfloat16,
171
- device_map="auto",
172
- trust_remote_code=True,
173
- # attn_implementation="flash_attention_2"
174
- )
175
-
176
- tokenizer = AutoTokenizer.from_pretrained(
177
- model_id,
178
- trust_remote_code=True
179
- )
180
-
181
- streamer = TextStreamer(tokenizer)
182
-
183
- pipeline = pipeline(
184
- "text-generation",
185
- model=model,
186
- tokenizer=tokenizer,
187
- model_kwargs={"torch_dtype": torch.bfloat16},
188
- streamer=streamer
189
- )
190
-
191
- # Then you can use the pipeline to generate text.
192
-
193
- messages = [
194
- {"role": "system", "content": "You are a pirate chatbot who always responds in pirate speak!"},
195
- {"role": "user", "content": "Who are you?"},
196
- ]
197
-
198
- prompt = tokenizer.apply_chat_template(
199
- messages,
200
- tokenize=False,
201
- add_generation_prompt=True
202
- )
203
-
204
- terminators = [
205
- tokenizer.eos_token_id,
206
- tokenizer.convert_tokens_to_ids("<|im_end|>")
207
- ]
208
-
209
- outputs = pipeline(
210
- prompt,
211
- max_new_tokens=8192,
212
- eos_token_id=terminators,
213
- do_sample=True,
214
- temperature=0.6,
215
- top_p=0.95,
216
- )
217
- print(outputs[0]["generated_text"][len(prompt):])
218
- ```
219
-
220
-
221
-
222
- # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
223
- Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_MaziyarPanahi__Llama-3-8B-Instruct-DPO-v0.3)
224
-
225
- | Metric |Value|
226
- |---------------------------------|----:|
227
- |Avg. |68.23|
228
- |AI2 Reasoning Challenge (25-Shot)|62.63|
229
- |HellaSwag (10-Shot) |79.20|
230
- |MMLU (5-Shot) |68.33|
231
- |TruthfulQA (0-shot) |53.29|
232
- |Winogrande (5-shot) |75.37|
233
- |GSM8k (5-shot) |70.58|
234
 
 
 
13
  - llama
14
  - llama-3
15
  base_model: meta-llama/Meta-Llama-3-8B-Instruct
 
 
16
  model_name: Llama-3-8B-Instruct-DPO-v0.3
17
  pipeline_tag: text-generation
18
  license_name: llama3
19
  license_link: LICENSE
20
+ inference: true
21
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
  ---
23
 
24
  <img src="./llama-3-merges.webp" alt="Llama-3 DPO Logo" width="500" style="margin-left:'auto' margin-right:'auto' display:'block'"/>
25
 
26
 
27
+ #StudyBuddy!
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
 
29
+ This model is a fine-tune (DPO) of `meta-llama/Meta-Llama-3-8B-Instruct` model.