banghua commited on
Commit
17e74ff
1 Parent(s): 4e14be7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -0
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: other
3
+ language:
4
+ - en
5
+ library_name: transformers
6
+ tags:
7
+ - RLHF
8
+ - Nexusflow
9
+ - Athene
10
+ - Chat Model
11
+ ---
12
+ # Athene-V2-Chat-72B
13
+
14
+ We introduce Athene-V2-Chat-72B, an open-weights LLM that rivals GPT-4o across benchmarks. It is trained through RLHF based off Qwen-2.5-72B.
15
+ Athene-V2-Chat-72B excels in chat, math and coding. Its sister model, [Athene-V2-Agent-72B](https://huggingface.co/Nexusflow/Athene-V2-Chat), surpasses GPT-4o in complex function calling and agent applications.
16
+
17
+ - **Developed by:** The Nexusflow Team.
18
+ - **Model type:** Chat Model
19
+ - **Finetuned from model:** [Qwen 2.5 72B](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct).
20
+ - **License**: [Nexusflow Research License](https://huggingface.co/Nexusflow/Athene-V2-Chat/blob/main/Nexusflow_Research_License.pdf)
21
+ - **Blog**: https://nexusflow.ai/blogs/athene-V2
22
+
23
+ ## Usage
24
+ Athene-V2-Chat uses the same chat template as Qwen 2.5 72B. Below is an example simple usage using the Transformers library.
25
+ ```Python
26
+ import transformers
27
+ import torch
28
+ model_id = "Nexusflow/Athene-V2-Chat"
29
+ pipeline = transformers.pipeline(
30
+ "text-generation",
31
+ model=model_id,
32
+ model_kwargs={"torch_dtype": torch.bfloat16},
33
+ device_map="auto",
34
+ )
35
+ messages = [
36
+ {"role": "system", "content": "You are an Athene Noctura, you can only speak with owl sounds. Whoooo whooo."},
37
+ {"role": "user", "content": "Whooo are you?"},
38
+ ]
39
+ terminators = [
40
+ pipeline.tokenizer.eos_token_id,
41
+ pipeline.tokenizer.convert_tokens_to_ids("<|end_of_text|>")
42
+ ]
43
+ outputs = pipeline(
44
+ messages,
45
+ max_new_tokens=256,
46
+ eos_token_id=terminators,
47
+ do_sample=True,
48
+ temperature=0.6,
49
+ top_p=0.9,
50
+ )
51
+ print(outputs[0]["generated_text"][-1])
52
+ ```
53
+
54
+ We found that by adding system prompts that enforce the model to think step by step, the model can do even better in math and problems like counting `r`s in strawberry.
55
+
56
+ ## Acknowledgment
57
+ We would like to thank the [LMSYS Organization](https://lmsys.org/) for their support of testing the model. We would like to thank Meta AI and the open source community for their efforts in providing the datasets and base models.