mlabonne commited on
Commit
df5eefc
1 Parent(s): ef797d7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -12
README.md CHANGED
@@ -1,27 +1,31 @@
1
  ---
2
- base_model:
3
- - Qwen/Qwen2.5-72B-Instruct
 
 
 
 
4
  library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
 
 
9
  ---
10
- # merge
11
 
12
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
- ## Merge Details
15
- ### Merge Method
16
 
17
- This model was merged using the passthrough merge method.
18
 
19
- ### Models Merged
20
 
21
- The following models were included in the merge:
22
- * [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
23
 
24
- ### Configuration
25
 
26
  The following YAML configuration was used to produce this model:
27
 
@@ -52,3 +56,28 @@ merge_method: passthrough
52
  dtype: bfloat16
53
 
54
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: other
3
+ license_name: tongyi-qianwen
4
+ license_link: https://huggingface.co/Qwen/Qwen2-72B-Instruct/blob/main/LICENSE
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
  library_name: transformers
9
  tags:
10
  - mergekit
11
  - merge
12
+ - lazymergekit
13
+ base_model:
14
+ - Qwen/Qwen2.5-72B-Instruct
15
  ---
16
+ # BigQwen2.5-120B-Instruct
17
 
18
+ BigQwen2.5-120B-Instruct is a [Qwen/Qwen2-72B-Instruct](https://huggingface.co/Qwen/Qwen2-72B-Instruct) self-merge made with [MergeKit](https://github.com/arcee-ai/mergekit/tree/main).
19
 
20
+ It applies the [mlabonne/Meta-Llama-3-120B-Instruct](https://huggingface.co/mlabonne/Meta-Llama-3-120B-Instruct/) recipe.
 
21
 
22
+ I made it due to popular demand but I haven't tested it so use it at your own risk. ¯\\\_(ツ)_/¯
23
 
24
+ ## 🔍 Applications
25
 
26
+ It might be good for creative writing tasks. I recommend a context length of 32k but you can go up to 131,072 tokens in theory.
 
27
 
28
+ ## 🧩 Configuration
29
 
30
  The following YAML configuration was used to produce this model:
31
 
 
56
  dtype: bfloat16
57
 
58
  ```
59
+
60
+ ## 💻 Usage
61
+
62
+ ```python
63
+ !pip install -qU transformers accelerate
64
+
65
+ from transformers import AutoTokenizer
66
+ import transformers
67
+ import torch
68
+
69
+ model = "mlabonne/BigQwen2.5-120B-Instruct"
70
+ messages = [{"role": "user", "content": "What is a large language model?"}]
71
+
72
+ tokenizer = AutoTokenizer.from_pretrained(model)
73
+ prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
74
+ pipeline = transformers.pipeline(
75
+ "text-generation",
76
+ model=model,
77
+ torch_dtype=torch.float16,
78
+ device_map="auto",
79
+ )
80
+
81
+ outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
82
+ print(outputs[0]["generated_text"])
83
+ ```