Update README.md
Browse files
README.md
CHANGED
@@ -28,4 +28,91 @@ library_name: transformers
|
|
28 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
29 |
- **Layers**: 36
|
30 |
- **Attention Heads (GQA)**: 24 for Q, 4 for KV
|
31 |
-
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
28 |
- **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
|
29 |
- **Layers**: 36
|
30 |
- **Attention Heads (GQA)**: 24 for Q, 4 for KV
|
31 |
+
- **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
|
32 |
+
|
33 |
+
### Requirements
|
34 |
+
The code of Qwen2.5 has been in the latest Hugging face transformers and we advise you to use the latest version of transformers.
|
35 |
+
```
|
36 |
+
pip install autoawq -q
|
37 |
+
pip install --upgrade torch -q
|
38 |
+
pip install --upgrade transformers -q
|
39 |
+
```
|
40 |
+
|
41 |
+
With transformers<4.37.0, you will encounter the following error:
|
42 |
+
```
|
43 |
+
KeyError: 'qwen2'
|
44 |
+
```
|
45 |
+
Also check out our [AWQ documentation](https://qwen.readthedocs.io/en/latest/quantization/awq.html) for more usage guide.
|
46 |
+
With pytorch<2.4.0, you will encounter the following error:
|
47 |
+
```
|
48 |
+
AttributeError: module 'torch.library' has no attribute 'register_fake'
|
49 |
+
```
|
50 |
+
|
51 |
+
### Quickstart
|
52 |
+
We use a special RuQwen2ForCausalLM class to work with this model:
|
53 |
+
```
|
54 |
+
from transformers import Qwen2ForCausalLM, AutoConfig, AutoTokenizer
|
55 |
+
import torch
|
56 |
+
|
57 |
+
class RuQwen2ForCausalLM(Qwen2ForCausalLM):
|
58 |
+
def __init__(self, config):
|
59 |
+
super().__init__(config)
|
60 |
+
|
61 |
+
if hasattr(self, "lm_head") and isinstance(self.lm_head, torch.nn.Linear):
|
62 |
+
if self.lm_head.bias is None:
|
63 |
+
self.config.add_bias_to_lm_head = True
|
64 |
+
self._add_bias_to_lm_head()
|
65 |
+
|
66 |
+
def _add_bias_to_lm_head(self):
|
67 |
+
"""Добавляет bias в lm_head, если его нет."""
|
68 |
+
old_lm_head = self.lm_head
|
69 |
+
# lm_head с bias
|
70 |
+
self.lm_head = torch.nn.Linear(
|
71 |
+
old_lm_head.in_features,
|
72 |
+
old_lm_head.out_features,
|
73 |
+
dtype=self.model.dtype,
|
74 |
+
bias=True,
|
75 |
+
)
|
76 |
+
with torch.no_grad():
|
77 |
+
self.lm_head.weight = old_lm_head.weight
|
78 |
+
torch.nn.init.zeros_(self.lm_head.bias)
|
79 |
+
|
80 |
+
@classmethod
|
81 |
+
def from_pretrained(cls, model_name, *args, **kwargs):
|
82 |
+
# Загружает модель с конфигурацией
|
83 |
+
model = super().from_pretrained(model_name, *args, **kwargs)
|
84 |
+
|
85 |
+
if hasattr(model.config, "add_bias_to_lm_head") and not model.config.add_bias_to_lm_head:
|
86 |
+
model._add_bias_to_lm_head()
|
87 |
+
|
88 |
+
return model
|
89 |
+
|
90 |
+
def save_pretrained(self, save_directory, *args, **kwargs):
|
91 |
+
self.config.add_bias_to_lm_head = self.lm_head.bias is not None
|
92 |
+
super().save_pretrained(save_directory, *args, **kwargs)
|
93 |
+
```
|
94 |
+
Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.
|
95 |
+
```
|
96 |
+
def generate(messages):
|
97 |
+
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
|
98 |
+
output = model.generate(input_ids,
|
99 |
+
max_new_tokens=1024,
|
100 |
+
do_sample=False,
|
101 |
+
temperature=None,
|
102 |
+
top_k=None,
|
103 |
+
top_p=None)
|
104 |
+
generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
|
105 |
+
return generated_text
|
106 |
+
|
107 |
+
model_name = 'FractalGPT/RuQwen2.5-3B-Instruct-AWQ'
|
108 |
+
model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
|
109 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name)
|
110 |
+
|
111 |
+
prompt = "Классификация медицинских терминов"
|
112 |
+
messages = [
|
113 |
+
{"role": "system", "content": "You are RuQwen, created by FractalGPT. You are a helpful assistant."},
|
114 |
+
{"role": "user", "content": prompt}
|
115 |
+
]
|
116 |
+
|
117 |
+
print(generate(messages))
|
118 |
+
```
|