marattt commited on
Commit
0c22e64
1 Parent(s): 24882be

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -1
README.md CHANGED
@@ -28,4 +28,91 @@ library_name: transformers
28
  - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
29
  - **Layers**: 36
30
  - **Attention Heads (GQA)**: 24 for Q, 4 for KV
31
- - **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  - **Architecture**: Transformer with RoPE, SwiGLU, RMSNorm, and Attention QKV bias
29
  - **Layers**: 36
30
  - **Attention Heads (GQA)**: 24 for Q, 4 for KV
31
+ - **Context Length**: Supports a full context of 131,072 tokens and generation of up to 8,192 tokens
32
+
33
+ ### Requirements
34
+ The code of Qwen2.5 has been in the latest Hugging face transformers and we advise you to use the latest version of transformers.
35
+ ```
36
+ pip install autoawq -q
37
+ pip install --upgrade torch -q
38
+ pip install --upgrade transformers -q
39
+ ```
40
+
41
+ With transformers<4.37.0, you will encounter the following error:
42
+ ```
43
+ KeyError: 'qwen2'
44
+ ```
45
+ Also check out our [AWQ documentation](https://qwen.readthedocs.io/en/latest/quantization/awq.html) for more usage guide.
46
+ With pytorch<2.4.0, you will encounter the following error:
47
+ ```
48
+ AttributeError: module 'torch.library' has no attribute 'register_fake'
49
+ ```
50
+
51
+ ### Quickstart
52
+ We use a special RuQwen2ForCausalLM class to work with this model:
53
+ ```
54
+ from transformers import Qwen2ForCausalLM, AutoConfig, AutoTokenizer
55
+ import torch
56
+
57
+ class RuQwen2ForCausalLM(Qwen2ForCausalLM):
58
+ def __init__(self, config):
59
+ super().__init__(config)
60
+
61
+ if hasattr(self, "lm_head") and isinstance(self.lm_head, torch.nn.Linear):
62
+ if self.lm_head.bias is None:
63
+ self.config.add_bias_to_lm_head = True
64
+ self._add_bias_to_lm_head()
65
+
66
+ def _add_bias_to_lm_head(self):
67
+ """Добавляет bias в lm_head, если его нет."""
68
+ old_lm_head = self.lm_head
69
+ # lm_head с bias
70
+ self.lm_head = torch.nn.Linear(
71
+ old_lm_head.in_features,
72
+ old_lm_head.out_features,
73
+ dtype=self.model.dtype,
74
+ bias=True,
75
+ )
76
+ with torch.no_grad():
77
+ self.lm_head.weight = old_lm_head.weight
78
+ torch.nn.init.zeros_(self.lm_head.bias)
79
+
80
+ @classmethod
81
+ def from_pretrained(cls, model_name, *args, **kwargs):
82
+ # Загружает модель с конфигурацией
83
+ model = super().from_pretrained(model_name, *args, **kwargs)
84
+
85
+ if hasattr(model.config, "add_bias_to_lm_head") and not model.config.add_bias_to_lm_head:
86
+ model._add_bias_to_lm_head()
87
+
88
+ return model
89
+
90
+ def save_pretrained(self, save_directory, *args, **kwargs):
91
+ self.config.add_bias_to_lm_head = self.lm_head.bias is not None
92
+ super().save_pretrained(save_directory, *args, **kwargs)
93
+ ```
94
+ Here provides a code snippet with apply_chat_template to show you how to load the tokenizer and model and how to generate contents.
95
+ ```
96
+ def generate(messages):
97
+ input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
98
+ output = model.generate(input_ids,
99
+ max_new_tokens=1024,
100
+ do_sample=False,
101
+ temperature=None,
102
+ top_k=None,
103
+ top_p=None)
104
+ generated_text = tokenizer.decode(output[0], skip_special_tokens=False)#.split('<|im_start|>assistant')[1]
105
+ return generated_text
106
+
107
+ model_name = 'FractalGPT/RuQwen2.5-3B-Instruct-AWQ'
108
+ model = Qwen2ForCausalLMWithBias.from_pretrained(model_name, torch_dtype=torch.float16)
109
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
110
+
111
+ prompt = "Классификация медицинских терминов"
112
+ messages = [
113
+ {"role": "system", "content": "You are RuQwen, created by FractalGPT. You are a helpful assistant."},
114
+ {"role": "user", "content": prompt}
115
+ ]
116
+
117
+ print(generate(messages))
118
+ ```