adalbertojunior
commited on
Commit
•
b7d5dc0
1
Parent(s):
5e1147e
Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,30 @@ language:
|
|
7 |
|
8 |
This model draws inspiration from [SOLAR](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0), but introduces a novel approach to increasing the model's depth without the traditional method of duplicating layers.
|
9 |
By rearranging the order of layers during inference, it maintains the advantages of depth upscaling while preserving the original parameter count.
|
10 |
-
Furthermore, it undergoes additional fine-tuning using the Dolphin dataset. The foundational architecture for this experiment is based on [Dolphin](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
|
8 |
This model draws inspiration from [SOLAR](https://huggingface.co/upstage/SOLAR-10.7B-Instruct-v1.0), but introduces a novel approach to increasing the model's depth without the traditional method of duplicating layers.
|
9 |
By rearranging the order of layers during inference, it maintains the advantages of depth upscaling while preserving the original parameter count.
|
10 |
+
Furthermore, it undergoes additional fine-tuning using the Dolphin dataset. The foundational architecture for this experiment is based on [Dolphin](https://huggingface.co/cognitivecomputations/dolphin-2.6-mistral-7b-dpo-laser).
|
11 |
+
|
12 |
+
**Use**
|
13 |
+
|
14 |
+
```python
|
15 |
+
# pip install transformers
|
16 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
17 |
+
|
18 |
+
model_id = "adalbertojunior/DUSMistral"
|
19 |
+
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
|
20 |
+
model = AutoModelForCausalLM.from_pretrained(model_id, trust_remote_code=True)
|
21 |
+
|
22 |
+
# Format message with the CHATML chat template
|
23 |
+
messages = [{"role": "user", "content": "Hello, how are you?"}]
|
24 |
+
input_ids = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
|
25 |
+
|
26 |
+
|
27 |
+
gen_tokens = model.generate(
|
28 |
+
input_ids,
|
29 |
+
max_new_tokens=100,
|
30 |
+
do_sample=True,
|
31 |
+
temperature=0.3,
|
32 |
+
)
|
33 |
+
|
34 |
+
gen_text = tokenizer.decode(gen_tokens[0])
|
35 |
+
print(gen_text)
|
36 |
+
```
|