dahara1 commited on
Commit
67d20c5
1 Parent(s): a462821

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +132 -0
README.md ADDED
@@ -0,0 +1,132 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - npu
4
+ - amd
5
+ - llama2
6
+ - RyzenAI
7
+ ---
8
+
9
+ This model is finetuned [webbigdata/ALMA-7B-Ja-V2](https://huggingface.co/webbigdata/ALMA-7B-Ja-V2) and AWQ quantized and converted to run on the [NPU installed Ryzen AI PC](https://github.com/amd/RyzenAI-SW/issues/18), for example, Ryzen 9 7940HS Processor.
10
+
11
+ For set up Ryzen AI for LLMs in window 11, see [Running LLM on AMD NPU Hardware](https://www.hackster.io/gharada2013/running-llm-on-amd-npu-hardware-19322f).
12
+
13
+ The following sample assumes that the setup on the above page has been completed.
14
+
15
+ This model has only been tested on RyzenAI for Windows 11. It does not work in Linux environments such as WSL.
16
+
17
+ 2024/07/30
18
+ - [Ryzen AI Software 1.2](https://ryzenai.docs.amd.com/en/latest/) has been released. Please note that this model is based on [Ryzen AI Software 1.1](https://ryzenai.docs.amd.com/en/1.1/index.html) and operation with 1.2 has not been confirmed.
19
+ - [amd/RyzenAI-SW 1.2](https://github.com/amd/RyzenAI-SW) was announced on July 29, 2024. This sample for [amd/RyzenAI-SW 1.1](https://github.com/amd/RyzenAI-SW/tree/1.1). Please note that the folder and script contents have been completely changed.
20
+
21
+
22
+ ### setup
23
+ In cmd windows.
24
+ ```
25
+ conda activate ryzenai-transformers
26
+ <your_install_path>\RyzenAI-SW\example\transformers\setup.bat
27
+
28
+ pip install transformers==4.34.0
29
+ # downgrade Transformers library will cause the LLama 3 sample to stop working.
30
+ # If you want to run LLama 3 or above, update to pip install transformers==4.43.3
31
+ pip install -U "huggingface_hub[cli]"
32
+
33
+ huggingface-cli download dahara1/ALMA-Ja-V3-amd-npu --revision main --local-dir ALMA-Ja-V3-amd-npu
34
+
35
+ copy <your_ryzen_ai-sw_install_path>\RyzenAI-SW\example\transformers\models\llama2\modeling_llama_amd.py .
36
+
37
+ # set up Runtime. see https://ryzenai.docs.amd.com/en/latest/runtime_setup.html
38
+ set XLNX_VART_FIRMWARE=<your_firmware_install_path>\voe-4.0-win_amd64\1x4.xclbin
39
+ set NUM_OF_DPU_RUNNERS=1
40
+
41
+ # save below sample script as utf8 and ALMA-Ja-V3-amd-npu-test.py
42
+ python ALMA-Ja-V3-amd-npu-test.py
43
+ ```
44
+
45
+ ### Sample Script
46
+
47
+ ```
48
+ import torch
49
+ import psutil
50
+ import transformers
51
+ from transformers import AutoTokenizer, set_seed
52
+ import qlinear
53
+ import logging
54
+
55
+
56
+ def translation(instruction, input):
57
+ system = """You are a highly skilled professional Japanese-English and English-Japanese translator. Translate the given text accurately, taking into account the context and specific instructions provided. Steps may include hints enclosed in square brackets [] with the key and value separated by a colon:. Only when the subject is specified in the Japanese sentence, the subject will be added when translating into English. If no additional instructions or context are provided, use your expertise to consider what the most appropriate context is and provide a natural translation that aligns with that context. When translating, strive to faithfully reflect the meaning and tone of the original text, pay attention to cultural nuances and differences in language usage, and ensure that the translation is grammatically correct and easy to read. After completing the translation, review it once more to check for errors or unnatural expressions. For technical terms and proper nouns, either leave them in the original language or use appropriate translations as necessary. Take a deep breath, calm down, and start translating."""
58
+ prompt = f"""{system}
59
+
60
+ ### Instruction:
61
+ {instruction}
62
+
63
+ ### Input:
64
+ {input}
65
+
66
+ ### Response:
67
+ """
68
+
69
+ tokenized_input = tokenizer(prompt, return_tensors="pt",
70
+ padding=True, max_length=1600, truncation=True)
71
+
72
+ terminators = [
73
+ tokenizer.eos_token_id,
74
+ ]
75
+
76
+ outputs = model.generate(tokenized_input['input_ids'],
77
+ max_new_tokens=600,
78
+ eos_token_id=terminators,
79
+ attention_mask=tokenized_input['attention_mask'],
80
+ do_sample=True,
81
+ temperature=0.3,
82
+ top_p=0.5)
83
+ response = outputs[0][tokenized_input['input_ids'].shape[-1]:]
84
+ response_message = tokenizer.decode(response, skip_special_tokens=True)
85
+ return response_message
86
+
87
+
88
+ if __name__ == "__main__":
89
+
90
+ set_seed(123)
91
+ p = psutil.Process()
92
+ p.cpu_affinity([0, 1, 2, 3])
93
+ torch.set_num_threads(4)
94
+
95
+ tokenizer = AutoTokenizer.from_pretrained("ALMA-Ja-V3-amd-npu")
96
+ ckpt = "alma_w_bit_4_awq_fa_amd.pt"
97
+
98
+ model = torch.load(ckpt)
99
+ model.eval()
100
+ model = model.to(torch.bfloat16)
101
+
102
+ for n, m in model.named_modules():
103
+ if isinstance(m, qlinear.QLinearPerGrp):
104
+ print(f"Preparing weights of layer : {n}")
105
+ m.device = "aie"
106
+ m.quantize_weights()
107
+
108
+ model = torch.load(ckpt)
109
+ model.eval()
110
+ model = model.to(torch.bfloat16)
111
+
112
+ for n, m in model.named_modules():
113
+ if isinstance(m, qlinear.QLinearPerGrp):
114
+ print(f"Preparing weights of layer : {n}")
115
+ m.device = "aie"
116
+ m.quantize_weights()
117
+
118
+ print("Translate Japanese to English.", "面白きこともなき世を面白く住みなすものは心なりけり")
119
+ print("Translate English to Japanese.", "If today were the last day of your life, would you want to do what you are about to do today.")
120
+
121
+
122
+ ```
123
+
124
+ ![chat_image](alma-v3.png)
125
+
126
+ ## Acknowledgements
127
+ - [amd/RyzenAI-SW](https://github.com/amd/RyzenAI-SW)
128
+ Sample Code and Drivers.
129
+ - [mit-han-lab/llm-awq](https://github.com/mit-han-lab/llm-awq)
130
+ Thanks for AWQ quantization Method.
131
+ - [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct)
132
+ [Built with Meta Llama 3](https://llama.meta.com/llama3/license/)