Update README.md
Browse files
README.md
CHANGED
@@ -36,8 +36,18 @@ If you want to run it on your own local computer, you will need approximately 8.
|
|
36 |
必要なライブラリのインストール
|
37 |
Installation of required libraries
|
38 |
```
|
39 |
-
|
|
|
40 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
41 |
```
|
42 |
|
43 |
サンプルスクリプト
|
@@ -61,14 +71,13 @@ def trans(my_str):
|
|
61 |
|
62 |
# Translation
|
63 |
generated_ids = model.generate(input_ids=input_ids,
|
64 |
-
num_beams=
|
65 |
use_cache=True,
|
66 |
prompt_lookup_num_tokens=10
|
67 |
)
|
68 |
full_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
|
69 |
return full_outputs[0].split("### Answer:\n")[-1].strip()
|
70 |
|
71 |
-
|
72 |
ret = trans("""
|
73 |
### Instructions:
|
74 |
Translate Japanese to English.
|
@@ -90,16 +99,20 @@ There are two types of instructions: "Translate Japanese to English." and "Trans
|
|
90 |
実験的な試みとして、インフォーマルな場面を想定した翻訳を行う際にsubculture文脈指定ができるようになっています。
|
91 |
As an experiment, subculture context can be specified when translating for informal situations.
|
92 |
|
|
|
93 |
Translate English to Japanese within the context of subculture.
|
94 |
|
95 |
|
96 |
## 留意事項 Attention
|
97 |
-
**Do not save this adapter merged with the base model**, as there exists a bug that reduces performance when saving this adapter merged with the model.
|
98 |
|
99 |
-
|
|
|
100 |
|
|
|
|
|
101 |
|
102 |
### 利用規約 Terms of Use
|
|
|
103 |
基本的にはgemmaと同じライセンスです
|
104 |
Basically the same license as gemma.
|
105 |
|
@@ -112,13 +125,14 @@ Our previous model, ALMA-7B-Ja-V2, has over 150K downloads, but we have no idea
|
|
112 |
そのため、使用した後は[Googleフォームに感想や今後期待する方向性、気が付いた誤訳の例などを記入](https://forms.gle/Ycr9nWumvGamiNma9)してください。
|
113 |
So, after you use it, please [fill out the Google form below with your impressions, future directions you expect us to take, and examples of mistranslations you have noticed](https://forms.gle/Ycr9nWumvGamiNma9).
|
114 |
|
115 |
-
|
116 |
-
We do not collect personal information, so please feel free to fill out the form!
|
117 |
|
118 |
どんなご意見でも感謝します!
|
119 |
Any feedback would be appreciated!
|
120 |
|
121 |
### 謝辞 Acknowledgment
|
|
|
122 |
Original Base Model
|
123 |
google/gemma-7b
|
124 |
https://huggingface.co/google/gemma-7b
|
@@ -131,7 +145,6 @@ QLoRA Adapter
|
|
131 |
webbigdata/C3TR-Adapter
|
132 |
https://huggingface.co/webbigdata/C3TR-Adapter
|
133 |
|
134 |
-
|
135 |
This adapter was trained with Unsloth.
|
136 |
https://github.com/unslothai/unsloth
|
137 |
|
|
|
36 |
必要なライブラリのインストール
|
37 |
Installation of required libraries
|
38 |
```
|
39 |
+
# first install pytorch. check official documents.
|
40 |
+
# https://pytorch.org/get-started/locally/#start-locally
|
41 |
|
42 |
+
# example for linux user.
|
43 |
+
# pip3 install torch torchvision torchaudio
|
44 |
+
|
45 |
+
# example for windows user.
|
46 |
+
# pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
|
47 |
+
|
48 |
+
pip install transformers==4.38.2
|
49 |
+
pip install peft==0.9.0
|
50 |
+
pip install bitsandbytes==0.42.0
|
51 |
```
|
52 |
|
53 |
サンプルスクリプト
|
|
|
71 |
|
72 |
# Translation
|
73 |
generated_ids = model.generate(input_ids=input_ids,
|
74 |
+
num_beams=1, max_new_tokens=800,
|
75 |
use_cache=True,
|
76 |
prompt_lookup_num_tokens=10
|
77 |
)
|
78 |
full_outputs = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)
|
79 |
return full_outputs[0].split("### Answer:\n")[-1].strip()
|
80 |
|
|
|
81 |
ret = trans("""
|
82 |
### Instructions:
|
83 |
Translate Japanese to English.
|
|
|
99 |
実験的な試みとして、インフォーマルな場面を想定した翻訳を行う際にsubculture文脈指定ができるようになっています。
|
100 |
As an experiment, subculture context can be specified when translating for informal situations.
|
101 |
|
102 |
+
eg:
|
103 |
Translate English to Japanese within the context of subculture.
|
104 |
|
105 |
|
106 |
## 留意事項 Attention
|
|
|
107 |
|
108 |
+
このアダプターをモデルとマージして保存すると性能が下がってしまう不具合が存在するため、**ベースモデル(gemma-7b-bnb-4bit)とアダプターをマージして保存しないでください**
|
109 |
+
**Do not save this adapter merged with the base model(gemma-7b-bnb-4bit)**, as there exists a bug that reduces performance when saving this adapter merged with the model.
|
110 |
|
111 |
+
どうしてもマージしたい場合は必ずPerplexityではなく、翻訳ベンチマークで性能を確認してから使うようにしてください
|
112 |
+
If you must merge, be sure to use a translation benchmark to check performance, not Perplexity!
|
113 |
|
114 |
### 利用規約 Terms of Use
|
115 |
+
|
116 |
基本的にはgemmaと同じライセンスです
|
117 |
Basically the same license as gemma.
|
118 |
|
|
|
125 |
そのため、使用した後は[Googleフォームに感想や今後期待する方向性、気が付いた誤訳の例などを記入](https://forms.gle/Ycr9nWumvGamiNma9)してください。
|
126 |
So, after you use it, please [fill out the Google form below with your impressions, future directions you expect us to take, and examples of mistranslations you have noticed](https://forms.gle/Ycr9nWumvGamiNma9).
|
127 |
|
128 |
+
個人情報やメールアドレスは収集しないので、気軽にご記入をお願いします
|
129 |
+
We do not collect personal information or email address, so please feel free to fill out the form!
|
130 |
|
131 |
どんなご意見でも感謝します!
|
132 |
Any feedback would be appreciated!
|
133 |
|
134 |
### 謝辞 Acknowledgment
|
135 |
+
|
136 |
Original Base Model
|
137 |
google/gemma-7b
|
138 |
https://huggingface.co/google/gemma-7b
|
|
|
145 |
webbigdata/C3TR-Adapter
|
146 |
https://huggingface.co/webbigdata/C3TR-Adapter
|
147 |
|
|
|
148 |
This adapter was trained with Unsloth.
|
149 |
https://github.com/unslothai/unsloth
|
150 |
|