Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,115 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
inference: false
|
3 |
+
language:
|
4 |
+
- ja
|
5 |
+
- en
|
6 |
+
- de
|
7 |
+
- is
|
8 |
+
- zh
|
9 |
+
- cs
|
10 |
+
---
|
11 |
+
# webbigdata/ALMA-7B-Ja-V2
|
12 |
+
|
13 |
+
ALMA-7B-Ja-V2は日本語から英語、英語から日本語の翻訳が可能な機械翻訳モデルです。
|
14 |
+
The ALMA-7B-Ja-V2 is a machine translation model capable of translating from Japanese to English and English to Japanese.
|
15 |
+
|
16 |
+
ALMA-7B-Ja-V2は以前のモデル(ALMA-7B-Ja)に更に学習を追加し、性能を向上しています。
|
17 |
+
The ALMA-7B-Ja-V2 adds further learning to the previous model (ALMA-7B-Ja) and improves performance.
|
18 |
+
|
19 |
+
日本語と英語間に加えて、このモデルは以下の言語間の翻訳能力も持っています。
|
20 |
+
In addition to translation between Japanese and English, the model also has the ability to translate the following four languages.
|
21 |
+
|
22 |
+
- ドイツ語 German(de) and 英語 English(en)
|
23 |
+
- 中国語 Chinese(zh) and 英語 English(en)
|
24 |
+
- アイスランド語 Icelandic(is) and 英語 English(en)
|
25 |
+
- チェコ語 Czech(cs) and 英語 English(en)
|
26 |
+
|
27 |
+
# ベンチマーク結果
|
28 |
+
|
29 |
+
Meta社の200言語以上の翻訳に対応した超多言語対応機械翻訳モデルNLLB-200シリーズと比較したベンチマーク結果は以下です。
|
30 |
+
Benchmark results compared to Meta's NLLB-200 series of super multilingual machine translation models, which support translations in over 200 languages, are shown below.
|
31 |
+
|
32 |
+
| Model Name | file size |E->J chrf++/F2|E->J comet|J->E chrf++/F2|J->E comet |
|
33 |
+
|------------------------------|-----------|--------------|----------|--------------|-----------|
|
34 |
+
| NLLB-200-Distilled | 2.46GB | 23.6/- | - | 50.2/- | - |
|
35 |
+
| NLLB-200-Distilled | 5.48GB | 25.4/- | - | 54.2/- | - |
|
36 |
+
| NLLB-200 | 5.48GB | 24.2/- | - | 53.6/- | - |
|
37 |
+
| NLLB-200 | 17.58GB | 25.2/- | - | 55.1/- | - |
|
38 |
+
| NLLB-200 | 220.18GB | 27.9/33.2 | 0.8908 | 55.8/59.8 | 0.8792 |
|
39 |
+
|
40 |
+
previous our model(ALMA-7B-Ja)
|
41 |
+
| Model Name | file size |E->J chrf++/F2|E->J comet|J->E chrf++/F2|J->E comet |
|
42 |
+
| webbigdata-ALMA-7B-Ja-q4_K_S | 3.6GB | -/24.2 | 0.8210 | -/54.2 | 0.8559 |
|
43 |
+
| ALMA-7B-Ja-GPTQ-Ja-En | 3.9GB | -/30.8 | 0.8743 | -/60.9 | 0.8743 |
|
44 |
+
| ALMA-Ja(Ours) | 13.48GB | -/31.8 | 0.8811 | -/61.6 | 0.8773 |
|
45 |
+
|
46 |
+
ALMA-7B-Ja-V2
|
47 |
+
| ALMA-7B-Ja-V2-GPTQ-Ja-En | 3.9GB | -/33.0 | 0.8818 | -/62.0 | 0.8774 |
|
48 |
+
| ALMA-Ja-V2(Ours) | 13.48GB | -/33.9 | 0.8820 | -/63.1 | 0.8873 |
|
49 |
+
| ALMA-Ja-V2-Lora(Ours) | 13.48GB | -/33.7 | 0.8843 | -/61.1 | 0.8775 |
|
50 |
+
|
51 |
+
|
52 |
+
様々なジャンルの文章を実際のアプリケーションと比較した結果は以下です。
|
53 |
+
Here are the results of a comparison of various genres of writing with the actual application.
|
54 |
+
|
55 |
+
政府の公式文章 Government Official Announcements
|
56 |
+
| |e->j chrF2++|e->j BLEU|e->j comet|j->e chrF2++|j->e BLEU|j->e comet|
|
57 |
+
|--------------------------|------------|---------|----------|------------|---------|----------|
|
58 |
+
| ALMA-7B-Ja-V2-GPTQ-Ja-En | 25.3 | 15.00 | 0.8848 | 60.3 | 26.82 | 0.6189 |
|
59 |
+
| ALMA-Ja-V2 | 27.2 | 15.60 | 0.8868 | 58.5 | 29.27 | 0.6155 |
|
60 |
+
| ALMA-7B-Ja-V2-Lora | 24.5 | 13.58 | 0.8670 | 50.7 | 21.85 | 0.6196 |
|
61 |
+
| gpt-3.5 | 34.6 | 28.33 | 0.8895 | 74.5 | 49.20 | 0.6382 |
|
62 |
+
| gpt-4.0 | 36.5 | 28.07 | 0.9255 | 62.5 | 33.63 | 0.6320 |
|
63 |
+
| google-translate | 43.5 | 35.37 | 0.9181 | 62.7 | 29.22 | 0.6446 |
|
64 |
+
| deepl | 43.5 | 35.74 | 0.9301 | 60.1 | 27.40 | 0.6389 |
|
65 |
+
|
66 |
+
二次創作 Fanfiction
|
67 |
+
| |e->j chrF2++|e->j BLEU|e->j comet|j->e chrF2++|j->e BLEU|j->e comet|
|
68 |
+
|--------------------------|------------|---------|----------|------------|---------|----------|
|
69 |
+
| ALMA-7B-Ja-V2-GPTQ-Ja-En | 27.6 | 18.28 | 0.8643 | 52.1 | 24.58 | 0.6106 |
|
70 |
+
| ALMA-Ja-V2 | 20.4 | 8.45 | 0.7870 | 48.7 | 23.06 | 0.6050 |
|
71 |
+
| ALMA-7B-Ja-V2-Lora | 23.9 | 18.55 | 0.8634 | 55.6 | 29.91 | 0.6093 |
|
72 |
+
| gpt-3.5 | 31.2 | 23.37 | 0.9001 | - | - | 0.5948 |
|
73 |
+
| gpt-4.0 | 30.7 | 24.31 | 0.8848 | 53.9 | 24.89 | 0.6163 |
|
74 |
+
| google-translate | 32.4 | 25.36 | 0.8968 | 58.5 | 29.88 | 0.6022 |
|
75 |
+
| deepl | 33.5 | 28.38 | 0.9094 | 60.0 | 31.14 | 0.6124 |
|
76 |
+
|
77 |
+
|
78 |
+
[Sample Code For Free Colab](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_Free_Colab_sample.ipynb)
|
79 |
+
|
80 |
+
|
81 |
+
|
82 |
+
## Other Version
|
83 |
+
|
84 |
+
### ALMA-7B-Ja-V2^GPTQ-Ja-En
|
85 |
+
GPTQ is quantized(reduce the size of the model) method and ALMA-7B-Ja-V2-GPTQ has GPTQ quantized version that reduces model size(3.9GB) and memory usage.
|
86 |
+
But the performance is probably lower. And translation ability for languages other than Japanese and English has deteriorated significantly.
|
87 |
+
|
88 |
+
[Sample Code For Free Colab webbigdata/ALMA-7B-Ja-V2-GPTQ-Ja-En](https://huggingface.co/webbigdata/ALMA-7B-Ja-V2-GPTQ-Ja-En)
|
89 |
+
|
90 |
+
If you want to translate the entire file at once, try Colab below.
|
91 |
+
[ALMA_7B_Ja_GPTQ_Ja_En_batch_translation_sample](https://github.com/webbigdata-jp/python_sample/blob/main/ALMA_7B_Ja_GPTQ_Ja_En_batch_translation_sample.ipynb)
|
92 |
+
|
93 |
+
|
94 |
+
|
95 |
+
**ALMA** (**A**dvanced **L**anguage **M**odel-based tr**A**nslator) is an LLM-based translation model, which adopts a new translation model paradigm: it begins with fine-tuning on monolingual data and is further optimized using high-quality parallel data. This two-step fine-tuning process ensures strong translation performance.
|
96 |
+
Please find more details in their [paper](https://arxiv.org/abs/2309.11674).
|
97 |
+
```
|
98 |
+
@misc{xu2023paradigm,
|
99 |
+
title={A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models},
|
100 |
+
author={Haoran Xu and Young Jin Kim and Amr Sharaf and Hany Hassan Awadalla},
|
101 |
+
year={2023},
|
102 |
+
eprint={2309.11674},
|
103 |
+
archivePrefix={arXiv},
|
104 |
+
primaryClass={cs.CL}
|
105 |
+
}
|
106 |
+
```
|
107 |
+
|
108 |
+
|
109 |
+
Original Model [ALMA-7B](https://huggingface.co/haoranxu/ALMA-7B). (26.95GB)
|
110 |
+
Prevous Model [ALMA-7B-Ja](https://huggingface.co/webbigdata/ALMA-7B-Ja). (13.3 GB)
|
111 |
+
|
112 |
+
|
113 |
+
|
114 |
+
## about this work
|
115 |
+
- **This work was done by :** [webbigdata](https://webbigdata.jp/).
|