Update README.md
Browse files
README.md
CHANGED
@@ -31,6 +31,13 @@ We also support [Huggingface](hflink)
|
|
31 |
|
32 |
|
33 |
## 模型细节/Model details
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
34 |
|
35 |
我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。
|
36 |
|
@@ -51,20 +58,12 @@ We used different tokenizers to extract ten thousand data samples from English,
|
|
51 |
| llama | 32000 | sp(bpe)|1805| 1257|1970 |
|
52 |
| gpt2_new_100k | 100000 | bpe|1575 | 477|1679 |
|
53 |
|
54 |
-
|
55 |
-
|
56 |
-
模型在一台8卡Nvidia A100上训练8小时,总共对15万条数据训练了3个epoch。
|
57 |
-
|
58 |
-
The model was trained on an 8-card Nvidia A100 for 8 hours, and a total of 150,000 lines of data were trained for 3 epochs.
|
59 |
-
|
60 |
## 训练数据集/Training data
|
61 |
|
62 |
我们采用了一系列高质量中英文数据集来训练和微调我们的对话语言模型,并且在不断更新迭代
|
63 |
|
64 |
We used a series of high-quality Chinese and English datasets to train and fine-tune our conversational language model, and continuously updated it through iterations.
|
65 |
|
66 |
-
![Screenshot](../img/data.jpg)
|
67 |
-
|
68 |
|
69 |
## 使用方式/How to use
|
70 |
|
@@ -204,7 +203,7 @@ Create a new directory named `aquila-7b` inside `./checkpoints_in`. Place the fi
|
|
204 |
|
205 |
#### Step 3: 启动可监督微调/Start SFT
|
206 |
```
|
207 |
-
bash dist_trigger_docker.sh hostfile aquila-sft.yaml
|
208 |
```
|
209 |
接下来会输出下列信息,注意`NODES_NUM`应该与节点数相等,`LOGFILE`是模型运行的日志文件;The following information will be output. Note that `NODES_NUM` should be equal to the number of nodes, and `LOGFILE` is the log file for the model run.
|
210 |
|
@@ -217,7 +216,7 @@ bash dist_trigger_docker.sh hostfile aquila-sft.yaml aquila-7b [实验名]
|
|
217 |
|
218 |
## 证书/License
|
219 |
|
220 |
-
Aquila-7B开源模型使用 [智源Aquila系列模型许可协议](
|
221 |
|
222 |
|
223 |
-
Aquila-7B open-source model is licensed under [ BAAI Aquila Model Licence Agreement](
|
|
|
31 |
|
32 |
|
33 |
## 模型细节/Model details
|
34 |
+
| Model | License | Commercial use? | GPU | Model link
|
35 |
+
| :---------------- | :------- | :-- |:-- | :-- |
|
36 |
+
|Aquila-7B | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100098
|
37 |
+
| AquilaCode-7B-nv | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100102
|
38 |
+
| AquilaCode-7B-ts | Apache 2.0 | ✅ | Tianshu-BI-V100 | https://model.baai.ac.cn/model-detail/100099
|
39 |
+
| AquilaChat-7B | Apache 2.0 | ✅ | Nvidia-A100 | https://model.baai.ac.cn/model-detail/100101
|
40 |
+
|
41 |
|
42 |
我们使用了一系列更高效的底层算子来辅助模型训练,其中包括参考[flash-attention](https://github.com/HazyResearch/flash-attention)的方法并替换了一些中间计算,同时还使用了RMSNorm。在此基础上,我们应用了[BMtrain](https://github.com/OpenBMB/BMTrain)技术进行轻量化的并行训练,该技术采用了数据并行、ZeRO(零冗余优化器)、优化器卸载、检查点和操作融合、通信-计算重叠等方法来优化模型训练过程。
|
43 |
|
|
|
58 |
| llama | 32000 | sp(bpe)|1805| 1257|1970 |
|
59 |
| gpt2_new_100k | 100000 | bpe|1575 | 477|1679 |
|
60 |
|
|
|
|
|
|
|
|
|
|
|
|
|
61 |
## 训练数据集/Training data
|
62 |
|
63 |
我们采用了一系列高质量中英文数据集来训练和微调我们的对话语言模型,并且在不断更新迭代
|
64 |
|
65 |
We used a series of high-quality Chinese and English datasets to train and fine-tune our conversational language model, and continuously updated it through iterations.
|
66 |
|
|
|
|
|
67 |
|
68 |
## 使用方式/How to use
|
69 |
|
|
|
203 |
|
204 |
#### Step 3: 启动可监督微调/Start SFT
|
205 |
```
|
206 |
+
bash dist_trigger_docker.sh hostfile aquila-sft.yaml aquilachat-7b [实验名]
|
207 |
```
|
208 |
接下来会输出下列信息,注意`NODES_NUM`应该与节点数相等,`LOGFILE`是模型运行的日志文件;The following information will be output. Note that `NODES_NUM` should be equal to the number of nodes, and `LOGFILE` is the log file for the model run.
|
209 |
|
|
|
216 |
|
217 |
## 证书/License
|
218 |
|
219 |
+
Aquila-7B开源模型使用 [智源Aquila系列模型许可协议](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf), 原始代码基于[Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|
220 |
|
221 |
|
222 |
+
Aquila-7B open-source model is licensed under [ BAAI Aquila Model Licence Agreement](https://huggingface.co/BAAI/AquilaCode-7B-NV/resolve/main/BAAI%20Aquila%20Model%20License%20Agreement.pdf). The source code is under [Apache Licence 2.0](https://www.apache.org/licenses/LICENSE-2.0)
|