Update README.md
Browse files
README.md
CHANGED
@@ -12,7 +12,7 @@ widget:
|
|
12 |
|
13 |
## Model description
|
14 |
|
15 |
-
This is the set of Chinese T5 Version 1.1 models pre-trained by [UER-py](https://arxiv.org/abs/1909.05658).
|
16 |
|
17 |
**Version 1.1**
|
18 |
|
@@ -22,6 +22,8 @@ Chinese T5 Version 1.1 includes the following improvements compared to our Chine
|
|
22 |
- Dropout was turned off in pre-training
|
23 |
- no parameter sharing between embedding and classifier layer
|
24 |
|
|
|
|
|
25 |
| | Link |
|
26 |
| ----------------- | :----------------------------: |
|
27 |
| **T5-v1_1-Small** | [**L=8/H=512 (Small)**][small] |
|
@@ -74,7 +76,6 @@ python3 pretrain.py --dataset_path cluecorpussmall_t5-v1_1_seq128_dataset.pt \
|
|
74 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
75 |
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
76 |
--feed_forward gated --decoder transformer --target t5
|
77 |
-
|
78 |
```
|
79 |
|
80 |
Stage2:
|
|
|
12 |
|
13 |
## Model description
|
14 |
|
15 |
+
This is the set of Chinese T5 Version 1.1 models pre-trained by [UER-py](https://github.com/dbiir/UER-py/), which is introduced in [this paper](https://arxiv.org/abs/1909.05658).
|
16 |
|
17 |
**Version 1.1**
|
18 |
|
|
|
22 |
- Dropout was turned off in pre-training
|
23 |
- no parameter sharing between embedding and classifier layer
|
24 |
|
25 |
+
You can download the set of Chinese T5 Version 1.1 models either from the [UER-py Modelzoo page](https://github.com/dbiir/UER-py/wiki/Modelzoo), or via HuggingFace from the links below:
|
26 |
+
|
27 |
| | Link |
|
28 |
| ----------------- | :----------------------------: |
|
29 |
| **T5-v1_1-Small** | [**L=8/H=512 (Small)**][small] |
|
|
|
76 |
--embedding word --relative_position_embedding --remove_embedding_layernorm --tgt_embedding word \
|
77 |
--encoder transformer --mask fully_visible --layernorm_positioning pre \
|
78 |
--feed_forward gated --decoder transformer --target t5
|
|
|
79 |
```
|
80 |
|
81 |
Stage2:
|