IDEA-CCNL
/

Erlangshen-MegatronBert-1.3B

Inference Endpoints

Model card Files Files and versions Community

Joelzhang commited on Sep 16, 2022

Commit

6379eb7

•

1 Parent(s): aea31a7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -30,7 +30,7 @@ A bidirectional language model based on the Encoder structure, focusing on solvi
 We follow [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), using 32 A100s and spending 14 days training a billion-level BERT on WuDao Corpora (180 GB version). Given Chinese grammar and the difficulty of large-scale training, we use four pre-training procedures to improve BERT: 1) Whole Word Masking (WWM), 2) Knowledge-based Dynamic Masking (KDM), 3) Sentence Order Prediction (SOP), 4) Pre-layer Normalization (Pre-LN).
 ## 成就 Achievement
-1.2021年11月10日，二郎神在FewCLUE上取得第一。其中，它在CHIDF(成语填空)和TNEWS(新闻分类)子任务中的表现优于人类表现。此外，它在CHIDF(成语填空), CSLDCP(学科文献分类), OCNLI(自然语言推理)任 务中均名列前茅。
 2.2022年1月24日，二郎神在CLUE基准测试中的ZeroCLUE中取得第一。具体到子任务，我们在CSLDCP(主题文献分类), TNEWS(新闻分类), IFLYTEK(应用描述分类), CSL(抽象关键字识别)和CLUEWSC(参考消歧)任务中取得第一。
 3.在2022年7月10日，我们在CLUE基准的语义匹配任务中取得第一。

 We follow [Megatron-LM](https://github.com/NVIDIA/Megatron-LM), using 32 A100s and spending 14 days training a billion-level BERT on WuDao Corpora (180 GB version). Given Chinese grammar and the difficulty of large-scale training, we use four pre-training procedures to improve BERT: 1) Whole Word Masking (WWM), 2) Knowledge-based Dynamic Masking (KDM), 3) Sentence Order Prediction (SOP), 4) Pre-layer Normalization (Pre-LN).
 ## 成就 Achievement
+1.2021年11月10日，二郎神在FewCLUE上取得第一。其中，它在CHIDF(成语填空)和TNEWS(新闻分类)子任务中的表现优于人类表现。此外，它在CHIDF(成语填空), CSLDCP(学科文献分类), OCNLI(自然语言推理)任务中均名列前茅。
 2.2022年1月24日，二郎神在CLUE基准测试中的ZeroCLUE中取得第一。具体到子任务，我们在CSLDCP(主题文献分类), TNEWS(新闻分类), IFLYTEK(应用描述分类), CSL(抽象关键字识别)和CLUEWSC(参考消歧)任务中取得第一。
 3.在2022年7月10日，我们在CLUE基准的语义匹配任务中取得第一。