metadata
language:
- zh
license: apache-2.0
widget:
- text: 生活的真谛是[MASK]。
Mengzi-BERT base model (Chinese)
Pretrained model on 300G Chinese corpus. Masked language modeling(MLM), part-of-speech(POS) tagging and sentence order prediction(SOP) are used as training task.
Mengzi: A lightweight yet Powerful Chinese Pre-trained Language Model
Usage
from transformers import BertTokenizer, BertModel
tokenizer = BertTokenizer.from_pretrained("Langboat/mengzi-bert-base")
model = BertModel.from_pretrained("Langboat/mengzi-bert-base")
Scores on nine chinese tasks (without any data augmentation)
Model | AFQMC | TNEWS | IFLYTEK | CMNLI | WSC | CSL | CMRC | C3 | CHID |
---|---|---|---|---|---|---|---|---|---|
RoBERTa-wwm-ext | 74.04 | 56.94 | 60.31 | 80.51 | 67.80 | 81.00 | 75.20 | 66.50 | 83.62 |
Mengzi-BERT-base | 74.58 | 57.97 | 60.68 | 82.12 | 87.50 | 85.40 | 78.54 | 71.70 | 84.16 |
RoBERTa-wwm-ext scores are from CLUE baseline |
Citation
If you find the technical report or resource is useful, please cite the following technical report in your paper.
example