|
# CirBERTa |
|
### Apply the Circular to the Pretraining Model |
|
|
|
|
|
|
|
|
|
| 预训练模型 | 学习率 | batchsize | 设备 | 语料库 | 时间 | 优化器 | |
|
| --------------------- | ------ | --------- | ------ | ------ | ---- | ------ | |
|
| CirBERTa-Chinese-Base | 1e-5 | 256 | 10张3090+3张A100 | 200G | 2月 | AdamW | |
|
|
|
|
|
使用通用语料(WuDao 200G) 进行无监督预训练 |
|
|
|
在多项中文理解任务上,CirBERTa-Base模型超过MacBERT-Chinese-Large/RoBERTa-Chinese-Large |
|
|
|
|
|
### 加载与使用 |
|
|
|
依托于huggingface-transformers |
|
|
|
``` |
|
from transformers import AutoTokenizer,AutoModel |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("WENGSYX/CirBERTa-Chinese-Base") |
|
model = AutoModel.from_pretrained("WENGSYX/CirBERTa-Chinese-Base") |
|
``` |
|
|
|
|
|
### 引用: |
|
(暂时先引用这个,论文正在撰写...) |
|
``` |
|
@misc{CirBERTa, |
|
title={CirBERTa: Apply the Circular to the Pretraining Model}, |
|
author={Yixuan Weng}, |
|
howpublished={\url{https://github.com/WENGSYX/CirBERTa}}, |
|
year={2022} |
|
} |
|
``` |
|
|