idea-teacher
commited on
Commit
•
ed8254e
1
Parent(s):
7ca6486
Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,61 @@
|
|
1 |
---
|
|
|
|
|
|
|
2 |
license: apache-2.0
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- zh
|
4 |
+
|
5 |
license: apache-2.0
|
6 |
+
|
7 |
+
tags:
|
8 |
+
- classification
|
9 |
+
|
10 |
+
inference: false
|
11 |
+
|
12 |
---
|
13 |
+
|
14 |
+
# IDEA-CCNL/Erlangshen-TCBert-110M-Sentence-Embedding-Chinese
|
15 |
+
|
16 |
+
- Github: [Fengshenbang-LM](https://github.com/IDEA-CCNL/Fengshenbang-LM)
|
17 |
+
- Docs: [Fengshenbang-Docs](https://fengshenbang-doc.readthedocs.io/)
|
18 |
+
|
19 |
+
## 简介 Brief Introduction
|
20 |
+
|
21 |
+
110M参数的Topic Classification BERT (TCBert)。
|
22 |
+
|
23 |
+
The TCBert with 110M parameters is pre-trained for, not limited to, Chinese topic classification tasks.
|
24 |
+
|
25 |
+
## 模型分类 Model Taxonomy
|
26 |
+
|
27 |
+
| 需求 Demand | 任务 Task | 系列 Series | 模型 Model | 参数 Parameter | 额外 Extra |
|
28 |
+
| :----: | :----: | :----: | :----: | :----: | :----: |
|
29 |
+
| 通用 General | 句子表征 | 二郎神 Erlangshen | TCBert | 110M | Chinese |
|
30 |
+
|
31 |
+
## 模型信息 Model Information
|
32 |
+
|
33 |
+
|
34 |
+
为了提高模型在话题分类上句子表征效果,我们收集了大量话题分类数据进行基于prompts的对比学习预训练。
|
35 |
+
|
36 |
+
To improve the model performance on the topic classification task, we collected numerous topic classification datasets for contrastive pre-training based on general prompts.
|
37 |
+
### 下游效果 Performance
|
38 |
+
|
39 |
+
Stay tuned.
|
40 |
+
|
41 |
+
## 使用 Usage
|
42 |
+
|
43 |
+
```python
|
44 |
+
from transformers import BertForMaskedLM, BertTokenizer
|
45 |
+
import torch
|
46 |
+
tokenizer=BertTokenizer.from_pretrained("IDEA-CCNL/Erlangshen-TCBert-110M-Classification-Chinese")
|
47 |
+
model=BertForMaskedLM.from_pretrained('IDEA-CCNL/Erlangshen-TCBert-110M-Classification-Chinese')
|
48 |
+
```
|
49 |
+
Stay tuned for more details.
|
50 |
+
如果您在您的工作中使用了我们的模型,可以引用我们的[网站](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
|
51 |
+
|
52 |
+
You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
|
53 |
+
|
54 |
+
```text
|
55 |
+
@misc{Fengshenbang-LM,
|
56 |
+
title={Fengshenbang-LM},
|
57 |
+
author={IDEA-CCNL},
|
58 |
+
year={2021},
|
59 |
+
howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
|
60 |
+
}
|
61 |
+
```
|