idea-teacher's picture
Update README.md
09ff910
|
raw
history blame
2.03 kB
metadata
language:
  - zh
license: apache-2.0
tags:
  - classification
inference: false

IDEA-CCNL/Erlangshen-TCBert-330M-Sentence-Embedding-Chinese

简介 Brief Introduction

330M参数的句子表征Topic Classification BERT (TCBert)。

The TCBert with 330M parameters is pre-trained for sentence representation for Chinese topic classification tasks.

模型分类 Model Taxonomy

需求 Demand 任务 Task 系列 Series 模型 Model 参数 Parameter 额外 Extra
通用 General 句子表征 二郎神 Erlangshen TCBert (sentence representation) 330M Chinese

模型信息 Model Information

为了提高模型在话题分类上句子表征效果,我们收集了大量话题分类数据进行基于prompts的对比学习预训练。

To improve the model performance on sentence representation for the topic classification task, we collected numerous topic classification datasets for contrastive pre-training based on general prompts.

下游效果 Performance

Stay tuned.

使用 Usage

from transformers import BertForMaskedLM, BertTokenizer
import torch
tokenizer=BertTokenizer.from_pretrained("IDEA-CCNL/Erlangshen-TCBert-330M-Sentence-Embedding-Chinese")
model=BertForMaskedLM.from_pretrained("IDEA-CCNL/Erlangshen-TCBert-330M-Sentence-Embedding-Chinese")

Stay tuned for more details on usage for sentence representation.

如果您在您的工作中使用了我们的模型,可以引用我们的网站:

You can also cite our website:

@misc{Fengshenbang-LM,
  title={Fengshenbang-LM},
  author={IDEA-CCNL},
  year={2021},
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
}