Update README.md
Browse files
README.md
CHANGED
@@ -5,8 +5,8 @@ language:
|
|
5 |
---
|
6 |
|
7 |
# Kconvo-roberta: Korean conversation RoBERTa ([github](https://github.com/HeoTaksung/Domain-Robust-Retraining-of-Pretrained-Language-Model))
|
8 |
-
- There are many PLMs (Pretrained Language Models) for Korean, but most of them
|
9 |
-
- Here, we introduce a retrained PLM for prediction of Korean conversation data.
|
10 |
|
11 |
## Usage
|
12 |
```python
|
@@ -20,7 +20,7 @@ model_roberta = RobertaModel.from_pretrained("yeongjoon/Kconvo-roberta")
|
|
20 |
-----------------
|
21 |
## Domain Robust Retraining of Pretrained Language Model
|
22 |
|
23 |
-
- Kconvo-roberta uses [klue/roberta-base](https://huggingface.co/klue/roberta-base) as the
|
24 |
- The retrained dataset was collected through the [National Institute of the Korean Language](https://corpus.korean.go.kr/request/corpusRegist.do) and [AI-Hub](https://www.aihub.or.kr/aihubdata/data/list.do?pageIndex=1&currMenu=115&topMenu=100&dataSetSn=&srchdataClCode=DATACL001&srchOrder=&SrchdataClCode=DATACL002&searchKeyword=&srchDataRealmCode=REALM002&srchDataTy=DATA003), and the collected dataset is as follows.
|
25 |
|
26 |
```
|
|
|
5 |
---
|
6 |
|
7 |
# Kconvo-roberta: Korean conversation RoBERTa ([github](https://github.com/HeoTaksung/Domain-Robust-Retraining-of-Pretrained-Language-Model))
|
8 |
+
- There are many PLMs (Pretrained Language Models) for Korean, but most of them are trained with written language.
|
9 |
+
- Here, we introduce a retrained PLM for prediction of Korean conversation data where we use verbal data for training.
|
10 |
|
11 |
## Usage
|
12 |
```python
|
|
|
20 |
-----------------
|
21 |
## Domain Robust Retraining of Pretrained Language Model
|
22 |
|
23 |
+
- Kconvo-roberta uses [klue/roberta-base](https://huggingface.co/klue/roberta-base) as the base model and retrained additionaly with the conversation dataset.
|
24 |
- The retrained dataset was collected through the [National Institute of the Korean Language](https://corpus.korean.go.kr/request/corpusRegist.do) and [AI-Hub](https://www.aihub.or.kr/aihubdata/data/list.do?pageIndex=1&currMenu=115&topMenu=100&dataSetSn=&srchdataClCode=DATACL001&srchOrder=&SrchdataClCode=DATACL002&searchKeyword=&srchDataRealmCode=REALM002&srchDataTy=DATA003), and the collected dataset is as follows.
|
25 |
|
26 |
```
|