yeongjoon commited on
Commit
3840cc7
1 Parent(s): 36fa30a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -5,8 +5,8 @@ language:
5
  ---
6
 
7
  # Kconvo-roberta: Korean conversation RoBERTa ([github](https://github.com/HeoTaksung/Domain-Robust-Retraining-of-Pretrained-Language-Model))
8
- - There are many PLMs (Pretrained Language Models) for Korean, but most of them exist for written language.
9
- - Here, we introduce a retrained PLM for prediction of Korean conversation data.
10
 
11
  ## Usage
12
  ```python
@@ -20,7 +20,7 @@ model_roberta = RobertaModel.from_pretrained("yeongjoon/Kconvo-roberta")
20
  -----------------
21
  ## Domain Robust Retraining of Pretrained Language Model
22
 
23
- - Kconvo-roberta uses [klue/roberta-base](https://huggingface.co/klue/roberta-base) as the basic model and additionally retrains the conversation dataset.
24
  - The retrained dataset was collected through the [National Institute of the Korean Language](https://corpus.korean.go.kr/request/corpusRegist.do) and [AI-Hub](https://www.aihub.or.kr/aihubdata/data/list.do?pageIndex=1&currMenu=115&topMenu=100&dataSetSn=&srchdataClCode=DATACL001&srchOrder=&SrchdataClCode=DATACL002&searchKeyword=&srchDataRealmCode=REALM002&srchDataTy=DATA003), and the collected dataset is as follows.
25
 
26
  ```
 
5
  ---
6
 
7
  # Kconvo-roberta: Korean conversation RoBERTa ([github](https://github.com/HeoTaksung/Domain-Robust-Retraining-of-Pretrained-Language-Model))
8
+ - There are many PLMs (Pretrained Language Models) for Korean, but most of them are trained with written language.
9
+ - Here, we introduce a retrained PLM for prediction of Korean conversation data where we use verbal data for training.
10
 
11
  ## Usage
12
  ```python
 
20
  -----------------
21
  ## Domain Robust Retraining of Pretrained Language Model
22
 
23
+ - Kconvo-roberta uses [klue/roberta-base](https://huggingface.co/klue/roberta-base) as the base model and retrained additionaly with the conversation dataset.
24
  - The retrained dataset was collected through the [National Institute of the Korean Language](https://corpus.korean.go.kr/request/corpusRegist.do) and [AI-Hub](https://www.aihub.or.kr/aihubdata/data/list.do?pageIndex=1&currMenu=115&topMenu=100&dataSetSn=&srchdataClCode=DATACL001&srchOrder=&SrchdataClCode=DATACL002&searchKeyword=&srchDataRealmCode=REALM002&srchDataTy=DATA003), and the collected dataset is as follows.
25
 
26
  ```