jisukim8873's picture
Update README.md
1d9e2bc
|
raw
history blame
1.7 kB
---
license: cc-by-4.0
language:
- ko
tags:
- generation
---
## Model Details
* Model Description: Speech style converter model based on gogamza/kobart-base-v2
* Developed by: Juhwan, Lee and Jisu, Kim
* Model Type: Text-generation
* Language: Korean
* License: CC-BY-4.0
## Dataset
* [korean SmileStyle Dataset](https://github.com/smilegate-ai/korean_smile_style_dataset)
* Randomly split train/valid dataset (9:1)
## BLEU Score
* 25.35
## Uses
This model can be used for convert speech style
* formal: 문어체
* informal: ꡬ어체
* android: μ•ˆλ“œλ‘œμ΄λ“œ
* azae: μ•„μž¬
* chat: μ±„νŒ…
* choding: μ΄ˆλ“±ν•™μƒ
* emoticon: 이λͺ¨ν‹°μ½˜
* enfp: enfp
* gentle: 신사
* halbae: 할아버지
* halmae: ν• λ¨Έλ‹ˆ
* joongding: 쀑학생
* king: μ™•
* naruto: λ‚˜λ£¨ν† 
* seonbi: μ„ λΉ„
* sosim: μ†Œμ‹¬ν•œ
* translator: λ²ˆμ—­κΈ°
```python
from transformers import pipeline
model = "KoJLabs/bart-speech-style-converter"
tokenizer = AutoTokenizer.from_pretrained(model)
nlg_pipeline = pipeline('text2text-generation',model=model, tokenizer=tokenizer)
styles = ["문어체", "ꡬ어체", "μ•ˆλ“œλ‘œμ΄λ“œ", "μ•„μž¬", "μ±„νŒ…", "μ΄ˆλ“±ν•™μƒ", "이λͺ¨ν‹°μ½˜", "enfp", "신사", "할아버지", "ν• λ¨Έλ‹ˆ", "쀑학생", "μ™•", "λ‚˜λ£¨ν† ", "μ„ λΉ„", "μ†Œμ‹¬ν•œ", "λ²ˆμ—­κΈ°"]
for style in styles:
text = f"{style} ν˜•μ‹μœΌλ‘œ λ³€ν™˜:μ˜€λŠ˜μ€ λ‹­λ³ΆμŒνƒ•μ„ λ¨Ήμ—ˆλ‹€. λ§›μžˆμ—ˆλ‹€."
out = nlg_pipeline(text, max_length=100)
print(style, out[0]['generated_text'])
```
## Model Source
https://github.com/KoJLabs/speech-style/tree/main
## Speech style conversion package
You can exercise korean speech style conversion task with python package [KoTAN](https://github.com/KoJLabs/KoTAN)