File size: 1,704 Bytes
7543bb5 e026ff2 5e97fe4 1accdc4 5e97fe4 2f7d0d2 5e97fe4 2f7d0d2 1d9e2bc |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
license: cc-by-4.0
language:
- ko
tags:
- generation
---
## Model Details
* Model Description: Speech style converter model based on gogamza/kobart-base-v2
* Developed by: Juhwan, Lee and Jisu, Kim
* Model Type: Text-generation
* Language: Korean
* License: CC-BY-4.0
## Dataset
* [korean SmileStyle Dataset](https://github.com/smilegate-ai/korean_smile_style_dataset)
* Randomly split train/valid dataset (9:1)
## BLEU Score
* 25.35
## Uses
This model can be used for convert speech style
* formal: λ¬Έμ΄μ²΄
* informal: ꡬμ΄μ²΄
* android: μλλ‘μ΄λ
* azae: μμ¬
* chat: μ±ν
* choding: μ΄λ±νμ
* emoticon: μ΄λͺ¨ν°μ½
* enfp: enfp
* gentle: μ μ¬
* halbae: ν μλ²μ§
* halmae: ν λ¨Έλ
* joongding: μ€νμ
* king: μ
* naruto: λ루ν
* seonbi: μ λΉ
* sosim: μμ¬ν
* translator: λ²μκΈ°
```python
from transformers import pipeline
model = "KoJLabs/bart-speech-style-converter"
tokenizer = AutoTokenizer.from_pretrained(model)
nlg_pipeline = pipeline('text2text-generation',model=model, tokenizer=tokenizer)
styles = ["λ¬Έμ΄μ²΄", "ꡬμ΄μ²΄", "μλλ‘μ΄λ", "μμ¬", "μ±ν
", "μ΄λ±νμ", "μ΄λͺ¨ν°μ½", "enfp", "μ μ¬", "ν μλ²μ§", "ν λ¨Έλ", "μ€νμ", "μ", "λ루ν ", "μ λΉ", "μμ¬ν", "λ²μκΈ°"]
for style in styles:
text = f"{style} νμμΌλ‘ λ³ν:μ€λμ λλ³Άμνμ λ¨Ήμλ€. λ§μμλ€."
out = nlg_pipeline(text, max_length=100)
print(style, out[0]['generated_text'])
```
## Model Source
https://github.com/KoJLabs/speech-style/tree/main
## Speech style conversion package
You can exercise korean speech style conversion task with python package [KoTAN](https://github.com/KoJLabs/KoTAN) |