|
--- |
|
language: |
|
- ko |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: t5-large-korean-P2G |
|
results: [] |
|
--- |
|
|
|
# t5-large-korean-text-summary |
|
|
|
|
|
์ด ๋ชจ๋ธ์ lcw99 / t5-large-korean-text-summary์ ๊ตญ๋ฆฝ ๊ตญ์ด์ ์ ๋ฌธ ๋ง๋ญ์น 50๋ง๊ฐ์ ๋ฌธ์ฅ์ 2021์ g2pK๋ก ํ๋ จ์์ผ G2P๋ ๋ฐ์ดํฐ๋ฅผ ์๋ณธ์ผ๋ก ๋๋ฆฝ๋๋ค. |
|
|
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
import nltk |
|
nltk.download('punkt') |
|
model_dir = "t5-large-korean-P2G" |
|
tokenizer = AutoTokenizer.from_pretrained(model_dir) |
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir) |
|
|
|
text = "ํ์๊ธด๊ฐ ์๊น ๊น๋์ ๊ฑ์ฌ๊ผฌ๋ฐฑ ๋ฝ ์ ์์ค์ง ๋๊ถ ์ถ๊ฐ" |
|
inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt") |
|
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100) |
|
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0] |
|
predicted_title = nltk.sent_tokenize(decoded_output.strip())[0] |
|
print(predicted_title) |
|
``` |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- optimizer: None |
|
- training_precision: float16 |
|
### Training results |
|
### Framework versions |
|
- Transformers 4.22.1 |
|
- TensorFlow 2.10.0 |
|
- Datasets 2.5.1 |
|
- Tokenizers 0.12.1 |