metadata
language:
- ko
tags:
- generated_from_keras_callback
model-index:
- name: t5-large-korean-P2G
results: []
t5-large-korean-text-summary
์ด ๋ชจ๋ธ์ lcw99 / t5-large-korean-text-summary์ ๊ตญ๋ฆฝ ๊ตญ์ด์ ์ ๋ฌธ ๋ง๋ญ์น 50๋ง๊ฐ์ ๋ฌธ์ฅ์ 2021์ g2pK๋ก ํ๋ จ์์ผ G2P๋ ๋ฐ์ดํฐ๋ฅผ ์๋ณธ์ผ๋ก ๋๋ฆฝ๋๋ค.
Usage
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import nltk
nltk.download('punkt')
model_dir = "t5-large-korean-P2G"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
text = "ํ์๊ธด๊ฐ ์๊น ๊น๋์ ๊ฑ์ฌ๊ผฌ๋ฐฑ ๋ฝ ์ ์์ค์ง ๋๊ถ ์ถ๊ฐ"
inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
predicted_title = nltk.sent_tokenize(decoded_output.strip())[0]
print(predicted_title)
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float16
Training results
Framework versions
- Transformers 4.22.1
- TensorFlow 2.10.0
- Datasets 2.5.1
- Tokenizers 0.12.1