t5-large-korean-P2G / Readme.md
kfkas's picture
Upload Readme.md
fac1d3c
|
raw
history blame
1.42 kB
---
language:
- ko
tags:
- generated_from_keras_callback
model-index:
- name: t5-large-korean-P2G
results: []
---
# t5-large-korean-text-summary
์ด ๋ชจ๋ธ์€ lcw99 / t5-large-korean-text-summary์„ ๊ตญ๋ฆฝ ๊ตญ์–ด์› ์‹ ๋ฌธ ๋ง๋ญ‰์น˜ 50๋งŒ๊ฐœ์˜ ๋ฌธ์žฅ์„ 2021์„ g2pK๋กœ ํ›ˆ๋ จ์‹œ์ผœ G2P๋œ ๋ฐ์ดํ„ฐ๋ฅผ ์›๋ณธ์œผ๋กœ ๋Œ๋ฆฝ๋‹ˆ๋‹ค.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import nltk
nltk.download('punkt')
model_dir = "t5-large-korean-P2G"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
text = "ํšŒ์ƒˆ๊ธด๊ฐ„ ์ž‘๊นŒ ๊น€๋™์‹œ ๊ฑ์‹ฌ๊ผฌ๋ฐฑ ๋œฝ ์ƒˆ ์†Œ์„ค์ง‘ ๋šœ๊ถŒ ์ถœ๊ฐ„"
inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
predicted_title = nltk.sent_tokenize(decoded_output.strip())[0]
print(predicted_title)
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float16
### Training results
### Framework versions
- Transformers 4.22.1
- TensorFlow 2.10.0
- Datasets 2.5.1
- Tokenizers 0.12.1