File size: 1,423 Bytes
efedd55 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 |
---
language:
- ko
tags:
- generated_from_keras_callback
model-index:
- name: t5-large-korean-P2G
results: []
---
# t5-large-korean-text-summary
이 모델은 lcw99 / t5-large-korean-text-summary을 국립 국어원 신문 말뭉치 50만개의 문장을 2021을 g2pK로 훈련시켜 G2P된 데이터를 원본으로 돌립니다.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import nltk
nltk.download('punkt')
model_dir = "t5-large-korean-P2G"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
text = "회새긴간 작까 김동시 걍심꼬백 뜽 새 소설집 뚜권 출간"
inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt")
output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100)
decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
predicted_title = nltk.sent_tokenize(decoded_output.strip())[0]
print(predicted_title)
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float16
### Training results
### Framework versions
- Transformers 4.22.1
- TensorFlow 2.10.0
- Datasets 2.5.1
- Tokenizers 0.12.1 |