File size: 1,845 Bytes
efedd55 4c18240 efedd55 4c18240 8feb836 efedd55 8feb836 a252122 1f8db16 efedd55 a252122 efedd55 a252122 1f8db16 a252122 1f8db16 efedd55 1f8db16 efedd55 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
---
language:
- ko
tags:
- generated_from_keras_callback
model-index:
- name: t5-large-korean-news-title-klue-ynat
results: []
---
# t5-large-korean-text-summary
์ด ๋ชจ๋ธ์ lcw99 / t5-large-korean-text-summary์ klue-ynat์ผ๋ก ํ๋ จ์์ผ ๋ง๋ ๋ชจ๋ธ์
๋๋ค.
Input = ['IT๊ณผํ','๊ฒฝ์ ','์ฌํ','์ํ๋ฌธํ','์ธ๊ณ','์คํฌ์ธ ','์ ์น']
OUTPUT = ๊ฐ label์ ๋ง๋ ๋ด์ค ๊ธฐ์ฌ ์ ๋ชฉ์ ์์ฑํฉ๋๋ค.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_dir = "kfkas/t5-large-korean-news-title-klue-ynat"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
model.to(device)
label_list = ['IT๊ณผํ','๊ฒฝ์ ','์ฌํ','์ํ๋ฌธํ','์ธ๊ณ','์คํฌ์ธ ','์ ์น']
text = "๊ฒฝ์ "
inputs = tokenizer.encode(text, max_length=256, truncation=True, return_tensors="pt")
with torch.no_grad():
output = model.generate(
input_ids,
do_sample=True, #์ํ๋ง ์ ๋ต ์ฌ์ฉ
max_length=128, # ์ต๋ ๋์ฝ๋ฉ ๊ธธ์ด๋ 50
top_k=50, # ํ๋ฅ ์์๊ฐ 50์ ๋ฐ์ธ ํ ํฐ์ ์ํ๋ง์์ ์ ์ธ
top_p=0.95, # ๋์ ํ๋ฅ ์ด 95%์ธ ํ๋ณด์งํฉ์์๋ง ์์ฑ
)
decoded_output = tokenizer.decode(output, skip_special_tokens=True)[0]
print(predicted_title)#์ ๋ถ ๊ธฐ์
๊ณ ์ฉ์ฐฝ์ถยท์ฑ์ฅ ์ด์ง ์ํ ๊ฒฝ์ ์ ์ฑ
ํ๋ ์ฃผ๋ชฉ
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float16
### Training results
### Framework versions
- Transformers 4.22.1
- TensorFlow 2.10.0
- Datasets 2.5.1
- Tokenizers 0.12.1 |