|
--- |
|
language: |
|
- ko |
|
tags: |
|
- generated_from_keras_callback |
|
model-index: |
|
- name: t5-base-korean-news-title-klue-ynat |
|
results: [] |
|
--- |
|
|
|
# t5-large-korean-text-summary |
|
|
|
|
|
์ด ๋ชจ๋ธ์ lcw99 / t5-base-korean-text-summary์ klue-ynat์ผ๋ก ํ๋ จ์์ผ ๋ง๋ ๋ชจ๋ธ์
๋๋ค. |
|
Input = ['IT๊ณผํ','๊ฒฝ์ ','์ฌํ','์ํ๋ฌธํ','์ธ๊ณ','์คํฌ์ธ ','์ ์น'] |
|
OUTPUT = ๊ฐ label์ ๋ง๋ ๋ด์ค ๊ธฐ์ฌ ์ ๋ชฉ์ ์์ฑํฉ๋๋ค. |
|
|
|
## Usage |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM |
|
|
|
device = torch.device("cuda" if torch.cuda.is_available() else "cpu") |
|
model_dir = "kfkas/t5-base-korean-news-title-klue-ynat" |
|
tokenizer = AutoTokenizer.from_pretrained(model_dir) |
|
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir) |
|
model.to(device) |
|
|
|
label_list = ['IT๊ณผํ','๊ฒฝ์ ','์ฌํ','์ํ๋ฌธํ','์ธ๊ณ','์คํฌ์ธ ','์ ์น'] |
|
text = "IT๊ณผํ" |
|
|
|
inputs = tokenizer.encode(text, max_length=256, truncation=True, return_tensors="pt") |
|
with torch.no_grad(): |
|
output = model.generate( |
|
input_ids, |
|
do_sample=True, #์ํ๋ง ์ ๋ต ์ฌ์ฉ |
|
max_length=128, # ์ต๋ ๋์ฝ๋ฉ ๊ธธ์ด๋ 50 |
|
top_k=50, # ํ๋ฅ ์์๊ฐ 50์ ๋ฐ์ธ ํ ํฐ์ ์ํ๋ง์์ ์ ์ธ |
|
top_p=0.95, # ๋์ ํ๋ฅ ์ด 95%์ธ ํ๋ณด์งํฉ์์๋ง ์์ฑ |
|
) |
|
decoded_output = tokenizer.decode(output, skip_special_tokens=True)[0] |
|
print(predicted_title) |
|
``` |
|
|
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- optimizer: None |
|
- training_precision: float16 |
|
### Training results |
|
### Framework versions |
|
- Transformers 4.22.1 |
|
- TensorFlow 2.10.0 |
|
- Datasets 2.5.1 |
|
- Tokenizers 0.12.1 |