kfkas's picture
Update README.md
65c91f2
|
raw
history blame
1.77 kB
---
language:
- ko
tags:
- generated_from_keras_callback
model-index:
- name: t5-base-korean-news-title-klue-ynat
results: []
---
# t5-large-korean-text-summary
์ด ๋ชจ๋ธ์€ lcw99 / t5-base-korean-text-summary์„ klue-ynat์œผ๋กœ ํ›ˆ๋ จ์‹œ์ผœ ๋งŒ๋“  ๋ชจ๋ธ์ž…๋‹ˆ๋‹ค.
Input = ['IT๊ณผํ•™','๊ฒฝ์ œ','์‚ฌํšŒ','์ƒํ™œ๋ฌธํ™”','์„ธ๊ณ„','์Šคํฌ์ธ ','์ •์น˜']
OUTPUT = ๊ฐ label์— ๋งž๋Š” ๋‰ด์Šค ๊ธฐ์‚ฌ ์ œ๋ชฉ์„ ์ƒ์„ฑํ•ฉ๋‹ˆ๋‹ค.
## Usage
```python
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model_dir = "kfkas/t5-base-korean-news-title-klue-ynat"
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
model.to(device)
label_list = ['IT๊ณผํ•™','๊ฒฝ์ œ','์‚ฌํšŒ','์ƒํ™œ๋ฌธํ™”','์„ธ๊ณ„','์Šคํฌ์ธ ','์ •์น˜']
text = "IT๊ณผํ•™"
inputs = tokenizer.encode(text, max_length=256, truncation=True, return_tensors="pt")
with torch.no_grad():
output = model.generate(
input_ids,
do_sample=True, #์ƒ˜ํ”Œ๋ง ์ „๋žต ์‚ฌ์šฉ
max_length=128, # ์ตœ๋Œ€ ๋””์ฝ”๋”ฉ ๊ธธ์ด๋Š” 50
top_k=50, # ํ™•๋ฅ  ์ˆœ์œ„๊ฐ€ 50์œ„ ๋ฐ–์ธ ํ† ํฐ์€ ์ƒ˜ํ”Œ๋ง์—์„œ ์ œ์™ธ
top_p=0.95, # ๋ˆ„์  ํ™•๋ฅ ์ด 95%์ธ ํ›„๋ณด์ง‘ํ•ฉ์—์„œ๋งŒ ์ƒ์„ฑ
)
decoded_output = tokenizer.decode(output, skip_special_tokens=True)[0]
print(predicted_title)
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- optimizer: None
- training_precision: float16
### Training results
### Framework versions
- Transformers 4.22.1
- TensorFlow 2.10.0
- Datasets 2.5.1
- Tokenizers 0.12.1