|
--- |
|
language: ko |
|
tags: |
|
- bart |
|
license: MIT |
|
--- |
|
|
|
# Korean News Summarization Model |
|
|
|
## How to use |
|
|
|
```python |
|
import torch |
|
from transformers import PreTrainedTokenizerFast |
|
from transformers import BartForConditionalGeneration |
|
|
|
tokenizer = PreTrainedTokenizerFast.from_pretrained( |
|
'gogamza/kobart-summarization', |
|
bos_token='<s>', eos_token='</s>', unk_token='<unk>', pad_token='<pad>', mask_token='<mask>') |
|
|
|
model = BartForConditionalGeneration.from_pretrained('gogamza/kobart-summarization') |
|
|
|
text = "과거를 떠올려보자. 방송을 보던 우리의 모습을..." |
|
|
|
raw_input_ids = tokenizer.encode(text) |
|
input_ids = [tokenizer.bos_token_id] + \ |
|
raw_input_ids + [tokenizer.eos_token_id] |
|
summary_ids = model.generate(torch.tensor([input_ids]), |
|
max_length=150, |
|
early_stopping=False, |
|
num_beams=5, |
|
repetition_penalty=1.0, |
|
eos_token_id=tokenizer.eos_token_id) |
|
summ_text = tokenizer.batch_decode(summary_ids.tolist(), skip_special_tokens=True)[0] |
|
``` |
|
|
|
## Demo |
|
|
|
- [요약 데모](http://52.231.69.211:8081/) |
|
|
|
![](summ.png) |
|
|
|
|
|
|
|
|