kfkas commited on
Commit
efedd55
1 Parent(s): 8397ff6

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +55 -0
README.md ADDED
@@ -0,0 +1,55 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ko
4
+ tags:
5
+ - generated_from_keras_callback
6
+ model-index:
7
+ - name: t5-large-korean-P2G
8
+ results: []
9
+ ---
10
+
11
+ # t5-large-korean-text-summary
12
+
13
+
14
+ 이 모델은 lcw99 / t5-large-korean-text-summary을 국립 국어원 신문 말뭉치 50만개의 문장을 2021을 g2pK로 훈련시켜 G2P된 데이터를 원본으로 돌립니다.
15
+
16
+
17
+ ## Usage
18
+ ```python
19
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
20
+ import nltk
21
+ nltk.download('punkt')
22
+ model_dir = "t5-large-korean-P2G"
23
+ tokenizer = AutoTokenizer.from_pretrained(model_dir)
24
+ model = AutoModelForSeq2SeqLM.from_pretrained(model_dir)
25
+
26
+ text = "회새긴간 작까 김동시 걍심꼬백 뜽 새 소설집 뚜권 출간"
27
+ inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt")
28
+ output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100)
29
+ decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0]
30
+ predicted_title = nltk.sent_tokenize(decoded_output.strip())[0]
31
+ print(predicted_title)
32
+ ```
33
+
34
+
35
+ ## Intended uses & limitations
36
+
37
+ More information needed
38
+
39
+ ## Training and evaluation data
40
+
41
+ More information needed
42
+
43
+ ## Training procedure
44
+
45
+ ### Training hyperparameters
46
+
47
+ The following hyperparameters were used during training:
48
+ - optimizer: None
49
+ - training_precision: float16
50
+ ### Training results
51
+ ### Framework versions
52
+ - Transformers 4.22.1
53
+ - TensorFlow 2.10.0
54
+ - Datasets 2.5.1
55
+ - Tokenizers 0.12.1