--- language: - ko tags: - generated_from_keras_callback model-index: - name: t5-large-korean-P2G results: [] --- # t5-large-korean-text-summary 이 모델은 lcw99 / t5-large-korean-text-summary을 국립 국어원 신문 말뭉치 50만개의 문장을 2021을 g2pK로 훈련시켜 G2P된 데이터를 원본으로 돌립니다. ## Usage ```python from transformers import AutoTokenizer, AutoModelForSeq2SeqLM import nltk nltk.download('punkt') model_dir = "t5-large-korean-P2G" tokenizer = AutoTokenizer.from_pretrained(model_dir) model = AutoModelForSeq2SeqLM.from_pretrained(model_dir) text = "회새긴간 작까 김동시 걍심꼬백 뜽 새 소설집 뚜권 출간" inputs = tokenizer(text, max_length=256, truncation=True, return_tensors="pt") output = model.generate(**inputs, num_beams=8, do_sample=True, min_length=10, max_length=100) decoded_output = tokenizer.batch_decode(output, skip_special_tokens=True)[0] predicted_title = nltk.sent_tokenize(decoded_output.strip())[0] print(predicted_title) ``` ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: None - training_precision: float16 ### Training results ### Framework versions - Transformers 4.22.1 - TensorFlow 2.10.0 - Datasets 2.5.1 - Tokenizers 0.12.1