Edit model card

KOEN-T5-SMALL-V0

This model is for Korean to English Machine Translator, which is based on T5-small architecture, but trained from scratch.

Code

The training code is from my lecture(LLM์„ ์œ„ํ•œ ๊น€๊ธฐํ˜„์˜ NLP EXPRESS), which is published on FastCampus. You can check the training code in this github repo.

Dataset

The training dataset for this model is mainly from AI-Hub. The dataset consists of 11M parallel samples.

Tokenizer

I use Byte-level BPE tokenizer for both source and target language. Since it covers both languages, tokenizer vocab size is 60k.

Architecture

The model architecture is based on T5-small, which is popular encoder-decoder model architecture. Please, note that this model is trained from-scratch, not fine-tuned.

Evaluation

I conducted the evaluation with 5 different test sets. Following figure shows BLEU scores on each test set.

DEEPCL model is private version of this model, which is trained on much more data.

Contact

Kim Ki Hyun (nlp.with.deep.learning@gmail.com)

Downloads last month
7
Safetensors
Model size
74.8M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.