Bert2gpt (Encoder-Decoder) on Liputan6 100k dataset

Dataset source: https://huggingface.co/datasets/fajrikoto/id_liputan6
Model used for Fine Tuning (Encoder):
https://huggingface.co/cahya/bert-base-indonesian-1.5G
Model used for Fine Tuning (Decoder):
https://huggingface.co/cahya/gpt2-small-indonesian-522M

Trained on 1x3090 @ 8 epoch (EarlyStopping Callbacks)

Train logs, metrics, and params: https://wandb.ai/willy030125/huggingface/runs/9nt3z9dh
https://www.comet.com/willy030125/huggingface/404a9a8abbd84ed5931ff944746df83c
Eval results and Perplexity: eval_results.json

Usage:

from transformers import BertTokenizer, GPT2Tokenizer, EncoderDecoderModel
encoder_tokenizer = BertTokenizer.from_pretrained("cahya/bert-base-indonesian-1.5G")
decoder_tokenizer = GPT2Tokenizer.from_pretrained("cahya/gpt2-small-indonesian-522M")
model = EncoderDecoderModel.from_pretrained("Willy030125/Bert2gpt_Liputan6_100k_8epoch")
Downloads last month
3
Safetensors
Model size
263M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.