Model Card for Model ID
This model is pretrained and fine-tuned with Vietnamese language, based on GPT-NeoX which is a large language model developed by EleutherAI.
Model Details
Training Data
- Pre-train: Culturax Vietnamese Dataset(450GB) + AI-Hub Vietnamese Dataset(1.3GB) + Crawled Vietnamese Wikipedia Dataset(630MB) + viwik18 Dataset(1.27GB)
- Fine-tuning:
12MB Vietnamese Question & Answer dataset
Vietnamese Alpaca(16412 rows) + Vietnamese QA Dataset based on viwik18(14293 rows)
Training Hardware
Trained on A100 40GB GPU and 48 core CPU. Took 18 hours to reach 10 epochs.
Hyperparameters
Hyperparameter | Value |
---|---|
num_train_epochs | 2670182400 |
train_batch_size | 2 |
learning_rate | 0.0001 |
warmup_steps | 1000 |
weight_decay | 0 |
How to use
The model can be loaded using the AutoModelForCausalLM
functionality:
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
model = AutoModelForCausalLM.from_pretrained("eunyounglee/GPT-NeoX-2.7B-Vietnamese-finetune")
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.