Base Model: LLaMa2 7B Chat HF
- Extend vocab to 44,800 for better Vietnamese understanding
- Continual Pre-Train with >2B tokens Vietnamese
- Trainning profile: LoRa (rank=32, alpha=128, 16fp), 1 epoch, block size = 512. Takes 300GPU Hours x RXT4090 24GB
Can be better use for
- Futher training / Fine-tuning for Vietnamese tasks
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.