Model Information
Model Details
Model Description
Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.
- Developed by: Viettel Solutions
- Funded by: NVIDIA
- Model type: Autoregressive transformer model
- Language(s) (NLP): Vietnamese, English
- License: Llama 3 Community License
- Finetuned from model: meta-llama/Meta-Llama-3-8B
Uses
Example snippet for usage with Transformers:
import transformers
import torch
model_id = "VTSNLP/Llama3-ViettelSolutions-8B"
pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chào!")
Training Details
Training Data
Dataset for continue pretrain: Vietnamese curated dataset
Dataset for supervised fine-tuning: Instruct general dataset
Training Procedure
Preprocessing
[More Information Needed]
Training Hyperparameters
- Training regime: bf16 mixed precision
- Data sequence length: 8192
- Tensor model parallel size: 4
- Pipelinemodel parallel size: 1
- Context parallel size: 1
- Micro batch size: 1
- Global batch size: 512
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
[More Information Needed]
Technical Specifications
Compute Infrastructure: NVIDIA DGX
Hardware: 4 x A100 80GB
Software: NeMo Framework
Citation
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
More Information
[More Information Needed]
Model Card Authors
[More Information Needed]
Model Card Contact
[More Information Needed]
- Downloads last month
- 112
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.