Model Information
Model Details
Model Description
Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.
- Developed by: Viettel Solutions
- Funded by: NVIDIA
- Model type: Autoregressive transformer model
- Language(s) (NLP): Vietnamese, English
- License: Llama 3 Community License
- Finetuned from model: meta-llama/Meta-Llama-3-8B
Uses
Example snippet for usage with Transformers:
import transformers
import torch
model_id = "VTSNLP/Llama3-ViettelSolutions-8B"
pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chà o!")
Training Details
Training Data
Dataset for continue pretrain: Vietnamese curated dataset
Dataset for supervised fine-tuning: Instruct general dataset
Training Procedure
Preprocessing
[More Information Needed]
Training Hyperparameters
- Training regime: bf16 mixed precision
- Data sequence length: 8192
- Tensor model parallel size: 4
- Pipelinemodel parallel size: 1
- Context parallel size: 1
- Micro batch size: 1
- Global batch size: 512
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
[More Information Needed]
Technical Specifications
Compute Infrastructure: NVIDIA DGX
Hardware: 4 x A100 80GB
Software: NeMo Framework
Citation
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
More Information
[More Information Needed]
Model Card Authors
[More Information Needed]
Model Card Contact
[More Information Needed]
- Downloads last month
- 97
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.