metadata
library_name: transformers
license: llama3
datasets:
- VTSNLP/vietnamese_curated_dataset
language:
- vi
- en
base_model:
- meta-llama/Meta-Llama-3-8B
pipeline_tag: text-generation
Model Information
Model Details
Model Description
Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.
- Developed by: Viettel Solutions
- Funded by: NVIDIA
- Model type: Autoregressive transformer model
- Language(s) (NLP): Vietnamese, English
- License: Llama 3 Community License
- Finetuned from model: meta-llama/Meta-Llama-3-8B
Uses
Example snippet for usage with Transformers:
import transformers
import torch
model_id = "VTSNLP/Llama3-ViettelSolutions-8B"
pipeline = transformers.pipeline(
"text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chào!")
Training Details
Training Data
Dataset for continue pretrain: Vietnamese curated dataset
Dataset for supervised fine-tuning: Instruct general dataset
Training Procedure
Preprocessing
[More Information Needed]
Training Hyperparameters
- Training regime: bf16 mixed precision
- Data sequence length: 8192
- Tensor model parallel size: 4
- Pipelinemodel parallel size: 1
- Context parallel size: 1
- Micro batch size: 1
- Global batch size: 512
Evaluation
Testing Data, Factors & Metrics
Testing Data
[More Information Needed]
Factors
[More Information Needed]
Metrics
[More Information Needed]
Results
[More Information Needed]
Summary
[More Information Needed]
Technical Specifications
Compute Infrastructure: NVIDIA DGX
Hardware: 4 x A100 80GB
Software: NeMo Framework
Citation
BibTeX:
[More Information Needed]
APA:
[More Information Needed]
More Information
[More Information Needed]
Model Card Authors
[More Information Needed]
Model Card Contact
[More Information Needed]