VTSNLP's picture
Update README.md
9b64a07 verified
metadata
library_name: transformers
license: llama3
datasets:
  - VTSNLP/vietnamese_curated_dataset
language:
  - vi
  - en
base_model:
  - meta-llama/Meta-Llama-3-8B
pipeline_tag: text-generation

Model Information

Model Details

Model Description

Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.

  • Developed by: Viettel Solutions
  • Funded by: NVIDIA
  • Model type: Autoregressive transformer model
  • Language(s) (NLP): Vietnamese, English
  • License: Llama 3 Community License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Uses

Example snippet for usage with Transformers:

import transformers
import torch

model_id = "VTSNLP/Llama3-ViettelSolutions-8B"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chào!")

Training Details

Training Data

Training Procedure

Preprocessing

[More Information Needed]

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Data sequence length: 8192
  • Tensor model parallel size: 4
  • Pipelinemodel parallel size: 1
  • Context parallel size: 1
  • Micro batch size: 1
  • Global batch size: 512

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

[More Information Needed]

Technical Specifications

  • Compute Infrastructure: NVIDIA DGX

  • Hardware: 4 x A100 80GB

  • Software: NeMo Framework

Citation

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

More Information

[More Information Needed]

Model Card Authors

[More Information Needed]

Model Card Contact

[More Information Needed]