Model Information

Model Details

Model Description

Llama3-ViettelSolutions-8B is a variant of the Meta Llama-3-8B model, continued pre-trained on the Vietnamese curated dataset and supervised fine-tuned on 5 million samples of Vietnamese instruct data.

  • Developed by: Viettel Solutions
  • Funded by: NVIDIA
  • Model type: Autoregressive transformer model
  • Language(s) (NLP): Vietnamese, English
  • License: Llama 3 Community License
  • Finetuned from model: meta-llama/Meta-Llama-3-8B

Uses

Example snippet for usage with Transformers:

import transformers
import torch

model_id = "VTSNLP/Llama3-ViettelSolutions-8B"

pipeline = transformers.pipeline(
    "text-generation", model=model_id, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto"
)
pipeline("Xin chào!")

Training Details

Training Data

Training Procedure

Preprocessing

[More Information Needed]

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Data sequence length: 8192
  • Tensor model parallel size: 4
  • Pipelinemodel parallel size: 1
  • Context parallel size: 1
  • Micro batch size: 1
  • Global batch size: 512

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

[More Information Needed]

Technical Specifications

  • Compute Infrastructure: NVIDIA DGX

  • Hardware: 4 x A100 80GB

  • Software: NeMo Framework

Citation

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

More Information

[More Information Needed]

Model Card Authors

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
92
Safetensors
Model size
8.03B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for VTSNLP/Llama3-ViettelSolutions-8B

Finetuned
(389)
this model
Quantizations
2 models

Dataset used to train VTSNLP/Llama3-ViettelSolutions-8B