vilm
/

VyLinh-Lite-preview / README.md
qnguyen3's picture
Update README.md
f1aaf57 verified
metadata
license: cc-by-nc-nd-3.0

VyLinh-Lite: Vietnamese 3B Reasoning Language Model

Model Details

  • Language(s): Vietnamese
  • Base Model: Qwen2.5-3B
  • Model Size: 3 billion parameters

Intended Use

  • Primary intended uses: Vietnamese language understanding, reasoning, and generation
  • Primary intended users: Researchers, developers, and practitioners working with Vietnamese language AI
  • Out-of-scope use cases: Production deployments without additional safety measures

Training Details

Training Data

The model underwent a sophisticated training process involving multiple stages of distillation and adaptation:

  1. Initial knowledge distillation from Llama 3.1 405B
  2. Architecture adaptation using mergekit-tokensurgeon
  3. Secondary distillation to Qwen architecture
  4. Parallel distillation from Qwen2-72B
  5. Final fusion and fine-tuning using EvolKit dataset

Training Procedure

Distillation Process

  1. Logit Distillation

    • Source: Llama 3.1 405B
    • Method: Offline distillation
    • Storage: Top-K logits preservation
  2. Cross-Architecture Adaptation

    • Tool: mergekit-tokensurgeon
    • Process: Vocabulary alignment with Llama 3.1 405B
  3. Architecture Transformation

    • Target: 3B parameter configuration
    • Method: Progressive knowledge transfer

Fine-tuning

  • Final Stage: EvolKit dataset utilization
  • Optimization: Focus on coherence and reasoning capabilities
  • Vocabulary: Qwen-native vocabulary restoration

Performance and Limitations

Benchmarks

Will be updated throughout the day

Limitations

  • Model size constraints may impact certain complex reasoning tasks
  • Performance may vary on domain-specific Vietnamese content
  • Limited context window compared to larger models

Ethical Considerations

  • Data Bias: May reflect biases present in training data
  • Environmental Impact: Reduced compared to larger models due to efficient distillation
  • Societal Impact: Potential influence on Vietnamese language technology landscape

Technical Specifications

  • Parameter Count: 3 billion
  • Context Window: 32K