vilm
/

VyLinh-Lite-preview / README.md
qnguyen3's picture
Update README.md
f1aaf57 verified
---
license: cc-by-nc-nd-3.0
---
# VyLinh-Lite: Vietnamese 3B Reasoning Language Model
## Model Details
- **Language(s)**: Vietnamese
- **Base Model**: Qwen2.5-3B
- **Model Size**: 3 billion parameters
## Intended Use
- **Primary intended uses**: Vietnamese language understanding, reasoning, and generation
- **Primary intended users**: Researchers, developers, and practitioners working with Vietnamese language AI
- **Out-of-scope use cases**: Production deployments without additional safety measures
## Training Details
### Training Data
The model underwent a sophisticated training process involving multiple stages of distillation and adaptation:
1. Initial knowledge distillation from Llama 3.1 405B
2. Architecture adaptation using mergekit-tokensurgeon
3. Secondary distillation to Qwen architecture
4. Parallel distillation from Qwen2-72B
5. Final fusion and fine-tuning using EvolKit dataset
### Training Procedure
#### Distillation Process
1. **Logit Distillation**
- Source: Llama 3.1 405B
- Method: Offline distillation
- Storage: Top-K logits preservation
2. **Cross-Architecture Adaptation**
- Tool: mergekit-tokensurgeon
- Process: Vocabulary alignment with Llama 3.1 405B
3. **Architecture Transformation**
- Target: 3B parameter configuration
- Method: Progressive knowledge transfer
#### Fine-tuning
- **Final Stage**: EvolKit dataset utilization
- **Optimization**: Focus on coherence and reasoning capabilities
- **Vocabulary**: Qwen-native vocabulary restoration
## Performance and Limitations
### Benchmarks
Will be updated throughout the day
### Limitations
- Model size constraints may impact certain complex reasoning tasks
- Performance may vary on domain-specific Vietnamese content
- Limited context window compared to larger models
## Ethical Considerations
- **Data Bias**: May reflect biases present in training data
- **Environmental Impact**: Reduced compared to larger models due to efficient distillation
- **Societal Impact**: Potential influence on Vietnamese language technology landscape
## Technical Specifications
- **Parameter Count**: 3 billion
- **Context Window**: 32K