dataset Intruction
datasets:
- mxz/CValues_DPO
language:
- zh
- en
metrics:
- perplexity
pipeline_tag:
- text-generation
tags:
- DPO
- fintune
- alignment
- LoRA
- Llama-3
About mxz-llama-3-8B-sft
This model trained by SFT and PPO.
It's have coding, reasoing, chinese QA .
evaluation
Result:
Model | MMLU | C-EVAL | C-MMLU |
---|---|---|---|
Llama-3-8B | 55.5 | 47.0 | 48.0 |
Llama-3-8B-Instruct | 60.1 | 49.7 | 49.3 |
Llama-3-8B-dpo | 62.2 | 49.9 | 49.4 |
- Llama-3-8B evaluation result from ymcui/Chinese-LLaMA-Alpaca-3
- Downloads last month
- 1
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.