license: apache-2.0 | |
datasets: | |
- opencsg/smoltalk-chinese | |
language: | |
- zh | |
base_model: | |
- opencsg/csg-wukong-ablation-chinese-fineweb-edu | |
* Using ``opencsg/csg-wukong-2b-chinese-fineweb-edu`` as base model, we fine-tune it on ``smoltalk-chinese`` for 2 epoch | |
* learning rate = 3e-4 ; global batch size = 32 ; lr scheduler=cosine |