--- license: apache-2.0 datasets: - opencsg/smoltalk-chinese language: - zh base_model: - opencsg/csg-wukong-ablation-chinese-fineweb-edu --- * Using ``opencsg/csg-wukong-2b-chinese-fineweb-edu`` as base model, we fine-tune it on ``smoltalk-chinese`` for 2 epoch * learning rate = 3e-4 ; global batch size = 32 ; lr scheduler=cosine