library_name: transformers | |
tags: | |
- trl | |
- sft | |
base_model: | |
- meta-llama/Llama-3.2-1B-Instruct | |
datasets: | |
- ngxson/MiniThinky-dataset | |
# MiniThinky 1B | |
My first trial to fine tune a small model to add reasoning capability. | |
Chat template is the same with llama 3, but the response will be as follow: | |
``` | |
<|thinking|>{thinking_process} | |
<|answer|> | |
{real_answer} | |
``` | |
TODO: include more info here + maybe do some benchmarks? (Plz add a discussion if you're interested) | |