What is the training procedure?
#5
by
Markgazol
- opened
Hello,
Amazing work on the hybrid model! I’m curious about the training process:
- What datasets were used for each of the four training stages?
- For each Stage, which parameters were frozen vs. fine-tuned?
- Are you planning to release the training codes?
Looking forward to your insights—thanks for sharing this!
Found the paper https://arxiv.org/pdf/2412.19048v1 :)