Base
Base model please
Instruct is not useful, I can't tune this
@ehartford I was actually going to make baes for all 3 of the (11b, 13b, and 16b) and start fine-tuning them.
Are you going to fine-tune as well? (once I finish my other fine-tunes I'll make the base ones)
I will not tune them directly
My intent is to use it to initialize the expert weights of a MoE,
Then pretrain and fine-tune on top of that to produce a dolphin MoE
I think proving you process by doing it with a Instruct model first is a great strategy to show that the output is coherent and the method is sound
@ehartford I was actually going to make baes for all 3 of the (11b, 13b, and 16b) and start fine-tuning them.
Are you going to fine-tune as well?(once I finish my other fine-tunes I'll make the base ones)
Thank you!
@ehartford sounds really interesting! Love to see how they work as experts. Here are the merges based on the base Llama-3-8B: