mlfoundations-dev/hp_ablations_qwen_scheduler_constant_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 114
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.05_minlr1e-6_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 19
mlfoundations-dev/hp_ablations_qwen_scheduler_inverse_sqrt_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 115
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.05_minlr5e-7_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 19
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.05_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 148
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.15_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 140
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.05_minlr1e-7_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 143
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr1e-6_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 140
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr1e-7_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 114
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.10_minlr5e-7_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 146
mlfoundations-dev/hp_ablations_qwen_scheduler_cosine_warmup0.10_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 114
mlfoundations-dev/hp_ablations_qwen_scheduler_linear_warmup0.10_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 18
mlfoundations-dev/hp_ablations_qwen_scheduler_linear_warmup0.05_dcftv1.2 Text Generation • Updated Dec 5, 2024 • 145