Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL Paper • 2410.12491 • Published Oct 16, 2024 • 4
Insights from the Inverse: Reconstructing LLM Training Goals Through Inverse RL Paper • 2410.12491 • Published Oct 16, 2024 • 4
jaredjoss/pythia-410m-roberta-lr_8e7-kl_01-steps_12000-rlhf-model Text Generation • Updated Aug 6, 2024 • 15
jaredjoss/pythia-410m-roberta-lr_8e7-kl_005-steps_2000-rlhf-model Text Generation • Updated Apr 16, 2024 • 10
jaredjoss/pythia-160m-roberta-lr_1e6-kl_0035-steps_1000-rlhf-model Text Generation • Updated Apr 16, 2024 • 11
jaredjoss/pythia-70m-roberta-lr_3e6-kl_0035-steps_600-rlhf-model Text Generation • Updated Apr 16, 2024 • 12