TRL

https://github.com/huggingface/trl

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

qgallouedec updated a dataset 13 days ago

trl-lib/documentation-images

qgallouedec updated a dataset 19 days ago

trl-lib/ultrafeedback-prompt

qgallouedec updated a model 22 days ago

trl-lib/Qwen2-0.5B-Reward-Math-Sheperd

View all activity

Organization Card

Community About org cards

This is the organization grouping all the models and datasets used in the TRL library.

Collections 2

spaces 2

Running

⚒️

TextEnvironments

Runtime error

213

🦙

StackLLaMa

models 81

datasets 19

trl-lib/documentation-images

Viewer • Updated 13 days ago • 1 • 5.47k

trl-lib/ultrafeedback-prompt

Viewer • Updated 19 days ago • 39.8k • 871 • 3

trl-lib/math_shepherd

Viewer • Updated Nov 28, 2024 • 445k • 372 • 1

trl-lib/alpaca-cleaned

Viewer • Updated Nov 28, 2024 • 51.8k • 51

trl-lib/hh-rlhf-helpful-base

Viewer • Updated Nov 21, 2024 • 46.2k • 111

trl-lib/prm800k

Viewer • Updated Nov 20, 2024 • 41.2k • 64 • 1

trl-lib/rlaif-v

Viewer • Updated Sep 27, 2024 • 83.1k • 148 • 3

trl-lib/Capybara-Preferences

Viewer • Updated Sep 19, 2024 • 15.4k • 78

trl-lib/Capybara

Viewer • Updated Sep 19, 2024 • 16k • 1.08k • 1

trl-lib/tldr

Viewer • Updated Sep 12, 2024 • 130k • 348

TRL

AI & ML interests

Recent Activity

Collections 2

teknium/OpenHermes-2.5-Mistral-7B

Intel/orca_dpo_pairs

trl-lib/OpenHermes-2-Mistral-7B-ipo-beta-0.1-steps-200

trl-lib/OpenHermes-2-Mistral-7B-ipo-beta-0.2-steps-200

trl-lib/pythia-1b-deduped-tldr-online-dpo

trl-lib/pythia-1b-deduped-tldr-sft

trl-lib/pythia-6.9b-deduped-tldr-online-dpo

trl-lib/pythia-2.8b-deduped-tldr-sft

spaces 2

TextEnvironments

StackLLaMa

models 81

trl-lib/Qwen2-0.5B-Reward-Math-Sheperd

trl-lib/Qwen2-0.5B-XPO

trl-lib/Qwen2-0.5B-OnlineDPO

trl-lib/Qwen2-0.5B-KTO

trl-lib/Qwen2-0.5B-ORPO

trl-lib/Qwen2-0.5B-DPO

trl-lib/Qwen2-0.5B-Reward

trl-lib/pythia-1b-deduped-tldr-rm

trl-lib/pythia-2.8b-deduped-tldr-online-dpo

trl-lib/pythia-6.9b-deduped-tldr-offline-dpo

datasets 19

trl-lib/documentation-images

trl-lib/ultrafeedback-prompt

trl-lib/math_shepherd

trl-lib/alpaca-cleaned

trl-lib/hh-rlhf-helpful-base

trl-lib/prm800k

trl-lib/rlaif-v

trl-lib/Capybara-Preferences

trl-lib/Capybara

trl-lib/tldr

AI & ML interests

Recent Activity

Team members 8

Collections 2

spaces 2 Sort: Recently updated

TextEnvironments

StackLLaMa

models 81 Sort: Recently updated

datasets 19 Sort: Recently updated

spaces 2

models 81

datasets 19