Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.01306

Preference Datasets for KTO

This collection contains a list of curated preference datasets for KTO fine-tuning for intent alignment of LLMs through signals.

argilla/ultrafeedback-binarized-preferences-cleaned-kto

Viewer • Updated Mar 19, 2024 • 231k • 90 • 9
argilla/distilabel-intel-orca-kto

Viewer • Updated Mar 19, 2024 • 23.1k • 43 • 7
argilla/distilabel-capybara-kto-15k-binarized

Viewer • Updated Mar 19, 2024 • 15.1k • 48 • 5
argilla/kto-mix-15k

Viewer • Updated Apr 19, 2024 • 15.3k • 57 • 13

AI-MO/NuminaMath-7B-TIR

Text Generation • Updated Aug 14, 2024 • 2.58k • 322
Running

308

📐

Reward Bench Leaderboard
KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 16

alignment_24_best

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 16
Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 50
SimPO: Simple Preference Optimization with a Reference-Free Reward

Paper • 2405.14734 • Published May 23, 2024 • 11
Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment

Paper • 2408.06266 • Published Aug 12, 2024 • 10

KTO: Model Alignment as Prospect Theoretic Optimization

Paper • 2402.01306 • Published Feb 2, 2024 • 16

ibm/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 1.1k • 13
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 36 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 44 • 25
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 288 • 70

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs