Collections
Discover the best community collections!
Collections including paper arxiv:2406.14491
-
Bootstrapping Language Models with DPO Implicit Rewards
Paper • 2406.09760 • Published • 39 -
DeepSeek-Coder-V2: Breaking the Barrier of Closed-Source Models in Code Intelligence
Paper • 2406.11931 • Published • 59 -
Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs
Paper • 2406.14544 • Published • 35 -
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 87
-
Instruction Pre-Training: Language Models are Supervised Multitask Learners
Paper • 2406.14491 • Published • 87 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 74 -
RAFT: Adapting Language Model to Domain Specific RAG
Paper • 2403.10131 • Published • 69 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 58
-
Iterative Reasoning Preference Optimization
Paper • 2404.19733 • Published • 48 -
Better & Faster Large Language Models via Multi-token Prediction
Paper • 2404.19737 • Published • 74 -
ORPO: Monolithic Preference Optimization without Reference Model
Paper • 2403.07691 • Published • 64 -
KAN: Kolmogorov-Arnold Networks
Paper • 2404.19756 • Published • 109
-
sDPO: Don't Use Your Data All at Once
Paper • 2403.19270 • Published • 41 -
Advancing LLM Reasoning Generalists with Preference Trees
Paper • 2404.02078 • Published • 44 -
Learn Your Reference Model for Real Good Alignment
Paper • 2404.09656 • Published • 83 -
mDPO: Conditional Preference Optimization for Multimodal Large Language Models
Paper • 2406.11839 • Published • 38