Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2408.15240

LLM Reasoning Papers

improve reasoning capabilities of LLMs

Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10
LLM Critics Help Catch LLM Bugs

Paper • 2407.00215 • Published Jun 28, 2024
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling

Paper • 2407.21787 • Published Jul 31, 2024 • 12
Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13

STaR: Bootstrapping Reasoning With Reasoning

Paper • 2203.14465 • Published Mar 28, 2022 • 8
Let's Verify Step by Step

Paper • 2305.20050 • Published May 31, 2023 • 10
Training Large Language Models to Reason in a Continuous Latent Space

Paper • 2412.06769 • Published 28 days ago • 66
Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

Paper • 2411.14405 • Published Nov 21, 2024 • 58

Reinforcement learning (RL)

Proximal Policy Optimization Algorithms

Paper • 1707.06347 • Published Jul 20, 2017 • 4
Fine-Grained Human Feedback Gives Better Rewards for Language Model Training

Paper • 2306.01693 • Published Jun 2, 2023 • 3
Generative Verifiers: Reward Modeling as Next-Token Prediction

Paper • 2408.15240 • Published Aug 27, 2024 • 13
Diffusion Policy Policy Optimization

Paper • 2409.00588 • Published Sep 1, 2024 • 20

Natural Language (LLM, NLP etc)

Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing

Paper • 2404.12253 • Published Apr 18, 2024 • 54
FlowMind: Automatic Workflow Generation with LLMs

Paper • 2404.13050 • Published Mar 17, 2024 • 33
How Far Can We Go with Practical Function-Level Program Repair?

Paper • 2404.12833 • Published Apr 19, 2024 • 6
Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models

Paper • 2404.18796 • Published Apr 29, 2024 • 68

Evaluation of LLM agents paper

MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries

Paper • 2401.15391 • Published Jan 27, 2024 • 6
Long-form factuality in large language models

Paper • 2403.18802 • Published Mar 27, 2024 • 24
JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 33
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 16
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 9
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 11
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 47

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs