Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2410.10934

Curated resources that support the use of LLMs to serve as automatic evaluators of other LLM outputs.

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Paper • 2310.17631 • Published Oct 26, 2023 • 34
Prometheus: Inducing Fine-grained Evaluation Capability in Language Models

Paper • 2310.08491 • Published Oct 12, 2023 • 53
Generative Judge for Evaluating Alignment

Paper • 2310.05470 • Published Oct 9, 2023 • 1
Calibrating LLM-Based Evaluator

Paper • 2309.13308 • Published Sep 23, 2023 • 11

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs