BigCode

Enterprise

non-profit

https://www.bigcode-project.org/

bigcode-project

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Jiayi-Pan authored a paper about 13 hours ago

World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Jiayi-Pan authored a paper about 13 hours ago

Inversion-Free Image Editing with Natural Language

Jiayi-Pan authored a paper about 13 hours ago

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

View all activity

bigcode's activity

Jiayi-Pan

authored 5 papers about 13 hours ago

World-to-Words: Grounded Open Vocabulary Acquisition through Fast Mapping in Vision-Language Models

Paper • 2306.08685 • Published Jun 14, 2023 • 1

Inversion-Free Image Editing with Natural Language

Paper • 2312.04965 • Published Dec 7, 2023 • 2

ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL

Paper • 2402.19446 • Published Feb 29, 2024

DANLI: Deliberative Agent for Following Natural Language Instructions

Paper • 2210.12485 • Published Oct 22, 2022

Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Paper • 2405.10292 • Published May 16, 2024 • 1

terryyz

in bigcode/bigcodebench-leaderboard 8 days ago

Fairness?

#9 opened 8 days ago by

Human baseline

#8 opened 9 days ago by

terryyz

updated 10 datasets 9 days ago

bigcode/bigcodebench-hard-results

Viewer • Updated 9 days ago • 160 • 61

bigcode/bigcodebench-hard-elo

Viewer • Updated 9 days ago • 264 • 58

bigcode/bigcodebench-results

Viewer • Updated 9 days ago • 160 • 71 • 1

bigcode/bigcodebench-elo

Viewer • Updated 9 days ago • 218 • 73

bigcode/bigcodebench-hard-solve-rate

Viewer • Updated 9 days ago • 296 • 59

bigcode/bigcodebench-hard-domain

Viewer • Updated 9 days ago • 291 • 53

bigcode/bigcodebench-hard-perf

Viewer • Updated 9 days ago • 291 • 55

bigcode/bigcodebench-solve-rate

Viewer • Updated 9 days ago • 2.28k • 63

bigcode/bigcodebench-domain

Viewer • Updated 9 days ago • 245 • 55

bigcode/bigcodebench-perf

Viewer • Updated 9 days ago • 245 • 51

terryyz

updated a Space 9 days ago

BigCodeBench Leaderboard

terryyz

in bigcode/bigcodebench-evaluator 15 days ago

Is it possible to evaluate a subset?

#2 opened 19 days ago by

mayank-mishra

authored a paper 15 days ago

Selective Self-Rehearsal: A Fine-Tuning Approach to Improve Generalization in Large Language Models

Paper • 2409.04787 • Published Sep 7, 2024