LeMaterial: an open source initiative to accelerate materials discovery and research 22 days ago • 31
BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18, 2024 • 43
StarCoder2-Instruct: Fully Transparent and Permissive Self-Alignment for Code Generation Apr 29, 2024 • 76
Building and better understanding vision-language models: insights and future directions Paper • 2408.12637 • Published Aug 22, 2024 • 124
view article Article A failed experiment: Infini-Attention, and why we should keep trying? Aug 14, 2024 • 54
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context Jul 23, 2024 • 225
view article Article Docmatix - a huge dataset for Document Visual Question Answering Jul 18, 2024 • 72
Agentless: Demystifying LLM-based Software Engineering Agents Paper • 2407.01489 • Published Jul 1, 2024 • 42
The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale Paper • 2406.17557 • Published Jun 25, 2024 • 87
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions Paper • 2406.15877 • Published Jun 22, 2024 • 45
view article Article Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models Jun 24, 2024 • 180
Scaling Laws and Compute-Optimal Training Beyond Fixed Training Durations Paper • 2405.18392 • Published May 28, 2024 • 12
Leaderboards and benchmarks ✨ Collection Cool leaderboard spaces collection for models across modalities! Text, vision, audio, ... • 83 items • Updated 15 days ago • 93