DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models Paper • 2402.03300 • Published Feb 5, 2024 • 75
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models Paper • 2309.12284 • Published Sep 21, 2023 • 19
DeepSeek-Prover: Advancing Theorem Proving in LLMs through Large-Scale Synthetic Data Paper • 2405.14333 • Published May 23, 2024 • 37
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding Paper • 2412.10302 • Published 21 days ago • 11
Granite 3.1 Language Models Collection A series of language models with 128K context length trained by IBM licensed under Apache 2.0 license. • 8 items • Updated 17 days ago • 45
AI PC: Text Generation Collection Text generation LLMs that have been validated to run on the AI PC Intel® Core™ Ultra CPU and iGPU. • 186 items • Updated Aug 28, 2024 • 4
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 10 items • Updated 24 days ago • 50
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 • Sep 3, 2024 • 30
Large Language Monkeys: Scaling Inference Compute with Repeated Sampling Paper • 2407.21787 • Published Jul 31, 2024 • 12
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published 28 days ago • 123