view article Article πΊπ¦ββ¬ LLM Comparison/Test: 25 SOTA LLMs (including QwQ) through 59 MMLU-Pro CS benchmark runs By wolfram β’ Dec 4, 2024 β’ 75
PaliGemma 2: A Family of Versatile VLMs for Transfer Paper β’ 2412.03555 β’ Published Dec 4, 2024 β’ 121
ColPali: Efficient Document Retrieval with Vision Language Models Paper β’ 2407.01449 β’ Published Jun 27, 2024 β’ 42
Transformer Explainer: Interactive Learning of Text-Generative Models Paper β’ 2408.04619 β’ Published Aug 8, 2024 β’ 156
view article Article Assisted Generation: a new direction toward low-latency text generation May 11, 2023 β’ 38
Llama 3.1 GPTQ, AWQ, and BNB Quants Collection Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM π€ β’ 9 items β’ Updated Sep 26, 2024 β’ 56
Llama 3.1 Collection This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models β’ 11 items β’ Updated about 1 month ago β’ 638
view article Article Llama 3.1 - 405B, 70B & 8B with multilinguality and long context Jul 23, 2024 β’ 225
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models Paper β’ 2407.12327 β’ Published Jul 17, 2024 β’ 77
NuminaMath Collection Datasets and models for training SOTA math LLMs. See our GitHub for training & inference code: https://github.com/project-numina/aimo-progress-prize β’ 6 items β’ Updated Jul 21, 2024 β’ 69
πͺ SmolLM Collection A series of smol LLMs: 135M, 360M and 1.7B. We release base and Instruct models as well as the training corpus and some WebGPU demos β’ 12 items β’ Updated 14 days ago β’ 206
view article Article PaliGemma β Google's Cutting-Edge Open Vision Language Model May 14, 2024 β’ 231
Tree of Thoughts: Deliberate Problem Solving with Large Language Models Paper β’ 2305.10601 β’ Published May 17, 2023 β’ 11
FunAudioLLM: Voice Understanding and Generation Foundation Models for Natural Interaction Between Humans and LLMs Paper β’ 2407.04051 β’ Published Jul 4, 2024 β’ 35
AgentInstruct: Toward Generative Teaching with Agentic Flows Paper β’ 2407.03502 β’ Published Jul 3, 2024 β’ 51
view article Article BigCodeBench: Benchmarking Large Language Models on Solving Practical and Challenging Programming Tasks Jun 18, 2024 β’ 43