Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2402.17759

Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

Paper • 2401.10774 • Published Jan 19, 2024 • 54
APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding

Paper • 2401.06761 • Published Jan 12, 2024 • 1
Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache

Paper • 2401.02669 • Published Jan 5, 2024 • 14
MambaByte: Token-free Selective State Space Model

Paper • 2401.13660 • Published Jan 24, 2024 • 53

Self-Refine: Iterative Refinement with Self-Feedback

Paper • 2303.17651 • Published Mar 30, 2023 • 2
Gorilla: Large Language Model Connected with Massive APIs

Paper • 2305.15334 • Published May 24, 2023 • 5
PERL: Parameter Efficient Reinforcement Learning from Human Feedback

Paper • 2403.10704 • Published Mar 15, 2024 • 57
Towards Optimal Learning of Language Models

Paper • 2402.17759 • Published Feb 27, 2024 • 16

Papers I find interesting

Scaling Instruction-Finetuned Language Models

Paper • 2210.11416 • Published Oct 20, 2022 • 7
Mamba: Linear-Time Sequence Modeling with Selective State Spaces

Paper • 2312.00752 • Published Dec 1, 2023 • 138
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Paper • 2403.05530 • Published Mar 8, 2024 • 61
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 62

ibm/AttaQ

Viewer • Updated Jan 26, 2024 • 1.4k • 1.28k • 13
snorkelai/snorkel-curated-instruction-tuning

Preview • Updated Mar 11, 2024 • 34 • 8
corbyrosset/researchy_questions

Viewer • Updated Feb 29, 2024 • 96.4k • 47 • 25
argilla/ultrafeedback-binarized-preferences

Viewer • Updated Nov 30, 2023 • 63.6k • 287 • 70

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 136
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 52
Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 18
Priority Sampling of Large Language Models for Compilers

Paper • 2402.18734 • Published Feb 28, 2024 • 16

daily_paper_coll

Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 52
Beyond Language Models: Byte Models are Digital World Simulators

Paper • 2402.19155 • Published Feb 29, 2024 • 49
StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 136
Simple linear attention language models balance the recall-throughput tradeoff

Paper • 2402.18668 • Published Feb 28, 2024 • 18

Visual In-Context Prompting

Paper • 2311.13601 • Published Nov 22, 2023 • 16
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 143
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 3
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Paper • 2303.02927 • Published Mar 6, 2023 • 3

Papers - Learning and Compression

Towards Optimal Learning of Language Models

Paper • 2402.17759 • Published Feb 27, 2024 • 16

Towards Optimal Learning of Language Models

Paper • 2402.17759 • Published Feb 27, 2024 • 16

Efficient Training

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 12
Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2, 2024 • 45
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 15
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 62

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs