-
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper • 2312.16862 • Published • 30 -
Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action
Paper • 2312.17172 • Published • 27 -
Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers
Paper • 2401.01974 • Published • 5 -
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations
Paper • 2401.01885 • Published • 27
Collections
Discover the best community collections!
Collections including paper arxiv:2401.00368
-
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Paper • 2401.00448 • Published • 28 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models
Paper • 2401.06951 • Published • 25 -
The Unreasonable Ineffectiveness of the Deeper Layers
Paper • 2403.17887 • Published • 78
-
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 54 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 257
-
Dense X Retrieval: What Retrieval Granularity Should We Use?
Paper • 2312.06648 • Published • 1 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 79 -
Text Embeddings Reveal (Almost) As Much As Text
Paper • 2310.06816 • Published • 1 -
RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Paper • 2401.08406 • Published • 37
-
A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions
Paper • 2312.08578 • Published • 16 -
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks
Paper • 2312.08583 • Published • 9 -
Vision-Language Models as a Source of Rewards
Paper • 2312.09187 • Published • 11 -
StemGen: A music generation model that listens
Paper • 2312.08723 • Published • 47
-
GAIA: a benchmark for General AI Assistants
Paper • 2311.12983 • Published • 187 -
Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models
Paper • 2312.02969 • Published • 12 -
Axiomatic Preference Modeling for Longform Question Answering
Paper • 2312.02206 • Published • 7 -
Alignment for Honesty
Paper • 2312.07000 • Published • 11