-
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 181 -
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models
Paper • 2401.04658 • Published • 25 -
Weaver: Foundation Models for Creative Writing
Paper • 2401.17268 • Published • 43 -
Efficient Tool Use with Chain-of-Abstraction Reasoning
Paper • 2401.17464 • Published • 17
Collections
Discover the best community collections!
Collections including paper arxiv:2402.19173
-
Attention Is All You Need
Paper • 1706.03762 • Published • 50 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 16 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14
-
LLM-Assisted Code Cleaning For Training Accurate Code Generators
Paper • 2311.14904 • Published • 4 -
The Program Testing Ability of Large Language Models for Code
Paper • 2310.05727 • Published • 1 -
Neural Rankers for Code Generation via Inter-Cluster Modeling
Paper • 2311.03366 • Published • 1 -
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 80
-
Magicoder: Source Code Is All You Need
Paper • 2312.02120 • Published • 80 -
StarCoder 2 and The Stack v2: The Next Generation
Paper • 2402.19173 • Published • 136 -
Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation
Paper • 2305.01210 • Published • 4 -
NeuRI: Diversifying DNN Generation via Inductive Rule Inference
Paper • 2302.02261 • Published • 3
-
Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models
Paper • 2311.06783 • Published • 26 -
To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning
Paper • 2311.07574 • Published • 14 -
Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding
Paper • 2401.04575 • Published • 14 -
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Paper • 2402.00159 • Published • 61
-
System 2 Attention (is something you might need too)
Paper • 2311.11829 • Published • 39 -
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems
Paper • 2311.11315 • Published • 6 -
Alignment for Honesty
Paper • 2312.07000 • Published • 11 -
Steering Llama 2 via Contrastive Activation Addition
Paper • 2312.06681 • Published • 11
-
Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation
Paper • 2310.18628 • Published • 7 -
ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation
Paper • 2311.00272 • Published • 9 -
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Paper • 2312.04474 • Published • 30 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82
-
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 74 -
SCREWS: A Modular Framework for Reasoning with Revisions
Paper • 2309.13075 • Published • 15 -
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Paper • 2310.03731 • Published • 29 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper • 2310.06830 • Published • 31