-
Creative Robot Tool Use with Large Language Models
Paper • 2310.13065 • Published • 8 -
CodeCoT and Beyond: Learning to Program and Test like a Developer
Paper • 2308.08784 • Published • 5 -
Lemur: Harmonizing Natural Language and Code for Language Agents
Paper • 2310.06830 • Published • 31 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 74
Collections
Discover the best community collections!
Collections including paper arxiv:2310.17680
-
SSD-LM: Semi-autoregressive Simplex-based Diffusion Language Model for Text Generation and Modular Control
Paper • 2210.17432 • Published • 1 -
TESS: Text-to-Text Self-Conditioned Simplex Diffusion
Paper • 2305.08379 • Published • 1 -
Diffusion Language Models Can Perform Many Tasks with Scaling and Instruction-Finetuning
Paper • 2308.12219 • Published • 1 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 70
-
ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment
Paper • 2403.05135 • Published • 42 -
Understanding Diffusion Objectives as the ELBO with Simple Data Augmentation
Paper • 2303.00848 • Published -
Scalable Diffusion Models with Transformers
Paper • 2212.09748 • Published • 17 -
High-Resolution Image Synthesis with Latent Diffusion Models
Paper • 2112.10752 • Published • 12
-
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper • 2403.03163 • Published • 93 -
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Paper • 2403.02545 • Published • 15 -
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
Exploring Parameter-Efficient Fine-Tuning Techniques for Code Generation with Large Language Models
Paper • 2308.10462 • Published • 2
-
AtP*: An efficient and scalable method for localizing LLM behaviour to components
Paper • 2403.00745 • Published • 12 -
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Paper • 2402.17764 • Published • 605 -
MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT
Paper • 2402.16840 • Published • 23 -
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper • 2402.13753 • Published • 114
-
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Paper • 2002.08155 • Published • 2 -
OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement
Paper • 2402.14658 • Published • 82 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 70 -
CodePlan: Repository-level Coding using LLMs and Planning
Paper • 2309.12499 • Published • 74
-
Chain-of-Thought Reasoning Without Prompting
Paper • 2402.10200 • Published • 104 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
BitDelta: Your Fine-Tune May Only Be Worth One Bit
Paper • 2402.10193 • Published • 19 -
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Paper • 2402.09727 • Published • 36
-
StarCoder: may the source be with you!
Paper • 2305.06161 • Published • 29 -
WizardCoder: Empowering Code Large Language Models with Evol-Instruct
Paper • 2306.08568 • Published • 28 -
SantaCoder: don't reach for the stars!
Paper • 2301.03988 • Published • 7 -
DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence
Paper • 2401.14196 • Published • 48
-
Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models
Paper • 2312.09608 • Published • 13 -
CodeFusion: A Pre-trained Diffusion Model for Code Generation
Paper • 2310.17680 • Published • 70 -
ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Real Image
Paper • 2310.17994 • Published • 8 -
Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss
Paper • 2401.02677 • Published • 22
-
Attention Is All You Need
Paper • 1706.03762 • Published • 50 -
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Paper • 1810.04805 • Published • 16 -
RoBERTa: A Robustly Optimized BERT Pretraining Approach
Paper • 1907.11692 • Published • 7 -
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
Paper • 1910.01108 • Published • 14