-
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 88 -
Qwen Technical Report
Paper • 2309.16609 • Published • 35 -
RLHF Workflow: From Reward Modeling to Online RLHF
Paper • 2405.07863 • Published • 67 -
Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free
Paper • 2410.10814 • Published • 49
Collections
Discover the best community collections!
Collections including paper arxiv:2309.16609
-
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 84 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83
-
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages
Paper • 2309.09400 • Published • 84 -
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Chain-of-Verification Reduces Hallucination in Large Language Models
Paper • 2309.11495 • Published • 37 -
LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models
Paper • 2309.12307 • Published • 88
-
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models
Paper • 2309.09958 • Published • 18 -
TextBind: Multi-turn Interleaved Multimodal Instruction-following
Paper • 2309.08637 • Published • 7 -
AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model
Paper • 2309.16058 • Published • 55 -
Qwen Technical Report
Paper • 2309.16609 • Published • 35
-
OmnimatteRF: Robust Omnimatte with 3D Background Modeling
Paper • 2309.07749 • Published • 7 -
AudioSR: Versatile Audio Super-resolution at Scale
Paper • 2309.07314 • Published • 26 -
Generative Image Dynamics
Paper • 2309.07906 • Published • 53 -
MagiCapture: High-Resolution Multi-Concept Portrait Customization
Paper • 2309.06895 • Published • 27
-
MADLAD-400: A Multilingual And Document-Level Large Audited Dataset
Paper • 2309.04662 • Published • 22 -
Neurons in Large Language Models: Dead, N-gram, Positional
Paper • 2309.04827 • Published • 16 -
Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs
Paper • 2309.05516 • Published • 9 -
DrugChat: Towards Enabling ChatGPT-Like Capabilities on Drug Molecule Graphs
Paper • 2309.03907 • Published • 10