-
Can Large Language Models Understand Context?
Paper • 2402.00858 • Published • 22 -
OLMo: Accelerating the Science of Language Models
Paper • 2402.00838 • Published • 82 -
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 145 -
SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity
Paper • 2401.17072 • Published • 25
Collections
Discover the best community collections!
Collections including paper arxiv:2402.09668
-
PDFTriage: Question Answering over Long, Structured Documents
Paper • 2309.08872 • Published • 53 -
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
Table-GPT: Table-tuned GPT for Diverse Table Tasks
Paper • 2310.09263 • Published • 39 -
Context-Aware Meta-Learning
Paper • 2310.10971 • Published • 16
-
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Paper • 2401.15024 • Published • 69 -
SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning
Paper • 2407.07523 • Published • 4 -
Spectra: A Comprehensive Study of Ternary, Quantized, and FP16 Language Models
Paper • 2407.12327 • Published • 77
-
Pre-training Small Base LMs with Fewer Tokens
Paper • 2404.08634 • Published • 35 -
Ziya2: Data-centric Learning is All LLMs Need
Paper • 2311.03301 • Published • 16 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies
Paper • 2404.06395 • Published • 22
-
Adapting Large Language Models via Reading Comprehension
Paper • 2309.09530 • Published • 77 -
TinyGSM: achieving >80% on GSM8k with small language models
Paper • 2312.09241 • Published • 37 -
How to Train Data-Efficient LLMs
Paper • 2402.09668 • Published • 40 -
Let GPT be a Math Tutor: Teaching Math Word Problem Solvers with Customized Exercise Generation
Paper • 2305.14386 • Published
-
OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework
Paper • 2404.14619 • Published • 126 -
Scaling Laws for Downstream Task Performance of Large Language Models
Paper • 2402.04177 • Published • 17 -
Orca 2: Teaching Small Language Models How to Reason
Paper • 2311.11045 • Published • 71 -
Orca-Math: Unlocking the potential of SLMs in Grade School Math
Paper • 2402.14830 • Published • 24
-
Watermarking Makes Language Models Radioactive
Paper • 2402.14904 • Published • 23 -
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
Paper • 2402.15220 • Published • 19 -
GPTVQ: The Blessing of Dimensionality for LLM Quantization
Paper • 2402.15319 • Published • 19 -
DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation
Paper • 2402.11929 • Published • 10