zucco
's Collections
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Paper
•
2402.13753
•
Published
•
115
Quiet-STaR: Language Models Can Teach Themselves to Think Before
Speaking
Paper
•
2403.09629
•
Published
•
76
Larimar: Large Language Models with Episodic Memory Control
Paper
•
2403.11901
•
Published
•
33
Evolutionary Optimization of Model Merging Recipes
Paper
•
2403.13187
•
Published
•
51
InternLM2 Technical Report
Paper
•
2403.17297
•
Published
•
30
Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language
Models
Paper
•
2404.12387
•
Published
•
39
XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference
Paper
•
2404.15420
•
Published
•
8
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report
Paper
•
2405.00732
•
Published
•
120
TPI-LLM: Serving 70B-scale LLMs Efficiently on Low-resource Edge Devices
Paper
•
2410.00531
•
Published
•
30