Ryukijano
's Collections
LLM in a flash: Efficient Large Language Model Inference with Limited
Memory
Paper
•
2312.11514
•
Published
•
258
3D-LFM: Lifting Foundation Model
Paper
•
2312.11894
•
Published
•
14
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective
Depth Up-Scaling
Paper
•
2312.15166
•
Published
•
57
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper
•
2312.16862
•
Published
•
31
LARP: Language-Agent Role Play for Open-World Games
Paper
•
2312.17653
•
Published
•
32
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper
•
2401.01055
•
Published
•
54
tiiuae/falcon-180B
Text Generation
•
Updated
•
1.16k
•
1.13k
meta-llama/Llama-2-70b-hf
Text Generation
•
Updated
•
104k
•
843
TinyLlama: An Open-Source Small Language Model
Paper
•
2401.02385
•
Published
•
91
microsoft/phi-2
Text Generation
•
Updated
•
182k
•
3.26k
🏢
LLaMA Pro 8B Instruct Chat
MoE-Mamba: Efficient Selective State Space Models with Mixture of
Experts
Paper
•
2401.04081
•
Published
•
70
Blending Is All You Need: Cheaper, Better Alternative to
Trillion-Parameters LLM
Paper
•
2401.02994
•
Published
•
49
Secrets of RLHF in Large Language Models Part II: Reward Modeling
Paper
•
2401.06080
•
Published
•
27
EmbeddedLLM/Mistral-7B-Merge-14-v0.1
Text Generation
•
Updated
•
226
•
24
Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding
Paper
•
2401.12954
•
Published
•
30
MM-LLMs: Recent Advances in MultiModal Large Language Models
Paper
•
2401.13601
•
Published
•
46
🏆🤖
Chatbot Arena Leaderboard
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper
•
2401.15947
•
Published
•
50
RWKV/v5-Eagle-7B-pth
Updated
•
199
Zyphra/BlackMamba-2.8B
Updated
•
7
•
30
abacusai/Smaug-72B-v0.1
Text Generation
•
Updated
•
2.8k
•
468
CohereForAI/aya-101
Text2Text Generation
•
Updated
•
3.76k
•
629
🚀
Pivot Prompt Demo
SubGen: Token Generation in Sublinear Time and Memory
Paper
•
2402.06082
•
Published
•
11
InternLM-Math: Open Math Large Language Models Toward Verifiable
Reasoning
Paper
•
2402.06332
•
Published
•
19
MPIrigen: MPI Code Generation through Domain-Specific Language Models
Paper
•
2402.09126
•
Published
•
13
BioMistral/BioMistral-7B
Text Generation
•
Updated
•
12.9k
•
414
SaulLM-7B: A pioneering Large Language Model for Law
Paper
•
2403.03883
•
Published
•
78
PERL: Parameter Efficient Reinforcement Learning from Human Feedback
Paper
•
2403.10704
•
Published
•
58
LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models
Paper
•
2403.13372
•
Published
•
63
raincandy-u/Llama-3-Aplite-Instruct-4x8B-MoE
Text Generation
•
Updated
•
171
•
38
nvidia/Llama3-70B-SteerLM-RM
Updated
•
12
•
42
meta-llama/Llama-3.1-405B-FP8
Text Generation
•
Updated
•
4k
•
108
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF
Text Generation
•
Updated
•
354k
•
2k
nvidia/Hymba-1.5B-Base
Text Generation
•
Updated
•
7.87k
•
136
nvidia/Hymba-1.5B-Instruct
Text Generation
•
Updated
•
4.61k
•
221
Qwen/QwQ-32B-Preview
Text Generation
•
Updated
•
168k
•
•
1.58k
meta-llama/Llama-3.3-70B-Instruct
Text Generation
•
Updated
•
552k
•
•
1.71k
NovaSky-AI/Sky-T1-32B-Preview
Text Generation
•
Updated
•
9.27k
•
498