Jialiang Cheng's picture

19 7

Jialiang Cheng

Julius-L

AI & ML interests

None yet

Recent Activity

updated a collection 2 days ago

multimodal dataset

updated a collection 2 days ago

multimodal dataset

updated a collection 2 days ago

multimodal dataset

View all activity

Organizations

None yet

Julius-L's activity

upvoted a paper 2 days ago

MinMo: A Multimodal Large Language Model for Seamless Voice Interaction

Paper • 2501.06282 • Published 12 days ago • 39

upvoted 13 papers 3 months ago

NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks

Paper • 2410.20650 • Published Oct 28, 2024 • 16

A Survey of Small Language Models

Paper • 2410.20011 • Published Oct 25, 2024 • 40

COAT: Compressing Optimizer states and Activation for Memory-Efficient FP8 Training

Paper • 2410.19313 • Published Oct 25, 2024 • 19

The Llama 3 Herd of Models

Paper • 2407.21783 • Published Jul 31, 2024 • 110

SemiEvol: Semi-supervised Fine-tuning for LLM Adaptation

Paper • 2410.14745 • Published Oct 17, 2024 • 47

Why Does the Effective Context Length of LLMs Fall Short?

Paper • 2410.18745 • Published Oct 24, 2024 • 17

MiniPLM: Knowledge Distillation for Pre-Training Language Models

Paper • 2410.17215 • Published Oct 22, 2024 • 14

Baichuan Alignment Technical Report

Paper • 2410.14940 • Published Oct 19, 2024 • 50

Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities

Paper • 2408.07666 • Published Aug 14, 2024 • 2

Memory-Efficient LLM Training with Online Subspace Descent

Paper • 2408.12857 • Published Aug 23, 2024 • 13

Qwen2-VL: Enhancing Vision-Language Model's Perception of the World at Any Resolution

Paper • 2409.12191 • Published Sep 18, 2024 • 76

Scaling Smart: Accelerating Large Language Model Pre-training with Small Model Initialization

Paper • 2409.12903 • Published Sep 19, 2024 • 22

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 169

upvoted 5 papers 4 months ago

What Matters for Model Merging at Scale?

Paper • 2410.03617 • Published Oct 4, 2024 • 8

SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3, 2024 • 47

MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning

Paper • 2409.20566 • Published Sep 30, 2024 • 55

MaskLLM: Learnable Semi-Structured Sparsity for Large Language Models

Paper • 2409.17481 • Published Sep 26, 2024 • 47

VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models

Paper • 2409.17066 • Published Sep 25, 2024 • 28