Patrick Haller's picture

9 7 40

Patrick Haller PRO

PatrickHaller

·

HallerPatrick

AI & ML interests

NLP, Language Models, Autoregressive Models

Recent Activity

upvoted a collection 14 days ago

fuck quadratic attention

updated a model 19 days ago

PatrickHaller/hgrn2_pile_10M_distill_babylm

updated a model 20 days ago

PatrickHaller/hgrn2_pile_100m_distill_babylm

View all activity

Organizations

PatrickHaller's activity

upvoted a collection 14 days ago

fuck quadratic attention

11 items • Updated Apr 24, 2024 • 22

upvoted a collection about 1 month ago

Common Models

The first generation of models pretrained on Common Corpus. • 5 items • Updated Dec 5, 2024 • 28

upvoted 3 papers 6 months ago

Grass: Compute Efficient Low-Memory LLM Training with Structured Sparse Gradients

Paper • 2406.17660 • Published Jun 25, 2024 • 5

Efficient Continual Pre-training by Mitigating the Stability Gap

Paper • 2406.14833 • Published Jun 21, 2024 • 19

Scaling Laws for Linear Complexity Language Models

Paper • 2406.16690 • Published Jun 24, 2024 • 22

upvoted 2 papers 9 months ago

SemScore: Automated Evaluation of Instruction-Tuned LLMs based on Semantic Textual Similarity

Paper • 2401.17072 • Published Jan 30, 2024 • 25

Fabricator: An Open Source Toolkit for Generating Labeled Training Data with Teacher LLMs

Paper • 2309.09582 • Published Sep 18, 2023 • 4