Dmitry Ryumin's picture

Dmitry Ryumin

DmitryRyumin

·

https://dmitryryumin.github.io

DmitryRyumin

AI & ML interests

Machine Learning and Applications, Multi-Modal Understanding

Recent Activity

updated a Space 7 days ago

DmitryRyumin/NewEraAI-Papers

updated a collection 13 days ago

liked a Space 16 days ago

prs-eth/marigold-dc

View all activity

Organizations

DmitryRyumin's activity

upvoted a collection about 1 month ago

KaLM-embedding

5 items • Updated 3 days ago • 5

upvoted 5 papers 3 months ago

FAN: Fourier Analysis Networks

Paper • 2410.02675 • Published Oct 3, 2024 • 25

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 169

MOSEL: 950,000 Hours of Speech Data for Open-Source Speech Foundation Model Training on EU Languages

Paper • 2410.01036 • Published Oct 1, 2024 • 14

HeadGAP: Few-shot 3D Head Avatar via Generalizable Gaussian Priors

Paper • 2408.06019 • Published Aug 12, 2024 • 13

Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction

Paper • 2409.18124 • Published Sep 26, 2024 • 32

upvoted a collection 3 months ago

Llama 3.2

Meta's new Llama 3.2 vision and text models including 1B, 3B, 11B and 90B. Includes GGUF, 4-bit bnb and original versions. • 23 items • Updated 12 days ago • 46

upvoted 3 articles 3 months ago

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 215

Article

Exploring the Daily Papers Page on Hugging Face

Sep 23, 2024

• 47

Article

XetHub is joining Hugging Face!

Aug 8, 2024

• 81

upvoted a collection 4 months ago

Qwen2.5

Qwen2.5 language models, including pretrained and instruction-tuned models of 7 sizes, including 0.5B, 1.5B, 3B, 7B, 14B, 32B, and 72B. • 45 items • Updated Nov 28, 2024 • 453

upvoted 4 papers 4 months ago

OLMoE: Open Mixture-of-Experts Language Models

Paper • 2409.02060 • Published Sep 3, 2024 • 78

ReMamba: Equip Mamba with Effective Long-Sequence Modeling

Paper • 2408.15496 • Published Aug 28, 2024 • 10

The Mamba in the Llama: Distilling and Accelerating Hybrid Models

Paper • 2408.15237 • Published Aug 27, 2024 • 38

The Russian-focused embedders' exploration: ruMTEB benchmark and Russian embedding model design

Paper • 2408.12503 • Published Aug 22, 2024 • 23

upvoted a paper 5 months ago

Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 64

upvoted a collection 5 months ago

Jamba-1.5

The AI21 Jamba family of models are state-of-the-art, hybrid SSM-Transformer instruction following foundation models • 2 items • Updated Aug 22, 2024 • 83

upvoted an article 5 months ago

Article

Llama-3.1-Storm-8B: Improved SLM with Self-Curation + Model Merging

By

•

Aug 19, 2024

• 75

upvoted a paper 5 months ago

Transformer Language Models without Positional Encodings Still Learn Positional Information

Paper • 2203.16634 • Published Mar 30, 2022 • 5

upvoted a collection 5 months ago

Qwen2-Audio

Audio-language model series based on Qwen2 • 4 items • Updated Nov 28, 2024 • 49