Andres Marafioti's picture

Andres Marafioti

andito

·

AI & ML interests

Multimodal models, VLM and TTS

Recent Activity

liked a dataset about 8 hours ago

HuggingFaceM4/WebSight

updated a dataset about 11 hours ago

andito/math-writing-dataset-google

updated a model about 11 hours ago

andito/math-writing-dataset-google

View all activity

Articles

SmolVLM - small yet mighty Vision Language Model

Deploying Speech-to-Speech on Hugging Face

FineVideo: behind the scenes

LAVE: Zero-shot VQA Evaluation on Docmatix with LLMs - Do We Still Need Fine-Tuning?

Docmatix - a huge dataset for Document Visual Question Answering

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

Organizations

andito's activity

upvoted 2 papers 19 days ago

DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding

Paper • 2412.10302 • Published 21 days ago • 11

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 21 days ago • 136

upvoted a collection about 1 month ago

Nov 29 Releases 🌲🌲

25 items • Updated Dec 2, 2024 • 10

upvoted 2 articles 2 months ago

Article

Llama 3.2 in Keras

Oct 21, 2024

• 11

Article

Welcome, Gradio 5

Oct 9, 2024

• 95

upvoted 2 articles 3 months ago

Article

Tool Use, Unified

Aug 12, 2024

• 70

Article

FineVideo: behind the scenes

Sep 23, 2024

• 27

upvoted 2 articles 4 months ago

Article

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

May 1, 2024

• 69

Article

Fine-tuning LLMs to 1.58bit: extreme quantization made easy

Sep 18, 2024

• 215

upvoted a collection 4 months ago

RWKV v6

5 items • Updated Sep 3, 2024 • 9

upvoted 2 articles 4 months ago

Article

A failed experiment: Infini-Attention, and why we should keep trying?

Aug 14, 2024

• 56

Article

Accelerate 1.0.0

Sep 13, 2024

• 51

upvoted 2 collections 4 months ago

🤖 Agents

21 items • Updated 4 days ago • 70

Image to Video

4 items • Updated Dec 10, 2023 • 10

upvoted 5 articles 4 months ago

Article

Announcing New Dataset Search Features

Jul 8, 2024

• 22

Article

DEMO: French Spoken Language Understanding with the new speech resources from NAVER LABS Europe

By

•

Aug 28, 2024

• 9

Article

Scaling robotics datasets with video encoding

Aug 27, 2024

• 34

Article

Deep Learning over the Internet: Training Language Models Collaboratively

Jul 15, 2021

• 4

Article

MicroJAX

By

•

Aug 25, 2024

• 17

upvoted a paper 4 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 124