Vaibhav Srivastav's picture

Vaibhav Srivastav PRO

reach-vb

·

https://vaibhavs10.github.io

AI & ML interests

TTS + LM performance prediction

Recent Activity

liked a model 3 days ago

nvidia/stt_en_fastconformer_tdt_large

liked a model 3 days ago

microsoft/VidTok

liked a model 3 days ago

openfree/claude-monet

View all activity

Articles

Faster Text Generation with Self-Speculative Decoding

Llama can now see and run on your device - welcome Llama 3.2

Google releases Gemma 2 2B, ShieldGemma and Gemma Scope

Llama 3.1 - 405B, 70B & 8B with multilinguality and long context

WWDC 24: Running Mistral 7B with Core ML

Welcome Gemma 2 - Google's new open LLM

Powerful ASR + diarization + speculative decoding with Hugging Face Inference Endpoints

CodeGemma - an official Google release for code LLMs

TTS Arena: Benchmarking Text-to-Speech Models in the Wild

AI Watermarking 101: Tools and Techniques

Deploy MusicGen in no time with Inference Endpoints

Jupyter X Hugging Face

Swift Diffusers: Fast Stable Diffusion for Mac

Organizations

Posts 12

Post

3287

VLMs are going through quite an open revolution AND on-device friendly sizes:

1. Google DeepMind w/ PaliGemma2 - 3B, 10B & 28B: google/paligemma-2-release-67500e1e1dbfdd4dee27ba48

2. OpenGVLabs w/ InternVL 2.5 - 1B, 2B, 4B, 8B, 26B, 38B & 78B: https://huggingface.co/collections/OpenGVLab/internvl-25-673e1019b66e2218f68d7c1c

3. Qwen w/ Qwen 2 VL - 2B, 7B & 72B: Qwen/qwen2-vl-66cee7455501d7126940800d

4. Microsoft w/ FlorenceVL - 3B & 8B: https://huggingface.co/jiuhai

5. Moondream2 w/ 0.5B: https://huggingface.co/vikhyatk/

What a time to be alive! 🔥

Post

3205

Massive week for Open AI/ ML:

Mistral Pixtral & Instruct Large - ~123B, 128K context, multilingual, json + function calling & open weights
mistralai/Pixtral-Large-Instruct-2411
mistralai/Mistral-Large-Instruct-2411

Allen AI Tülu 70B & 8B - competive with claude 3.5 haiku, beats all major open models like llama 3.1 70B, qwen 2.5 and nemotron
allenai/tulu-3-models-673b8e0dc3512e30e7dc54f5
allenai/tulu-3-datasets-673b8df14442393f7213f372

Llava o1 - vlm capable of spontaneous, systematic reasoning, similar to GPT-o1, 11B model outperforms gemini-1.5-pro, gpt-4o-mini, and llama-3.2-90B-vision
Xkev/Llama-3.2V-11B-cot

Black Forest Labs Flux.1 tools - four new state of the art model checkpoints & 2 adapters for fill, depth, canny & redux, open weights
reach-vb/black-forest-labs-flux1-6743847bde9997dd26609817

Jina AI Jina CLIP v2 - general purpose multilingual and multimodal (text & image) embedding model, 900M params, 512 x 512 resolution, matroyoshka representations (1024 to 64)
jinaai/jina-clip-v2

Apple AIM v2 & CoreML MobileCLIP - large scale vision encoders outperform CLIP and SigLIP. CoreML optimised MobileCLIP models
apple/aimv2-6720fe1558d94c7805f7688c
apple/coreml-mobileclip

A lot more got released like, OpenScholar ( OpenScholar/openscholar-v1-67376a89f6a80f448da411a6), smoltalk ( HuggingFaceTB/smoltalk), Hymba ( nvidia/hymba-673c35516c12c4b98b5e845f), Open ASR Leaderboard ( hf-audio/open_asr_leaderboard) and much more..

Can't wait for the next week! 🤗

Collections 4

Papers 3

arxiv:2410.23320

arxiv:2409.09506

arxiv:2202.10408

spaces 31

Github Issue Generator

SmolLM2 WebGPU Structured Generation 🔥

MLX My Repo

Webllm Simple Chat

My Heatmap

Tgi Tests

models 111

reach-vb/Qwen2.5-0.5B-Instruct-Q3-mlx

Text Generation • Updated 5 days ago • 7

reach-vb/stable-diffusion-v1-4.dduf

Updated 16 days ago

reach-vb/one-diffusion.dduf

Updated 16 days ago

reach-vb/test-gating-f

Text Classification • Updated Nov 21

reach-vb/test-attt-tag

Audio-Text-to-Text • Updated Nov 19

reach-vb/TinyLlama-1.1B-Chat-v1.0-Q4-mlx

Updated Oct 18 • 3 • 1

reach-vb/qwen2.5-0.5b-instruct-q8_0

Updated Oct 11 • 2

reach-vb/qwen2.5-0.5b-instruct-q4_0

reach-vb/qwen2.5-0.5b-base-q4_0

reach-vb/qwen2.5-0.5b-base-q3_K_M

datasets 37

reach-vb/test-happiness

Viewer • Updated 10 days ago • 158 • 58

reach-vb/world-happiness-clone

Preview • Updated 10 days ago • 95

reach-vb/gguf-stats

Viewer • Updated 27 days ago • 60.5k • 525 • 16

reach-vb/gguf-name-agg-true

Viewer • Updated Oct 16 • 37k • 6

reach-vb/gguf-agg-counts

Viewer • Updated Oct 16 • 33.6k • 10

reach-vb/gguf-filenames

Viewer • Updated Oct 15 • 572k • 12

reach-vb/random-wheels

Updated Jun 27 • 9

reach-vb/mls-eng-10k-text-tags-v3-v1

Viewer • Updated Jun 3 • 2.43M • 46

reach-vb/random-audios

Viewer • Updated May 24 • 5 • 1.24k • 2

reach-vb/expresso-tagged-w-speech-mistral-v3

Viewer • Updated May 10 • 11.6k • 39 • 1