Kyle Wascher

kylewascher

AI & ML interests

None yet

Recent Activity

reacted to merve's post with 🔥 6 days ago

Oof, what a week! 🥵 So many things have happened, let's recap! https://huggingface.co/collections/merve/jan-24-releases-6793d610774073328eac67a9 Multimodal 💬 - We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗 - UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B - Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B - MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context - Dataset: Yale released a new benchmark called MMVU - Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark LLMs 📖 - DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯 - Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B - NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!) Audio 🗣️ - Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B - TangoFlux is a new audio generation model trained from scratch and aligned with CRPO Image/Video/3D Generation ⏯️ - Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux - tencent released Hunyuan3D-2, new 3D asset generation from images

reacted to sequelbox's post with 👍 10 days ago

A general FYI that Valiant Labs no longer has an X account. This is a business decision. Many other businesses seem to be making the same decision right now. You can follow my account on Bluesky for updates on Shining Valiant 3, other Valiant Labs models, my open-source datasets, etc: https://bsky.app/profile/sequelbox.bsky.social back to building :)

liked a model 10 days ago

deepseek-ai/DeepSeek-R1

View all activity

Organizations

None yet

kylewascher's activity

reacted to merve's post with 🔥 6 days ago

Post

4480

Oof, what a week! 🥵 So many things have happened, let's recap! merve/jan-24-releases-6793d610774073328eac67a9

Multimodal 💬
- We have released SmolVLM -- tiniest VLMs that come in 256M and 500M, with it's retrieval models ColSmol for multimodal RAG 💗
- UI-TARS are new models by ByteDance to unlock agentic GUI control 🤯 in 2B, 7B and 72B
- Alibaba DAMO lab released VideoLlama3, new video LMs that come in 2B and 7B
- MiniMaxAI released Minimax-VL-01, where decoder is based on MiniMax-Text-01 456B MoE model with long context
- Dataset: Yale released a new benchmark called MMVU
- Dataset: CAIS released Humanity's Last Exam (HLE) a new challenging MM benchmark

LLMs 📖
- DeepSeek-R1 & DeepSeek-R1-Zero: gigantic 660B reasoning models by DeepSeek, and six distilled dense models, on par with o1 with MIT license! 🤯
- Qwen2.5-Math-PRM: new math models by Qwen in 7B and 72B
- NVIDIA released AceMath and AceInstruct, new family of models and their datasets (SFT and reward ones too!)

Audio 🗣️
- Llasa is a new speech synthesis model based on Llama that comes in 1B,3B, and 8B
- TangoFlux is a new audio generation model trained from scratch and aligned with CRPO

Image/Video/3D Generation ⏯️
- Flex.1-alpha is a new 8B pre-trained diffusion model by ostris similar to Flux
- tencent released Hunyuan3D-2, new 3D asset generation from images

7 replies

reacted to sequelbox's post with 👍 10 days ago

Post

2311

A general FYI that Valiant Labs no longer has an X account. This is a business decision. Many other businesses seem to be making the same decision right now.

You can follow my account on Bluesky for updates on Shining Valiant 3, other Valiant Labs models, my open-source datasets, etc: https://bsky.app/profile/sequelbox.bsky.social

back to building :)

liked a model 10 days ago

deepseek-ai/DeepSeek-R1

Text Generation • Updated 6 days ago • 674k • • 5.6k

liked a model 5 months ago

vidore/colpali

Updated Sep 27, 2024 • 37.3k • 412

liked a Space 5 months ago

Running

🏆

The timm Leaderboard

liked 3 models 5 months ago

liked a model 6 months ago

hustvl/yolos-tiny

Object Detection • Updated Apr 10, 2024 • 119k • 261

liked a Space 6 months ago

Running

🚀

DataCentricVisualAIChallenge

reacted to rwightman's post with 👀 6 months ago

Post

1994

I can't resist an opportunity to update an old baseline. Read a new article on my latest look at improving MobileNet-V1 and EfficientNet-B0 baselines.

https://huggingface.co/blog/rwightman/mobilenet-baselines
timm/mobilenetv1_100.ra4_e3600_r224_in1k
timm/efficientnet_b0.ra4_e3600_r224_in1k

upvoted an article 6 months ago

Article

MobileNet Baselines

•

Jul 26, 2024

• 23

replied to rwightman's post 6 months ago

Much thanks! I got a model working in on device via onnx w/React Native fairly easily!

reacted to rwightman's post with 🔥 6 months ago

Post

2458

MobileNetV4 weights are now in timm! So far these are the only weights for these models as the offiicial Tensorflow impl remains weightless.

Guided by paper hparams with a few tweaks, I've managed to match or beat the paper results training the medium models. I'm still working on large and improving the small result. They appear to be solid models for on-device use.

timm/mobilenetv4-pretrained-weights-6669c22cda4db4244def9637

MobileNetV4 -- Universal Models for the Mobile Ecosystem (2404.10518)

1 reply

updated a collection about 1 year ago

cv

Collection

1 item • Updated Jan 6, 2024

upvoted a paper over 1 year ago

Vision Transformers Need Registers

Paper • 2309.16588 • Published Sep 28, 2023 • 78

liked a model over 1 year ago

impira/layoutlm-document-qa

Document Question Answering • Updated Mar 18, 2023 • 46.8k • 1.06k