12 1 39

VILARIN

vilarin

AI & ML interests

Pantheon

Recent Activity

new activity about 17 hours ago

vilarin/lumiere:Cannot load "lumiere_flux_alpha-fp8.safetensors" with "FluxTransformer2DModel.from_single_file"

reacted to nroggendorff's post with 😔 12 days ago

im so tired

liked a model 20 days ago

franciszzj/Leffa

View all activity

Organizations

vilarin's activity

reacted to nroggendorff's post with 😔 12 days ago

Post

3576

im so tired

3 replies

reacted to merve's post with 🚀 about 1 month ago

Post

3902

Small yet mighty! 💫

We are releasing SmolVLM: a new 2B small vision language made for on-device use, fine-tunable on consumer GPU, immensely memory efficient 🤠

We release three checkpoints under Apache 2.0: SmolVLM-Instruct, SmolVLM-Synthetic and SmolVLM-Base HuggingFaceTB/smolvlm-6740bd584b2dcbf51ecb1f39

Learn more from our blog here: huggingface.co/blog/smolvlm
This release comes with a demo, fine-tuning code, MLX integration and TRL integration for DPO 💝
Try the demo: HuggingFaceTB/SmolVLM
Fine-tuning Recipe: https://github.com/huggingface/smollm/blob/main/finetuning/Smol_VLM_FT.ipynb
Also TRL integration for DPO 💗

reacted to davanstrien's post with ❤️ about 1 month ago

Post

2482

First dataset for the new Hugging Face Bluesky community organisation: bluesky-community/one-million-bluesky-posts 🦋

📊 1M public posts from Bluesky's firehose API
🔍 Includes text, metadata, and language predictions
🔬 Perfect to experiment with using ML for Bluesky 🤗

Excited to see people build more open tools for a more open social media platform!

posted an update about 1 month ago

Post

1401

A few days ago, Blackforestlabs released FLUX.1 Tools, which has surprised everyone with its quality and effects. Now that diffusers support these features, you can easily deploy and build your own Tools.
Combined with the powerful Gradio and ZeroGPU, you can experience the Tools immediately, which is truly wonderful.
I was impressed by the Flux.1 Fill dev, so here I've built a demo for it, making it easy to use for inpainting and outpainting images.

🏄Model: black-forest-labs/FLUX.1-Fill-dev
🦖Demo: vilarin/Flux.1-Fill-dev
👏diffusers: https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/flux

posted an update about 2 months ago

Post

1123

🏄‍♂️While browsing new models, I stumbled upon Lumiere from aixonlab. After testing it, I feel it has considerable potential. Keep up the good work!

Lumiere Alpha is a model focusing on improving realism without compromising prompt coherency or changing the composition completely from the original Flux.1-Dev model.

🦄 Model: aixonlab/flux.1-lumiere-alpha

🦖 Demo: vilarin/lumiere

1 reply

reacted to merve's post with 👀 2 months ago

Post

1666

Tencent released a new depth model that generates temporally consistent depth maps over videos ⏯️

Model: tencent/DepthCrafter
Demo: tencent/DepthCrafter
Paper: DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos (2409.02095)

You don't need to input anything other than video itself, no need for optical flow or camera poses! 🤩

reacted to merve's post with 🔥 4 months ago

Post

5571

I have put together a notebook on Multimodal RAG, where we do not process the documents with hefty pipelines but natively use:
- vidore/colpali for retrieval 📖 it doesn't need indexing with image-text pairs but just images!
- Qwen/Qwen2-VL-2B-Instruct for generation 💬 directly feed images as is to a vision language model with no processing to text!
I used ColPali implementation of the new 🐭 Byaldi library by @bclavie 🤗
https://github.com/answerdotai/byaldi
Link to notebook: https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb

reacted to clem's post with 🔥 4 months ago

Post

1761

"LLM inference at scale with TGI". Cool blogpost: https://www.adyen.com/knowledge-hub/llm-inference-at-scale-with-tgi

Well done
@martinigoyanes @rafa-hernandez @Vidusharma @frisokingma @hannahwright @jeanmarcs @antonioramos & the whole https://huggingface.co/adyen team. Could be useful to cross-post here: https://huggingface.co/blog/community

2 replies

posted an update 4 months ago

Post

1627

🐣Ai2 Releasing OLMoE!
OLMoE-1B-7B-Instruct is a Mixture-of-Experts LLM with 1B active and 7B total parameters, and, OLMoE is 100% open-source in model, code-base, datasets!

🦖Paper: https://arxiv.org/abs/2409.02060

🤗Model: allenai/OLMoE-1B-7B-0924-Instruct

💾Datasets: allenai/OLMoE-mix-0924

posted an update 4 months ago

Post

6052

🤩 Amazing day. AWPortrait-FL finally here!
🦖 AWPortrait-FL is finetuned on FLUX.1-dev using the training set of AWPortrait-XL and nearly 2,000 fashion photography photos with extremely high aesthetic quality.

🤗Model: Shakker-Labs/AWPortrait-FL

🙇Demo: vilarin/flux-labs

6 replies

posted an update 4 months ago

Post

2456

Shakker-Labs brings an amazing LoRA trained on FLUX.1-dev for blended realistic illustration by Muertu 😍 the front character is in illustration style, while the background is realistic. 🤩

🤙Model: https://huggingface.co/Shakker-Labs/FLUX.1-dev-LoRA-blended-realistic-illustration

🙇‍♂️My space for demo: vilarin/flux-lab-light

posted an update 5 months ago

Post

4195

Black Forest Labs, BASED! 👏
FLUX.1 is more delightful, with good instruction following.
FLUX.1 dev( black-forest-labs/FLUX.1-dev) with a 12B parameter distillation model, second only to Black Forest Labs' state-of-the-art model FLUX.1 pro. 🙀

Update 🤙Official demo:
black-forest-labs/FLUX.1-dev

1 reply

replied to merve's post 7 months ago

Thank you :) I updated the demo to support file.

reacted to merve's post with ❤️ 7 months ago

Post

2737

THUDM has released GLM-4V-9B and it's.. chatty! 😂
I asked it to describe my favorite Howl's Moving Castle scene and here's how it went 👇🏻

joke aside it seems to outperform the previous VLMs. however the license isn't open-source 📈
model repo: THUDM/glm-4v-9b
a community member has built a demo: vilarin/VL-Chatbox

1 reply