leditsplusplus

community

AI & ML interests

None defined yet.

Recent Activity

leditsplusplus's activity

multimodalartΒ 
posted an update 7 months ago
felfriΒ 
updated a Space 7 months ago
felfriΒ 
posted an update 8 months ago
view post
Post
2300
πŸš€ Excited to announce the release of our new research paper, "LLAVAGUARD: VLM-based Safeguards for Vision Dataset Curation and Safety Assessment"!
In this work, we introduce LLAVAGUARD, a family of cutting-edge Vision-Language Model (VLM) judges designed to enhance the safety and integrity of vision datasets and generative models. Our approach leverages flexible policies for assessing safety in diverse settings. This context awareness ensures robust data curation and model safeguarding alongside comprehensive safety assessments, setting a new standard for vision datasets and models. We provide three versions (7B, 13B, and 34B) and our data, see below. This achievement wouldn't have been possible without the incredible teamwork and dedication of my great colleagues @LukasHug , @PSaiml , @mbrack . πŸ™ Together, we've pushed the boundaries of what’s possible at the intersection of large generative models and safety.
πŸ” Dive into our paper to explore:
Innovative methodologies for dataset curation and model safeguarding.
State-of-the-art safety assessments.
Practical implications for AI development and deployment.
Find more at AIML-TUDA/llavaguard-665b42e89803408ee8ec1086 and https://ml-research.github.io/human-centered-genai/projects/llavaguard/index.html
  • 2 replies
Β·
radamesΒ 
posted an update 8 months ago
view post
Post
6093
Thanks to @OzzyGT for pushing the new Anyline preprocessor to https://github.com/huggingface/controlnet_aux. Now you can use the TheMistoAI/MistoLine ControlNet with Diffusers completely.

Here's a demo for you: radames/MistoLine-ControlNet-demo
Super resolution version: radames/Enhance-This-HiDiffusion-SDXL

from controlnet_aux import AnylineDetector

anyline = AnylineDetector.from_pretrained(
    "TheMistoAI/MistoLine", filename="MTEED.pth", subfolder="Anyline"
).to("cuda")

source = Image.open("source.png")
result = anyline(source, detect_resolution=1280)
radamesΒ 
posted an update 9 months ago
view post
Post
6854
At Google I/O 2024, we're collaborating with the Google Visual Blocks team (https://visualblocks.withgoogle.com) to release custom Hugging Face nodes. Visual Blocks for ML is a browser-based tool that allows users to create machine learning pipelines using a visual interface. We're launching nodes with Transformers.js, running models on the browser, as well as server-side nodes running Transformers pipeline tasks and LLMs using our hosted inference. With @Xenova @JasonMayes

You can learn more about it here https://huggingface.co/blog/radames/hugging-face-google-visual-blocks

Source-code for the custom nodes:
https://github.com/huggingface/visual-blocks-custom-components
radamesΒ 
posted an update 9 months ago
view post
Post
2024
AI-town now runs on Hugging Face Spaces with our API for LLMs and embeddings, including the open-source Convex backend, all in one container. Easy to duplicate and config on your own

Demo: radames/ai-town
Instructions: https://github.com/radames/ai-town-huggingface
Β·
multimodalartΒ 
posted an update 9 months ago
view post
Post
27751
The first open Stable Diffusion 3-like architecture model is JUST out πŸ’£ - but it is not SD3! πŸ€”

It is Tencent-Hunyuan/HunyuanDiT by Tencent, a 1.5B parameter DiT (diffusion transformer) text-to-image model πŸ–ΌοΈβœ¨, trained with multi-lingual CLIP + multi-lingual T5 text-encoders for english 🀝 chinese understanding

Try it out by yourself here ▢️ https://huggingface.co/spaces/multimodalart/HunyuanDiT
(a bit too slow as the model is chunky and the research code isn't super optimized for inference speed yet)

In the paper they claim to be SOTA open source based on human preference evaluation!
radamesΒ 
posted an update 9 months ago
view post
Post
2542
HiDiffusion SDXL now supports Image-to-Image, so I've created an "Enhance This" version using the latest ControlNet Line Art model called MistoLine. It's faster than DemoFusion

Demo: radames/Enhance-This-HiDiffusion-SDXL

Older version based on DemoFusion radames/Enhance-This-DemoFusion-SDXL

New Controlnet SDXL Controls Every Line TheMistoAI/MistoLine

HiDiffusion is compatible with diffusers and support many SD models - https://github.com/megvii-research/HiDiffusion
  • 1 reply
Β·
radamesΒ 
posted an update 9 months ago
view post
Post
2461
I've built a custom component that integrates Rerun web viewer with Gradio, making it easier to share your demos as Gradio apps.

Basic snippet
# pip install gradio_rerun gradio
import gradio as gr
from gradio_rerun import Rerun

gr.Interface(
    inputs=gr.File(file_count="multiple", type="filepath"),
    outputs=Rerun(height=900),
    fn=lambda file_path: file_path,
).launch()

More details here radames/gradio_rerun
Source https://github.com/radames/gradio-rerun-viewer

Follow Rerun here https://huggingface.co/rerun
radamesΒ 
posted an update 9 months ago
radamesΒ 
posted an update 9 months ago
radamesΒ 
posted an update 10 months ago
radamesΒ 
posted an update 10 months ago
view post
Post
2769
Following up on @vikhyatk 's Moondream2 update and @santiagomed 's implementation on Candle, I quickly put togheter the WASM module so that you could try running the ~1.5GB quantized model in the browser. Perhaps the next step is to rewrite it using https://github.com/huggingface/ratchet and run it even faster with WebGPU, @FL33TW00D-HF .

radames/Candle-Moondream-2

ps: I have a collection of all Candle WASM demos here radames/candle-wasm-examples-650898dee13ff96230ce3e1f