mjbuehler's picture
Update README.md
cc29489 verified
|
raw
history blame
4.45 kB
metadata
base_model: stabilityai/stable-diffusion-3-medium-diffusers
library_name: diffusers
license: openrail++
tags:
  - text-to-image
  - diffusers-training
  - diffusers
  - lora
  - template:sd-lora
  - stable-diffusion-3
  - stable-diffusion-3-diffusers
instance_prompt: <leaf microstructure>
widget: []

Stable Diffusion 3 Medium Fine-tuned with Leaf Images

Model description

These are LoRA adaption weights for stabilityai/stable-diffusion-3-medium-diffusers.

Trigger keywords

The following image were used during fine-tuning using the keyword <leaf microstructure>:

image/png

You should use to trigger the image generation.

How to use

Defining some helper functions:

from diffusers import DiffusionPipeline
import torch
import os
from datetime import datetime
from PIL import Image

def generate_filename(base_name, extension=".png"):
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    return f"{base_name}_{timestamp}{extension}"

def save_image(image, directory, base_name="image_grid"):
    
    filename = generate_filename(base_name)
    file_path = os.path.join(directory, filename)
    image.save(file_path)
    print(f"Image saved as {file_path}")

def image_grid(imgs, rows, cols, save=True, save_dir='generated_images', base_name="image_grid",
              save_individual_files=False):
    
    if not os.path.exists(save_dir):
        os.makedirs(save_dir)
        
    assert len(imgs) == rows * cols

    w, h = imgs[0].size
    grid = Image.new('RGB', size=(cols * w, rows * h))
    grid_w, grid_h = grid.size

    for i, img in enumerate(imgs):
        grid.paste(img, box=(i % cols * w, i // cols * h))
        if save_individual_files:
            save_image(img, save_dir, base_name=base_name+f'_{i}-of-{len(imgs)}_')
            
    if save and save_dir:
        save_image(grid, save_dir, base_name)
    
    return grid

Text-to-image

Model loading and generation pipeline:


repo_id_load='lamm-mit/stable-diffusion-3-medium-leaf-inspired'

pipeline = DiffusionPipeline.from_pretrained ("stabilityai/stable-diffusion-3-medium-diffusers", 
                                              torch_dtype=torch.float16
                                             )

pipeline.load_lora_weights(repo_id_load)
pipeline=pipeline.to('cuda')

prompt          = "a cube in the shape of a <leaf microstructure>" 
negative_prompt = ""

num_samples = 3
num_rows = 3
n_steps=75
guidance_scale=15
all_images = []

for _ in range(num_rows):
    image = pipeline(prompt,num_inference_steps=n_steps,num_images_per_prompt=num_samples,
                     guidance_scale=guidance_scale,negative_prompt=negative_prompt).images
     
    all_images.extend(image)

grid = image_grid(all_images, num_rows, num_samples,  
                  save_individual_files=True, 
                  save_dir='generated_images', 
                  base_name="image_grid",
                 )
grid

image/png

Image-to-image

We start with this image generated earlier:

image/png

from diffusers import StableDiffusion3Img2ImgPipeline
from diffusers.utils import load_image

pipeline = StableDiffusion3Img2ImgPipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", torch_dtype=torch.float16) 

pipeline=pipeline.to('cuda')
init_image = load_image("https://huggingface.co/lamm-mit/stable-diffusion-3-medium-leaf-inspired/resolve/main/image_20240721_212111.png")

prompt = "Turn this image into a spider web."
negative_prompt=""

n_steps=20
guidance_scale=25

image = pipeline(prompt, num_inference_steps=n_steps, 
                 guidance_scale=guidance_scale,
                 negative_prompt=negative_prompt,
                 image=init_image,
                ).images[0]
save_image(image, directory='generated_images', base_name="image_grid", )
image

image/png

More examples

image/png