Softedge ControlNet
EcomXL contains a series of text-to-image diffusion models optimized for e-commerce scenarios, developed based on Stable Diffusion XL.
The controlnet weights are fine-tuned based on stable-diffusion-xl-base-1.0.
It works good on SDXL as well as community models based on SDXL.
The model is trained on general data and taobao e-commerce data, and has good capabilities in both general and e-commerce scenarios.
Examples
These cases are generated using AUTOMATIC1111/stable-diffusion-webui.
Usage with Diffusers
from diffusers import (
ControlNetModel,
StableDiffusionXLControlNetPipeline,
DPMSolverMultistepScheduler,
AutoencoderKL
)
from diffusers.utils import load_image
from controlnet_aux import PidiNetDetector, HEDdetector
import torch
from PIL import Image
controlnet = ControlNetModel.from_pretrained(
"alimama-creative/EcomXL_controlnet_softedge", torch_dtype=torch.float16, use_safetensors=True
)
vae = AutoencoderKL.from_pretrained('madebyollin/sdxl-vae-fp16-fix', torch_dtype=torch.float16)
pipe = StableDiffusionXLControlNetPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0",
controlnet=controlnet,
vae=vae,
torch_dtype=torch.float16
)
pipe.scheduler = DPMSolverMultistepScheduler.from_config(pipe.scheduler.config)
# pipe.enable_xformers_memory_efficient_attention()
pipe.to(device="cuda", dtype=torch.float16)
pipe.enable_vae_slicing()
image = load_image(
"https://huggingface.co/alimama-creative/EcomXL_controlnet_softedge/resolve/main/images/1_1.png"
)
edge_processor = PidiNetDetector.from_pretrained('lllyasviel/Annotators')
edge_image = edge_processor(image, safe=False) # set True to use pidisafe
prompt="a bottle on the Twilight Grassland, Sitting on the ground, a couple of tall grass sitting in a field of tall grass, sunset,"
negative_prompt = "low quality, bad quality, sketches"
output = pipe(
prompt,
negative_prompt=negative_prompt,
image=edge_image,
num_inference_steps=25,
controlnet_conditioning_scale=0.6,
guidance_scale=7,
width=1024,
height=1024,
).images[0]
output.save(f'test_edge.png')
The model exhibits good performance when the controlnet weight (controlnet_condition_scale) is within the range of 0.6 to 0.8.
Training details
Mixed precision: FP16
Learning rate: 1e-5
batch size: 1024
Noise offset: 0.05
The model is trained for 37k steps.
The training data includes 12M laion2B and internal sources images with aesthetic 6 plus, as well as 3M Taobao e-commerce images. The softedge preproessor during training is randomly selected from pidinet, hed, pidisafe and hedsafe, which are officially supported by Automatic&&Mikubill. The model has good performance when the weight is in 0.6~0.8.
- Downloads last month
- 122
Model tree for alimama-creative/EcomXL_controlnet_softedge
Base model
stabilityai/stable-diffusion-xl-base-1.0