How to load the model ?

#7
by schuler - opened

Hello,
How should the model 'AnimateLCM-SVD-xt-1.1.safetensors' be loaded?

Kind regards,
JP.

After googling a bit, found at https://github.com/G-U-N/AnimateLCM how to do text to video (but not image to video):

import torch
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
from diffusers.utils import export_to_gif

adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM", torch_dtype=torch.float16)
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter, torch_dtype=torch.float16)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")

pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="AnimateLCM_sd15_t2v_lora.safetensors", adapter_name="lcm-lora")
pipe.set_adapters(["lcm-lora"], [0.8])

pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()

output = pipe(
    prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
    negative_prompt="bad quality, worse quality, low resolution",
    num_frames=16,
    guidance_scale=2.0,
    num_inference_steps=6,
    generator=torch.Generator("cpu").manual_seed(0),
)
frames = output.frames[0]
export_to_gif(frames, "animatelcm.gif")

I got an error related to "peft". To solve it, got inspiration from https://huggingface.co/wangfuyun/AnimateLCM/discussions/6 . I run the following and restarted:

!pip install diffusers accelerate peft

Hi, @schuler . Thanks for your interest! Here is a space demo built upon gradio. You may find how to load the model with it or simply clone the space. See https://huggingface.co/spaces/wangfuyun/AnimateLCM-SVD/tree/main.

@wangfuyun ,
Thank you so much! I can see some tricks at https://huggingface.co/spaces/wangfuyun/AnimateLCM-SVD/blob/main/app.py :

from pipeline import StableVideoDiffusionPipeline
...
noise_scheduler = AnimateLCMSVDStochasticIterativeScheduler(
    num_train_timesteps=40,
    sigma_min=0.002,
    sigma_max=700.0,
    sigma_data=1.0,
    s_noise=1.0,
    rho=7,
    clip_denoised=False,
)
pipe = StableVideoDiffusionPipeline.from_pretrained(
    "stabilityai/stable-video-diffusion-img2vid-xt",
    scheduler=noise_scheduler,
    torch_dtype=torch.float16,
    variant="fp16",
)
pipe.to("cuda")
pipe.enable_model_cpu_offload()  # for smaller cost
model_select("AnimateLCM-SVD-xt-1.1.safetensors")

It now makes a lot more sense.

Thank you again.

Kind regards,
JP.

I tried to use above code and get the error as below:
File "D:\AI\diffusers\venv\lib\site-packages\diffusers\pipelines\stable_video_diffusion\pipeline_stable_video_diffusion.py", line 573, in call
latent_model_input = torch.cat([latent_model_input, image_latents], dim=2)
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 2 but got size 1 for tensor number 1 in the list.

Sign up or log in to comment