Diffusers documentation

DPM Stochastic Scheduler inspired by Karras et. al paper

You are viewing v0.18.2 version. A newer version v0.32.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

DPM Stochastic Scheduler inspired by Karras et. al paper

Overview

Inspired by Stochastic Sampler from Karras et. al. Scheduler ported from @crowsonkb’s https://github.com/crowsonkb/k-diffusion library:

All credit for making this scheduler work goes to Katherine Crowson

DPMSolverSDEScheduler

class diffusers.DPMSolverSDEScheduler

< >

( num_train_timesteps: int = 1000 beta_start: float = 0.00085 beta_end: float = 0.012 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None prediction_type: str = 'epsilon' use_karras_sigmas: typing.Optional[bool] = False noise_sampler_seed: typing.Optional[int] = None timestep_spacing: str = 'linspace' steps_offset: int = 0 )

Parameters

  • num_train_timesteps (int) — number of diffusion steps used to train the model. beta_start (float): the
  • starting beta value of inference. beta_end (float) — the final beta value. beta_schedule (str): the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose from linear or scaled_linear.
  • trained_betas (np.ndarray, optional) — option to pass an array of betas directly to the constructor to bypass beta_start, beta_end etc.
  • prediction_type (str, default epsilon, optional) — prediction type of the scheduler function, one of epsilon (predicting the noise of the diffusion process), sample (directly predicting the noisy sample) or v_prediction` (see section 2.4 https://imagen.research.google/video/paper.pdf)
  • use_karras_sigmas (bool, optional, defaults to False) — This parameter controls whether to use Karras sigmas (Karras et al. (2022) scheme) for step sizes in the noise schedule during the sampling process. If True, the sigmas will be determined according to a sequence of noise levels {σi} as defined in Equation (5) of the paper https://arxiv.org/pdf/2206.00364.pdf.
  • noise_sampler_seed (int, optional, defaults to None) — The random seed to use for the noise sampler. If None, a random seed will be generated.
  • timestep_spacing (str, default "linspace") — The way the timesteps should be scaled. Refer to Table 2. of Common Diffusion Noise Schedules and Sample Steps are Flawed for more information.
  • steps_offset (int, default 0) — an offset added to the inference steps. You can use a combination of offset=1 and set_alpha_to_one=False, to make the last step use step 0 for the previous alpha product, as done in stable diffusion.

Implements Stochastic Sampler (Algorithm 2) from Karras et al. (2022). Based on the original k-diffusion implementation by Katherine Crowson: https://github.com/crowsonkb/k-diffusion/blob/41b4cb6df0506694a7776af31349acf082bf6091/k_diffusion/sampling.py#L543

~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__ function, such as num_train_timesteps. They can be accessed via scheduler.config.num_train_timesteps. SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and from_pretrained() functions.

scale_model_input

< >

( sample: FloatTensor timestep: typing.Union[float, torch.FloatTensor] ) torch.FloatTensor

Parameters

  • Ensures interchangeability with schedulers that need to scale the denoising model input depending on the —
  • current timestep. — sample (torch.FloatTensor): input sample timestep (int, optional): current timestep

Returns

torch.FloatTensor

scaled input sample

set_timesteps

< >

( num_inference_steps: int device: typing.Union[str, torch.device] = None num_train_timesteps: typing.Optional[int] = None )

Parameters

  • num_inference_steps (int) — the number of diffusion steps used when generating samples with a pre-trained model.
  • device (str or torch.device, optional) — the device to which the timesteps should be moved to. If None, the timesteps are not moved.

Sets the timesteps used for the diffusion chain. Supporting function to be run before inference.

step

< >

( model_output: typing.Union[torch.FloatTensor, numpy.ndarray] timestep: typing.Union[float, torch.FloatTensor] sample: typing.Union[torch.FloatTensor, numpy.ndarray] return_dict: bool = True s_noise: float = 1.0 ) SchedulerOutput or tuple

Parameters

  • Predict the sample at the previous timestep by reversing the SDE. Core function to propagate the diffusion —
  • process from the learned model outputs (most often the predicted noise). —
  • model_output (Union[torch.FloatTensor, np.ndarray]) — Direct output from learned diffusion model.
  • timestep (Union[float, torch.FloatTensor]) — Current discrete timestep in the diffusion chain.
  • sample (Union[torch.FloatTensor, np.ndarray]) — Current instance of sample being created by diffusion process.
  • return_dict (bool, optional) — Option for returning tuple rather than SchedulerOutput class. Defaults to True.
  • s_noise (float, optional) — Scaling factor for the noise added to the sample. Defaults to 1.0.

Returns

SchedulerOutput or tuple

SchedulerOutput if return_dict is True, otherwise a tuple. When returning a tuple, the first element is the sample tensor.