UniPC
Overview
UniPC is a training-free framework designed for the fast sampling of diffusion models, which consists of a corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders.
For more details about the method, please refer to the [paper] and the [code].
Fast Sampling of Diffusion Models with Exponential Integrator.
UniPCMultistepScheduler
class diffusers.UniPCMultistepScheduler
< source >( num_train_timesteps: int = 1000 beta_start: float = 0.0001 beta_end: float = 0.02 beta_schedule: str = 'linear' trained_betas: typing.Union[numpy.ndarray, typing.List[float], NoneType] = None solver_order: int = 2 prediction_type: str = 'epsilon' thresholding: bool = False dynamic_thresholding_ratio: float = 0.995 sample_max_value: float = 1.0 predict_x0: bool = True solver_type: str = 'bh2' lower_order_final: bool = True disable_corrector: typing.List[int] = [] solver_p: SchedulerMixin = None )
Parameters
-
num_train_timesteps (
int
) — number of diffusion steps used to train the model. -
beta_start (
float
) — the startingbeta
value of inference. -
beta_end (
float
) — the finalbeta
value. -
beta_schedule (
str
) — the beta schedule, a mapping from a beta range to a sequence of betas for stepping the model. Choose fromlinear
,scaled_linear
, orsquaredcos_cap_v2
. -
trained_betas (
np.ndarray
, optional) — option to pass an array of betas directly to the constructor to bypassbeta_start
,beta_end
etc. -
solver_order (
int
, default2
) — the order of UniPC, also the p in UniPC-p; can be any positive integer. Note that the effective order of accuracy issolver_order + 1
due to the UniC. We recommend to usesolver_order=2
for guided sampling, andsolver_order=3
for unconditional sampling. -
prediction_type (
str
, defaultepsilon
, optional) — prediction type of the scheduler function, one ofepsilon
(predicting the noise of the diffusion process),sample
(directly predicting the noisy sample) or
v_prediction` (see section 2.4 https://imagen.research.google/video/paper.pdf) -
thresholding (
bool
, defaultFalse
) — whether to use the “dynamic thresholding” method (introduced by Imagen, https://arxiv.org/abs/2205.11487). For pixel-space diffusion models, you can set bothpredict_x0=True
andthresholding=True
to use the dynamic thresholding. Note that the thresholding method is unsuitable for latent-space diffusion models (such as stable-diffusion). -
dynamic_thresholding_ratio (
float
, default0.995
) — the ratio for the dynamic thresholding method. Default is0.995
, the same as Imagen (https://arxiv.org/abs/2205.11487). -
sample_max_value (
float
, default1.0
) — the threshold value for dynamic thresholding. Valid only whenthresholding=True
andpredict_x0=True
. -
predict_x0 (
bool
, defaultTrue
) — whether to use the updating algrithm on the predicted x0. See https://arxiv.org/abs/2211.01095 for details -
solver_type (
str
, defaultbh2
) — the solver type of UniPC. We recommend usebh1
for unconditional sampling when steps < 10, and usebh2
otherwise. -
lower_order_final (
bool
, defaultTrue
) — whether to use lower-order solvers in the final steps. Only valid for < 15 inference steps. We empirically find this trick can stabilize the sampling of DPM-Solver for steps < 15, especially for steps <= 10. -
disable_corrector (
list
, default[]
) — decide which step to disable the corrector. For large guidance scale, the misalignment between theepsilon_theta(x_t, c)
andepsilon_theta(x_t^c, c)
might influence the convergence. This can be mitigated by disable the corrector at the first few steps (e.g., disable_corrector=[0]) -
solver_p (
SchedulerMixin
, defaultNone
) — can be any other scheduler. If specified, the algorithm will become solver_p + UniC.
UniPC is a training-free framework designed for the fast sampling of diffusion models, which consists of a corrector (UniC) and a predictor (UniP) that share a unified analytical form and support arbitrary orders. UniPC is by desinged model-agnostic, supporting pixel-space/latent-space DPMs on unconditional/conditional sampling. It can also be applied to both noise prediction model and data prediction model. The corrector UniC can be also applied after any off-the-shelf solvers to increase the order of accuracy.
For more details, see the original paper: https://arxiv.org/abs/2302.04867
Currently, we support the multistep UniPC for both noise prediction models and data prediction models. We recommend
to use solver_order=2
for guided sampling, and solver_order=3
for unconditional sampling.
We also support the “dynamic thresholding” method in Imagen (https://arxiv.org/abs/2205.11487). For pixel-space
diffusion models, you can set both predict_x0=True
and thresholding=True
to use the dynamic thresholding. Note
that the thresholding method is unsuitable for latent-space diffusion models (such as stable-diffusion).
~ConfigMixin takes care of storing all config attributes that are passed in the scheduler’s __init__
function, such as num_train_timesteps
. They can be accessed via scheduler.config.num_train_timesteps
.
SchedulerMixin provides general loading and saving functionality via the SchedulerMixin.save_pretrained() and
from_pretrained() functions.
convert_model_output
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
)
→
torch.FloatTensor
Parameters
-
model_output (
torch.FloatTensor
) — direct output from learned diffusion model. -
timestep (
int
) — current discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor
) — current instance of sample being created by diffusion process.
Returns
torch.FloatTensor
the converted model output.
Convert the model output to the corresponding type that the algorithm PC needs.
multistep_uni_c_bh_update
< source >(
this_model_output: FloatTensor
this_timestep: int
last_sample: FloatTensor
this_sample: FloatTensor
order: int
)
→
torch.FloatTensor
Parameters
-
this_model_output (
torch.FloatTensor
) — the model outputs atx_t
-
this_timestep (
int
) — the current timestept
-
last_sample (
torch.FloatTensor
) — the generated sample before the last predictor:x_{t-1}
-
this_sample (
torch.FloatTensor
) — the generated sample after the last predictor:x_{t}
-
order (
int
) — thep
of UniC-p at this step. Note that the effective order of accuracy should be order + 1
Returns
torch.FloatTensor
the corrected sample tensor at the current timestep.
One step for the UniC (B(h) version).
multistep_uni_p_bh_update
< source >(
model_output: FloatTensor
prev_timestep: int
sample: FloatTensor
order: int
)
→
torch.FloatTensor
Parameters
-
model_output (
torch.FloatTensor
) — direct outputs from learned diffusion model at the current timestep. -
prev_timestep (
int
) — previous discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor
) — current instance of sample being created by diffusion process. -
order (
int
) — the order of UniP at this step, also the p in UniPC-p.
Returns
torch.FloatTensor
the sample tensor at the previous timestep.
One step for the UniP (B(h) version). Alternatively, self.solver_p
is used if is specified.
scale_model_input
< source >(
sample: FloatTensor
*args
**kwargs
)
→
torch.FloatTensor
Ensures interchangeability with schedulers that need to scale the denoising model input depending on the current timestep.
set_timesteps
< source >( num_inference_steps: int device: typing.Union[str, torch.device] = None )
Sets the timesteps used for the diffusion chain. Supporting function to be run before inference.
step
< source >(
model_output: FloatTensor
timestep: int
sample: FloatTensor
return_dict: bool = True
)
→
~scheduling_utils.SchedulerOutput
or tuple
Parameters
-
model_output (
torch.FloatTensor
) — direct output from learned diffusion model. -
timestep (
int
) — current discrete timestep in the diffusion chain. -
sample (
torch.FloatTensor
) — current instance of sample being created by diffusion process. -
return_dict (
bool
) — option for returning tuple rather than SchedulerOutput class
Returns
~scheduling_utils.SchedulerOutput
or tuple
~scheduling_utils.SchedulerOutput
if return_dict
is
True, otherwise a tuple
. When returning a tuple, the first element is the sample tensor.
Step function propagating the sample with the multistep UniPC.