Diffusers documentation

Overview

You are viewing v0.20.0 version. A newer version v0.32.1 is available.
Hugging Face's logo
Join the Hugging Face community

and get access to the augmented documentation experience

to get started

Overview

Generating high-quality outputs is computationally intensive, especially during each iterative step where you go from a noisy output to a less noisy output. One of 🧨 Diffuser’s goal is to make this technology widely accessible to everyone, which includes enabling fast inference on consumer and specialized hardware.

This section will cover tips and tricks - like half-precision weights and sliced attention - for optimizing inference speed and reducing memory-consumption. You can also learn how to speed up your PyTorch code with torch.compile or ONNX Runtime, and enable memory-efficient attention with xFormers. There are also guides for running inference on specific hardware like Apple Silicon, and Intel or Habana processors.