|
--- |
|
license: creativeml-openrail-m |
|
language: |
|
- en |
|
library_name: diffusers |
|
pipeline_tag: text-to-image |
|
tags: |
|
- stable-diffusion |
|
- stable-diffusion-diffusers |
|
- text-to-image |
|
--- |
|
|
|
# ddpo-alignment |
|
|
|
This model was finetuned from [Stable Diffusion v1-5](https:/runwayml/stable-diffusion-v1-5) using [DDPO](https://arxiv.org/abs/2305.13301) and a reward function that uses [LLaVA](https://llava-vl.github.io/) to measure prompt-image alignment. See [the project website](https://rl-diffusion.github.io/) for more details. |
|
|
|
The model was finetuned for 120 iterations with a batch size of 256 samples per iteration. During finetuning, we used prompts of the form: "_a(n) \<animal\> \<activity\>_". We selected the animal and activity from the following lists, so try those for the best results. However, we also observed limited generalization to other prompts. |
|
|
|
Activities: |
|
- washing dishes |
|
- playing chess |
|
- riding a bike |
|
|
|
Animals: |
|
- cat |
|
- dog |
|
- horse |
|
- monkey |
|
- rabbit |
|
- zebra |
|
- spider |
|
- bird |
|
- sheep |
|
- deer |
|
- cow |
|
- goat |
|
- lion |
|
- tiger |
|
- bear |
|
- raccoon |
|
- fox |
|
- wolf |
|
- lizard |
|
- beetle |
|
- ant |
|
- butterfly |
|
- fish |
|
- shark |
|
- whale |
|
- dolphin |
|
- squirrel |
|
- mouse |
|
- rat |
|
- snake |
|
- turtle |
|
- frog |
|
- chicken |
|
- duck |
|
- goose |
|
- bee |
|
- pig |
|
- turkey |
|
- fly |
|
- llama |
|
- camel |
|
- bat |
|
- gorilla |
|
- hedgehog |
|
- kangaroo |