license: creativeml-openrail-m | |
language: | |
- en | |
library_name: diffusers | |
pipeline_tag: text-to-image | |
tags: | |
- stable-diffusion | |
- stable-diffusion-diffusers | |
- text-to-image | |
inference: | |
parameters: | |
num_inference_steps: 50 | |
guidance_scale: 5.0 | |
eta: 1.0 | |
widget: | |
- text: "a horse playing chess" | |
example_title: horse + chess | |
- text: "a lion washing dishes" | |
example_title: lion + dishes | |
- text: "a goat riding a bike" | |
example_title: goat + bike | |
# ddpo-alignment | |
This model was finetuned from [Stable Diffusion v1-4](https:/CompVis/stable-diffusion-v1-4) using [DDPO](https://arxiv.org/abs/2305.13301) and a reward function that uses [LLaVA](https://llava-vl.github.io/) to measure prompt-image alignment. See [the project website](https://rl-diffusion.github.io/) for more details. | |
The model was finetuned for 200 iterations with a batch size of 256 samples per iteration. During finetuning, we used prompts of the form: "_a(n) \<animal\> \<activity\>_". We selected the animal and activity from the following lists, so try those for the best results. However, we also observed limited generalization to other prompts. | |
Activities: | |
- washing dishes | |
- playing chess | |
- riding a bike | |
Animals: | |
- cat | |
- dog | |
- horse | |
- monkey | |
- rabbit | |
- zebra | |
- spider | |
- bird | |
- sheep | |
- deer | |
- cow | |
- goat | |
- lion | |
- tiger | |
- bear | |
- raccoon | |
- fox | |
- wolf | |
- lizard | |
- beetle | |
- ant | |
- butterfly | |
- fish | |
- shark | |
- whale | |
- dolphin | |
- squirrel | |
- mouse | |
- rat | |
- snake | |
- turtle | |
- frog | |
- chicken | |
- duck | |
- goose | |
- bee | |
- pig | |
- turkey | |
- fly | |
- llama | |
- camel | |
- bat | |
- gorilla | |
- hedgehog | |
- kangaroo |