Disty0 commited on
Commit
f7af3cb
·
verified ·
1 Parent(s): 8144037

Upload 2 files

Browse files
Files changed (2) hide show
  1. README.md +105 -0
  2. model_index.json +1 -1
README.md ADDED
@@ -0,0 +1,105 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-to-image
3
+ license: other
4
+ license_name: stable-cascade-nc-community
5
+ license_link: LICENSE
6
+ ---
7
+
8
+ # SoteDiffusion Cascade
9
+
10
+ Anime finetune of Stable Cascade Decoder.
11
+ No commercial use thanks to StabilityAI.
12
+
13
+ ## Code Example
14
+
15
+ ```shell
16
+ pip install diffusers
17
+ ```
18
+
19
+ ```python
20
+ import torch
21
+ from diffusers import StableCascadeDecoderPipeline, StableCascadePriorPipeline
22
+
23
+ prompt = "(extremely aesthetic, best quality, newest), 1girl, solo, cat ears, looking at viewer, blush, light smile, upper body,"
24
+ negative_prompt = "very displeasing, worst quality, monochrome, sketch, blurry, fat, child,"
25
+
26
+ prior = StableCascadePriorPipeline.from_pretrained("Disty0|SoteDiffusion-Cascade_pre-alpha0", torch_dtype=torch.float16)
27
+ decoder = StableCascadeDecoderPipeline.from_pretrained("SoteDiffusion-Cascade_Decoder", torch_dtype=torch.float16)
28
+
29
+ prior.enable_model_cpu_offload()
30
+ prior_output = prior(
31
+ prompt=prompt,
32
+ height=1024,
33
+ width=1024,
34
+ negative_prompt=negative_prompt,
35
+ guidance_scale=6.0,
36
+ num_images_per_prompt=1,
37
+ num_inference_steps=30
38
+ )
39
+
40
+ decoder.enable_model_cpu_offload()
41
+ decoder_output = decoder(
42
+ image_embeddings=prior_output.image_embeddings.to(torch.float16),
43
+ prompt=prompt,
44
+ negative_prompt=negative_prompt,
45
+ guidance_scale=1.0,
46
+ output_type="pil",
47
+ num_inference_steps=10
48
+ ).images[0]
49
+ decoder_output.save("cascade.png")
50
+ ```
51
+
52
+ ## Dataset
53
+
54
+ Used the same dataset as SoteDiffusion-Cascade_pre-alpha0.
55
+ Selected images from newest dataset that got more than 0.98 score by both aesthetic and quality taggers.
56
+ Trained with 98K~ images.
57
+
58
+ ## Training:
59
+
60
+ **GPU used for training**: 1x AMD RX 7900 XTX 24GB
61
+
62
+ **Software used**: https://github.com/2kpr/StableCascade
63
+
64
+ ### Config:
65
+ ```
66
+ experiment_id: sotediffusion-sc-b_3b
67
+ model_version: 3B
68
+ dtype: bfloat16
69
+ use_fsdp: False
70
+
71
+ batch_size: 64
72
+ grad_accum_steps: 64
73
+ updates: 3000
74
+ backup_every: 128
75
+ save_every: 32
76
+ warmup_updates: 100
77
+
78
+ lr: 4.0e-6
79
+ optimizer_type: Adafactor
80
+ adaptive_loss_weight: True
81
+ stochastic_rounding: True
82
+
83
+ image_size: 768
84
+ multi_aspect_ratio: [1/1, 1/2, 1/3, 2/3, 3/4, 1/5, 2/5, 3/5, 4/5, 1/6, 5/6, 9/16]
85
+ shift: 4
86
+
87
+ checkpoint_path: /mnt/DataSSD/AI/SoteDiffusion/StableCascade/
88
+ output_path: /mnt/DataSSD/AI/SoteDiffusion/StableCascade/
89
+ webdataset_path: file:/mnt/DataSSD/AI/anime_image_dataset/best/newest_best-{0000..0001}.tar
90
+
91
+ effnet_checkpoint_path: /mnt/DataSSD/AI/models/sd-cascade/effnet_encoder.safetensors
92
+ stage_a_checkpoint_path: /mnt/DataSSD/AI/models/sd-cascade/stage_a.safetensors
93
+ generator_checkpoint_path: /mnt/DataSSD/AI/SoteDiffusion/StableCascade/stage_b-generator-049152.safetensors
94
+ ```
95
+
96
+
97
+ ## Limitations and Bias
98
+
99
+ ### Bias
100
+
101
+ - This model is intended for anime illustrations.
102
+ Realistic capabilites are not tested at all.
103
+
104
+ ### Limitations
105
+ - Far shot eyes are bad thanks to the heavy latent compression.
model_index.json CHANGED
@@ -1,7 +1,7 @@
1
  {
2
  "_class_name": "StableCascadeDecoderPipeline",
3
  "_diffusers_version": "0.27.0",
4
- "_name_or_path": "stabilityai/stable-cascade",
5
  "decoder": [
6
  "diffusers",
7
  "StableCascadeUNet"
 
1
  {
2
  "_class_name": "StableCascadeDecoderPipeline",
3
  "_diffusers_version": "0.27.0",
4
+ "_name_or_path": "Disty0/SoteDiffusion-Cascade_Decoder",
5
  "decoder": [
6
  "diffusers",
7
  "StableCascadeUNet"