ai-forever commited on
Commit
c01c634
·
verified ·
1 Parent(s): 279d199

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md CHANGED
@@ -220,6 +220,35 @@ image = inp_pipe( "A cute corgi lives in a house made out of sushi.", image, mas
220
  + Denis Dimitrov: [Github](https://github.com/denndimitrov), [Blog](https://t.me/dendi_math_ai)
221
 
222
  ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
223
  ```
224
  @misc{arkhipkin2023kandinsky,
225
  title={Kandinsky 3.0 Technical Report},
 
220
  + Denis Dimitrov: [Github](https://github.com/denndimitrov), [Blog](https://t.me/dendi_math_ai)
221
 
222
  ## Citation
223
+ ```
224
+ @inproceedings{vladimir-etal-2024-kandinsky,
225
+ title = "Kandinsky 3: Text-to-Image Synthesis for Multifunctional Generative Framework",
226
+ author = "Vladimir, Arkhipkin and
227
+ Vasilev, Viacheslav and
228
+ Filatov, Andrei and
229
+ Pavlov, Igor and
230
+ Agafonova, Julia and
231
+ Gerasimenko, Nikolai and
232
+ Averchenkova, Anna and
233
+ Mironova, Evelina and
234
+ Anton, Bukashkin and
235
+ Kulikov, Konstantin and
236
+ Kuznetsov, Andrey and
237
+ Dimitrov, Denis",
238
+ editor = "Hernandez Farias, Delia Irazu and
239
+ Hope, Tom and
240
+ Li, Manling",
241
+ booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations",
242
+ month = nov,
243
+ year = "2024",
244
+ address = "Miami, Florida, USA",
245
+ publisher = "Association for Computational Linguistics",
246
+ url = "https://aclanthology.org/2024.emnlp-demo.48",
247
+ pages = "475--485",
248
+ abstract = "Text-to-image (T2I) diffusion models are popular for introducing image manipulation methods, such as editing, image fusion, inpainting, etc. At the same time, image-to-video (I2V) and text-to-video (T2V) models are also built on top of T2I models. We present Kandinsky 3, a novel T2I model based on latent diffusion, achieving a high level of quality and photorealism. The key feature of the new architecture is the simplicity and efficiency of its adaptation for many types of generation tasks. We extend the base T2I model for various applications and create a multifunctional generation system that includes text-guided inpainting/outpainting, image fusion, text-image fusion, image variations generation, I2V and T2V generation. We also present a distilled version of the T2I model, evaluating inference in 4 steps of the reverse process without reducing image quality and 3 times faster than the base model. We deployed a user-friendly demo system in which all the features can be tested in the public domain. Additionally, we released the source code and checkpoints for the Kandinsky 3 and extended models. Human evaluations show that Kandinsky 3 demonstrates one of the highest quality scores among open source generation systems.",
249
+ }
250
+ ```
251
+
252
  ```
253
  @misc{arkhipkin2023kandinsky,
254
  title={Kandinsky 3.0 Technical Report},