Text-to-Image
Diffusers
nielsr HF Staff commited on
Commit
6fff176
·
verified ·
1 Parent(s): 80e848e

Enhance model card for Kandinsky 5.0 Image Lite with metadata, links, and usage

Browse files

This PR significantly enhances the model card for `kandinskylab/Kandinsky-5.0-T2I-Lite` by adding crucial metadata and comprehensive content.

Key changes include:
* Adding `pipeline_tag: text-to-image` to ensure discoverability for text-to-image models on the Hugging Face Hub.
* Specifying `library_name: diffusers` as the compatible library, which enables the automated "how to use" widget on the model page.
* Populating the content section with an overview of the model, clearly linked to the official paper, the project page, and the GitHub repository.
* Including a "How to use" section with a Python code snippet for text-to-image generation, directly sourced from the project's GitHub README.
* Adding an "Examples" section with images and a "Citation" section with the BibTeX entry.

This update makes the model more accessible and informative for users.

Files changed (1) hide show
  1. README.md +93 -1
README.md CHANGED
@@ -1,3 +1,95 @@
1
  ---
2
  license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
3
+ pipeline_tag: text-to-image
4
+ library_name: diffusers
5
+ ---
6
+
7
+ # Kandinsky 5.0 Image Lite
8
+
9
+ This repository hosts the `Kandinsky 5.0 Image Lite` model, part of the Kandinsky 5.0 family of state-of-the-art foundation models for high-resolution image and 10-second video synthesis. It is described in the paper [Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation](https://huggingface.co/papers/2511.14993).
10
+
11
+ Kandinsky 5.0 Image Lite is a line-up of 6B image generation models with the following capabilities:
12
+ * 1K resolution (1280x768, 1024x1024 and others).
13
+ * High visual quality
14
+ * Strong text-writing
15
+ * Russian concepts understanding
16
+
17
+ * **Project Page**: https://kandinskylab.ai/
18
+ * **GitHub Repository**: https://github.com/kandinskylab/kandinsky-5
19
+
20
+ ## How to use
21
+
22
+ You can use the `kandinsky` library, which integrates with `diffusers`, for text-to-image inference as shown in the example below:
23
+
24
+ ```python
25
+ import torch
26
+ from kandinsky import get_T2I_pipeline
27
+
28
+ device_map = {
29
+ "dit": torch.device('cuda:0'),
30
+ "vae": torch.device('cuda:0'),
31
+ "text_embedder": torch.device('cuda:0')
32
+ }
33
+
34
+ pipe = get_T2I_pipeline(device_map, conf_path="configs/k5_lite_t2i_sft_hd.yaml")
35
+
36
+ images = pipe(
37
+ seed=42,
38
+ save_path='./test.png',
39
+ text="A cat in a red hat with a label 'HELLO'"
40
+ )
41
+ ```
42
+
43
+ ## Examples
44
+
45
+ <table border="0" style="width: 200; text-align: left; margin-top: 20px;">
46
+ <tr>
47
+ <td>
48
+ <image src="https://github.com/user-attachments/assets/f46e6866-15ce-445d-bb81-9843a341e2a9" width=200 ></image>
49
+ </td>
50
+ <td>
51
+ <image src="https://github.com/user-attachments/assets/74f3af1f-b11e-4174-9f36-e956b871a6e6" width=200 ></image>
52
+ </td>
53
+ <td>
54
+ <image src="https://github.com/user-attachments/assets/7e469d09-8b96-4691-b929-dd809827adf9" width=200 ></image>
55
+ </td>
56
+ <tr>
57
+ </table>
58
+ <table border="0" style="width: 200; text-align: left; margin-top: 10px;">
59
+ <td>
60
+ <image src="https://github.com/user-attachments/assets/8054b25b-5d71-4547-8822-b07d71d137f4" width=200 ></image>
61
+ </td>
62
+ <td>
63
+ <image src="https://github.com/user-attachments/assets/f4825237-640b-4b2d-86e6-fd08fe95039f" width=200 ></image>
64
+ </td>
65
+ <td>
66
+ <image src="https://github.com/user-attachments/assets/73fbbc2a-3249-4b70-8931-2893ab0107a5" width=200 ></image>
67
+ </td>
68
+
69
+ </table>
70
+ <table border="0" style="width: 200; text-align: left; margin-top: 10px;">
71
+ <td>
72
+ <image src="https://github.com/user-attachments/assets/c309650b-8d8b-4e44-bb63-48287e22ff44" width=200 ></image>
73
+ </td>
74
+ <td>
75
+ <image src="https://github.com/user-attachments/assets/d5c0fcca-69b7-4d77-9c36-cd2fb87f2615" width=200 ></image>
76
+ </td>
77
+ <td>
78
+ <image src="https://github.com/user-attachments/assets/7895c3e8-2e72-40b8-8bf7-dcac859a6b29" width=200 ></image>
79
+ </td>
80
+
81
+ </table>
82
+
83
+ ## Citation
84
+
85
+ ```bibtex
86
+ @misc{arkhipkin2025kandinsky50familyfoundation,
87
+ title={Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation},
88
+ author={Vladimir Arkhipkin and Vladimir Korviakov and Nikolai Gerasimenko and Denis Parkhomenko and Viacheslav Vasilev and Alexey Letunovskiy and Nikolai Vaulin and Maria Kovaleva and Ivan Kirillov and Lev Novitskiy and Denis Koposov and Nikita Kiselev and Alexander Varlamov and Dmitrii Mikhailov and Vladimir Polovnikov and Andrey Shutkin and Julia Agafonova and Ilya Vasiliev and Anastasiia Kargapoltseva and Anna Dmitrienko and Anastasia Maltseva and Anna Averchenkova and Olga Kim and Tatiana Nikulina and Denis Dimitrov},
89
+ year={2025},
90
+ eprint={2511.14993},
91
+ archivePrefix={arXiv},
92
+ primaryClass={cs.CV},
93
+ url={https://arxiv.org/abs/2511.14993},
94
+ }
95
+ ```