Enhance model card for Kandinsky 5.0 Image Lite with metadata, links, and usage
Browse filesThis PR significantly enhances the model card for `kandinskylab/Kandinsky-5.0-T2I-Lite` by adding crucial metadata and comprehensive content.
Key changes include:
* Adding `pipeline_tag: text-to-image` to ensure discoverability for text-to-image models on the Hugging Face Hub.
* Specifying `library_name: diffusers` as the compatible library, which enables the automated "how to use" widget on the model page.
* Populating the content section with an overview of the model, clearly linked to the official paper, the project page, and the GitHub repository.
* Including a "How to use" section with a Python code snippet for text-to-image generation, directly sourced from the project's GitHub README.
* Adding an "Examples" section with images and a "Citation" section with the BibTeX entry.
This update makes the model more accessible and informative for users.
|
@@ -1,3 +1,95 @@
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
-
--
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
license: mit
|
| 3 |
+
pipeline_tag: text-to-image
|
| 4 |
+
library_name: diffusers
|
| 5 |
+
---
|
| 6 |
+
|
| 7 |
+
# Kandinsky 5.0 Image Lite
|
| 8 |
+
|
| 9 |
+
This repository hosts the `Kandinsky 5.0 Image Lite` model, part of the Kandinsky 5.0 family of state-of-the-art foundation models for high-resolution image and 10-second video synthesis. It is described in the paper [Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation](https://huggingface.co/papers/2511.14993).
|
| 10 |
+
|
| 11 |
+
Kandinsky 5.0 Image Lite is a line-up of 6B image generation models with the following capabilities:
|
| 12 |
+
* 1K resolution (1280x768, 1024x1024 and others).
|
| 13 |
+
* High visual quality
|
| 14 |
+
* Strong text-writing
|
| 15 |
+
* Russian concepts understanding
|
| 16 |
+
|
| 17 |
+
* **Project Page**: https://kandinskylab.ai/
|
| 18 |
+
* **GitHub Repository**: https://github.com/kandinskylab/kandinsky-5
|
| 19 |
+
|
| 20 |
+
## How to use
|
| 21 |
+
|
| 22 |
+
You can use the `kandinsky` library, which integrates with `diffusers`, for text-to-image inference as shown in the example below:
|
| 23 |
+
|
| 24 |
+
```python
|
| 25 |
+
import torch
|
| 26 |
+
from kandinsky import get_T2I_pipeline
|
| 27 |
+
|
| 28 |
+
device_map = {
|
| 29 |
+
"dit": torch.device('cuda:0'),
|
| 30 |
+
"vae": torch.device('cuda:0'),
|
| 31 |
+
"text_embedder": torch.device('cuda:0')
|
| 32 |
+
}
|
| 33 |
+
|
| 34 |
+
pipe = get_T2I_pipeline(device_map, conf_path="configs/k5_lite_t2i_sft_hd.yaml")
|
| 35 |
+
|
| 36 |
+
images = pipe(
|
| 37 |
+
seed=42,
|
| 38 |
+
save_path='./test.png',
|
| 39 |
+
text="A cat in a red hat with a label 'HELLO'"
|
| 40 |
+
)
|
| 41 |
+
```
|
| 42 |
+
|
| 43 |
+
## Examples
|
| 44 |
+
|
| 45 |
+
<table border="0" style="width: 200; text-align: left; margin-top: 20px;">
|
| 46 |
+
<tr>
|
| 47 |
+
<td>
|
| 48 |
+
<image src="https://github.com/user-attachments/assets/f46e6866-15ce-445d-bb81-9843a341e2a9" width=200 ></image>
|
| 49 |
+
</td>
|
| 50 |
+
<td>
|
| 51 |
+
<image src="https://github.com/user-attachments/assets/74f3af1f-b11e-4174-9f36-e956b871a6e6" width=200 ></image>
|
| 52 |
+
</td>
|
| 53 |
+
<td>
|
| 54 |
+
<image src="https://github.com/user-attachments/assets/7e469d09-8b96-4691-b929-dd809827adf9" width=200 ></image>
|
| 55 |
+
</td>
|
| 56 |
+
<tr>
|
| 57 |
+
</table>
|
| 58 |
+
<table border="0" style="width: 200; text-align: left; margin-top: 10px;">
|
| 59 |
+
<td>
|
| 60 |
+
<image src="https://github.com/user-attachments/assets/8054b25b-5d71-4547-8822-b07d71d137f4" width=200 ></image>
|
| 61 |
+
</td>
|
| 62 |
+
<td>
|
| 63 |
+
<image src="https://github.com/user-attachments/assets/f4825237-640b-4b2d-86e6-fd08fe95039f" width=200 ></image>
|
| 64 |
+
</td>
|
| 65 |
+
<td>
|
| 66 |
+
<image src="https://github.com/user-attachments/assets/73fbbc2a-3249-4b70-8931-2893ab0107a5" width=200 ></image>
|
| 67 |
+
</td>
|
| 68 |
+
|
| 69 |
+
</table>
|
| 70 |
+
<table border="0" style="width: 200; text-align: left; margin-top: 10px;">
|
| 71 |
+
<td>
|
| 72 |
+
<image src="https://github.com/user-attachments/assets/c309650b-8d8b-4e44-bb63-48287e22ff44" width=200 ></image>
|
| 73 |
+
</td>
|
| 74 |
+
<td>
|
| 75 |
+
<image src="https://github.com/user-attachments/assets/d5c0fcca-69b7-4d77-9c36-cd2fb87f2615" width=200 ></image>
|
| 76 |
+
</td>
|
| 77 |
+
<td>
|
| 78 |
+
<image src="https://github.com/user-attachments/assets/7895c3e8-2e72-40b8-8bf7-dcac859a6b29" width=200 ></image>
|
| 79 |
+
</td>
|
| 80 |
+
|
| 81 |
+
</table>
|
| 82 |
+
|
| 83 |
+
## Citation
|
| 84 |
+
|
| 85 |
+
```bibtex
|
| 86 |
+
@misc{arkhipkin2025kandinsky50familyfoundation,
|
| 87 |
+
title={Kandinsky 5.0: A Family of Foundation Models for Image and Video Generation},
|
| 88 |
+
author={Vladimir Arkhipkin and Vladimir Korviakov and Nikolai Gerasimenko and Denis Parkhomenko and Viacheslav Vasilev and Alexey Letunovskiy and Nikolai Vaulin and Maria Kovaleva and Ivan Kirillov and Lev Novitskiy and Denis Koposov and Nikita Kiselev and Alexander Varlamov and Dmitrii Mikhailov and Vladimir Polovnikov and Andrey Shutkin and Julia Agafonova and Ilya Vasiliev and Anastasiia Kargapoltseva and Anna Dmitrienko and Anastasia Maltseva and Anna Averchenkova and Olga Kim and Tatiana Nikulina and Denis Dimitrov},
|
| 89 |
+
year={2025},
|
| 90 |
+
eprint={2511.14993},
|
| 91 |
+
archivePrefix={arXiv},
|
| 92 |
+
primaryClass={cs.CV},
|
| 93 |
+
url={https://arxiv.org/abs/2511.14993},
|
| 94 |
+
}
|
| 95 |
+
```
|