Update README.md
Browse files
README.md
CHANGED
|
@@ -42,17 +42,10 @@ Source code is available at https://github.com/xiaomabufei/lumos.
|
|
| 42 |
- **Developed by:** Lumos
|
| 43 |
- **Model type:** Diffusion-Transformer-based generative model
|
| 44 |
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
|
| 45 |
-
- **Model Description:** Lumos-I2I is a model that can be used to generate and modify images based on image prompt.
|
| 46 |
It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained vision encoders ([DINO](
|
| 47 |
-
https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth))
|
| 48 |
-
and one latent feature encoder ([VAE](https://arxiv.org/abs/2112.10752)).
|
| 49 |
-
Lumos-T2I is a model that can be used to generate and modify images based on image prompt.
|
| 50 |
-
It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained vision encoders ([DINO](
|
| 51 |
-
https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth))
|
| 52 |
-
and one latent feature encoder ([VAE](https://arxiv.org/abs/2112.10752)).
|
| 53 |
-
Lumos-T2I is a model that can be used to generate and modify images based on text prompts.
|
| 54 |
It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](
|
| 55 |
-
https://huggingface.co/DeepFloyd/t5-v1_1-xxl))
|
| 56 |
-
and one latent feature encoder ([VAE](https://arxiv.org/abs/2112.10752)).
|
| 57 |
- **Resources for more information:** Check out our [GitHub Repository](https://github.com/xiaomabufei/lumos) and the [Lumos report on arXiv](https://arxiv.org/pdf/2412.07767).
|
| 58 |
|
|
|
|
| 42 |
- **Developed by:** Lumos
|
| 43 |
- **Model type:** Diffusion-Transformer-based generative model
|
| 44 |
- **License:** [CreativeML Open RAIL++-M License](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/blob/main/LICENSE.md)
|
| 45 |
+
- **Model Description:** **Lumos-I2I** is a model that can be used to generate and modify images based on the image prompt.
|
| 46 |
It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained vision encoders ([DINO](
|
| 47 |
+
https://dl.fbaipublicfiles.com/dino/dino_vitbase16_pretrain/dino_vitbase16_pretrain.pth)). **Lumos-T2I** is a model that can be used to generate and modify images based on text prompts.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 48 |
It is a [Transformer Latent Diffusion Model](https://arxiv.org/abs/2310.00426) that uses one fixed, pretrained text encoders ([T5](
|
| 49 |
+
https://huggingface.co/DeepFloyd/t5-v1_1-xxl)).
|
|
|
|
| 50 |
- **Resources for more information:** Check out our [GitHub Repository](https://github.com/xiaomabufei/lumos) and the [Lumos report on arXiv](https://arxiv.org/pdf/2412.07767).
|
| 51 |
|