Image-to-Image
Diffusers
Safetensors
LDMPipeline
computed-tomography
ct-reconstruction
diffusion-model
latent-diffusion
inverse-problems
dm4ct
sparse-view-ct
Instructions to use jiayangshi/lodoind_latent_diffusion with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Diffusers
How to use jiayangshi/lodoind_latent_diffusion with Diffusers:
pip install -U diffusers transformers accelerate
import torch from diffusers import DiffusionPipeline from diffusers.utils import load_image # switch to "mps" for apple devices pipe = DiffusionPipeline.from_pretrained("jiayangshi/lodoind_latent_diffusion", dtype=torch.bfloat16, device_map="cuda") prompt = "Turn this cat into a dog" input_image = load_image("https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png") image = pipe(image=input_image, prompt=prompt).images[0] - Notebooks
- Google Colab
- Kaggle
| library_name: diffusers | |
| license: mit | |
| pipeline_tag: image-to-image | |
| tags: | |
| - computed-tomography | |
| - ct-reconstruction | |
| - diffusion-model | |
| - latent-diffusion | |
| - inverse-problems | |
| - dm4ct | |
| - sparse-view-ct | |
| # Latent Diffusion Model β LoDoInd (DM4CT) | |
| This repository contains the pretrained **latent-space diffusion model** used in the | |
| **DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026)** benchmark. | |
| - **Paper:** [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589) | |
| - **Project Page:** [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/) | |
| - **Codebase:** [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT) | |
| --- | |
| ## π¬ Model Overview | |
| This model learns a **prior over CT reconstruction images in a compressed latent space** using a denoising diffusion probabilistic model (DDPM). | |
| Unlike pixel diffusion models, diffusion is performed in the latent space of a pretrained autoencoder (VQ-VAE). | |
| - **Architecture**: | |
| - VQ-VAE (image encoder/decoder) | |
| - 2D UNet operating in latent space | |
| - **Input resolution (image space)**: 512 Γ 512 | |
| - **Channels**: 1 (grayscale CT slice) | |
| - **Training objective**: Ξ΅-prediction (standard DDPM formulation) | |
| - **Noise schedule**: Linear beta schedule | |
| - **Training dataset**: Industry CT dataset (LoDoInd) | |
| - **Intensity normalization**: Rescaled to (-1, 1) | |
| This model is intended to be combined with data-consistency correction for CT reconstruction tasks. | |
| --- | |
| ## π Dataset: LoDoInd | |
| The model was trained on the industrial CT dataset [LoDoInd](https://zenodo.org/records/10391412). | |
| - Reconstructed slices were rescaled to the range (-1, 1). | |
| - The model learns an unconditional latent prior over CT slices; no specific geometry information is embedded in the weights. | |
| --- | |
| ## π§ Training Details | |
| - **Optimizer**: AdamW | |
| - **Learning rate**: 1e-4 | |
| - **Hardware**: NVIDIA A100 GPU | |
| - **Training scripts**: Available in the [DM4CT GitHub repository](https://github.com/DM4CT/DM4CT/blob/main/train_latent.py). | |
| --- | |
| ## π Usage | |
| You can load and use this model with the `diffusers` library: | |
| ```python | |
| from diffusers import LDMPipeline | |
| import torch | |
| pipeline = LDMPipeline.from_pretrained( | |
| "jiayangshi/lodoind_latent_diffusion" | |
| ) | |
| pipeline.to("cuda") | |
| # Generate a sample (unconditional prior) | |
| image = pipeline().images[0] | |
| image.save("generated_ct_slice.png") | |
| ``` | |
| Note: For actual CT reconstruction, this prior is typically used with data-consistency guidance as described in the paper. | |
| --- | |
| ## Citation | |
| ```bibtex | |
| @inproceedings{ | |
| shi2026dmct, | |
| title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction}, | |
| author={Shi, Jiayang and Pelt, Dani{\in}l M and Batenburg, K Joost}, | |
| booktitle={The Fourteenth International Conference on Learning Representations}, | |
| year={2026}, | |
| url={https://openreview.net/forum?id=YE5scJekg5} | |
| } | |
| ``` |