Improve model card: add pipeline tag, paper link, and fix usage snippet (#1)

6f35a9b 3 months ago

2.91 kB

	---
	library_name: diffusers
	license: mit
	pipeline_tag: image-to-image
	tags:
	- computed-tomography
	- ct-reconstruction
	- diffusion-model
	- latent-diffusion
	- inverse-problems
	- dm4ct
	- sparse-view-ct
	---

	# Latent Diffusion Model – LoDoInd (DM4CT)

	This repository contains the pretrained latent-space diffusion model used in the
	DM4CT: Benchmarking Diffusion Models for CT Reconstruction (ICLR 2026) benchmark.

	- Paper: [DM4CT: Benchmarking Diffusion Models for Computed Tomography Reconstruction](https://huggingface.co/papers/2602.18589)
	- Project Page: [https://dm4ct.github.io/DM4CT/](https://dm4ct.github.io/DM4CT/)
	- Codebase: [https://github.com/DM4CT/DM4CT](https://github.com/DM4CT/DM4CT)

	---

	## 🔬 Model Overview

	This model learns a prior over CT reconstruction images in a compressed latent space using a denoising diffusion probabilistic model (DDPM).

	Unlike pixel diffusion models, diffusion is performed in the latent space of a pretrained autoencoder (VQ-VAE).

	- Architecture:
	- VQ-VAE (image encoder/decoder)
	- 2D UNet operating in latent space
	- Input resolution (image space): 512 × 512
	- Channels: 1 (grayscale CT slice)
	- Training objective: ε-prediction (standard DDPM formulation)
	- Noise schedule: Linear beta schedule
	- Training dataset: Industry CT dataset (LoDoInd)
	- Intensity normalization: Rescaled to (-1, 1)

	This model is intended to be combined with data-consistency correction for CT reconstruction tasks.

	---

	## 📊 Dataset: LoDoInd

	The model was trained on the industrial CT dataset [LoDoInd](https://zenodo.org/records/10391412).

	- Reconstructed slices were rescaled to the range (-1, 1).
	- The model learns an unconditional latent prior over CT slices; no specific geometry information is embedded in the weights.

	---

	## 🧠 Training Details

	- Optimizer: AdamW
	- Learning rate: 1e-4
	- Hardware: NVIDIA A100 GPU
	- Training scripts: Available in the [DM4CT GitHub repository](https://github.com/DM4CT/DM4CT/blob/main/train_latent.py).

	---

	## 🚀 Usage

	You can load and use this model with the `diffusers` library:

	```python
	from diffusers import LDMPipeline
	import torch

	pipeline = LDMPipeline.from_pretrained(
	"jiayangshi/lodoind_latent_diffusion"
	)
	pipeline.to("cuda")

	# Generate a sample (unconditional prior)
	image = pipeline().images[0]
	image.save("generated_ct_slice.png")
	```

	Note: For actual CT reconstruction, this prior is typically used with data-consistency guidance as described in the paper.

	---

	## Citation

	```bibtex
	@inproceedings{
	shi2026dmct,
	title={{DM}4{CT}: Benchmarking Diffusion Models for Computed Tomography Reconstruction},
	author={Shi, Jiayang and Pelt, Dani{\in}l M and Batenburg, K Joost},
	booktitle={The Fourteenth International Conference on Learning Representations},
	year={2026},
	url={https://openreview.net/forum?id=YE5scJekg5}
	}
	```