You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

NeuroVFM: Health system learning achieves generalist neuroimaging models

Preprint / Interactive Demo / GitHub / MLiNS Lab

This is the model card for NeuroVFM, a generalist visual foundation model for clinical neuroimaging, pretrained on 5.24 million uncurated MRI and CT volumes from a large academic health system using Volumetric Joint-Embedding Predictive Architecture (Vol-JEPA).

Why use NeuroVFM? NeuroVFM is the first 3D visual backbone self-supervised at the scale of an entire health system (5.24M volumes), bypassing the limitations of public datasets and internet-transfer learning. Unlike standard 2D models that lose volumetric context or medical models trained on small, curated cohorts, NeuroVFM learns modality-invariant representations of neuroanatomy and pathology directly from the raw, uncurated clinical stream. This learning enables zero-shot transfer across MRI and CT and exceptional robustness to scanner variations, providing a plug-and-play foundation for diagnosis, triage, and report generation that outperforms frontier models like GPT-5 and DINOv3.

NOTE: This is the model card for the visual feature backbone. For diagnostic classification heads, see here (MRI) and here (CT). For the radiological findings generation model, see here.

Model Details

  • Architecture: 3D Vision Transformer (ViT-Base/4x16x16px)
  • Training Data: UM-NeuroImages (5.24 million 3D volumes)
    • Diversity: 566,915 unique studies (CT & MRI) acquired over 20 years
  • Training Objective: Volumetric Joint-Embedding Predictive Architecture (Vol-JEPA)
  • Compute Hardware: Trained on 8x NVIDIA L40S GPUs (48GB VRAM)
  • Training Efficiency: <1,000 GPU-hours total pretraining time (Automatic Mixed Precision with PyTorch DDP)
  • Optimization: AdamW, LR of 3.75e-4 with Cosine Decay (10% warmup)

Quick Start

The easiest way to use NeuroVFM is through our Python package:

from neurovfm import load_encoder, load_diagnostic_head

encoder, preproc = load_encoder("mlinslab/neurovfm-encoder")
dx_head = load_diagnostic_head("mlinslab/neurovfm-dx-ct")

vols = preproc.load_study("/path/to/study/")             # study directory with 1+ DICOM/NIfTI files   
embs = encoder.embed(vols)                               # series-wise embeddings   

dx = dx_head.predict_proba(embs, top_k=3)

Intended Use

NeuroVFM is designed to be a frozen feature extractor for downstream clinical tasks. It is not a diagnostic device itself.

Limitations & Safety

This model is a research tool. It has not been approved by the FDA or any regulatory body for clinical use. While trained on a diverse health system population, the model may carry biases intrinsic to the University of Michigan patient cohort. When used for generation (with an LLM), the system may still hallucinate findings, though at a lower rate than pure language models. Outputs must be verified by a clinician.

License

Downloads last month
42
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including mlinslab/neurovfm-encoder