SDXL Detector - Vision Transformer

Model Description

This model is a specialized binary classifier trained to detect images generated by Stable Diffusion XL (SDXL). It achieves 99.60% accuracy on held-out test data.

Key Features

🎯 Specialist Detector: Optimized specifically for SDXL-generated images
🚀 High Accuracy: 99.60% test accuracy
⚡ Fast Inference: ~10ms per image on GPU
🛡️ Robust: Trained with 6-layer overfitting prevention
📊 Well-Validated: Separate train/val/test splits with no overlap

Performance

Test Accuracy:  0.9960
Precision:      0.9930
Recall:         0.9990
F1 Score:       0.9960
AUC-ROC:        0.9999

False Positive Rate: 0.0070
False Negative Rate: 0.0010

Quick Start

import torch
from PIL import Image
from transformers import ViTForImageClassification, ViTImageProcessor

# Load model and processor
model = ViTForImageClassification.from_pretrained(
    "ash12321/sdxl-detector-vit"
)
processor = ViTImageProcessor.from_pretrained(
    "google/vit-base-patch16-224"
)

# Load image
image = Image.open("test.jpg")
inputs = processor(images=image, return_tensors="pt")

# Get prediction
model.eval()
with torch.no_grad():
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    
    if probs[0][1] > 0.5:
        print(f"SDXL-Generated ({probs[0][1]:.2%} confident)")
    else:
        print(f"Real Image ({probs[0][0]:.2%} confident)")

Using the model.py Helper

from model import detect_image

result = detect_image("test.jpg", model_path="ash12321/sdxl-detector-vit")
print(f"Is Fake: {result['is_fake']}")
print(f"Confidence: {result['confidence']:.2%}")

Files in this Repository

pytorch_model.bin - Model weights
config.json - Model configuration
model.py - Model architecture and helper functions
README.md - This documentation
training_results.json - Detailed training metrics
training_curves.png - Training visualization
confusion_matrix.png - Test set confusion matrix

Citation

@misc{sdxl-detector-vit,
  author = {ash12321},
  title = {SDXL Detector - Vision Transformer},
  year = {2024},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/ash12321/sdxl-detector-vit}},
}

License: Apache 2.0
Created: 2025-12-31

Downloads last month: 35

Safetensors

Model size

85.8M params

Tensor type

F32

Datasets used to train ash12321/sdxl-detector-vit

Evaluation results

Test Accuracy
self-reported

0.996
F1 Score
self-reported

0.996