Model Card: DenseNet121 for Cervix type Image Classification
This model classifies cervical images into Type_1, Type_2, Type_3, and an Out-of-Distribution (OOD) category. It uses a DenseNet121 backbone pretrained on ImageNet and fine-tuned on cervical images, including OOD examples from Caltech101.
Model Details
- Base model:
torchvision.models.densenet121pretrained on ImageNet - Input: RGB images (224x224)
- Output: 4 classes:
['Type_1', 'Type_2', 'Type_3', 'OOD'] - License: MIT
- Training dataset sources:
- Cervical images: Intel MobileODT competition dataset
- OOD images: Caltech101 dataset
- Preprocessing & Augmentation:
- Resize to 224x224
- Normalization (ImageNet mean & std)
- Data augmentation: Random rotation, color jitter (brightness/contrast)
Dataset Distribution
| Split | Type_1 | Type_2 | Type_3 | OOD | Total |
|---|---|---|---|---|---|
| Train | 557 | 532 | 547 | 424 | 2060 |
| Validation | 151 | 161 | 154 | 122 | 588 |
| Test | 73 | 88 | 80 | 54 | 295 |
Training Details
- Optimizer: Adam
- Loss: CrossEntropyLoss
- Batch size: 8
- Learning rate: 1e-5
- Epochs: 30
- Device: GPU (Tesla T4, 14GB)
Evaluation
Evaluation Metrics
| Class | Precision | Recall | F1-score | Sensitivity | Specificity |
|---|---|---|---|---|---|
| OOD | 1.00 | 1.00 | 1.00 | 1.0000 | 1.0000 |
| Type_1 | 0.74 | 0.93 | 0.82 | 0.9333 | 0.9074 |
| Type_2 | 0.85 | 0.51 | 0.64 | 0.5114 | 0.9574 |
| Type_3 | 0.73 | 0.92 | 0.81 | 0.9189 | 0.8762 |
Overall accuracy: 0.81
Confusion Matrix
Predicted
OOD T1 T2 T3
Actual
OOD 54 0 0 0
Type_1 0 56 3 1
Type_2 0 19 45 24
Type_3 0 1 5 68
Classification Report
precision recall f1-score support
OOD 1.00 1.00 1.00 54
Type_1 0.74 0.93 0.82 60
Type_2 0.85 0.51 0.64 88
Type_3 0.73 0.92 0.81 74
accuracy 0.81 276
macro avg 0.83 0.84 0.82 276
weighted avg 0.82 0.81 0.80 276
How to Get Started
import torch
from torchvision import transforms, models
from PIL import Image
# Load model
model = models.densenet121(pretrained=False)
model.classifier = torch.nn.Linear(model.classifier.in_features, 4)
model.load_state_dict(torch.load("Dense_net_121.pth", map_location="cpu"))
model.eval()
# Transform
transform = transforms.Compose([
transforms.Resize((224, 224)),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485,0.456,0.406], std=[0.229,0.224,0.225])
])
# Load image
image = Image.open("example.jpg").convert("RGB")
image = transform(image).unsqueeze(0)
# Predict
outputs = model(image)
probabilities = torch.softmax(outputs, dim=1)
predicted_class = torch.argmax(probabilities, dim=1).item()
confidence = probabilities[0, predicted_class].item()
class_names = ["Type_1", "Type_2", "Type_3", "OOD"]
print(f"Predicted class: {class_names[predicted_class]}, confidence: {confidence:.2f}")
````
---
## Technical Specifications
### Model Architecture
* **Backbone:** DenseNet121 pretrained on ImageNet
* **Output Layer:** Fully connected layer with 4 outputs (`Type_1`, `Type_2`, `Type_3`, `OOD`)
* **Activation:** Softmax for multi-class classification
* **Training Framework:** PyTorch
* **Loss Function:** CrossEntropyLoss
* **Data Handling:** Includes OOD images from Caltech101 along with in-distribution cervical images
* **Preprocessing & Augmentation:** Resize to 224x224, normalization (ImageNet mean/std), random rotation, color jitter
### Compute Infrastructure
* **Hardware:** Tesla T4 GPU (14GB)
* **Software:** PyTorch, torchvision, CUDA
---