T2.1 — Compressed Crop Disease Classifier (INT8 ONNX, 4.34 MB)

A MobileNetV3-Small classifier compressed to 4.34 MB INT8 ONNX, trained for the AIMS KTT Hackathon Tier 2 brief (T2.1). It takes a 224×224 JPEG leaf image and returns one of five labels:

bean_spot — bean angular leaf spot
cassava_mosaic — Cassava Mosaic Disease (CMD)
healthy — healthy maize leaf
maize_blight — maize Northern Leaf Blight
maize_rust — maize common rust

Intended for low-bandwidth, edge-device deployment in rural agricultural contexts. It ships with a FastAPI/ONNX Runtime service and a USSD/SMS fallback pathway for farmers on feature phones.

GitHub: DrUkachi/ktt-crop-disease-classifier

Evaluation

Split	Macro-F1	Notes
Clean test (150 imgs)	1.0000	balanced 30 per class
Field-noisy test (150 imgs)	0.9867	same images, blur σ ∈ [0, 1.5] + JPEG q ∈ [50, 85] + brightness jitter
Δ clean → field	1.33 pp	brief budget: < 12 pp ✅
INT8 vs FP32 delta	0.00 pp	MatMul/Gemm-only INT8 is lossless on this backbone

Per-class confusion matrices and Grad-CAM overlays are in notebooks/01_train_eval.ipynb.

Honest caveat on clean F1 = 1.00. PlantVillage (the source for the three maize classes) is a studio-lit dataset with consistent per-class backgrounds, and the five labels span three plant species with very different leaf morphology. ImageNet-pretrained features separate those distributions trivially. The more meaningful number is the 1.33 pp drop on the field-noisy set, which measures generalisation under blur, JPEG re-compression, and brightness jitter.

Model details

Architecture: MobileNetV3-Small, ImageNet pretrained, classifier head replaced with a Linear(576 → 1024 → 5) stack
Input: 224 × 224 × 3 RGB, ImageNet mean/std normalization
Output: 5 logits in this fixed class ordering: bean_spot, cassava_mosaic, healthy, maize_blight, maize_rust
Quantization: ONNX Runtime dynamic INT8 on MatMul/Gemm nodes only (the classifier head), preceded by quant_pre_process (BN fusion, shape inference). The convolutional backbone stays FP32.
Why not full-graph INT8: MobileNetV3’s Hardswish activations and Squeeze-and-Excitation blocks regress badly under ORT static INT8 (clean F1 → 0.73) and collapse entirely under full-graph dynamic INT8 (clean F1 → 0.07, always-one-class). QAT would likely fix this, but it was out of scope for the 4-hour brief cap. Full empirical details are in process_log.md.
Inference: CPU-only via ONNX Runtime (CPUExecutionProvider), with observed latency of ~3–5 ms per image

Training

Hardware: NVIDIA L4 (23 GB)
Run time: full 15-epoch training took 40.2 seconds
Optimiser: AdamW, LR 5e-4, weight decay 1e-4, cosine annealing over 15 epochs
Loss: class-weighted cross-entropy
Batch size: 64
Train-time augmentation: horizontal flip, ±10° rotation, mild colour jitter (brightness/contrast 0.2, saturation 0.1)
Best epoch: 2

Blur and JPEG re-compression were deliberately excluded from training so the clean → field gap remains an honest robustness check.

Training data

Assembled by generate_dataset.py from three public Hugging Face dataset mirrors:

Class	HF dataset	Label
`bean_spot`	`AI-Lab-Makerere/beans`	idx 0 `angular_leaf_spot`
`cassava_mosaic`	`dpdl-benchmark/cassava`	3 CMD
`healthy`	`BrandonFors/Plant-Diseases-PlantVillage-Dataset`	idx 10 `Corn_(maize)___healthy`
`maize_blight`	same	idx 9 `Corn_(maize)___Northern_Leaf_Blight`
`maize_rust`	same	idx 8 `Corn_(maize)___Common_rust_`

There are 300 images per class, with an 80/10/10 train/val/test split using seed 1337. Full provenance (per-image source IDs) is recorded in data/manifest.json after the generator runs.

Usage

With ONNX Runtime directly

import numpy as np
import onnxruntime as ort
from PIL import Image

CLASSES = ["bean_spot", "cassava_mosaic", "healthy", "maize_blight", "maize_rust"]
MEAN = np.array([0.485, 0.456, 0.406], dtype=np.float32)
STD = np.array([0.229, 0.224, 0.225], dtype=np.float32)

sess = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
img = Image.open("maize_rust.jpg").convert("RGB").resize((224, 224))
arr = (np.asarray(img, dtype=np.float32) / 255.0 - MEAN) / STD
arr = arr.transpose(2, 0, 1)[None, ...].astype(np.float32)
logits = sess.run(None, {sess.get_inputs()[0].name: arr})[0][0]
print(CLASSES[int(logits.argmax())])

As a FastAPI service

git clone https://github.com/DrUkachi/ktt-crop-disease-classifier.git
cd ktt-crop-disease-classifier
pip install -r service/requirements.txt
uvicorn service.app:app --host 0.0.0.0 --port 8000

curl -X POST -F 'image=@samples/maize_rust_1.jpg' http://localhost:8000/predict

The service returns { label, confidence, top3, latency_ms, rationale } and adds escalation: "second_photo_different_angle" when confidence < 0.6.

Limitations and intended use

Trained on ~1,200 studio-lit and smartphone-quality images
Performance on microscope, UV, or non-leaf substrate images is not characterised
The five classes do not cover all realistic field scenarios
The service exposes top3 and an escalation field so the consuming PWA can route low-confidence cases to a human extension officer
Training data provenance is inherited from the upstream Hugging Face mirrors
The model card does not evaluate fairness across cultivars, soil types, or geographies

License

MIT, matching the GitHub repo.

Citation

@misc{osisiogu2026ktt,
  author = {Osisiogu, Ukachi},
  title = {Compressed Crop Disease Classifier (AIMS KTT T2.1)},
  year = {2026},
  howpublished = {\url{https://github.com/DrUkachi/ktt-crop-disease-classifier}},
}

Upstream dataset credits: PlantVillage (Mohanty et al. 2016), Cassava Leaf Disease (Mwebaze et al. 2019, Kaggle 2020), and iBeans (Makerere AI Lab 2020).

Downloads last month: -; Downloads are not tracked for this model. How to track

Datasets used to train DrUkachi/ktt-crop-disease-classifier

Evaluation results

macro-F1 (clean test) on T2.1 synthetic recipe (PlantVillage + Cassava + iBeans, 300/class)
self-reported

1.000
macro-F1 (field-noisy test) on T2.1 synthetic recipe (PlantVillage + Cassava + iBeans, 300/class)
self-reported

0.987