T2.1 β€” Compressed Crop Disease Classifier (INT8 ONNX, 4.34 MB)

A MobileNetV3-Small classifier compressed to 4.34 MB INT8 ONNX, trained for the AIMS KTT Hackathon Tier 2 brief (T2.1). It takes a 224Γ—224 JPEG leaf image and returns one of five labels:

  • bean_spot β€” bean angular leaf spot
  • cassava_mosaic β€” Cassava Mosaic Disease (CMD)
  • healthy β€” healthy maize leaf
  • maize_blight β€” maize Northern Leaf Blight
  • maize_rust β€” maize common rust

Intended for low-bandwidth, edge-device deployment in rural agricultural contexts. It ships with a FastAPI/ONNX Runtime service and a USSD/SMS fallback pathway for farmers on feature phones.

GitHub: DrUkachi/ktt-crop-disease-classifier


Evaluation

Split Macro-F1 Notes
Clean test (150 imgs) 1.0000 balanced 30 per class
Field-noisy test (150 imgs) 0.9867 same images, blur Οƒ ∈ [0, 1.5] + JPEG q ∈ [50, 85] + brightness jitter
Ξ” clean β†’ field 1.33 pp brief budget: < 12 pp βœ…
INT8 vs FP32 delta 0.00 pp MatMul/Gemm-only INT8 is lossless on this backbone

Per-class confusion matrices and Grad-CAM overlays are in notebooks/01_train_eval.ipynb.

Honest caveat on clean F1 = 1.00. PlantVillage (the source for the three maize classes) is a studio-lit dataset with consistent per-class backgrounds, and the five labels span three plant species with very different leaf morphology. ImageNet-pretrained features separate those distributions trivially. The more meaningful number is the 1.33 pp drop on the field-noisy set, which measures generalisation under blur, JPEG re-compression, and brightness jitter.

Model details

  • Architecture: MobileNetV3-Small, ImageNet pretrained, classifier head replaced with a Linear(576 β†’ 1024 β†’ 5) stack
  • Input: 224 Γ— 224 Γ— 3 RGB, ImageNet mean/std normalization
  • Output: 5 logits in this fixed class ordering: bean_spot, cassava_mosaic, healthy, maize_blight, maize_rust
  • Quantization: ONNX Runtime dynamic INT8 on MatMul/Gemm nodes only (the classifier head), preceded by quant_pre_process (BN fusion, shape inference). The convolutional backbone stays FP32.
  • Why not full-graph INT8: MobileNetV3’s Hardswish activations and Squeeze-and-Excitation blocks regress badly under ORT static INT8 (clean F1 β†’ 0.73) and collapse entirely under full-graph dynamic INT8 (clean F1 β†’ 0.07, always-one-class). QAT would likely fix this, but it was out of scope for the 4-hour brief cap. Full empirical details are in process_log.md.
  • Inference: CPU-only via ONNX Runtime (CPUExecutionProvider), with observed latency of ~3–5 ms per image

Training

  • Hardware: NVIDIA L4 (23 GB)
  • Run time: full 15-epoch training took 40.2 seconds
  • Optimiser: AdamW, LR 5e-4, weight decay 1e-4, cosine annealing over 15 epochs
  • Loss: class-weighted cross-entropy
  • Batch size: 64
  • Train-time augmentation: horizontal flip, Β±10Β° rotation, mild colour jitter (brightness/contrast 0.2, saturation 0.1)
  • Best epoch: 2

Blur and JPEG re-compression were deliberately excluded from training so the clean β†’ field gap remains an honest robustness check.

Training data

Assembled by generate_dataset.py from three public Hugging Face dataset mirrors:

Class HF dataset Label
bean_spot AI-Lab-Makerere/beans idx 0 angular_leaf_spot
cassava_mosaic dpdl-benchmark/cassava 3 CMD
healthy BrandonFors/Plant-Diseases-PlantVillage-Dataset idx 10 Corn_(maize)___healthy
maize_blight same idx 9 Corn_(maize)___Northern_Leaf_Blight
maize_rust same idx 8 Corn_(maize)___Common_rust_

There are 300 images per class, with an 80/10/10 train/val/test split using seed 1337. Full provenance (per-image source IDs) is recorded in data/manifest.json after the generator runs.

Usage

With ONNX Runtime directly

import numpy as np
import onnxruntime as ort
from PIL import Image

CLASSES = ["bean_spot", "cassava_mosaic", "healthy", "maize_blight", "maize_rust"]
MEAN = np.array([0.485, 0.456, 0.406], dtype=np.float32)
STD = np.array([0.229, 0.224, 0.225], dtype=np.float32)

sess = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
img = Image.open("maize_rust.jpg").convert("RGB").resize((224, 224))
arr = (np.asarray(img, dtype=np.float32) / 255.0 - MEAN) / STD
arr = arr.transpose(2, 0, 1)[None, ...].astype(np.float32)
logits = sess.run(None, {sess.get_inputs()[0].name: arr})[0][0]
print(CLASSES[int(logits.argmax())])

As a FastAPI service

git clone https://github.com/DrUkachi/ktt-crop-disease-classifier.git
cd ktt-crop-disease-classifier
pip install -r service/requirements.txt
uvicorn service.app:app --host 0.0.0.0 --port 8000

curl -X POST -F 'image=@samples/maize_rust_1.jpg' http://localhost:8000/predict

The service returns { label, confidence, top3, latency_ms, rationale } and adds escalation: "second_photo_different_angle" when confidence < 0.6.


Limitations and intended use

  • Trained on ~1,200 studio-lit and smartphone-quality images
  • Performance on microscope, UV, or non-leaf substrate images is not characterised
  • The five classes do not cover all realistic field scenarios
  • The service exposes top3 and an escalation field so the consuming PWA can route low-confidence cases to a human extension officer
  • Training data provenance is inherited from the upstream Hugging Face mirrors
  • The model card does not evaluate fairness across cultivars, soil types, or geographies

License

MIT, matching the GitHub repo.

Citation

@misc{osisiogu2026ktt,
  author = {Osisiogu, Ukachi},
  title = {Compressed Crop Disease Classifier (AIMS KTT T2.1)},
  year = {2026},
  howpublished = {\url{https://github.com/DrUkachi/ktt-crop-disease-classifier}},
}

Upstream dataset credits: PlantVillage (Mohanty et al. 2016), Cassava Leaf Disease (Mwebaze et al. 2019, Kaggle 2020), and iBeans (Makerere AI Lab 2020).

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Datasets used to train DrUkachi/ktt-crop-disease-classifier

Evaluation results

  • macro-F1 (clean test) on T2.1 synthetic recipe (PlantVillage + Cassava + iBeans, 300/class)
    self-reported
    1.000
  • macro-F1 (field-noisy test) on T2.1 synthetic recipe (PlantVillage + Cassava + iBeans, 300/class)
    self-reported
    0.987