FloraScan
AI-powered papaya crop diagnostic tool using a two-stage ML pipeline: EfficientNetB0 for disease image classification, followed by a Hugging Face LLM generating human-readable treatment recommendations.
Tech Stack
Stakeholders
Academic Supervisor
Research guidance, model evaluation criteria, and thesis assessment
Smallholder Farmers (End Users)
Target users — evaluated usability and diagnostic trust during field testing
UTM Department of Computing
Project scope approval and academic evaluation panel
Zafran (ML Developer + Backend)
Dataset curation, model training, FastAPI backend, Docker deployment, React Native frontend
The Problem
Malaysian smallholder papaya farmers lack affordable, real-time access to crop disease diagnostics. Agronomist consultations are expensive and slow — by the time a diagnosis is made, crop damage has spread significantly.
The Solution
Trained a custom EfficientNetB0 model on a curated papaya disease dataset for multi-class image classification. A second stage pipes the classification result into a Hugging Face LLM that generates plain-language treatment recommendations in Bahasa Malaysia and English.
Architecture
Two-stage inference pipeline exposed via a FastAPI REST API. Stage 1: A fine-tuned EfficientNetB0 model classifies the uploaded leaf image into one of 7 disease categories. Stage 2: The label and confidence score are passed as a structured prompt to a Hugging Face Inference API (Mistral-7B-Instruct) which generates a treatment recommendation paragraph. The mobile app sends images to the API and renders results.
- 01
React Native Frontend
Camera capture and gallery picker for leaf image input. Displays classification result with confidence bar and LLM-generated treatment text. Supports both English and Bahasa Malaysia output.
- 02
FastAPI Backend
Single /predict endpoint accepts multipart image upload. Preprocesses image (resize, normalize) before passing to the TF SavedModel. Constructs prompt from classification output and calls HF Inference API. Returns structured JSON.
- 03
EfficientNetB0 Model
Fine-tuned on a curated dataset of 2,800 papaya leaf images across 7 classes (healthy + 6 disease types). Transfer learning from ImageNet weights. Saved as TensorFlow SavedModel for efficient serving.
- 04
Hugging Face LLM Layer
Mistral-7B-Instruct via HF Inference API. Prompt engineering to produce structured, actionable recommendations. Fallback to a static recommendation template if API call fails or exceeds timeout.
- 05
Docker Deployment
FastAPI + TF model containerised in a single image. Model weights baked into the image at build time for zero cold-start latency. Deployed on self-hosted Ubuntu server.
Dev Setup
Prerequisites
- Python 3.11+
- Docker
- Hugging Face API key
- Expo CLI (for mobile)
# Set HF_API_KEY and MODEL_PATH
# Or download pre-trained weights from releases
# API on localhost:8000
Challenges
- 01
Small dataset with high class imbalance
The papaya disease dataset had severe imbalance — healthy samples outnumbered some disease classes 8:1. Addressed with weighted loss functions, aggressive data augmentation (random crop, flip, brightness jitter, mixup), and class-weighted sampling during training. Validation F1 improved from 0.74 to 0.91 after these interventions.
- 02
LLM latency in a mobile context
The HF Inference API adds 1.5–2.5s of latency to each request. On slow Malaysian mobile networks this felt unacceptable. Implemented optimistic UI — shows the classification result immediately while the LLM recommendation streams in separately, so the user sees useful output within 1 second.
- 03
Prompt engineering for agricultural domain
General-purpose LLM prompts produced verbose, academic-sounding recommendations. Iteratively refined the system prompt with few-shot examples of good recommendations (concise, actionable, locally-relevant). Also added a post-processing step to strip disclaimers and references to 'consult a doctor'.
What I Learned
- 01
Transfer learning with EfficientNet is remarkably effective even on small, domain-specific datasets — you don't need millions of images.
- 02
Dataset quality beats quantity: cleaning mislabelled images had more impact than adding more raw data.
- 03
Optimistic UI patterns significantly improve perceived performance on slow networks — show what you have immediately.
- 04
Prompt engineering is an iterative craft — budget time for it like you would for model tuning.