AcademicFinal Year Project · ML Developer · 2024

FloraScan

AI-powered papaya crop diagnostic tool using a two-stage ML pipeline: EfficientNetB0 for disease image classification, followed by a Hugging Face LLM generating human-readable treatment recommendations.

92%Disease classification accuracy
2-stageVision + language pipeline
< 3sEnd-to-end inference time

Tech Stack

PythonFastAPITensorFlowEfficientNetB0Hugging FaceDockerReact Native

Stakeholders

Academic Supervisor

Research guidance, model evaluation criteria, and thesis assessment

Smallholder Farmers (End Users)

Target users — evaluated usability and diagnostic trust during field testing

UTM Department of Computing

Project scope approval and academic evaluation panel

Zafran (ML Developer + Backend)

Dataset curation, model training, FastAPI backend, Docker deployment, React Native frontend

The Problem

Malaysian smallholder papaya farmers lack affordable, real-time access to crop disease diagnostics. Agronomist consultations are expensive and slow — by the time a diagnosis is made, crop damage has spread significantly.

The Solution

Trained a custom EfficientNetB0 model on a curated papaya disease dataset for multi-class image classification. A second stage pipes the classification result into a Hugging Face LLM that generates plain-language treatment recommendations in Bahasa Malaysia and English.

Architecture

Two-stage inference pipeline exposed via a FastAPI REST API. Stage 1: A fine-tuned EfficientNetB0 model classifies the uploaded leaf image into one of 7 disease categories. Stage 2: The label and confidence score are passed as a structured prompt to a Hugging Face Inference API (Mistral-7B-Instruct) which generates a treatment recommendation paragraph. The mobile app sends images to the API and renders results.

  1. 01

    React Native Frontend

    Camera capture and gallery picker for leaf image input. Displays classification result with confidence bar and LLM-generated treatment text. Supports both English and Bahasa Malaysia output.

  2. 02

    FastAPI Backend

    Single /predict endpoint accepts multipart image upload. Preprocesses image (resize, normalize) before passing to the TF SavedModel. Constructs prompt from classification output and calls HF Inference API. Returns structured JSON.

  3. 03

    EfficientNetB0 Model

    Fine-tuned on a curated dataset of 2,800 papaya leaf images across 7 classes (healthy + 6 disease types). Transfer learning from ImageNet weights. Saved as TensorFlow SavedModel for efficient serving.

  4. 04

    Hugging Face LLM Layer

    Mistral-7B-Instruct via HF Inference API. Prompt engineering to produce structured, actionable recommendations. Fallback to a static recommendation template if API call fails or exceeds timeout.

  5. 05

    Docker Deployment

    FastAPI + TF model containerised in a single image. Model weights baked into the image at build time for zero cold-start latency. Deployed on self-hosted Ubuntu server.

Dev Setup

Prerequisites

  • Python 3.11+
  • Docker
  • Hugging Face API key
  • Expo CLI (for mobile)
bash — setup
$git clone https://github.com/zafransakowi/florascan && cd florascan
$python -m venv venv && source venv/bin/activate
$pip install -r requirements.txt
$cp .env.example .env

# Set HF_API_KEY and MODEL_PATH

$# To train model: python train.py --epochs 30 --data ./dataset

# Or download pre-trained weights from releases

$uvicorn app.main:app --reload

# API on localhost:8000

$# Docker: docker build -t florascan . && docker run -p 8000:8000 florascan

Challenges

  1. 01

    Small dataset with high class imbalance

    The papaya disease dataset had severe imbalance — healthy samples outnumbered some disease classes 8:1. Addressed with weighted loss functions, aggressive data augmentation (random crop, flip, brightness jitter, mixup), and class-weighted sampling during training. Validation F1 improved from 0.74 to 0.91 after these interventions.

  2. 02

    LLM latency in a mobile context

    The HF Inference API adds 1.5–2.5s of latency to each request. On slow Malaysian mobile networks this felt unacceptable. Implemented optimistic UI — shows the classification result immediately while the LLM recommendation streams in separately, so the user sees useful output within 1 second.

  3. 03

    Prompt engineering for agricultural domain

    General-purpose LLM prompts produced verbose, academic-sounding recommendations. Iteratively refined the system prompt with few-shot examples of good recommendations (concise, actionable, locally-relevant). Also added a post-processing step to strip disclaimers and references to 'consult a doctor'.

What I Learned

  • 01

    Transfer learning with EfficientNet is remarkably effective even on small, domain-specific datasets — you don't need millions of images.

  • 02

    Dataset quality beats quantity: cleaning mislabelled images had more impact than adding more raw data.

  • 03

    Optimistic UI patterns significantly improve perceived performance on slow networks — show what you have immediately.

  • 04

    Prompt engineering is an iterative craft — budget time for it like you would for model tuning.