key: cord-0047227-dqa74cus
authors: Ouhami, Maryam; Es-Saady, Youssef; Hajji, Mohamed El; Hafiane, Adel; Canals, Raphael; Yassa, Mostafa El
title: Deep Transfer Learning Models for Tomato Disease Detection
date: 2020-06-05
journal: Image and Signal Processing
DOI: 10.1007/978-3-030-51935-3_7
sha: b8cfbc96a86a8f3e10d48fa9938093b3bc11e79f
doc_id: 47227
cord_uid: dqa74cus

Vegetable crops in Morocco and especially in the Sous-Massa region are exposed to parasitic diseases and pest attacks which affect the quantity and the quality of agricultural production. Precision farming is introduced as one of the biggest revolutions in agriculture, which is committed to improving crop protection by identifying, analyzing and managing variability delivering effective treatment in the right place, at the right time, and with the right rate. The main purpose of this study is to find the most suitable machine learning model to detect tomato crop diseases in standard RGB images. To deal with this problem we consider the deep learning models DensNet, 161 and 121 layers and VGG16 with transfer learning. Our study is based on images of infected plant leaves divided into 6 types of infections pest attacks and plant diseases. The results were promising with an accuracy up to 95.65% for DensNet161, 94.93% for DensNet121 and 90.58% for VGG16.

With a surface area of nearly 8.7 million hectares, the Moroccan department of agriculture assumes that agricultural field produces a very wide range of products and generates 13% of the gross domestic product (GDP) [1] . This sector has experienced a significant evolution of the GDP due to the exploitation of fertilization and plant protection systems [1] . Despite the efforts, it still faces important challenges, such as diseases. Pathogen is the factor that causes disease in the plant; it is induced either by physical factors such as sudden climate changes or chemical/biological factors like viruses and fungi [2] .

Market gardening and especially tomato crops in the Sous-Massa region are ones of the crops that are exposed to several risks which increase the quantity and quality of the agriculture products. The most important damages are caused by pests' attacks (leafminer flies, Tuta absoluta and Thrips) in addition to cryptogamic pathogens infections (early blight, late blight and Powdery Mildew).

Since diagnosis can be performed on plant leaves, our study is conducted as a task of classification of symptoms and damages on those leaves. Ground imaging with RGB camera presents an interesting way for this diagnosis. However, robust algorithms are required to deal with different acquisition conditions: light changes, color calibration, etc. For several years, great efforts have been devoted to the study of plant disease detection. Indeed, feature engineering models [3] [4] [5] [6] on one side with Convolutional Neural Networks (CNN) [7] [8] [9] [10] on the other side; are carried out to solve this task.

In [6] , the study is based on a database of 120 images of infected rice leaves divided into three classes bacterial leaf blight, brown spot, and leaf smut (40 images for each class), Authors have converted the RGB images to an HSV color space to identify lesions, with a segmentation accuracy up to 96.71% using k-means. The experiments were carried out to classify the images based on multiple combinations of the extracted characteristics (texture, color and shape) using Support Vector Machine (SVM). The weakness of this method consists in a moderate accuracy obtained of 73.33%. In fact, the image quality was decreased during the segmentation phase, during which some holes were generated within disease portion, which could be a reason for the low classification accuracy. In addition, leaf smut is misclassified with an accuracy of 40%, which requires other types of features to improve le results.

In the same context, in [4] , the authors have proposed an approach for diseases recognition on plant leaves. This approach is based on combining multiple SVM classifiers (sequential, parallel and hybrid) using color, texture and shape characteristics. Different preprocessing have been performed, including normalization, noise reduction and segmentation by the k-means clustering method. The database of infected plant leaves contains six classes including three types of insect pest damages and three forms of pathogen symptoms. The hybrid approach has outperformed the other approaches, achieving a rate of 93.90%. The analysis of the confusion matrix for these three methods has highlighted the causes of misclassification, which are essentially due to the complexity of certain diseases for which it is difficult to differentiate their symptoms during the different stages of development, with a high degree of confusion between powdery mildew and thrips classes in all the combination approaches.

In another study, the authors have used a maize database acquired by a drone flying at a height of 6 m [11] . They have selected patches of 500 by 500 pixels of each original image of 4000 by 6000, and labelled them according to the fact that the most central 224 by 224 area contained a lesion. For the classification step between healthy and diseased, a sliding window of 500 by 500 on the image is used and was introduced in the convolutional neural network (CNN) ResNet model [8] . With a test precision of 97.85%, the method remains non generalizable since the chosen disease has important and distinct symptoms compared to other diseases and the acquisitions are made on a single field. For that reason, it is not clear how the model would perform to classify different diseases with similar symptoms.

Another work that uses aerial images, with the aim of detecting disease symptoms in grape leaves [9] . Authors have used CNN approach by performing a relevant combination of image features and color spaces. Indeed, after the acquisitions of RGB images using UAV at 15 m height. The images were converted into different colorimetric spaces to separate the intensity information from the chrominance. The color spaces used in this study were HSV, LAB and YUV, in addition to the extracted vegetation indices (Excessive Green (ExG), Excessive Red (ExR), Excessive Green-Red (ExGR), Green-Red Vegetation Index (GRVI), Normalized Difference Index (NDI) and Red-Green Index (RGI)). For classification, they have used the CNN model Net-5 with 4 output classes: soil, healthy, infected and susceptible to infection. The model has been tested on multiple combinations of input data and three patch sizes. The best result was obtained by combining ExG, ExR & GRVI with an accuracy of 95.86% on 64 Â 64 patches.

In [10] , the authors tested several existing state-of-the-art CNN architectures for plant disease classification. The public PlantVillage database [12] was used in this study. The database consists of 55,038 images of 14 plant types, divided into 39 classes of healthy and infected leaves, including a background class. The best results were obtained with the transfer learning ResNet34 model, achieving an accuracy of 99.67%.

Several works have been carried out to deal with the problem of plant diseases detection using images provided by remote sensing materials (smartphones, drones…). Nevertheless, CNN have demonstrated high performances to solve this problem compared to models based on the classic feature extracting methods.

In the present study we took advantages from the deep learning and transfer learning approaches to address the problem of the most important damages caused by pests' attacks and cryptogamic pathogens infections in tomato crops. The rest of the paper is organized as follows. Section 2 presents a comparative study and discusses our preliminary results. The conclusion and perspectives are presented in Sect. 3.

The study was conducted on a database of images of infected leaves, developed and used in [3] [4] [5] .The images were taken with a digital camera, Canon 600D, in several farms in the area of Sous Massa, Morocco. Additional images are collected from the Internet in order to increase the size of the database. The dataset is composed of six classes, three of damage caused by insect pests (leafminer flies, thrips and tuta absoluta), and three classes of cryptogamic pathogens symptoms (Early blight, late blight and powdery mildew). The dataset is validated with the help of an agricultural experts. Figure 1 depicts the types of symptoms on tomato leaves, Table 1 presents the composition of the database and the symptoms of each class. The images were resized in order to put the leaves in the center of the images.

The motivation behind using deep learning for computer vision is the direct exploitation of image without any hand-crafted features. In plant disease detection field, many researchers have chosen deep models DensNets and VGGs for their high performance in standard computer vision tasks.

The idea behind the DensNet architecture is to avoid creating short paths from the early layers to the later layers and to ensure maximum information flow between the layers of the network. Therefore, DensNet connects all its layers (with corresponding feature map sizes) directly to each other. In addition, each layer obtains additional inputs from all previous layers and transmits its own characteristic maps to all subsequent layers [13] . Indeed, according to [13] DensNets require substantially fewer parameters and less computation to achieve state-of-the-art performances. Figure 2(a) gives an example of a five dense layers convolutional model. In this study DensNet was used with 121 layers and 161 layers.

Very deep convolutional networks or VGG ranked second in ILSVRC-2014 challenge [14] . The model is widely used for image recognition task especially in the crop field [15, 16] . Consequently, we used VGG with 16 layers. The architecture has 16 

Fine-tuning is a transfer learning method that allows to take advantage of models trained on another computer vision task where a large number of labelled images is available. Moreover, it reduces the needs on having a large dataset and computation power to train the model from scratch [16] . Fine-tuned learning experiments are much faster and more accurate compared to models trained from scratch [10, 15, 17, 18] . Hence, we fine-tuned all network layers of the 3 models based on learned features on the ImageNet dataset [19] . The idea is to take pre-trained weights from the VGG16, Densnet121 and Densnet161 trained on the ImageNet dataset, use those weights as first step of our learning process, then we keep them for every convolutional layer in all the iterations and update only the weights of the linear layers.

Experimental Setup Experiments were run on a Google Compute Engine instance named Google Colaboratory (Colab) [20] as well as a local machine LenovoY560 with 16 GB of RAM. Colab notebooks are based on Jupyter and work as a Google Docs object. In addition to that, the notebooks are pre-configured with the essential machine learning and artificial intelligence libraries, such as TensorFlow, Matplotlib, and Keras. Colab operates under Ubuntu 17.10 64 bits and it is composed of an Intel Xeon processor and 13 GB RAM. It is equipped with a NVIDIA Tesla K80 (GK210 chipset), 12 GB RAM, 2496 CUDA. We implemented and executed the experiments in Python, using PyTorch library [21] , which performs automatic differentiation over dynamic computation graphs. In addition, we used the PyTorch model Zoo, which contains various models pretrained on the ImageNet dataset [19] . The model architecture is train with stochastic gradient descent (SGD) optimizer with learning rate of 1e-3 (0.005) and a total of 20 epochs. The dataset is divided into 80% for training and 20% for evaluation.

The evaluation of loss during the training phase illustrated in Fig. 3(a) . Based on the graph we can observe that the training loss converged for all models. A big reduction of loss started from the first 5 epochs, after 20 iterations all the models were optimized with low losses reaching a score of 0.12 for DensNet161, 0.14 for DensNet121 and 0.15 for VGG16. After the 14 th epoch the training start to converge as well as the training accuracy. In addition, after testing with higher learning rate and increasing number of epochs, the best training scores were achieved using DensNets models. Which means that the models performed better with less parameters. Besides, DensNet161 performed better than Densnet121 due to the deeper architecture of the model. We can observe in Table 2 that DensNets performed better than VGG model during the test even if their losses reached a score around 0.14 during. Note that DensNet with deeper layers had better test score. In the test phase DensNet161 outperformed DensNet121 and VGG16 with an accuracy 95.65%, 94.93% and 90.58% respectively. We can clearly see from the Table 2 that DensNet161 outperformed the other models in classifying leafminer fly, thrips and powdery mildew with an accuracy up to 100%, 95.65% and 100% respectively. Furthermore, DensNet121 had the best classification rate for early blight, late blight and tuta absoluta with an accuracy of 100%, 95.65% and 95.65 respectively. In order to compare the two models having the best accuracies, we calculated the confusion matrix of the testing dataset for those models. Figure 4 represents the confusion matrix for the DensNet classification models with 161 and 121 layers. More images were misclassified for DensNet121 compared to DensNet161. Moreover, the most confused classes for the DensNet121 are leafminer fly and thrips with two thrips images classified as leafminer fly and one leaf miner classified as thrips. In DensNet161 model, the confusion is more likely between early blight and late blight with one early blight image classified as late blight and one image late blight classified as early blight.

Thrips and early blight are the most misclassified classes for both models, which is due to the similarity between the symptoms making it difficult to differentiate between these classes. Table 3 describes the studies we cited earlier in Sect. 1, aligned with our model results. Each approach is using different dataset. Nevertheless, according to the accuracies listed, the approaches based on deep learning models outperformed the approaches based on feature engineering. The results of our model are promising starting with a dataset with the size of 666 imagessee Sect. 2.1 and achieving an accuracy of 95.65% using DensNet161 model.

In this paper we have studied three deep learning models in order to deal with the problem of plant disease detection. The best test accuracy score is achieved with DenseNet161 with 20 training epochs, outperforming the tested architectures. From the study that has been conducted it is possible to conclude that DensNet has a suitable architecture for the task of plants disease detection based on crop images. Moreover, we realized that DensNets requires less parameters to achieve better performances. The preliminary results are promising. In our future works we will try to improve the results, increase the dataset size and address more challenging diseases detection problems.

MAPM du développement rural et des eaux et forêts, L'agriculture en chiffre_Plan Maroc vert, L'agriculture en chiffre

Agriculture de précision

Automatic recognition of plant leaves diseases based on serial combination of two SVM classifiers

A hybrid combination of multiple SVM classifiers for automatic recognition of the damages and symptoms on plant leaves

Automatic recognition of the damages and symptoms on plant leaves using parallel combination of two classifiers

Detection and classification of rice plant diseases

Automatic recognition of vegetable crops diseases based on neural network classifier

Autonomous detection of plant disease symptoms directly from aerial imagery. Plant Phenom

Deep leaning approach with colorimetric spaces and vegetation indices for vine diseases detection in UAV images

Deep learning for plant diseases: detection and saliency map visualisation

Image set for deep learning: field images of maize annotated with disease symptoms

An open access repository of images on plant health to enable the development of mobile disease diagnostics

Densely connected convolutional networks

Very deep convolutional networks for large-scale image recognition

A comparative study of fine-tuning deep learning models for plant disease identification

VGG16 for plant image classification with transfer learning and data augmentation

Deep learning with unsupervised data labeling for weed detection in line crops in UAV images

Using deep learning for image-based plant disease detection

Imagenet: a large-scale hierarchical image database

Performance analysis of Google Colaboratory as a tool for accelerating deep learning applications

Deep Learning with Python

Acknowledgment. This study was supported by Campus France, Cultural Action and Cooperation Department of the French embassy in Morocco, in the call of proposals, 2019 campaign, under the name of "Appel à Projet Recherche et Universitaire".