key: cord-0013488-nzftj3wf
authors: Saleem, Muhammad Hammad; Potgieter, Johan; Arif, Khalid Mahmood
title: Plant Disease Classification: A Comparative Evaluation of Convolutional Neural Networks and Deep Learning Optimizers
date: 2020-10-06
journal: Plants (Basel)
DOI: 10.3390/plants9101319
sha: badaba6047681788d166980cdcae0212fccc0087
doc_id: 13488
cord_uid: nzftj3wf

Recently, plant disease classification has been done by various state-of-the-art deep learning (DL) architectures on the publicly available/author generated datasets. This research proposed the deep learning-based comparative evaluation for the classification of plant disease in two steps. Firstly, the best convolutional neural network (CNN) was obtained by conducting a comparative analysis among well-known CNN architectures along with modified and cascaded/hybrid versions of some of the DL models proposed in the recent researches. Secondly, the performance of the best-obtained model was attempted to improve by training through various deep learning optimizers. The comparison between various CNNs was based on performance metrics such as validation accuracy/loss, F1-score, and the required number of epochs. All the selected DL architectures were trained in the PlantVillage dataset which contains 26 different diseases belonging to 14 respective plant species. Keras with TensorFlow backend was used to train deep learning architectures. It is concluded that the Xception architecture trained with the Adam optimizer attained the highest validation accuracy and F1-score of 99.81% and 0.9978 respectively which is comparatively better than the previous approaches and it proves the novelty of the work. Therefore, the method proposed in this research can be applied to other agricultural applications for transparent detection and classification purposes.

In order to match the food demand, agricultural problems should be addressed by advanced techniques. In this regard, the agricultural industries are focusing on artificial intelligence methods. Several traditional machine learning (ML) algorithms have been used to perform various agricultural operations. On top of that, deep learning (DL) produced significant developments in the agricultural field of research. This is due to the automatic feature extraction capability of the deep learning algorithms. Among several agricultural problems, the successful classification of plant diseases is vital to improve the quality/quantity of agricultural products and reduce an undesirable application of chemical sprayers such as fungicide/herbicide. Therefore, it is an emerging research topic to advance agricultural automation. This agricultural task has a complexity due to the resemblance in the occurrence of the plant containing diseases. In this regard, several studies have been conducted to improve the classification of plant disease. through a comparative study. In this regard, this article presents a comprehensive comparative analysis to perform plant disease classification in two steps. In the first step, the performance of 18 convolutional neural networks was evaluated: 10 famous/well-known DL architectures that were previously used for several image recognition tasks, six recently published modified versions that were derived from the famous DL models, and two cascaded/hybrid versions that were developed from two efficient DL algorithms; the second step was applied to improve the performance of the best-obtained model by training with various deep learning optimizers including RMSProp, Adam, Adadelta, Adamax, and Adagrad. For a comprehensive evaluation, validation accuracy/loss, F1-score, and the number of epochs (required to converge training and validation plots) were compared. The PlantVillage dataset was selected for this research, which contains disease in 14 different plant species. The successful/better classification results obtained in a large variety of dataset classes confirm that the method presented in this article can also be applied to other datasets related to plant disease. Furthermore, the better results obtained by this research will be useful for future studies regarding the real-time classification and detection of plant disease in a single framework. Moreover, the proposed methodology could also be adopted to other agricultural applications.

The rest of the paper is organized as follows: Section 2 presents the details of the dataset, hardware/software specifications, DL architectures, DL optimizers, and specifications required to train the DL models. Section 3 presents the results to indicate the performance of all the well-known, modified, and cascaded/hybrid versions of DL models along with the improvement in the performance of best-obtained models by using various deep learning optimizers, and finally, Section 4 describes the concluding remarks along with some future recommendations.

The Convolutional Neural Networks (CNNs) are mostly used for image classification tasks. Therefore, in this research, the performance of many state-of-the-art CNN architectures was evaluated for the classification of plant diseases. The modified and cascaded versions of DL architectures were also considered, which were recently published in prominent research articles related to plant disease classification. Figure 1 shows all the 18 DL architectures considered for this research. These models were divided into three categories: well-known, modified/improved, and cascaded/hybrid versions. An overall methodology of this research is presented in Figure 2 . Firstly, the Stochastic Gradient Descent (SGD) with momentum optimizer was selected to train the CNN models due to its fast convergence ability [24] . Then, 18 CNN architectures were trained on the PlantVillage dataset and their convergence to the final training/validation values was observed to update the hyperparameters. Next, the CNN models were compared in terms of training and validation accuracy/loss, and F1-score. This led us to apply the DL optimization algorithms for further improvement in the performance of those CNN architectures, which achieved the highest F1-score in their particular category. The novelty of the work is proved by getting the most suitable combination of the CNN model and DL optimizer, which provided considerably better result as compared to the previous researches.

All the DL models were trained on a publicly available dataset called PlantVillage [29] , which contains a total of 54,306 images containing 38 different healthy/diseased leaves related to their 14 plant species (some of the plant diseases are shown in Figure 3 ). The size of the images was changed to 224 × 224 × 3 and normalization was considered by dividing the values of pixel by 255 for making it suitable for the initial values of the models. The dataset was divided by 70%, 20%, and 10% into three categories to avoid overfitting: training, validation, and testing datasets, respectively [22] . 

The DL architectures were programmed in Python language due to the availability of very useful libraries and DL frameworks. Keras with TensorFlow backend was utilized to build the architectures. CuDNN library was installed as it increases the speed of training and works with TensorFlow. All the experiments were carried out on a Graphical Processing Unit (NVIDIA Quadro K2200) having the specifications: 4GB memory, 640 CUDA cores, 1045 MHz core clock, and 80 GB/sec memory bandwidth.

After the development of the AlexNet architecture, a revolutionary period of state-of-the-art CNN architectures was started for many image classification tasks. Therefore, in this article, we considered very popular and successful CNN models such as AlexNet [30] , OverFeat [31] , VGG-16 [32] , ZFNet [33] , ResNet-50 [34] , Inception ResNet-v2 [35] , Inception-v4 [35] , MobileNet [36] , DenseNet-121 [37] and Xception [38] .

Some researchers proposed improved/modified versions of state-of-the-art DL architectures to achieve better/more results for classifying the diseases of plant species. Among them, we have considered improved GoogLeNet [20] , inspired by the famous GoogLeNet model [39] , Cifar-10 [20] , LeafNet [23] , a multilayer convolutional neural network (MLCNN) [17] derived from the AlexNet model [30] , and modified and reduced MobileNet [22] inspired by the MobileNet model [36] . Some cascaded/hybrid versions of DL architectures have also been considered in this article such as a cascaded form of the well-known AlexNet with GoogLeNet models as described in [18] and a hybrid DL architecture of AlexNet with VGG models (AgroAVNET) as proposed in [40] . 

The Stochastic Gradient Descent (SGD) was used to train all the DL models during the first step of the proposed method. After getting the best DL architecture, an improvement in the classification of plant disease was also attempted. In this regard, we used five state-of-the-art deep learning optimizers to train those DL models which attained the highest validation accuracy and F1-score in the first step of the analysis. Few characteristics of these optimizers are provided as under:

• SGD: This is one of the simplest deep learning optimizers. A static learning rate for all the parameters requires in the duration of whole training and it has a fast convergence ability [41] . • Adagrad: This optimizer uses different learning rates for every parameter in the model. It updates the learning rate according to the frequency of the update of each parameter [42] .

RMSProp: To reduce the training time observed in Adagrad, the RMSProp optimizing functions were proposed and its learning rate decays exponentially [43] . • Adadelta: This is an extended version of Adagrad optimizer and accumulates the previous gradients over a fixed time window which ultimately ensures the continuation of learning even after many iterations. Adadelta used Hessian approximation to ensure the update direction in the negative gradient and eliminated the learning rate from update rule [44] .

Adam: The Adaptive moment estimation method (Adam) evaluates adaptive learning rates from the first and second moments of gradients for various parameters [45] . It has combined advantages of two extended versions of the SGD method that are Adagrad and RMSProp. In contrast with the RMSProp, it calculates the average of the second moment of gradient and it also utilizes the previous gradients to speed up learning [45] . • Adamax: A different version of Adam was also proposed in [45] which is based on the infinity norm and could be useful for sparse parameter updates like word embeddings.

All the DL models were trained from scratch on the PlantVillage dataset. The hyperparameters were tuned by the random search method [46] . The internal covariate shift problem occurs on the neural network because of the variation in the distribution of input data due to a change in the number of parameters in the previous layer. This problem was addressed by Batch Normalization which is a very useful technique for a high learning rate [47] . For training all the DL models, the ReLU activation function was used as it is computationally efficient [24, 30] and reduces the possibility of the gradient vanishing. The specifications of all the DL optimizers are summarized in Table 1 . 

This section first presents the comparative analysis of DL architectures to select the best model which leads to the results obtained regarding the improvement in the performance of the best-suited models by using various DL optimization algorithms. All the results were evaluated in terms of training, validation accuracy/loss, and F1-score. The F1-score is considered an important performance metric especially for the case when there is an uneven distribution in the classes just such as the PlantVillage dataset (for example, the Potato healthy class contains the least number of images (152), whereas, the Citrus greening has the highest number of images (5507) [29] ). Therefore, the model/optimizer that attained the highest F1-score was considered the most suitable architecture for the classification of plant disease. The performances of all DL architectures are represented by line graphs (Figures 4-6) , and it was empirically observed that they required 60 epochs (an epoch is a complete cycle of training on each image sample in the training dataset) at which training/validation accuracy and loss were converged. The overall performance of DL architectures is also summarized in Table 2 . 

The performance of well-known CNN architectures is presented in Figure 4 , and it indicates that there is no sign of underfitting (the problem occurs during the training of deep learning models according to which the model does not train accurately if training loss does not change or it continuously decreases) and overfitting (the problem at which the model does not perform appropriately for new data/validation dataset or validation loss decreases to some extent then suddenly increases for the remaining epochs). Overall, 10 well-known CNN architectures were considered. A few important observations from Figure 4 and Table 2 were made:

The Xception model attained the highest validation accuracy, F1-score, and lowest validation loss among all the well-known CNN models. Therefore, this model can be undoubtedly considered as the best CNN architecture to classify plant disease on the PlantVillage dataset. It implies that the concept of a modified version of depth-wise separable convolution [38] in the Xception model is a useful way to obtain higher classification results. Moreover, this DL model converged to its final value at the 34th epoch which is the least number of epochs as compare to all the other DL architectures. On the other hand, it required a significant amount of time to complete one epoch (around 3400 s). Therefore, future studies should propose another version of DL architecture that can achieve Xception-level accuracy and require smaller training time for each epoch.

The second highest F1-score/validation accuracy was attained by ZFNet architecture. Hence, a smaller filter size and the increased number of activation maps used in ZFNet architectures (as compared to AlexNet) improved its performance. • Then, MobileNet, DenseNet, and AlexNet architectures have also achieved a good F1-score followed by Inception-v4, ResNet-50, and Inception ResNet-v2 architectures. The MobileNet is a comparatively more preferable model due to its lower number of parameters which reduced its computation time significantly. The depthwise and pointwise convolutional layers helped to achieve a better classification result. Therefore, a CNN model could be proposed in future research based on the MobileNet architecture. Moreover, this model required a lower number of epochs to achieve its final accuracy and loss as compare to DenseNet and AlexNet models (as shown in Table 2 ). • From Table 2 , it is also noticed that the DL models, such as Inception-v4, Inception ResNet-v2, OverFeat, and VGG-16, required 58-59 number of epochs to converge training/validation plots (also shown in Figure 4 ), which significantly increased their training time.

The VGG-16 and OverFeat were found unsuitable models for plant disease classification as they achieved lower validation accuracy/F1-score and higher validation loss as compared to the other well-known DL architectures. The smaller filter size of the VGG model degraded its performance. However, the larger filter size of the OverFeat model significantly reduced its training time but they were not enough to provide a noticeable classification performance. Additionally, they had a higher number of parameters (in millions) which slow down their training time effectively.

In this article, six modified/improved versions of CNN architectures were also considered. Their performance is presented in Figure 5 from which the following points are discussed:

The improved GoogLeNet architecture achieved the best performance in terms of validation accuracy/loss and F1-score among all the modified versions of CNN architectures by utilizing the concept of the Inception module from the original GoogLeNet model. Moreover, it got the final value of accuracy and loss in 53 epochs which is the least as compared to other modified/improved versions of the DL models considered in this article, but it required more training time to complete one epoch as compared to the models like Modified and Reduced MobileNet.

The MLCNN architecture provided a good F1-score due to the inclusion of a dropout layer after each max pooling layer and a reduction in the number of filters of the starting convolution layers in the original AlexNet architecture. However, due to a higher number of parameters, this modified DL architecture required considerably higher training time per epoch.

The two versions of MobileNet named Modified and Reduced MobileNet models achieved an acceptable F1-score closed to each other. These modified versions of DL architecture used depthwise separable convolutional layers, which helped to attain a good classification result, and they had six times fewer parameters than the original MobileNet model which reduced their training time per epoch. •

Moreover, there were some models like Improved Cifar-10 and LeafNet models that had a lower number of parameters which increased their speed of training per epoch. The Improved Cifar-10 model achieved a noticeable F1-score, but the reduced parameters of the LeafNet model were not enough to obtain a good F1-score/validation accuracy. Therefore, it is not a suitable model to classify diseases in the selected dataset. It is also observed that these two models required a higher number of epochs as compare to other modified versions of DL architectures. Hence, future research could comprise of proposing a DL model such as Improved Cifar-10 and LeafNet for reducing the training time, but some convolutional layers should be added to attain acceptable validation/testing accuracy. Figure 6 presents the performance of cascaded/hybrid version of CNN models as explained below:

• The cascaded AlexNet with GoogLeNet architecture outperformed all the DL models in terms of validation accuracy; moreover, except for the Xception architecture, this model achieved the highest F1-score among all the DL architectures considered in this research (as shown in Table 2 ). Although it required almost 57 epochs to reach its final accuracy/loss values (as shown in Figure 6 ), but it completed one epoch in a smaller period, which clearly shows its effectiveness in terms of training time. There were a few important modifications in the original AlexNet model, which helped to extract the features of plants containing disease including smaller convolution kernel in different layers, the inclusion of max-pooling layer, cascading the Inception module with the modified AlexNet layers, and convolutional layers after Inception to replace two fully connected layers [18] . • Moreover, a hybrid version of AlexNet with VGG architectures has also been studied, and it provided good performance in terms of validation accuracy (as shown in Figure 6 ) and F1-score, but it had the highest number of parameters which significantly increased its training time to complete each epoch. This model performed well due to the utilization of concepts such as normalization and selection of filter depth from AlexNet and VGG models, respectively [40] .

In this article, an improvement in the performance of CNN architectures has also been attempted by training the best models (obtained from the previous step) through different deep learning optimization functions. In this regard, the best DL model was selected from each of the three categories such as the Xception, Improved GoogLeNet, and cascaded version of AlexNet with GoogLeNet models. Table 3 summarizes the results obtained by using various optimization algorithms. Some important observations can be made as follows:

•

Considerable changes were observed in training/validation accuracy, loss, precision, recall, and F1-score by training the DL models through various deep learning optimizers.

Adam and Adadelta were the most successful optimizers for all the three selected DL architectures. •

The Xception model trained with the Adam optimizer achieved the highest validation accuracy and F1-score of 99.81% and 0.9978, respectively, which clearly show the effectiveness of the proposed approach. Moreover, these results are better than previous studies that used the same dataset but different approaches [12, 16, 19, 24] . Therefore, the methodology proposed in this article could be used for various other agricultural operations.

The cascaded AlexNet with GoogLeNet and improved GoogLeNet models achieved their best classification results by using the Adadelta and Adam optimizers, respectively. • However, a degradation in the performance has also been observed when optimizing functions were changed from SGD to Adagrad and RMSProp for Xception and cascaded models, respectively.

It is also noticed that the Improved GoogLeNet showed its lowest validation accuracy/F1-score when it was trained by the SGD optimizer. 

In this article, a comprehensive comparative analysis has been performed between various state-of-the-art deep learning architectures divided into three categories namely well-known, modified, and cascaded versions. Moreover, the performance of the best-obtained models was further improved by using various deep learning optimization algorithms. It was found that the Xception, Improved GoogLeNet and cascaded version of AlexNet with GoogLeNet models obtained the highest validation accuracy and F1-score in their respective category. When these three DL models were trained by using various deep learning optimizers, the Xception model trained by the Adam optimizer achieved the highest F1-score of 0.9978 which suggests that this combination of the CNN model and the optimization algorithm is the most suitable way to classify the plant disease. This research provided us some interesting future directions for upcoming research given as follows:

Various deep learning optimizers such as Adam, and Adadelta, can also be used to enhance research on other agricultural applications, such as crop/weed discrimination, classification of weeds, plant recognition, etc.

The classification performance of the other datasets related to plant disease could also be improved by adopting the methodology proposed in this research. • Furthermore, although the Xception model provided the best results according to the analysis provided in this article, it required a significant amount of time to complete each epoch. Therefore, an attempt should be made to achieve an Xception level accuracy with small training time.

Machine learning in agriculture: A review

Robust fitting of fluorescence spectra for pre-symptomatic wheat leaf rust detection with support vector machines

Detection of peanut leaf spots disease using canopy hyperspectral reflectance

Assessment of Dothistroma needle blight of Pinus radiata using airborne hyperspectral imagery

Improvement of lesion phenotyping in Cercospora beticola-sugar beet interaction by hyperspectral imaging

Mahmood Arif, K. Plant Disease Detection and Classification by Deep Learning

Hyperspectral imaging for classification of healthy and gray mold diseased tomato leaves with different infection severities

Detection of rice panicle blast with multispectral radiometer and the potential of using airborne multispectral scanners

Deep learning for tomato diseases: Classification and symptoms visualization

Deep learning models for plant disease detection and diagnosis

A robust deep-learning-based detector for real-time tomato plant diseases and pests recognition

Using deep learning for image-based plant disease detection. Front

Plant disease and pest detection using deep learning-based features

Can Deep Learning Identify Tomato Leaf Disease?

Using Deep Learning for Image-Based Potato Tuber Disease Detection

A comparative study of fine-tuning deep learning models for plant disease identification

Multilayer Convolution Neural Network for the Classification of Mango Leaves Infected by Anthracnose Disease

Identification of plant leaf diseases using a nine-layer deep convolutional neural network

Identification of maize leaf diseases using improved deep convolutional neural networks

Visual Tea Leaf Disease Recognition Using a Convolutional Neural Network Model. Symmetry

Depthwise separable convolution architectures for plant disease classification

Real-Time Detection of Apple Leaf Diseases Using Deep Learning Approach Based on Improved Convolutional Neural Networks

Deep learning for plant diseases: Detection and saliency map visualisation

Deep interpretable architecture for plant diseases classification

Automated identification of northern leaf blight-infected maize plants from field imagery using deep learning

Random hyper-parameter search-based deep neural network for power consumption forecasting

Coronavirus optimization algorithm: A bioinspired metaheuristic based on the COVID-19 propagation model

An open access repository of images on plant health to enable the development of mobile disease diagnostics

Imagenet classification with deep convolutional neural networks

Integrated recognition, localization and detection using convolutional networks. arXiv 2013

Very deep convolutional networks for large-scale image recognition

Visualizing and understanding convolutional networks

Deep residual learning for image recognition

Inception-v4, inception-resnet and the impact of residual connections on learning

Efficient convolutional neural networks for mobile vision applications. arXiv 2017

Densely connected convolutional networks

Xception: Deep learning with depthwise separable convolutions

Going deeper with convolutions

AgroAVNET for crops and weeds classification: A step forward in automatic farming

An overview of gradient descent optimization algorithms

Adaptive subgradient methods for online learning and stochastic optimization

Neural networks for machine learning

An adaptive learning rate method. arXiv 2012

A method for stochastic optimization. arXiv

Random search for hyper-parameter optimization

Batch normalization: Accelerating deep network training by reducing internal covariate shift

The authors declare no conflict of interest.