key: cord-0790415-2byrffcv
authors: Bahadur Chandra, Tej; Verma, Kesari; Kumar Singh, Bikesh; Jain, Deepak; Singh Netam, Satyabhuwan
title: Coronavirus Disease (COVID-19) Detection in Chest X-Ray Images using Majority Voting Based Classifier Ensemble
date: 2020-08-26
journal: Expert Syst Appl
DOI: 10.1016/j.eswa.2020.113909
sha: 6941d1c3e6200cebeb601598cfd774f0d09d76a4
doc_id: 790415
cord_uid: 2byrffcv

Novel coronavirus disease (nCOVID-19) is the most challenging problem for the world. The disease is caused by severe acute respiratory syndrome coronavirus-2 (SARS-COV-2), leading to high morbidity and mortality worldwide. The study reveals that infected patients exhibit distinct radiographic visual characteristics along with fever, dry cough, fatigue, dyspnea, etc. Chest X-Ray (CXR) is one of the important, non-invasive clinical adjuncts that play an essential role in the detection of such visual responses associated with SARS-COV-2 infection. However, the limited availability of expert radiologists to interpret the CXR images and subtle appearance of disease radiographic responses remains the biggest bottlenecks in manual diagnosis. In this study, we present an automatic COVID screening (ACoS) system that uses radiomic texture descriptors extracted from CXR images to identify the normal, suspected, and nCOVID-19 infected patients. The proposed system uses two-phase classification approach (normal vs. abnormal and nCOVID-19 vs. pneumonia) using majority vote based classifier ensemble of five benchmark supervised classification algorithms. The training-testing and validation of the ACoS system are performed using 2088 (696 normal, 696 pneumonia and 696 nCOVID-19) and 258 (86 images of each category) CXR images, respectively. The obtained validation results for phase-I (accuracy (ACC) = 98.062%, area under curve (AUC) = 0.956) and phase-II (ACC = 91.329% and AUC = 0.831) show the promising performance of the proposed system. Further, the Friedman post-hoc multiple comparisons and z-test statistics reveals that the results of ACoS system are statistically significant. Finally, the obtained performance is compared with the existing state-of-the-art methods.

The recent outbreak of the novel coronavirus disease has infected millions of people and killed several individuals across the world ("Coronavirus Disease 2019," 2020; "Johns Hopkins University, Corona Resource Center," 2020). The World Health Organization (WHO) has declared this epidemic a global health emergency. nCOVID-19 is caused by a highly contagious virus named severe acute respiratory syndrome coronavirus-2 (SARS-COV-2) in which transmission of infection can even occur from the asymptotic patients during the incubation period . As per the expert's opinion, the virus mainly infects the human respiratory tract leading to severe bronchopneumonia with symptoms of fever, dyspnea, dry cough, fatigue, and respiratory failure, etc. (N. Cheng et al., 2020) . There is no specific vaccine or medication available to cure the disease and prevent further spread. Also, the standard confirmatory clinical test-reverse transcription-polymerase chain reaction (RT-PCR) test for detecting nCOVID-19 is manual, complex, and time-consuming (Chowdhury et al., 2020) . The limited availability of test-kits and domain experts in the hospitals and rapid increase in the number of infected patients necessitates an automatic screening system, which can act as a second opinion for expert physicians to quickly identify the infected patients, who require immediate isolation and further clinical confirmation.

Chest X-Ray (CXR) is one of the important, non-invasive clinical adjuncts that play an essential role in the preliminary investigation of different pulmonary abnormalities (Chandra & Verma, 2020b , 2020a Ke et al., 2019) . It can act as an alternative screening modality for the detection of nCOVID-19 or to validate the related diagnosis, where the CXR images are interpreted by expert radiologists to look for infectious lesions associated with nCOVID-19. The earlier studies reveal that the infected patients exhibit distinct visual characteristics in CXR images, as shown in Figure 1 (Cheng et al., 2020; Chowdhury et al., 2020; Chung et al., 2020; Zhang et al., 2020) . These characteristics typically include multi-focal, bilateral ground-glass opacities and patchy reticular (or reticulonodular) opacities in non-ICU patients, while dense pulmonary consolidations in ICU patients (Hosseiny, Kooraki, Gholamrezanezhad, Reddy, & Myers, 2020) . However, the manual interpretation of these subtle visual characteristics on CXR images is challenging and require domain expert (Kanne, Little, Chung, Elicker, & Ketai, 2020; L. Wang & Wong, 2020) . Moreover, the exponential increase in the number of infected patients makes it difficult for the radiologist to complete the diagnosis in time, leading to high morbidity and mortality (Asnaoui, Chawki, & Idri, 2020) . Figure 1 .

To fight against nCOVID-19 epidemic, the recent machine learning (ML) techniques can be embedded to develop an automatic computer-aided diagnosis (CAD) system. In this direction, many clinical and radiological studies have been reported, describing various radio-imaging findings and epidemiology of nCOVID-19 (N. Huang et al., 2020; Kooraki et al., 2020; Yoon et al., 2020) . Further, many deep-learning models like deep convolutional network, recursive network, transfer learning models, etc. have been implemented to automatically analyze the radiological disease characteristics (Chouhan et al., 2020; Jaiswal et al., 2019) . Xue et al. (2018) used a convolutional neural network (CNN) to assign a class label to different superpixels extracted from the lungs parenchyma and localize tuberculosis-infected regions in CXR images with average dice index of 0.67. Another work by Pesce et al. (2019) , used two novel models, the first model is based on backpropagation neural network that uses weakly labeled CXR images and generates visual attention feedback for accurate localization of pulmonary lesions; the second model used reinforcement learning-based recurrent attention model, which learns the sequence of images to find the nodules. Recently, Purkayastha et al. (2020) introduced CheXNet-deep learning (DL) based model integrated with LibreHealth Radiology Information System, which analyzes uploaded CXR images and assigns one of 14 diagnostic labels.

Motivated by the promising performance of DL models reported in the literature and urgent need of an alternate screening tool for early detection of nCOVID-19 infected patients, the research community has applied different DL techniques on chest radiograph images (Abbas, Abdelsamea, & Gaber, 2020; Xu et al., 2020) . The detailed description of different state-of-arts methods, including the imaging modality, dataset size, algorithms, and obtained performance, are recapitulated in Table 1 . Initially, the authors have used a mixture of CXR images collected from different hospitals, publications, and older repositories (Abbas et al., 2020; Hemdan, Shouman, & Karar, 2020; Narin, Ali and Kaya, Ceren and Pamuk, 2020) . However, the limited availability of annotated CXR images for nCOVID-19 cases to train the data-hungry DL models turned out to be the biggest bottleneck (X. Wang et al., 2017) . Latter, to avoid the overfitting of the models, the studies used data augmentation techniques, which generates different variants of the source image by applying random photometric transformations like blurring, sharpening, contrast adjustment, etc. (Chowdhury et al., 2020; Xu et al., 2020) . Further, the CT images have also been used to perform in-depth volumetric analysis of subtle disease responses (similar to viral pneumonia or other inflammatory lung diseases) (Maghdid, Asaad, Ghafoor, Sadiq, & Khan, 2020; Xu et al., 2020) . After the retrospective analysis of the above literatures, we found that the existing studies had been performed using a limited number of input CXR or CT images, which may lead to under-fitting of the data-hungry DL models (X. Wang et al., 2017) . Moreover, the DL approach requires huge computational resources along with a large number of accurately annotated CXR images to train the model, which restrain its clinical acceptability (Altaf, Islam, Akhtar, & Janjua, 2019; Ho & Gwak, 2019) . Conventional ML techniques can be better integrated with CAD systems to overcome these shortcomings. Despite several studies, no one has used conventional ML approaches with ensemble learning using majority voting for the classification of normal and nCOVID-19 infected CXR images.

In this study, we tailored an automatic COVID screening (ACoS) system that employs hierarchical classification using conventional ML algorithms and radiomic texture descriptors to segregate normal, pneumonia, and nCOVID-19 infected patients. The major advantage of the proposed system is that it can be easily modeled using the limited number of annotated images and can be deployed even in a resourceconstrained environment.

The contributions of this study are recapitulated as follows:

-Proposed an ACoS system for detection of nCOVID-19 infected patients using hierarchical classification and augmented images. The proposed model can be used as a retrospective tool or to validate the related diagnosis.

-Applied majority vote based classifier ensemble to aggregate the prediction results of five supervised classification algorithms.

-Review and compare the performance of the proposed ACOS system with the state of the art methods.

The remaining sections of the paper is organized as follows. Section 2 describes the materials and methods used in this study. The obtained results and its detailed analysis are discussed in Section 3. The paper is concluded in Section 4.

In this study, we have used dataset from three public repositories-COVID-Chestxray set (Cohen, Morrison, & Dao, 2020) , Montgomery set Jaeger et al., 2014) , and NIH ChestX-ray14 set (X. Wang et al., 2017) . The detailed statistics of the number of posterior-anterior (PA) view CXR images used from each repository are shown in Table 2 . Initially, all input images are preprocessed, which includes image resizing (512×512 pixels), format conversion (Portable Network Graphics), and color space conversion (Gray Scale). Subsequently, the texture preserving guided filter is applied to reduce the inherent quantum noise (Sprawls, 2018) . The choice of the de-noising filter is based on our previous study (Chandra & Verma, 2020a) . Table 2 . Statistics of the number of CXR images used from different repositories for performance evaluation in training, testing, and validation set. Table 3 . Image augmentation using various photometric transformations.

Range Sharpening Automatic Highlight the fine details by adjusting the contrast between bright and dark pixels. Gaussian Blur 0.1 1.5 Random smoothing of texture information between the specified range of sigma. Brightness -20 20 Randomly increase or decrease the pixel's intensity between the given range.

Automatic Adjust the contrast of the image.

The preprocessed images are divided into two sub-sets: training-testing set (80%) and validation set (20%). Further, the image augmentation technique is applied to the images of the training-testing set to build a generalized model by incorporating the possible variability in the images, which might occur due to diverse imaging conditions. We applied different random photometric transformations with random parameters between the specified ranges, as described in Table 3 .

The nCOVID-19 infected patients exhibit different radiographic texture patterns such as patchy ground-glass opacities (Figure 2 .a), pulmonary consolidations (Figure 2 .c), reticulonodular opacities ( Figure 2 .b), etc. on CXR images . These subtle visual characteristics can be efficiently represented with the help of radiomic texture descriptors. The study uses eight first order statistical features (FOSF) (Srinivasan & Shobha, 2008) , 88 grey level co-occurrence matrix (GLCM) (Gómez, Pereira, & Infantosi, 2012; Haralick, Shanmugam, & Dinstein, 1973) 2018) features. The FOSF describes the complete image at a glance by using the mean, variance, roughness, smoothness, kurtosis, energy, and entropy, etc. It can easily quantify the global texture patterns; however, it does not contemplate the local neighborhood information. To overcome this shortcoming, the GLCM and HOG feature descriptor are used to perform the in-depth texture analysis. The GLCM feature describes the spatial correlation among the pixel intensities in radiographic texture patterns along four distinct directions (i.e., ) whereas the HOG feature encodes the local 0°,45°, 90°, 135°s hape/texture information. The selection of these statistical texture features is motivated by the fact that it can efficiently encode the natural texture patterns and is widely used in medical image analysis (Chandra & Verma, 2020a; Chandra, Verma, Singh, Jain, & Netam, 2020; Vajda et al., 2018) . In this study, a total of 8196 features (8 FOSF, 88 GLCM, 8100 HOG) are extracted from each CXR image (described in Appendix-A). However, not all the extracted features are relevant for accurate characterization of visual indicators associated with nCOVID-19. Thus, to select the most informative features, we used a recently developed meta-heuristic approach called-binary grey wolf optimization (BGWO) (Mirjalili, Mirjalili, & Lewis, 2014; Too, Abdullah, Mohd Saad, Mohd Ali, & Tee, 2018) . The method imitates the leadership, encircling, and hunting strategy of grey wolfs. Unlike the other evolutionary algorithms, the method does not get trapped in local minima, which motivated us to use it in our study (Emary, Zawbaa, & Hassanien, 2016) .

Mathematically, the grey wolfs are divided into four categories denoted by alpha ( ), beta ( ), delta ( ), and omega ( ). The -wolf is the decision-maker and administers the hunting process with the help of beta. The -wolfs are the fittest candidate to replace the alpha when the alpha is very old or dead. Thewolfs are the next in the hierarchy, which obey the orders from and -wolfs but command omega wolfs. The -wolfs are the lowest in the hierarchy and report to these leader wolfs. The encircling strategy of the wolfs is described in Eq. 1.

where, denotes the position of the pray and grey wolf in ' ' iteration, respectively. The A and C and are the coefficient vectors computed using equations Eq. 3 and Eq. 4, respectively.

where, denotes the two random numbers between 0 and 1, and ' ' denotes the linearly decreasing 1 , 2 encircling coefficient (from 2 to 0) used to balance the tradeoff between searching and exploitation.

Further, the optimal position of the wolfs ( ) at iteration 't' are updated using Eq. 5; ,

, and 1 , are computed using Eq. 3; and are calculated using Eq. 2. 2 , and 3 , , and

To develop a robust ACoS system, we hypothesized the following:

Hypothesis 1-The image augmentation technique could improve the robustness of the ACoS system by incorporating variability in input CXR images, which might occur due to diverse imaging modality, exposure time, radiation dose, and varying patient's posture.

Hypothesis 2-The solid mathematical foundation and better generalization capability of support vector machine (SVM) could uncover the subtle radiological characteristics associated with nCOVID-19.

Hypothesis 3-The majority voting based classifier ensemble could act as a multi-expert recommendation and reduce the probable chance of false diagnosis.

To evaluate the hypothetical assumptions, we proposed a prototype (ACoS system) model, as shown in Figure 3 . The proposed system consists of five major steps: pre-processing, image augmentation, feature extraction, classification, and performance evaluation. Initially, to examine Hypothesis-1, the input CXR images are preprocessed (resize, de-noise), and the image augmentation technique is applied (described in Section 2.1). Subsequently, the radiomic textures descriptors are extracted from the complete CXR image and binary gray wolf optimization (BGWO) (Emary et al., 2016; Mirjalili et al., 2014) based feature selection technique is applied to pick the most relevant features. Further to examine Hypothesis-2, the selected features are used to train the model using five supervised classification algorithms, namelydecision tree (DT) (Shalev-Shwartz & Ben-David, 2014) , support vector machine (SVM) (Vapnik, 1998) , k-nearest neighbor (KNN) (Han, Kamber, & Pei, 2012) , naïve Bayes (NB) (Rish & others, 2001 ) and artificial neural network (ANN) ("Artificial Neural Networks," 2013). The proposed methodology uses two-phase classification approach. In phase-I, the normal and abnormal (containing nCOVID-19 and Pneumonia) images are segregated. Subsequently, the abnormal images are further classified in phase-II to segregate the nCOVID-19 and pneumonia. Moreover, the fully trained model is validated using a separate validation set. The final prediction of the validation set is the majority vote of seven benchmark classifiers (ANN, KNN, NB, DT, SVM (linear kernel), SVM (radial basis function (RBF) kernel), and SVM (polynomial kernel)), which reduce the probable chance of misclassification (Hypothesis-3). Finally, the performance measures are evaluated for testing and validation sets. All the experiments in this study are implemented using MATLAB R2018a 1 . 

To compute the discriminative performance of the aforementioned features we have used the popular supervised classification algorithms: SVM (linear, radial bias function, polynomial) (Chandra & Verma, 2020a; Vapnik, 1998) , ANN ("Artificial Neural Networks," 2013), KNN (Han et al., 2012) , NB (Khatami, Khosravi, Nguyen, Lim, & Nahavandi, 2017; Venegas-Barrera & Manjarrez, 2011) , and DT (Han et al., 2012; Pantazi, Moshou, & Bochtis, 2020) . These algorithms are very fast and are widely used in the literature for the classification of pulmonary diseases using CXR images (Chandra & Verma, 2020a; . The selection of these classifiers is motivated by the fact that these algorithms can be efficiently trained using smaller datasets without compromising with the performance. In this study, a discrete set of models were created for phase-I and phase-II, respectively. In phase-I, the models were trained using normal and abnormal images (containing nCOVID-19 and pneumonia) from training -testing set. However, in phase-II, only abnormal images (containing nCOVID-19 and pneumonia) were used to train the models. In both the phases, the performance of the classifiers was evaluated using a 10 fold cross-validation setup. In each fold, all the optimizable learning hyper-parameters were tuned using the Bayesian automatic optimization method (Snoek, Larochelle, & Adams, 2012) .

In general, the learning hyper-parameters can be optimized in two ways, called manual and automatic searching. The manual parameter tuning requires expertise. However, when dealing with numerous models and larger datasets, even expertise may not be sufficient (Ucar & Korkmaz, 2020) . To overcome this shortcoming, an automatic parameter tuning is used as an alternative. In this study, grid search algorithm is used to select the best hyper-parameters by minimizing the cross-validation loss automatically 2 .

Moreover, to examine the Hypothesis-3, majority vote based classifier ensemble technique (described in Appendix-B, Algorithm-1) is applied (shown in Figure 3 ) using a separate validation set (258 CXR images). In order to select the optimal combination of evaluated classifiers for majority vote, we implemented an exhaustive search using recursive elimination method (Chatterjee, Dey, & Munshi, 2019; Q. Chen, Meng, & Su, 2020) . Initially, the method starts with all evaluated classifiers, according to the selection criteria, it iteratively eliminates the classifiers until all possible combinations exhausted.

The performance of the proposed ACoS system is assessed using seven performance measures, as shown in Eq. 9 to Eq. 15 (Han et al., 2012) , where, the number of infected and normal CXR images correctly predicted by the proposed system is denoted by true positive (TP) and true negative (TN), respectively; the false positive (FP) and false-negative (FN) denotes the misclassification of normal and infected images, respectively; P=TP+FN and N=TN+FP.

Area Under Curve (AUC) = 1 2 ( + )

Matthews Correlation Coefficient (MCC) = × TN -FP × FN

Finally, the obtained results is statistically validated using z-test and Friedman average ranking and Holm (Holm, 1979) and Shaffer (Shaffer, 1986) post-hoc multiple comparison methods.

This section presents a detailed discussion of the obtained experimental results of the proposed ACoS system. To evaluate the hypothetical assumptions (Hypothesis 1-3) , the following experiments were formulated:

Experiment 1-Different photometric transformations were randomly applied to input CXR images, and classification performance was evaluated (Hypothesis-1).

Experiment 2-The classification performance of SVM was assessed and compared with the other benchmark classifiers (Hypothesis-2) .

Experiment 3-The classification performance of the majority voting technique and individual benchmark classifiers are evaluated using a separate validation set (Hypothesis-3).

In this study, we used two-phase classification technique to discriminate the normal, nCOVID-19 and pneumonia X-ray images. Initially, two sets of classification models were created for phase-I and phase-II, respectively using original CXR images from the training-testing set. Subsequently, the image augmentation was performed using different photometric transformations as discussed in Section 2.1. The augmented images along with the original CXR images were used to re-train the models and classification performance was evaluated. From the obtained results shown in Table 4 and Table 5 , it was observed that the supervised models trained using augmented images performed significantly better compared to the models trained using original CXR images for both the phases (phase-I and phase-II), which confirms the validity of Hypothesis-1. The obtained promising performance using augmented images can be justified by the fact that the augmented images provide sufficient instances to train the model for possible variations in input CXR images, which might occur due to diverse imaging parameters and platforms in different hospitals.

Moreover, the results obtained using different supervised algorithms for phase-I (shown in Table 4 ) and phase-II (shown in Table 5 ) demonstrates that the SVM (linear kernel) outperformed the others using a selected feature set (1546 features for phase-I and 2018 features for phase-II). The significant better performance of SVM is due to its generalization capability and ability to learn and infer the intricate natural patterns by efficiently adapting the hyperplane and the soft margins using support vectors. Further, the obtained higher accuracy (ACC) of 99.67±0.31%, area under the curve (AUC) of 1±0.00, and Matthews Correlation Coefficient (MCC) of 0.99±0.00 for phase-I and ACC of 98.78±0.96, AUC of 0.99±0.01, and MCC of 0.98±0.02 for phase-II demonstrates its promising performance and thus justifying the validity of Hypothesis-2. Table 4 . Phase-I (Normal vs. Abnormal) classification performance of different supervised models using Training-Testing in 10-fold cross-validation setup. The nCOVID-19 is highly contagious, and even a single false negative may lead to community spread of the infection. Therefore, to reduce the probable chance of misclassification, we used the majority voting based classifier ensemble of seven benchmark supervised models, as shown in Figure 3 . Further, the classification performance of the majority voting technique and individual benchmark classifiers are evaluated using a separate validation set (which was not used during training of the models). The set consists of 258 CXR images (86 normal, 86 nCOVID-19 and 86 pneumonia). Initially, the radiomic texture features (described in Section 2.2) were extracted from the input CXR images and classified using different supervised models in phase-I. The output of each model acts as an expert suggestion to segregate the input CXR images into normal or abnormal (nCOVID-19 or pneumonia). The classification performance of each model using validation set in phase-I is shown in Table 6 . From the obtained results, it can be observed that the performance of majority voting algorithm in phase-I (ACC of 98.062%, AUC of 0.977, and MCC of 0.956) is significantly better compared to the others. Further, all the images which were classified to abnormal category in phase-I were passed to phase-II for differential diagnosis between nCOVID-19 and pneumonia.

In phase-II, the abnormal input images were classified using each supervised model, and prediction results were aggregated using majority voting based classifier ensemble. From the obtained results shown in Table 7 , it was observed that majority voting based classifier ensemble achieved significantly higher performance (ACC of 91.279%, AUC of 0.913, and MCC of 0.830) compared to the individual models, which confirms the robustness of the proposed ACoS system (justifying the validity of Hypothesis-3). To breakoff, the community spread of the nCOVID-19, one of the desired properties in any ACoS system is that it should have the least number of Type-II (false negative) errors without compromising with the number of Type-I (false positive) errors. Figure 4. (a) and Figure 4.(b) show the confusion matrix (CM) for majority voting algorithm for phase-I and phase-II, respectively using validation set. From the CM, it was observed that the majority voting approach outperformed the others achieving fewer Type-I and Type-II errors. 

In this section describes the statistical significance of the obtained results from the various experiments performed in this study. Initially, the statistical significance of obtained performance (ACC and F 1 -measure) of different supervised models using augmented images and without using augmented images for phase-I and phase-II were validated using z-test statistics. The test consider the null hypothesis as the performance of supervised models before and after applying image augmentation is equal. Alternatively, the models trained using augmented images exhibit higher performance. The test statistics for phase-I and phase-II at 95% confidence interval (or ) are shown in Table 8 . From the statistical results of phase-I, it was = 0.05 observed that the classification performance of ANN, SVM (linear and RBF kernel), DT, and KNN models are significantly higher using augmented images (accepting the alternate hypothesis) compared to the models trained using original CXR images. Similarly, in phase-II, all the models strongly accept the alternate hypothesis (i.e., models trained using augmented images exhibit higher performance or ). < 0.05 Table 8 . Computed z-score for comparing the performance (accuracy and F-measure) of different supervised models using augmented images vs. without using augmented images for Training-Testing set in 10-fold cross-validation setup (at 95% significance level or alpha = 0.05). The statistical significance of the proposed ACoS system was evaluated using Friedman average ranking method and Holm and Shaffer pairwise comparison method for validation set (Chandra & Verma, 2020a; Chandra, Verma, Singh, et al., 2020) . The Friedman test statistics compare the mean ranks of different classifiers assuming that the performance of all classifiers are equal (null hypothesis). From the average ranks shown in Table 9 , we found that the test strongly accepts the alternate hypothesis while rejecting the null, which confirms the substantial difference in the performance of different classification algorithms (at ) for both the phases. The result can also be verified from the Friedman test (at 7 = 0.05 degrees of freedom) with for phase-I and -= 0.0000003993 < 0.05 for phase-II. Further, the validity of Hypothesis-3 can be verified from -= 0.000001428 < 0.05 the fact that the majority voting algorithm achieved minimum rank (first rank) in both the phases. Table 9 . Average ranking of classifiers based on different classification performance metrics using the Friedman test with 7 degrees of freedom. Further, the Friedman average rankings shown in Table 9 demonstrate that the mean ranks of different classification algorithms are significantly different ( ), therefore it is meaning full to -< perform the pairwise post-hoc comparisons. In this study, Holm (Holm, 1979) and Shaffer (Shaffer, 1986) post-hoc procedures were used to perform multiple pairwise comparisons. The method considers the null hypothesis as all algorithms performed equally.

In this study, 28 pairs of classification algorithms (denoted by ' ') were compared at level of = 0.05 significance. The Holm and Shaffer method reject those hypotheses that have an unadjusted and , respectively for both phase-I and phase-II. The test -≤ 0.002381 -≤ 0.001786 statistics for phase-I and phase-II are shown in Table 10 and Table 11 , respectively. From the statistical results, it was observed that the performance of the proposed majority vote based classifier ensemble method is significantly better compared to the other classification algorithms for both the phases confirming the validity of Hypothesis-3. Finally, the performance of the proposed system is compared with the existing state of the art methods (summarized in Table 1 ). Initially, the proposed method is compared for two-class (normal vs. abnormal/nCOVID-19) as shown in Table 12 . From the table, it was observed that the proposed method performed significantly better compared to Panwar et al. (2020) , Hemdan et al. (2020) and Maghdid et al. (2020) . Further, it achieved comparably equal performance to Narin et al. (2020) and Ozturk et al. (2020) . However, one should note that Narin et al. (2020) and Ozturk et al. (2020) used comparably less number of CXR images to train the DL model. Further, the overall accuracy (for three class: normal vs. nCOVID-19 vs. pneumonia) of the proposed model is evaluated and compared with the existing state of art methods, as shown in Table 13 . The table reveals that the proposed method performed significantly better in terms of overall accuracy (ACC=93.411%) compared to Ozturk et al. (2020) and L. . However, it achieved comparably lower performance than Abbas et al. (2020) , Chowdhury et al. (2020) , Ucar et al. (2020) and Toğaçar et al. (2020) , which is due to the fact that the author Abbas et al. (2020) and Toğaçar et al. (2020) used very less number of CXR image to train the DL models. Further, radiological responses of pneumonia and nCOVID-19 are subtle, which confuses the classifier. Overcoming such limitation is still an openended research area. 

The morbidity and mortality rate due to nCOVID-19 is rapidly increasing, with thousands of reported death worldwide. The WHO has already declared this pandemic as a global health emergency ("Coronavirus Disease 2019," 2020). In this study, we presented an ACoS system to detect nCOVID-19 infected patients using CXR image data. We performed two-phase classification to segregate normal, nCOVID-19 and pneumonia infected images. The major challenges we experienced in this study are:

-The publicly available nCOVID-19 infected CXR images are limited and lacking standardization.

-The radiological characteristics of nCOVID-19 and pneumonia infections are ambiguous.

Moreover, several studies using DL approaches have been reported in the literature for detection of nCOVID-19 infection in CXR and CT images (as shown in Table 1 ). Although the DL methods reported promising performance, it suffers from the following shortcomings:

-Resize the input CXR images to lower resolution (like 64x64 or 224x224, etc.) before processing, which may result in loss of crucial discriminative texture information.

-Demands massive training data to sufficiently train the model.

-Requires expertise to define suitable network architecture and set the many hyper-parameters (like input resolution, number of layers, filters, and filter shape, etc.).

-Requires high computational resources, extensive memory and a significant amount of time to train the network.

-Unlike conventional machine learning, DL approaches are unexplainable in nature.

To overcome the aforementioned limitations, we have used a combination of radiomic texture features with conventional ML algorithms. The following facts can justify the promising performance of the proposed ACoS system: -The radiomic texture descriptors (FOSF, GLCM, and HOG features) are highly efficient in encoding natural textures and thus can easily quantize the correlation attributes of radiological visual characteristics associated with nCOVID-19 infection.

-The image augmentation technique provides sufficient instances to train the model for possible variable inputs, making the model robust.

-The conventional ML algorithms can be efficiently trained using smaller datasets, fewer resources and minimal hyper-parameter tuning without compromising with the performance.

-The majority vote based classifier ensemble method used in the proposed ACoS system acts as a multi-expert recommendation system and reduces the probable chance of misclassification.

The disadvantages of the proposed system are as follows:

-The subtle radiographic responses of different abnormalities like TB, pneumonia, influenza, etc. confuses the classifier, limiting the diagnostic performance of the system.

In the proposed ACoS system, majority vote based classifier ensemble technique has been exploited to reduce the probable chance of misclassification of nCOVID-19 infected patients. Such method can be easily integrated into mobile radiology van and can work for the welfare of the society.

In this study, we have presented an ACoS system for preliminary diagnosis of nCOVID-19 infected patients, so that proper precautionary measures (like isolation and RT-PCR test) can be taken to prevent the further outbreak of the infection. The key findings of the study are summarized as follows:

-The proposed ACoS system demonstrated the promising potential to segregate the normal, pneumonia, and nCOVID-19 infected patients, which can be verified from the significant performance of phase-I (ACC=98.062%, AUC=0.977, and MCC=0.956) and phase-II (ACC=91.329%, AUC=0.914 and MCC=0.831) using the validation set.

-There are significant variations in the input CXR images due to diverse imaging conditions in different hospitals. The proposed system used augmented images, which generate sufficient variability to train the model and improve its robustness.

-The radiomic texture descriptors like FOSF, GLCM, and HOG features are highly efficient in quantizing the correlation attributes of radiological visual characteristics associated with nCOVID-19 infection.

-Unlike the data-hungry DL approaches, the proposed ACoS system used conventional ML algorithms to train the model with limited annotated images and less computational resources. This type of system may have greater clinical acceptability and can be deployed even in a resource-constrained environment.

-The Friedman post-hoc multiple comparison and z-score statistics confirm the statistical significance of the proposed system.

The future work of this study should focus on improving the reliability and clinical acceptability of the system. The integration of the patient's symptomatology and radiologist's feedback with the CAD system could be helpful in making a robust screening system. Further, an in-depth analytical comparison of performances between conventional algorithms and deep learning methods could help in establishing its clinical acceptability.

Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network

Going Deep in Medical Image Analysis: Concepts, Methods, Challenges and Future Directions

Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: Results of 10 convolutional neural networks

Automated Methods for Detection and Classification Pneumonia based on X-Ray Images Using Deep Learning

Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration

Analysis of quantum noise-reducing filters on chest X-ray images: A review

Pneumonia Detection on Chest X-Ray Using Machine Learning Paradigm

Localization of the Suspected Abnormal Region in Chest Radiograph Images

Automatic detection of tuberculosis related abnormalities in Chest X-ray images using hierarchical feature extraction scheme

Integration of morphological preprocessing and fractal based feature extraction with recursive feature elimination for skin lesion types classification

Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study

WERFE: A Gene Selection Algorithm Based on Recursive Feature Elimination and Ensemble Strategy

First case of Coronavirus Disease 2019 (COVID-19) pneumonia in Taiwan

A novel transfer learning based approach for pneumonia detection in chest X-ray images

Can AI help in screening Viral and COVID-19 pneumonia?

CT Imaging Features of 2019 Novel Coronavirus (2019-nCoV)

COVID-Chestxray Database

Coronavirus Disease

Histograms of Oriented Gradients for Human Detection

Binary grey wolf optimization approaches for feature selection

Analysis of Co-Occurrence Texture Statistics as a Function of Gray-Level Quantization for Classifying Breast Ultrasound

Data Mining: Concepts and Techniques

Textural Features for Image Classification

COVIDX-Net: A Framework of Deep Learning Classifiers to Diagnose COVID-19 in X-Ray Images

Multiple Feature Integration for Classification of Thoracic Disease in Chest Radiography

A simple sequentially rejective multiple test procedure

Radiology Perspective of Coronavirus Disease 2019 (COVID-19): Lessons From Severe Acute Respiratory Syndrome and Middle East Respiratory Syndrome

Clinical features of patients infected with 2019 novel coronavirus in Wuhan

Automatic Tuberculosis Screening Using Chest Radiographs

Identifying pneumonia in chest X-rays: A deep learning approach

Essentials for Radiologists on COVID-19: An Update-Radiology Scientific Expert Panel

A neuro-heuristic approach for recognition of lung diseases from X-ray images

Medical image analysis using wavelet transform and deep belief networks

Coronavirus (COVID-19) Outbreak: What the Department of Radiology Should Know

Diagnosing COVID-19 Pneumonia from X-Ray and CT Images using Deep Learning and Transfer Learning Algorithms

Grey Wolf Optimizer

Automatic Detection of Coronavirus Disease (COVID-19) Using X-ray Images and Deep Convolutional Neural Networks

A Novel Medical Diagnosis model for COVID-19 infection detection based on Deep Features and Bayesian Optimization

Automated detection of COVID-19 cases using deep neural networks with X-ray images

Artificial intelligence in agriculture

Application of deep learning for fast detection of COVID-19 in X-Rays using nCOVnet

Deep Transfer Learning Based Classification Model for COVID-19 Disease

Learning to detect chest radiographs containing pulmonary lesions using visual attention networks

Evaluating the Implementation of Deep Learning in LibreHealth Radiology on Chest X-Rays

An empirical study of the naive Bayes classifier

Automated chest x-ray screening: Can lung region symmetry help detect pulmonary abnormalities?

Modified sequentially rejective multiple test procedures

Understanding machine learning: From theory to algorithms

Practical Bayesian Optimization of Machine Learning Algorithms

The Physical Principles of Medical Imaging

Statistical texture analysis

COVID-19 detection using deep learning models to exploit Social Mimic Optimization and structured chest X-ray images using fuzzy color and stacking approaches

A New Competitive Binary Grey Wolf Optimizer to Solve the Feature Selection Problem in EMG Signals Classification

COVIDiagnosis-Net: Deep Bayes-SqueezeNet based diagnosis of the coronavirus disease 2019 (COVID-19) from X-ray images

Feature Selection for Automatic Tuberculosis Screening in Frontal Chest Radiographs

Statistical learning theory

Visual Categorization with Bags of Keypoints

COVID-Net: A Tailored Deep Convolutional Neural Network Design for Detection of COVID-19 Cases from Chest Radiography Images

A deep learning algorithm using CT images to screen for corona virus disease (COVID-19)

ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. Computer Vision and Pattern Recognition (CVPR

Deep Learning System to Screen Coronavirus Disease

Localizing tuberculosis in chest radiographs with deep learning

Chest Radiographic and CT Findings of the 2019 Novel Coronavirus Disease (COVID-19): Analysis of Nine Patients Treated in Korea

Recent advances in the detection of respiratory virus infection in humans

The authors would like to acknowledge J.P. Cohen and group for collecting images from various sources and making it publicly available (COVID-Chestxray Database) for the research community.

The authors declare no conflict of interest.

The ambiguous texture patterns in CXR images due to different infectious diseases is challenging for the radiologist to diagnose and correlate the patterns to a specific disease accurately. nCOVID-19 exhibits various radiological characteristics, as described in Section 2.2. These visual indicators can be efficiently quantized using statistical texture descriptors (FOSF, GLCM, HOG feature). The FOSF encodes the texture according to the statistical distribution of pixel intensities over the entire image deriving a set of histogram statistics (like mean, variance, smoothness, kurtosis, energy and entropy etc.) by waiving the correlation among the pixels (Srinivasan & Shobha, 2008) . Further, to encode the spatial correlation of grey-level distributions of disease texture patterns in the local neighborhood, the GLCM texture feature (Gómez et al., 2012; Haralick et al., 1973) is used. It considers the relative position of pixel intensities in a given neighborhood of size of an input image , encoding natural textures (Gómez et al., × ( , ) 2012). It represents the relative frequency of two grey levels ' ' and ' ' as statistical probability value or that occurs at pair of points separated by distance vector ' ' along angle ( , | , ) ( , | ∆ , ∆ ) . The statistical probability value is described in Eq. 16. The summary of these = 0°, 45°, 90°, 135°f eatures are recapitulated in Table 14 .The HOG features (Dalal & Triggs, 2005; ) extract the gradient magnitude and direction of local neighborhood, encoding the local shape and texture information. Initially, the input image is converted into smaller blocks for which gradient magnitude and orientation are computed, as shown in Eq. 17 and Eq. 18, respectively. Further, the histogram bin is created and normalized to create a feature descriptor.Where, represents the gradient in and direction, respectively. represents the gradient and direction. Appendix-B

Initially, in step 1 and 2, each image from dataset D is retrieved, and class variables are initialized. The extracted images are tested via all the classification algorithms (i.e., SVM, ANN, KNN, NB, ∈ and DT), as shown in step 4. If the classifier classified the sample to a healthy class, the 'Healthy' class label would be incremented by 1 as shown in step 6. Similarly, for the samples classified to unhealthy class, the 'Unhealthy' class label will be incremented (shown step 8). Finally, based on the majority vote, a class label is assigned to each input image, as shown in step 11. Here, the majority vote acts as a multiexpert recommendation and reduce the probable chance of false diagnosis.

Validation dataset . // =number of images in the dataset = { 1 , 2 , …… } 1. Proposed Automatic COVID Screening(ACoS) system for detection of infected patients.2. Random image augmentation is applied to incorporate the variability in the images.3. Applied hierarchical (two phase) classification to segregate three classes. 4. Majority vote based classifier ensemble is used to combine model's prediction. 5. Proposed method show promising potential to detect nCOVID-19 infected patients.