key: cord-0835924-z7b03pfk authors: Pi, Pengpeng; Lima, Dimas title: Gray level co-occurrence matrix and extreme learning machine for Covid-19 diagnosis date: 2021-06-04 journal: nan DOI: 10.1016/j.ijcce.2021.05.001 sha: 92f347560e7167ff0245d9ee8398d7434ee5d446 doc_id: 835924 cord_uid: z7b03pfk Background Chest CT is considered to be a more accurate method for diagnosing suspected patients. However, with the spread of the epidemic, traditional diagnostic methods have been unable to meet the requirements of efficiency and speed. Therefore, it is necessary to use artificial intelligence to help people make efficient and accurate judgments. A number of studies have shown that it is feasible to use deep learning methods to help people diagnose COVID-19. However, most of the existing methods are single-layer neural network structures, and their accuracy and efficiency need to be improved. Method In this scheme, a hybrid model is adopted. Firstly, the gray co-occurrence matrix is used to extract the features of the images, and then the extreme learning machine is used for classification. Results The experimental results show that the model proposed in this paper is feasible and can help medical staff to accurately determine suspected patients for subsequent isolation and treatment. Within months of the discovery of the first COVID-19 patient in December 2019, COVID-19 had spread to most countries around the world. According to the latest data from the World Health Organization, as of December 27, 2020. There are more than 79 million confirmed cases of new coronary pneumonia worldwide and more than 1.7 million deaths. We are surprised at the speed and scope of the epidemic, which has had a major impact on the world economy and politics. According to clinical observations, most patients with COVID-19 infection have typical features such as fever, dry cough, and weakness of the limbs. According to the current investigation results, the incubation period of the new coronavirus is generally 3-14 days, but the incubation period of some cases can reach 24 days. The patient has no typical clinical symptoms during the virus incubation period. Most patients will have the above-mentioned typical clinical manifestations after the incubation period of the virus has passed, but a small number of patients may still have no clinical manifestations after the incubation period has passed. In summary, COVID-19 has the characteristics of strong infectivity, strong pathogenicity, and long incubation period. Therefore, it is very difficult to perform efficient and rapid detection of suspected patients. The most commonly used detection method for COVID-19 is nucleic acid detection, which uses a combination of RNA reverse transcription and polymerase chain reaction (RT-PCR) to detect viral RNA fragments. Only the nucleic acid test result is positive, and the suspected patient can be diagnosed. However, the sensitivity of RT-PCR screening is low, so even if the test results of some patients are negative, the possibility of their illness cannot be completely ruled out. In addition, nucleic acid testing requires special test kits, and some countries and regions cannot afford the corresponding costs. At the same time, nucleic acid test results are relatively slow, which is not conducive to follow-up treatment and isolation. Therefore, there is an urgent need to find an efficient and accurate method for corresponding detection. At the same time, according to the COVID-19 Diagnosis and Treatment Program (Trial Seventh Edition) issued by the National Health Commission of China, "imaging features" are listed as one of the three clinical features of COVID-19 suspected cases, and the clinical diagnosis results of chest imaging can also be used as the criteria for the judgment of COVID-19 cases. Because COVID-19 patients have typical lung imaging features such as ground glass opacities (GGO), pulmonary sclerosis, pulmonary fibrosis, and multiple foci, it is very conducive to the diagnosis of suspected patients. In addition, the literature compares the detection and diagnosis of CT and RT-PCR, and the results show that the diagnosis based on chest CT is faster and more efficient than RT-PCR. Generally, the traditional judgment method is made by the radiologist, and the doctor's personal status has a great influence on this method. Intelligent diagnosis systems based on computer vision and artificial intelligence will benefit patients, doctors and hospitals in many ways. Ozturk et al. proposed a binary classification model for distinguishing normal people from those infected with COVID-19, with an accuracy of 98.08%. They also proposed a binary classification model for distinguishing between normal people, people infected with viral pneumonia, and people infected with COVID-19. The multi-classification model of 19 infected persons has an accuracy of 87.02%, but it still needs to be improved [1] . Lu [2] proposed Radial Basis Function Neural Network (RBFNN) for brain classification, and their method can be used as comparison basis. Li, et al. [3] proposed Real-Coded Biogeography-based Optimization (RCBBO) for brain detection. Similarity, this RCBBO is used as comparison baseline. Matteo et al. proposed an improved light convolutional neural network based on SqueezeNet. Compared with the traditional SqueezeNet, this network structure has 85.03% accuracy, higher efficiency and fewer parameters [4] . It can effectively distinguish the chest CT images of COVID-19 patients and healthy people. Benbrahim et al. used deep learning methods to classify chest CT images of COVID-19 patients using the investigationv3 model and the ResNet-50 model, respectively, with higher accuracy rates of 99.01% and 98.03% [5] . Shervin and others trained four commonly used neural networks, ResNet-18, ResNet-50, SqueezeNet and DenseNet-121, to classify chest CT images of suspected COVID-19 patients [6] . Among them, SqueezeNet demonstrated the best performance, reaching a sensitivity of 98% and a specificity of 92%. Toraman et al. proposed a new type of artificial convolutional neural network called CapsNet, which achieves rapid and accurate diagnosis of COVID-19 [7] . This method classifies chest CT images of suspected COVID-19 patients in two different ways, two classifications and multiple classifications, with accuracy rates of 97.24% and 84.22%, respectively [7] . In order to improve the accuracy and efficiency of the model, researchers put forward the concept of a hybrid network model. Özkaya et al. used convolutional networks and support vector machine algorithms in classification tasks, but did not perform corresponding feature selection, so the application scenarios are very limited. Yao [8] combined wavelet entropy (WE) and biogeography-based optimization (BBO) for COVID-19 recognition. Chen [9] employed Gray-Level Co-occurrence Matrix (GLCM) and support vector machine (SVM) for COVID-19 recognition. Radial Basis Function neural network has three main characteristics: optimal approximation, simple training and fast learning and convergence. RBFNN can approximate arbitrary continuous nonlinear network with arbitrary accuracy [10] , and is widely used in image recognition, function recognition and other fields. RBFNN consists of three layers, as shown in the Figure 1 , with only one hidden layer: 1, The connection weight between the input layer and the hidden layer is 1, so the input layer only plays the role of data input. 2, The hidden layer contains N radial basis neurons (activation function is RBF), which can map the low-dimensional linearly separable input to the high-dimensional linearly separable space. When the input is near the center of the base function, the hidden layer node will generate larger input. As the input moves away from the center, the output drops sharply. 3, The output layer contains J linear neurons, and the final output is the linear weighted sum of the output of the hidden layer. The radial basis function RBF is a radially symmetric non-negative real valued function of the center point, whose value is only related to the distance from the center point. The RBF we most often use uses Euclidean distance and Gaussian function, which is shown in the Figure 2 . Let be the center point of the Gaussian function of thenode in the hidden layer, and be the width parameter of thenode: The main purpose of RBF is to map the indivisible low-dimensional data to the high-dimensional space and turn it into high-dimensional separable data. RBF only needs to find the central point that represents the data. Compared with the traditional BP algorithm, RBF does not need to train the global connection value, and only needs to adjust the weight that affects the output accordingly. Therefore, the overall training speed has been improved, and this function is also called the local response function. RBFNN needs to select hidden layer basis functions. When the distance between the input vector and the center point is smaller, on the contrary, the corresponding output is larger. The size of the center point matrix is the number of neurons in the hidden layer × the number of neurons in the input layer . The corresponding to each enables different input information to be reflected by different neurons in the hidden layer to the greatest extent [11] . The final output is: Biogeographic optimization is a new global optimization algorithm based on biogeographic theory to study the geographical distribution of objects. The BBO algorithm has similar functions to other biology-based algorithms, such as sharing information between solutions. Similar to the BBO algorithm are genetic algorithm and particle swarm optimization algorithm. However, BBO algorithm has its own unique functions [12] , such as it can maintain its solution set from iteration to iteration, which requires migration to probabilistically adapt these solutions, and through migration operations, BBO can share information between solutions. In addition, increasing the mutation operator of BBO is beneficial to increase the diversity of the population [13] . For BBO, despite many improvements, RCBBO algorithm is a truly coded BBO algorithm for global optimization problems in a continuous domain. Compared with the BBO algorithm, the RCBBO algorithm introduces mutation operator to enhance the search ability of the algorithm and improve the population diversity [14] . In addition, in the RCBBO algorithm, each individual is represented by a D-dimensional argument vector. WEBBO uses the wavelet entropy method to extract the feature value of the image, then uses the BBO algorithm as the training algorithm, and finally uses the single hidden layer network as the classifier. Therefore, the WEBBO algorithm can complete the classification task efficiently and quickly. Chest CT is now considered to be an efficient method that can quickly detect whether a suspected COVID-19 patient is diagnosed, but traditional CT diagnosis is based on the diagnosis of experts and professors, so the reliability of the results depends more on the status of the individual. Now the WEBBO model is used to classify chest CT images, which is both efficient and fast. Support Vector Machines (SVM) is a dichotomous model. The purpose of SVM is to find a hyperplane in a sample set containing positive and negative examples, so that the plane can segment the sample set [15] . It' s basic model is defined as a linear classifier with the largest interval in the feature space, while the learning strategy of SVM is to maximize the interval [16] . Linear functions are points in one-dimensional space, lines in two-dimensional space, planes in three-dimensional space, and so on. If a linear function happens to separate the samples, it is said to be linearly separable; otherwise, it is non-linearly separable. Linear functions have a common name (Hyper Plane). It is a supervised learning method, which is widely used in statistical classification and regression analysis. In the sample space, the partitioning hyperplane can be described by the following linear equation: If it has completed the corresponding segmentation of the samples, and the labels of the two samples are (+1, −1), then for a classifier, ( ) 0 and ( ) 0 can represent two different categories, +1 and -1. The core idea of SVM is to try to keep the two separate categories as far apart as possible, so that the separation is more reliable [17] . Moreover, only unknown new samples have good classification and prediction ability (called generalization ability in machine learning), so it is far from enough to separate the samples [18] . To maximize the spacing, the SVM works by maximizing the distance from the data point closest to the separation surface. In order to describe the data points closest to the separated hyperplane, we need to find two hyperplanes that are parallel to and equidistant from this hyperplane: As shown in the Figure 3 : The sample points on these two hyperplanes determine the positions of 1 and 2 . In theory, it is the closest point to the two hyperplanes. They support the dividing line and are called support vectors. This is the origin of support vector machines. The distance between the two parallel lines + = 1 and + = 2 can be expressed as follows: Can launch the 1 and 2 two hyperplane interval of , the purpose is to maximize the interval. Therefore, the support vector machine is also called Maximum Margin Hyper Plane Classifier. Equivalent to minimize || || to after derivation and calculation is convenient, is equivalent to minimize further Assuming that the hyperplane can classify the samples correctly, we can make: When the two formulas are combined, we get: That's the constraint on the target function. Now the problem becomes an optimization problem: This is a convex quadratic programming problem, and Lagrange Duality is used to find an effective method to solve it. For the above optimization problems, Lagrange functions need to be constructed first: I take the derivative with respect to and : Then, the dual problem of the original problem is obtained by substituting Lagrange function: The modeling process for this problem is now complete. When you classify a data point, you only need to bring the data point that needs to be classified into ( ). After that, compare the above results with the symbols accordingly. From the above calculated , substitute ( ) to get: This expression shows that the prediction of only needs the inner product of and the training point. This is the basic prerequisite for kernel linear generalization, you don't need to use it at every training point. You only need to use support vectors, and the coefficient of non-support vectors is 0. When a low-dimensional inseparable sample set is mapped to a high-dimensional space, it will change from a low-dimensional inseparable data set to a high-dimensional separable data set. Only after it becomes a high-dimensional separable data set, the data set can use SVM. Set the mapping function as ( ), then the mapped spatial classification function becomes: However, if we directly map the data in the low-dimensional space to the high-dimensional space, the number of dimensions will increase sharply. Therefore, we need to introduce a kernel function. The main idea of the kernel function is to find a function that gives the same result in low-dimensional calculations and high-dimensional calculations. That is, the inner product ( 1 ), ( 2 ) . So that we not only have the same result, but also avoid the need to directly calculate high dimensions. Now the classification function looks like this: Where is the kernel. Another fact in the current research field is that, due to the training of neural networks, most gradient-based algorithms have limited training speed, and with this training algorithm, all parameters in the network have to be updated and adjusted during iteration. Therefore, the training speed of the forward neural network is much slower than people expect. In this paper, a hybrid network model is proposed. Firstly, the features of images are extracted by using the gray co-occurrence matrix, and then the classification results are obtained by using the extreme learning machine algorithm. Mixed network model is helpful to improve the accuracy and overall efficiency of image classification. At the same time, due to the characteristic of random input weights and analysis to determine the output weights of the network, the algorithm of extreme learning machine has a great advantage in learning speed compared with the traditional forward neural network. It can help medical staff diagnose suspected patients more efficiently and facilitate subsequent isolation and treatment. CT For each subject to be tested, select 1 to 4 parts. For COVID-19 patients, we need to select the larger and larger number of lesions, that is, the hierarchical selection method, while for ordinary objects, you can choose any level of the image, without the need to use the hierarchical selection method. Table 1 shows the detailed information of the tested object. HC represents the health control. At the same time, the resolution of all images we use is 1024×1024. Where denotes majority voting, represents the labelling of all three experts. We often use contrast (CON), entropy (ENT), deficit moment (IDM) and energy (ASM) to represent texture features. CON: The contrast between the brightness of the pixel value and its neighborhood pixel value, the deeper the texture grooves, the more pronounced the contrast. ENT: Entropy represents a measure of image information content, and represents the unevenness or complexity of image texture. When the image is noisy or random, the entropy value will be very large. The gray level co-occurrence matrix describes the texture of gray image by studying the spatial correlation characteristics of gray. The matrix represents the number of pixel pairs with the same gray value in a given distance and direction. Examples of gray level co-occurrence matrix are as follows: Extreme Learning Machine (ELM) is an algorithm proposed by Huang Guangbin for solving single-hidden layer neural networks. Compared with traditional single-layer neural networks, ELM's most significant feature [19] is that it can guarantee higher efficiency under the premise of learning accuracy [20] . ELM is a new fast learning algorithm. For a single hidden layer neural network, ELM can initialize the input weights and deviations randomly, and get the corresponding output weights. Where in, ( ) is the activation function, = [ ,1 , ,2 , … , , ] is the input weight, β i is the output weight, and is the bias of the the hidden layer unit [21] . · is the inner product of and . The goal of single-hidden layer neural network learning is to minimize the error of output, which can be It can be represented matrices as = Where, is the output of the hidden layer node, is the output weight, and is the expected output. . In order to be able to train a single hidden layer neural network, we hope to obtain̂, ̂ and̂, such that ‖ (̂,̂)̂− ‖ = , , ‖ ( , ) − ‖ Among them, = 1, . . . , , which is equivalent to minimizing the loss function Traditional gradient descent-based algorithms can solve this type of problem, but almost all gradient based algorithms need to adjust all parameters accordingly in the iterative process. In the ELM algorithm [22] , once the input weight and the bias of the hidden layer are randomly determined, the output matrix of the hidden layer is uniquely determined. Training single hidden layer neural network can be transformed into solving a linear system = [23] . And the output weight can be determined ̂= ̅ (33) Where, ̅ is the Moore-Penrose inverse of the matrix. And it can be proved that the norm of solution ̂ is minimal and unique. In the future, some deep neural networks will be tested, such as convolutional neural networks [24] [25] [26] , etc. Generally, when building a machine learning model, we divide the data set into two categories: training set and test set. The data in the test set will be used to evaluate the performance of the model alone and will not participate in any training [27] . However, we often encounter over-fitting problems during the training process. We will find that the model can predict the data of the training set very well [28] , but when we use data other than the training set to test, the accuracy of the model will be unsatisfactory. But if we use the test data to adjust the model at this time, it is equivalent to knowing the test data in advance during the training process, and finally the overall accuracy of the model will be affected. Our usual approach is to separate the training set data separately and use it as the validation set data to evaluate the training results of the model [29] [30] [31] . The data of the validation set comes from the training set, but it will not participate in the training. Therefore, the degree of matching between the model and the data outside the training set can be better evaluated. Cross-validation is also called round-robin validation and is usually used in the corresponding evaluation of model validation data. We first divide the initial sample into k sub-samples, use one of the sub-samples as the data of the validation set, and use the remaining k-1 sub-samples as the data of the training set. Cross-validation is to repeat K times, each sub-sample is performed once, we take the average of K results or other combinations, and we will get a single estimate. The advantage of this method is that randomly generated sub-samples are repeatedly used for training and verification at the same time, and the results are verified once for each time. Ten-fold cross validation is the most common method. The data set was divided into 10 parts, and 9 parts were used as training data and 1 part as experimental test data in turn. Each experiment will give you data. The average of the 10 results is used as an estimate of algorithm performance. The diagram below shows the tenfold cross validation method. There are many ways to evaluate the performance of classifier, but we use confusion matrix to find various evaluation indexes to verify the results. Seven types of performance metrics are used to evaluate the results of the classifier [32] . The classification results were calculated based on the confusion matrix [33] , and the results were listed according to sensitivity, specificity, precision, accuracy, MCC, F1-Score, FMI. The formula to calculate the performance measure are given as: 1. Confusion Matrix: A confusion matrix is a table that is typically used to describe the performance of a classification model over a set of test data with known true values [34] . It includes true positive, true negative, false positive and false negative. The sum of true positives and true negatives is the result of correct prediction, while the sum of false positives and false negatives is the result of wrong prediction. 1) True Positive: The prediction for the class is positive, and the prediction results are correct. 2) False Negative: The result prediction for the class is negative, and the result prediction is wrong. 3) False Positive: The result prediction for the class is positive, and the result of the prediction is wrong. The predicted results for the class were negative, and the predicted results were correct. 2. Accuracy: Accuracy [35] represents the degree to which our predicted value is close to the true value, and its results are expressed as the ratio of the sum of true negatives and true positives to all values. FP, and FN, and is mainly used to measure binary classification problems. When the value is -1, it indicates that the predicted result completely contradicts the actual result [36] . When the value is 0, it indicates that the predicted result is not as good as the random predicted result. When the value is 1, it indicates that the predicted result is completely consistent with the actual result [37] . It can be seen from the above that the MCC indicates the degree of agreement between the predicted results and the actual results. (40) 8 . The Fowlkes-Mallows score FMI is defined as the geometric mean of the pairwise precision and recall [38] . (41) A total of 640 images were used in this study, including 320 CT images of the lungs of COVID-19 patients and 320 HC images for comparison. Before we conduct model training, we first processed the data set accordingly. First, set the width of all images to 1024 and the height to 1024. Later, in order to improve the contrast of the picture, the histogram processing was carried out. Finally, through edge cropping and down-sampling, all data sets are processed into pictures with a width of 256, a height of 256, and a channel number of 1. After that, the corresponding gray co-occurrence matrix of all gray images was calculated to extract the corresponding image features, and then the extracted feature values were submitted to the Extreme Learning Machine algorithm to get the final classification results. We conducted ten tests on the samples through ten-fold cross validation, and the results are as follows: Table 3 , Table 4 and Table 5 are the detailed data of the specificity, precision and accuracy of the 10-fold cross-validation of the model. Figure 10 , Figure 11 and Figure 12 are the line charts of the variation trend of the specificity, precision and accuracy of the model. ±2.03 Figure 11 The trend of specificity If we replace ELM with one fully-connected layer, the results are shown in Table 7 . Through Table 7 compared with detailed data of precision, accuracy, F1 -score, MCC and FMI which aspects, we will model is superior to the performance of ELM replaced with one layer model of the whole connection layer. Therefore, ELM algorithm not only has strong generalization ability and fast learning speed, but also has better comprehensive performance than traditional pre-feedback neural networks in some specific cases. In order to test the effectiveness of the proposed model, we used a ten-fold cross-validation method to verify it. is 73.45%, GLCM-SVM is 74.88%, and the specificity of our model is 76.00%. Therefore, the accuracy of our model is the best among the above models. In terms of F1-score, RBFNN was 69.39%, RCBBO was 73.72%, WEBBO was 73.31%, GLCM-SVM was 74.21%, and the specificity of our model was 75.54%. Similarly, our model also had the best performance in terms of F1-score. For MCC, RBFNN was 41.10%, RCBBO was 51.10%, WEBBO was 46.91%, GLCM-SVM was 49.8%, and the specificity of our model was 52.08%. In terms of FMI, RBFNN was 69.46%, RCBBO was 73.93%, WEBBO was 73.32%, GLCM-SVM was 74.25%, and the specificity of our model was 75.57%. Our model also had the best performance in terms of FMI. Through the analysis of the above data, it was found that the model proposed by us had the best performance among the five models in terms of specificity, accuracy, F1-score, MCC and FMI, although it was slightly lower than RCBBO in terms of precision and specificity. But the stability of our model is also relatively good. The good stability of the model is helpful to draw more accurate conclusions about the disease of the tested subjects. In this paper, a hybrid network model was proposed, which used the grey co-existence matrix as the feature extractor, and then used the extreme learning machine algorithm to classify lung CT images into two categories: those with COVID-19 and those without. In this method, the required grayscale images are obtained by corresponding preprocessing, and then the grayscale co-occurrence matrix is obtained to obtain the features of the images. Finally, the features are transmitted to the extreme learning machine algorithm to classify the corresponding images. After that, we evaluated the model through seven aspects of sensitivity, specificity, accuracy, accuracy, F1-score, MCC and FMI. The average sensitivity, specificity, precision, accuracy, F1-score, MCC and FMI were (74.19 ± 2.74%), (77.81 ± 2.03%), (77.01 ± 1.29%), (76.00 ± 0.98%), (75.54 ± 1.31%), (52.08 ± 1.95%), and (75.57 ± 1.28%). We compared the proposed model with RBFNN, RCBBO, WEBBO and GLCM-SVM. The results show that our proposed model not only has good sensitivity, precision and accuracy, but also has relatively stable performance, which can be used to classify COVID-19 well. However, there is still room for improvement in the accuracy and accuracy of the model. We will continue to improve the model in the future to seek better performance so that it can be better applied to COVID-19 diagnosis. There is no conflict of interest. Automated detection of COVID-19 cases using deep neural networks with X-ray images A Pathological Brain Detection System Based on Radial Basis Function Neural Network Pathological Brain Detection via Wavelet Packet Tsallis Entropy and Real-Coded Biogeography-based Optimization A light CNN for detecting COVID-19 from CT scans of the chest Deep Transfer Learning with Apache Spark to Detect COVID-19 in chest X-ray Images Deep-COVID: Predicting COVID-19 from chest X-ray images using deep transfer learning Convolutional capsnet: A novel artificial neural network approach to detect COVID-19 disease from X-ray images using capsule networks COVID-19 Detection via Wavelet Entropy and Biogeography-based Optimization Covid-19 Classification Based on Gray-Level Co-occurrence Matrix and Support Vector Machine Hardware implementation of radial-basis neural networks with Gaussian activation functions on FPGA Removal of hydrochlorothiazide from molecular liquids using carbon nanotubes: Radial basis function neural network modeling and culture algorithm optimization Binary biogeography-based optimization based SVM-RFE for feature selection Smart detection on abnormal breasts in digital mammography based on contrast-limited adaptive histogram equalization and chaotic adaptive real-coded biogeography-based optimization Biogeography based optimization for mining rules to assess credit risk. Intelligent Systems in Accounting Finance & Management Facial Emotion Recognition Based on Biorthogonal Wavelet Entropy, Fuzzy Support Vector Machine, and Stratified Cross Validation Authoritative subspecies diagnosis tool for European honey bees based on ancestry informative SNPs Detection of Dendritic Spines using Wavelet Packet Entropy and Fuzzy Support Vector Machine Development and validation of a prognostic COVID-19 severity assessment (COSA) score and machine learning models for patient triage at a tertiary hospital Extreme learning machine used for focal liver lesion identification Ductal carcinoma in situ detection in breast thermography by extreme learning machine and combination of statistical measure and fractal dimension Smart Pathological brain detection by Synthetic Minority Oversampling Technique, Extreme Learning Machine, and Jaya Algorithm A fingerprint technique for indoor localization using autoencoder based semi-supervised deep extreme learning machine Combined i-Vector and Extreme Learning Machine Approach for Robust Speaker Identification and Evaluation with SITW Improved Breast Cancer Classification Through Combining Graph Convolutional Network and Convolutional Neural Network A five-layer deep convolutional neural network with stochastic pooling for chest CT-based COVID-19 diagnosis Covid-19 Classification by FGCNet with Deep Feature Fusion from Graph Convolutional Network and Convolutional Neural Network Classification of normal and depressed EEG signals based on centered correntropy of rhythms in empirical wavelet transform domain Regularization of a nonlinear inverse problem by discrete mollification method Covid-19 diagnosis via DenseNet and optimization of transfer learning setting AVNC: Attention-based VGG-style network for COVID-19 diagnosis by CBAM PSSPNN: PatchShuffle Stochastic Pooling Neural Network for an Explainable Diagnosis of COVID-19 with Multiple-Way Data Augmentation Improving ductal carcinoma in situ classification by convolutional neural network with exponential linear unit and rank-based weighted pooling About the Pitfall of Erroneous Validation Data in the Estimation of Confusion Matrices A seven-layer convolutional neural network for chest CT based COVID-19 diagnosis using stochastic pooling Indices for rough set approximation and the application to confusion matrices Diagnosis of COVID-19 by Wavelet Renyi Entropy and Three-Segment Biogeography-Based Optimization Advances in multimodal data fusion in neuroimaging: Overview, challenges, and novel orientation Comparative Analysis of Supervised and Unsupervised Approaches Applied to Large-Scale "In The Wild" Face Verification