key: cord-0749337-h363rb5w
authors: Laxmi Lydia, E.; Anupama, C. S. S.; Beno, A.; Elhoseny, Mohamed; Alshehri, Mohammad Dahman; Selim, Mahmoud M.
title: Cognitive computing-based COVID-19 detection on Internet of things-enabled edge computing environment
date: 2021-11-18
journal: Soft comput
DOI: 10.1007/s00500-021-06514-6
sha: 65ae3391a6a4fa4ba7f273c10b93e49cb917ef11
doc_id: 749337
cord_uid: h363rb5w

In the current pandemic, smart technologies such as cognitive computing, artificial intelligence, pattern recognition, chatbot, wearables, and blockchain can sufficiently support the collection, analysis, and processing of medical data for decision making. Particularly, to aid medical professionals in the disease diagnosis process, cognitive computing is helpful by processing massive quantities of data rapidly and generating customized smart recommendations. On the other hand, the present world is facing a pandemic of COVID-19 and an earlier detection process is essential to reduce the mortality rate. Deep learning (DL) models are useful in assisting radiologists to investigate the large quantity of chest X-ray images. However, they require a large amount of training data and it needs to be centralized for processing. Therefore, federated learning (FL) concept can be used to generate a shared model with no use of local data for DL-based COVID-19 detection. In this view, this paper presents a federated deep learning-based COVID-19 (FDL-COVID) detection model on an IoT-enabled edge computing environment. Primarily, the IoT devices capture the patient data, and then the DL model is designed using the SqueezeNet model. The IoT devices upload the encrypted variables into the cloud server which then performs FL on major variables using the SqueezeNet model to produce a global cloud model. Moreover, the glowworm swarm optimization algorithm is utilized to optimally tune the hyperparameters involved in the SqueezeNet architecture. A wide range of experiments were conducted on benchmark CXR dataset, and the outcomes are assessed with respect to different measures . The experimental outcomes pointed out the enhanced performance of the FDL-COVID technique over the other methods.

Recently, cognitive computing has rapidly transformed the healthcare industry in assisting physicians in better treatment of diseases and improvising the patient services. The cognitive computing examines huge quantities of data promptly to respond to particular queries and offer intelligent recommendations. On the other hand, the rapid growth of social networking and Internet of things (IoT) applications results in a dramatic increase in the data created at network edges (Wang et al. 2019) . It can be anticipated that the data generated rate would surpass the capability of the present Internet in the following days (Chiang and Zhang 2016) . Because of the data privacy concerns and network bandwidth, it is not practical and often needless for sending the entire data to a remote cloud (Kelly 2016 ) .

Local data processing and storing with global management is developed probably by the developing technologies of mobile edge computing (MEC) (Kelly 2016) , whereas edge nodes like home gateway, sensor, small cell, and micro-server are outfitted with computation and storage capacity. Multiple edge nodes collaborate with the remote cloud for performing large-scale distributed tasks which include both remote coordination and execution and local processing. For analyzing huge numbers of data and attaining effective data for the prediction, detection, and classification of upcoming events, ML methods are Communicated by Gopal Chaudhary.

Extended author information available on the last page of the article frequently employed. This variation of distributed ML from a federation of edge nodes is called federated learning (FL) (Mao et al. 2017) . Figure 1 shows the framework of the IoT-permitted MEC system.

The COVID-19 pandemic has greatly affected the continual loss to the health and regular living of people globally. Thus, studies on diagnosing and detecting COVID-19 persons are highly significant (Shankar et al. 2021; Dash et al. 2021) . Mainly, the medical appearances of COVID-19-diseased pneumonia contain systemic pain, fever, chills, and dry cough. Some people have abdominal symptoms. Hence, it can be essential for testing more people without delay (Satpathy et al. 2021; Khadidos et al. 2020) . The ML and computer vision technologies perform a significant part in this method.

The DL has created a large involvement in the classification of images in the healthcare sector, and it is to turn into an efficient tool for the physicians to analyze and judge the situation. For obtaining a robust and accurate depth module, the key component is massive and extensively varied training data. Gathering these training data was one of the main problems. To a certain range, it has created insufficient data while implementing DL methods for detecting COVID-19. The FL is one of the accessible manners for addressing this problem. The FL is an architecture of learning through multiple organizations without sharing a person's data. It is possible for solving fundamental challenges of data silos and data privacy. FL applications in medicinal big data are promising studies. The fundamental FL is for utilizing the datasets shared on multiple devices to collectively make a distributed module and does not need local raw data sharing. This accurately secures personal data.

This paper presents a federated deep learning-based COVID-19 (FDL-COVID) detection model on an IoT-enabled edge computing environment. Initially, the FDL-COVID technique enables the IoT devices in capturing the patient data, and then the DL model is designed based on the SqueezeNet architecture. The IoT devices upload the encrypted variables into the cloud server which then performs FL on major variables using the SqueezeNet model to produce a global cloud model. Furthermore, glowworm swarm optimization (GSO) algorithm-based hyperparameter optimizer is applied for optimal selection of hyperparameters involved in the SqueezeNet model. A wide range of experimental analysis is performed on the R dataset, and the outcomes are evaluated with respect to diverse measures. Liu et al. (2020) proposed a scheme which utilizes FL for COVID-19 data training and place research for verifying the efficiency. Also, they relate efficiency of four common modules (COVID-Net, MobileNet, ResNet18, and MoblieNet) with/without FL architecture. Park et al. (2021) proposed the method using PSO rather than FedAvg that upgrades the global module by gathering weights of learned modules that are mostly utilized in FL. The method is called FedPSO and increases its strength in unstable network platforms by transferring score values instead of larger weights. Dou et al. (2021) determine the feasibility of FL technique to detect COVID-19-oriented CT abnormality with exterior authentication on persons from multi-national research.

Abdul Salam et al. (2021) investigated the efficiency of FL vs. conventional learning by emerging two ML modules (FL and conventional ML) with TensorFlow and Keras federated. In the module training phase, they attempt to detect which factor affects module predictive loss and accuracy, such as model optimizer, activation function, data size, number of rounds, and rate of learning, they saved plotting and recording the module predictive loss and accuracy for every training round, to detect which factor affects the module efficiency, and they discovered softmax activation function and SGD optimizer to provide optimum predictive loss and accuracy; altering the numbers of rounds and learning rate has somewhat influence on module predictive loss and accuracy; however, rising the data size did not have any effects on module predictive loss and accuracy. Feki et al. (2021) presented a combined FL architecture permitting multiple medical organizations to screen COVID-19 from chest X-ray images by DL with no sharing of personal data. They examine a few important specificities and properties of FL settings including non-IID and unbalanced data distribution that arises naturally. Qayyum et al. (2021) utilize the emergent model of CFL for automated diagnoses of COVID-19. They calculate the efficiency of the presented architecture which has distinct investigative settings on two standard datasets. Zhang et al. (2021) presented a new dynamic fusion-enabled FL method for diagnosing medical image analyses to identify COVID-19 diseases. Firstly, they designed a framework for a dynamic fusion-based FL system for analyzing medical diagnosis images. Additionally, they summarized a classification of medical diagnosis image dataset for detecting COVID-19 that is utilized by the ML method for analyzing an image. Kumar et al. (2021) proposed an architecture which gathers a minimal quantity of data from distinct sources and trains a global DL method by blockchain-based FL. Blockchain technique validates the data, and FL undergoes training the module globally when maintaining the secrecy of the organization. Firstly, they proposed a data normalization method that handles data heterogeneity as data are collected from distinct medical institutions containing distinct types of CT scanner. Next, they utilize CapsNetbased classification and segmentation for detecting COVID-19 persons. Lastly, they designed a technique that could collectively train a global module by blockchain technique with FL when maintaining privacy. Vaid et al. (2021) intended for using FL and ML methods which evade locally aggregating raw medical data through multiple organizations, for predicting death of hospitalized persons with COVID-19 within 7 days. The LR using L1 regularization or LASSO and MLP modules have been trained with the help of local data at all the sites.

FL is a learning approach projected (Koněcný et al. 2017) for shared datasets. It involves training a module by the datasets shared through several devices when avoiding leakage of data. It is beneficial in that it enhances privacy and decreases transmission costs. Over FL, ANN methods could be learned without data breaches or private data. Furthermore, transmitting the entire data from many devices to the central server surges storage costs and network traffic. FL considerably decreases transmission cost by interchanging the weights obtained from the training modules. Figure 2 summarizes the FL procedure.

1. The server transmits the learning module to all the clients. 2. The obtained modules are trained on user data. 3. Every user transmits their trained module to the server. 4. The server collects the gathered modules and aggregates them to an upgraded module. 5. The server transmits the upgraded modules to all the clients, and steps 1 to 5 are continued.

In the study, assume N COVID-19 CXR image owners as F 1 ; F 2 ; F N . In this limitation, each of them needs to train its individual module by combining their corresponding data D 1 ; D 2 D N . A traditional technique occurs to place each datum together and utilize D ¼ D 1 [ D 2 D N for training to attain an MSUM module. In this procedure, the data owner collectively trains the module M F ED. In this procedure, some data owners F i will not reveal their personal data D i to others. Moreover, the accuracy of M F ED denoted as V F ED must be closer to the V S UM efficiency of MSUM, where e denotes a nonnegative real number; when V F ED À V S UM j j , they could mediate the FL. Figure 3 illustrates the overall framework of the proposed model. The FL is a distributed learning technique. The server is utilized for maintaining the entire method and allocating it to several user terminals (UTs). The server would fix the score S and remove the UT on the basis of proportion for updating the central method of the server. Later, it would upload the client-enhanced model parameters to the server for updating server model parameters.

Then, it is allocated to UT for improving the UT models. This method could ensure the privacy and accuracy of the UT, employ the UT computing power and a huge amount of client data for learning, and maintain an outstanding central method. It needs client data for using their actual private data to train and provide the trained local method to the FL servers. In general, the FL training procedure comprises the succeeding three training phases. Primarily, they determine the local module representing the method trained on every contributing device, and the global method denotes the module afterward the FL server is aggregated.

•

Step 1 Implement tasks initiation. The server defines the training process that is to define the corresponding data requirements and target application. Simultaneously, the server states the global method and determines the parameter at the time of training procedure, for example, rate of learning. Later, the server assigns the initiated global method W G 0 and trained task to the participating users for completing the task distribution. • Step 2 Implement training and upgrading of the local module. The training is executed based on the global method W t G, whereas t denotes the present iteration index, and every participating client utilizes local data and tools for updating the local methods' parameter W ti . The last aim of participating i in iterations t is to detect the optimum parameter W ti that reduces the loss function L(W ti ). • Step 3 Realize the aggregation and upgradation of the global method. The server aggregates the local methods of participating clients and forwards the upgraded global methods' parameter W t þ 1 ð Þ G to the clients who keep the data.

In this work, they utilize SqueezeNet for learning cloud modules and client-side SqueezeNet methods. Consider the features of FL; this study creates substantial developments for local SqueezeNet network, i.e., the key parameters beforehand the latter hidden layer are distributed, and the parameter among the latter hidden and the outcome layers is not allocated. The thorough details are described in the succeeding subsections.

The cloud server utilizes public data and the parameter uploaded by the client for establishing a global cloud module f s . The objective procedure at the time of training could be given by arg min

where l Á; Á ð Þ represents the loss function of a trained module, for example, cross-entropy loss functions. x i ; y i f g denotes sample x i and the equivalent label y j ; n signifies the sample size of public data. H indicates the parameter matrix which should be learned, with the bias and weights of the hidden layer. Afterward, the cloud method is determined, and the parameter H is allocated to every user.

The user creates a local SqueezeNet method as the cloud module. The training procedure residues are generally similar, excluding that the instance data are comparatively smaller and belong to private data . For some users u, the local SqueezeNet method is stated as f u , and the objective function is given by

Since a significant variable of local SqueezeNet, H u , is loaded into the cloud in the encrypted model, the cloud trains the parameter set H 1 ; H 2 ; . . .; H n È É , for updating the global cloud module and the parameter H, and later allocates the upgraded parameter H to each user. Based on the actual requirements, local parameters could be upgraded frequently, for example regularly upgrading every night. The aforementioned procedure is a dynamic procedure of iterated optimization that always enhances the detection capability of a method.

In this study, SqueezeNet architecture is employed for the detection of COVID-19 utilizing CXR images. CNN is one of the common DL techniques mainly utilized for image classification problems. It generally contains five layers that have output, input, convolution, pooling r, and fully connected layers. The real-world support of CNN is containing less parameters that considerably decreases the time taken for learning and decreases the number of data required to train the module. Additionally, CNN could be trained end to end for the selection and extraction of feature images and, finally, could be utilized for predicting or classifying the image ). Since the number of parameters for VGGNet and AlexNet is increasing, the SqueezeNet network method was projected to have a low number of parameters when maintaining accuracy (Xu et al. 2020 ).

The fire model is the core base model in SqueezeNet, and its framework is displayed in Fig. 4 . This model is separated to expand and squeeze frameworks. Squeeze has S1 Â 1 Conv kernel. The expand layer has 1 Â 1 kernels and 3 Â 3 Conv kernels. The amount of 1 Â 1 Conv kernel is E 1Â1 , and the amount of 3 Â 3 Conv kernels is E 3Â3 : The module should fulfill S\ E 1Â1 þ E 3Â3 ð Þ . Min utilized an MLP rather than the conventional linear Conv kernel for improving the expressive power of networks. If the amount of output and input channel is larger, the Conv kernel parameters become larger. They include 1 Â 1 Conv for all the inception models, decreasing the amount of input channel, and the Conv kernel parameter and also operational complexity decrease. Finally, a 1 9 1 Conv is included for improving the amount of channels and improving feature extraction (Iandola et al. 2016 ). SqueezeNet substitutes 3 9 3 Convs with 1 9 1 Conv for reducing the amount of parameters to one-ninth.

In order to optimally adjust the hyperparameters involved in the SqueezeNet model, the GSO algorithm is employed to it and thereby the classification performance gets improved. For optimizing the hyperparameters of the GSO method, a group of glowworms is initiated arbitrarily distributed in the solution space where it is efficiently distributed. The glowworms or agent carries a luminescent quantity known as luciferin with an identical primary value. The intensities of emitted light are connected to the luciferin count that is nearly combined to it where the glowworms are placed in their motion and have a dynamic decision domain r i d t ð Þ constrained with a spherical sensor range rsð0 i d \ ¼ r i s Þ. A glowworm i recognizes other glowworm j as a fellow memory utilizing the probabilistic manner in which j is placed in the present local decision region of i. The glowworms discharge higher quantity of luciferin to attract further glowworms for moving toward it.

Initially, the glowworm has a similar number of luciferins, l 0 . Depending upon the resemblance of luciferin value, the glowworm i chooses its neighboring one j with a probability p ij and moves in the direction of decision range rsð0 i d \ ¼ r i s Þ, whereas the location of the glowworm i is denoted by xi xi 2 Rm; i ¼ 1; 2; . . .; n ð Þ . The iteration comprises a luciferin upgrade stage following a motion stage relying on a transition rule. Firstly, the value of luciferin equivalent to the calculated value is determined at the sensed profile. Simultaneously, a part of luciferin value endures subtraction for simulating the decay in luciferin with time. In luciferin, upgrade value is determined by:

whereas l i t ð Þ signifies the luciferin level linked to glowworm i at time t; q denotes the luciferin declining constant 0\q\1; c signifies the luciferin upgrading constant, and J x i t ð Þ ð

Þindicates the value of an objective function at agent i 0 s position at time t: Based on the process included in the GSO method, the glowworms are attracted by the adjacent ones that glow brighter. As a result, at the time of motion stage, the glowworms utilize probabilistic procedure for moving toward the adjacent one that has a maximal luciferin intensity. In the event of each glowworm i, the likelihood of motion on an adjacent glowworm is denoted by

where 2 N i t ð Þ,

represents the group of neighboring glowworms i at time t; d ij t ð Þ signifies the Euclidean distance among the glowworm i and j at time t: r i d t ð Þ indicates the parameter adjacent range interrelated with glowworm i at time t. The parameter is constrained by a radial sensor range (0 i d s). At that time, the different time module of the glowworm motion is determined as

where sð [ 0Þ denotes the step size and jj jj represents the Euclidean norm operator. Besides, x i t ð Þ 2 R m indicates the position of glowworm i at time t in m-dimension real space R m . The adjacent range upgrade phase is used for detecting various peak values in the multimodal function landscape. Later, assume r 0 be the initiated adjacent range of every glowworm (i.e., r i d 0 ð Þ ¼ r 0 , 8i). To dynamically upgrade the decision range of every glowworm, the beneath rule is used:

where b indicates a constant and n t describes a variable used to control the degree.

The proposed FDL-COVID technique is validated on a freely accessible COVIDx dataset. The parameter setting of the proposed model is given as follows: batch size 500, max. epochs 15, learning rate 0.05, dropout rate 0.2, and momentum 0.9. It has a set of 15,282 images where 13,703 images are used to train the model and the rest 1579 images are utilized to test the method. The dataset comprises images of three classes, namely normal, pneumonia, and COVID-19. Firstly, a brief results analysis of various models' sensitivity takes place using FL in Table 1 and Fig. 5 . From the table values, it is ensured that the MN-v2 model has showcased least outcome with the sensitivity of 0.912 and 0.868 on the applied training and testing sets correspondingly.

Then, the COVID-Net and Res-NXT techniques have depicted slightly increased performance with the somewhat enhanced sensitivity values on the applied training and testing sets, respectively. Moreover, the RN-18 technique has gained moderate performance with the sensitivity of 0.962 and 0.913 on the applied training and testing sets correspondingly. However, the FDL-COVID technique has gained superior performance with the sensitivity of 0.976 and 0.965.

Primarily, a detailed outcomes analysis of different methods' perplexity takes place utilizing FL in Table 2 . From the table values, it can be stated that the MN-v2 method has outperformed worse results with the perplexity of 0.949, 0.872, and 0.503 on the applied normal, pneumonia, and COVID-19, respectively. Afterward, the COVID-Net and Res-NXT manners have showcased slightly improved efficiency with the somewhat enhanced perplexity values on the applied normal, pneumonia, and COVID-19 correspondingly. Furthermore, the RN-18 approach has attained moderate performance with the perplexity of 0.962, 0.927, and 0.736 on the applied normal, pneumonia, and COVID-19 correspondingly. But, the FDL-COVID algorithm has accomplished higher performance with the perplexity of 0.987, 0.949, and 0.898 on the applied normal, pneumonia, and COVID-19.

Next, a brief outcomes analysis of distinct technique loss convergence speed takes place using FL in Table 3 and Fig. 6 Figure 7 showcases the set of confusion matrices generated by the existing techniques. The COVID-Net technique has classified a set of 531 images into pneumonia, 854 images into normal, and 23 images into COVID-19. Eventually, the MN-v2 manner has classified a set of 555 images into pneumonia, 777 images into normal, and 39 images into COVID-19. In the meantime, the RN-18 method has classified a set of 555 images into pneumonia, 845 images into normal, and 41 images into COVID-19. Lastly, the Res-NXT algorithm has classified a set of 564 images into pneumonia, 805 images into normal, and 58 images into COVID-19.

The confusion matrix produced by the FDL-COVID technique on the classification of benchmark images is demonstrated in Fig. 8 . From the figure, it is obvious that the FDL-COVID technique has resulted in the classification of 575 images into pneumonia, 866 images into normal, and 66 images into COVID-19.

A detailed classification results analysis of the FDL-COVID technique with existing techniques is made in Table 4 and Fig. 9 However, the proposed FDL-COVID technique has resulted in maximum performance with an average sens. of 0.869, spec. of 0.974, and acc. of 0.970. A brief classification outcomes analysis of the FDL-COVID manner with existing methods is made in Table 5 and Fig. 10 

This paper has presented a new FDL-COVID technique to detect and classify COVID-19 on IoT-enabled MEC environment. The proposed FDL-COVID technique involves IoT-based data acquisition process, and the CXR images of the patients are collected. In addition, Squee-zeNet method is employed for the detection and classification of COVID-19 utilizing CXR images. The IoT devices upload the encrypted variables into the cloud server which then performs FL on major variables using the SqueezeNet model to produce a global cloud model. At last, GSO algorithm-based hyperparameter optimizer is applied for optimal selection of hyperparameters involved in the SqueezeNet model, which in turn considerably enhances the COVID-19 detection outcomes. An extensive set of simulations were carried out on the benchmark CXR dataset, and the proposed model outperformed the existing techniques with the maximum accuracy of 0.9700. As a part of future work, data offloading and resource management approaches can be focused on the IoT-enabled MEC platform.

COVID-19 detection using federated machine learning

Fog and IoT: an overview of research opportunities

A deep learning method to forecast COVID-19 outbreak

Federated deep learning for detecting COVID-19 lung abnormalities in CT: a privacypreserving multinational validation study

Federated learning for COVID-19 screening from chest X-ray images

SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and\ 0.5 MB model size

Internet of Things data to top 1.6 zettabytes by 2020

Analysis of COVID-19 infections on a CT image using deepsense model

Federated learning: strategies for improving communication efficiency

Blockchain-federated-learning and deep learning models for covid-19 detection using ct imaging

A survey on mobile edge computing: the communication perspective

Identification of pneumonia disease applying an intelligent computational framework based on deep learning and machine learning techniques

FedPSO: federated learning using particle swarm optimization to reduce communication costs

Collaborative federated learning for healthcare: Multi-modal covid

Predicting mortality rate and associated risks in COVID-19 patients

Automated COVID-19 diagnosis and classification using convolutional neural network with fusion based feature extraction model

Federated learning of electronic health records to improve mortality prediction in hospitalized patients with COVID-19: machine learning approach

Adaptive federated learning in resource constrained edge computing systems

Fed-SCNN: a federated shallow-cnn recognition framework for distracted driving

Electronic component recognition algorithm based on deep learning with a faster SqueezeNet

Dynamic fusion-based federated learning for COVID-19 detection

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Cognitive computing-based COVID-19 detection on Internet

Laxmi Lydia elaxmi2002@yahoo

Selim m.selim@pssu.edu.sa 1 Department of Computer Science and Engineering, Vignan's Institute of Information Technology (Autonomous)

Acknowledgements This work was supported by the Taif University Researchers Supporting Project number (TURSP-2020/126), Taif University, Taif, Saudi Arabia.

Conflict of interest The authors declare that they have no conflict of interest. The manuscript was written through contributions of all authors. All authors have given approval to the final version of the manuscript.Data availability Data sharing is not applicable to this article as no datasets were generated during the current study.Ethical approval This article does not contain any studies with human participants performed by any of the authors.Informed consent Informed consent was obtained from all individual participants included in the study.