key: cord-0473981-7ibvlt6v
authors: Kamalov, Firuz; Cherukuri, Aswani; Sulieman, Hana; Thabtah, Fadi; Hossain, Akbar
title: Machine learning applications for COVID-19: A state-of-the-art review
date: 2021-01-19
journal: nan
DOI: nan
sha: 2243634f5f96b6e1ff2dd0a9f814664b27f77c69
doc_id: 473981
cord_uid: 7ibvlt6v

The COVID-19 pandemic has galvanized the machine learning community to create new solutions that can help in the fight against the virus. The body of literature related to applications of machine learning and artificial intelligence to COVID-19 is constantly growing. The goal of this article is to present the latest advances in machine learning research applied to COVID-19. We cover four major areas of research: forecasting, medical diagnostics, drug development, and contact tracing. We review and analyze the most successful state of the art studies. In contrast to other existing surveys on the subject, our article presents a high level overview of the current research that is sufficiently detailed to provide an informed insight.

The field of machine learning has made tremendous progress over the past decade. Improved deep learning algorithms coupled with increased computational capacity catalyzed the growth of the field into stratosphere. As a result, machine learning has been used in a diverse array of applications. Arguably the most crucial application of machine learning has been in the fight against COVID-19 pandemic. Researchers have aggressively -and often successfully -pursued a number of different avenues using machine learning to battle COVID-19. A range of machine learning applications have been developed to tackle various issues related to the virus. In this paper, we present the latest results and achievements of the machine learning community in the battle against the global pandemic. In contrast, with other existing surveys on the subject we provide a general overview that is nuanced enough to provide a substantial insight. Our survey includes preprint works to ensure the most up-to-date coverage of the topics. The current applications of machine learning to COVID-19 can be divided into four groups:

• forecasting • medical diagnostics • drug development • contact tracing Deep learning algorithms have been successfully deployed to forecast the number of new infections. Recurrent neural networks have shown superior performance in time-series forecasting over traditional approaches such as ARIMA models. Researchers have used recurrent networks, and their variant long shortterm memory networks, to successfully model the spread of the infection and predict the future number of infections in population. Arguably the most important application of machine learning is in the field Date: January 21, 2021 of medical diagnostics that is made possible by the advances in computer vision. Machine learning has achieved near human level accuracy in many image recognition tasks. Therefore, it is no surprise that image recognition software is successfully being used to detect signs of COVID-19 in patient chest Xray images. In many parts of the world where an effective clinical testing procedure is not available or unaffordable chest X-ray images and CT scans provide the only option to diagnose the virus. Studies have shown that deep leaning approaches can diagnose COVID-19 based on chest X-ray image with over 99% accuracy. Smart contact tracing using artificial intelligence has helped authorities locate potential infected persons. A number of software solutions based on artificial intelligence are currently in use to trace spread of the virus. Machine learning has been used to help guide researchers to new discoveries in pharmacology. In particular, variational autoencoders have the ability to analyze perturbations in chemical composition that can lead to possible new medicines. Applying autoencoders to the existing flu vaccines can help identify potential avenues to creating COVID-19 vaccine.

The challenge to fight off the global pandemic and help the humanity has spurred researchers across disciplines. In an effort to accelerate scientific research on COVID-19 the publishing community has made all the related publications freely available to the public. As a result, we are able to access and assess all the current research and present our survey to the readers. Our goal is to provide a quick, but sufficiently detailed, overview of the current state of the art in machine learning research applied to COVID-19. We hope our survey will supply the reader with the necessary information to facilitate a deeper investigation into the topic.

The paper is structured as follows. In Section 2, we discuss the use of machine learning in forecasting the number of new infections. Section 3 discusses the use of deep learning in detection and diagnosis of the infection. Section 4 contains the information about the use of machine learning in drug discovery and development. Section 5 discusses the current research related to the application of machine learning for contact tracing. Finally, Section 6 concludes the paper with a few closing remarks.

Forecasting the number of infections is critical for proper planning and allocation of resources. Modern machine learning (ML) algorithms such as long short-term memory (LSTM) networks have been shown to outperform the traditional time series models such ARIMA and GARCH. As a result, LSTMs have been used in various application involving time series projections [17, 19] . Several countries employ ML based software to estimate the number of future infections and the trajectory of the infected population. In this subsection, we will provide an overview of the latest advances in ML related to forecasting the number of COVID-19 infections. The results of our survey are summarized in Table 1 and a more detailed discussion follows below.

A comparative study of ML-based algorithms for COVID-19 forecasting was done in [7] . The authors analyzed a number of evolutionary algorithms such as Genetic Algorithm, Particle Swarm Optimization, and Gray Wolf Optimizer as well as ML algorithms such Multilayer Perceptron (MLP) and adaptive network-based fuzzy inference system (ANFIS). The models were evaluated on the basis of their accuracy for different prediction lead times. The authors employed data from 5 different countries in their study experiments revealed that MLP and ANFIS algorithms produce the best results achieving correlation level of 0.999. A novel approach from Google Research that combines temporal and spatial data is proposed in [21] . Using graph neural networks and Google mobility data the authors uncover the rich interactions between time and space that is often present in the spread of pandemic. Numerical experiments demonstrate the power of mobility data with the GNN framework. In [31] , the authors employ an ensemble neural network to predict the number of confirmed cases and deaths in Mexico. The proposed ensemble network (MNNF) consists of 3 modules: nonlinear autoregressive and function fitting neural networks. The module predictions are combined via a fuzzy integrator -designed to handle uncertainty -into a single output. The method is tested on data from Mexico. The authors carried out experiments to predict the number of confirmed cases and deaths 10 days ahead. Results reveal that the MNNF method outperforms single neural network models. The authors in [9] test 3 LSTM-based models to forecast the number of infected individuals for 32 states in India. The tested models include stacked, convolutional, and bi-directional LSTM neural networks. The predictions are made one day and one week ahead. The results show that the bi-directional LSTM produces the optimal results. Several ML models are compared in [42] to forecast confirmed cases in Brazil and the US. The models under consideration include Bayesian neural network, cubist regression, kNN, random forest, and SVR. In addition, variational mode decomposition (VMD) is applied as a preprocessing step. The authors also consider exogenous variables such as temperature and precipitation. Numerical experiments produce mixed results with no clear favorite. It can only be noted that VMD improves model performance when the prediction horizon is 6 days ahead. The authors in [38] compare statistical and ML approaches to time series forecasting. In particular, they study autoregressive integrated moving average (ARIMA), support vector regression (SVR), and LSTM models to forecast the number of infections, deaths, and recoveries. The model input consists of the data from the previous 110 days. The model is used to predict infections for the next 48 days. The study is based on data from 10 countries. The results show that LSTM models generally outperform ARIMA and SVR. Machine learning approaches do not always outperform traditional methods. In [37] , the authors compare classic statistical methods to SVR to predict the number of positive cases, death rate, and recovery rate. The study covers a large number of countries. Results show that statistical models outperform SVR. In [15] , the authors apply deep learning to forecast the number of infections and deaths regionally and worldwide. LSTM models use observed last 3 days of data to forecast 10 days ahead. In their analysis the authors considered Middle East, Europe, China, and worldwide data. The results show that the forecasts achieve 1.5% root mean square error (RMSE).

Diagnosing COVID-19 infection is a key first step to fighting the virus. The rapid spread of the disease across the globe has made diagnosis of the disease at early stages not only important for the individual patient but also for preventing the community spread of the disease. Polymerase chain reaction (PCR) tests that are currently employed to detect the presence of the COVID-19 virus require time and capital to administer. Despite recent improvements PCR tests remain scarce and costly in developing countries and rural areas. PCR tests may further suffer from sample preparation and quality control which can lead to insufficient sensitivity [41] . Therefore, developing alternative approaches to testing is a vital research area. At present there are several ML applications that support diagnostic process. Deep neural networks 

Predict the number of new cases 1, 3, and 6 days ahead.

Various ML models -Bayesian neural network, cubist regression, kNN, random forest, and SVR -are considered. In addition, variational mode decomposition (VMD) preprocessing is applied.

Exogenous input variables -temperature and precipitation -are also considered.

John Hopkins University repository and Brazilian State Health Offices API. Number of daily positive cases for 5 states in the US and Brazil until Apr 28, 2020.

Mixed results with different models achieving the best outcomes on various subsets of data. The best models achieved an out of sample forecasting error of 3%.

Predict the number of positive, death, and recovery cases ARIMA, SVR, and LSTM models are compared. The models are applied to data from 10 countries.

China Data Lab, 2020, "World COVID-19 Daily Cases." Number of confirmed, death, and recovery cases over the period of Jan 22, 2020 to Jun 27, 2020.

Mixed results, but in general LSTM appears to produce better results. For instance, LSTM has the lowest MAE values for confirmed cases and deaths as 2.0463 and 0.0095 respectively. Rustam demonstrated capability to achieve high accuracy in image detection tasks. Consequently, applying deep learning and other ML techniques to X-ray and CT scan images has been one of the intensely researched areas. In addition, detection approaches based on clinical data have also been tried and tested. Artificial intelligence (AI) based methods augment the diagnosis process and accelerate the treatment of the disease. These models can assist the physicians and healthcare professionals not only during testing and treatment but also for planning and managing of the resource [27] . The results of our survey on the current AI/ML research for COVID-19 diagnostics are summarized in Tables 2 and 3 .

Imaging techniques such as X-rays and CT scans are widely used as diagnosis tools for many lung diseases including tuberculosis, lung cancer, and pneumonia viruses. CT scan images provide fast and detailed information about the pathology and prognosis of diseases. As a result, ML techniques are being increasingly integrated with imaging and computer vision methods for applications in disease diagnosis. The success of deep learning techniques in detecting and diagnosing various types of pneumonia has been already reported in the literature. The authors in [28] developed a robust model based on 3-dimensional convolutional neural network (CNN) framework to extract features from CT scan images and distinguish COVID-19 from the community acquired pneumonia. When diagnosing patients in early stages, AI models proved to be successful by integrating both CT scan imaging and clinical information [32] . Combining the output of CNN model on CT scan images and the output of ML models such as SVM and Random Forests on clinical data the accuracy of diagnosis reaches the levels of human healthcare experts. CT scan imaging is the diagnostic tool predominantly used in treating the pulmonary infections. The same is employed during the current outbreak by many countries in diagnosing COVID-19 patients -particularly at early stages. Further progress was made by Zhou et al. [50] who identified the importance of segmentation and proposed deep learning based models to address these issues in ML based diagnosis of COVID-19.

Despite the promising research results there is still a lot of room for growth for ML based diagnostics. Production ready applications that can be used in hospitals require further refinement. A great deal of of research is yet to be conducted to improve their reliability. The main challenge in deploying the AI/ML models in the COVID-19 is the generalization ability of these models which is also prevalent in AI based models in other applications. Another major bottleneck in implementing AL/ML based solutions in healthcare is the availability of patient data samples of necessary size and quality to train the ML models. In some instances though the data is available, format and structure of the data pose another challenge. Integrating existing research solutions to practical applications and products is another challenge. Finally it is vital to ensure that the studies, investigations conducted and reported during this pandemic and pressing times are technically, scientifically and ethically are correct.

A wide array of ML models has been deployed to try diagnose instances of COVID-19. The list of models includes CNN, RNN, SVM, transfer learning, XGBoost and others. Although these models demonstrate high performance and accuracy they possess limitations such as the lack of sufficient data to train the models, inability to generalize the results, etc [50] . Despite the ongoing efforts to apply ML/AI in COVID-19 diagnostics some members of the radiologist community have raised their concerns regarding possible pitfalls. Laghi [26] has cautioned that while AI/ML should be used for diagnosis of COVID-19, a more objective and precise quantification is required in understanding the lungs involvement of disease. Wynants [46] reviewed the validity and usefulness of the various models published in the literature on 

Diagnose COVID-19 based on CT scans.

The proposed system consists of two segmentation models: one for lung lesion segmentation and another one for diagnosis prediction. 3D convolutional blocks are used for the classification.

A large database of 532,506 CT scan images of 4,154 patients having COVID-19, common pneumonia and normal controls.

Results are evaluated over the dice coefficient and pixel accuracy metrics. The system achieved the accuracy of 92.49%, sensitivity of 94.93% and 91.13% specificity.

Analyze the adoptability of CNN techniques for diagnosis of COVID-19 with the help of X-ray images.

Transfer learning with CNN is used over the X-ray image data.

An image dataset of 1427 X ray images that include COVID-19 positive, common pneumonia and normal conditions. Deep CNN network with three stages: first stage with 3D convolution, batchnorm and pooling layers; second stage has two 3D residual blocks; third stage is with progressive classifier which abstracts the information by 3D max pooling and output the probability of COVID-19. CT scan images were pre-processed by a simple 2D UNet to create 3D lung mask. 540 patients chest CT scans are considered for evaluation. Further, travel and contact history of these patients, symptoms, clinical lab findings are also considered. 

Automatically detect the COVID-19 in Xray and CT Scan images.

Compared the performance of a pool of deep learning based different feature extraction models such as DenseNet, MobileNet, ResNet, Incep-tionV3, NASNet.

Features extracted from these models are fed to classification techniques such as decision trees, random forests, XGBoost, AdaBoost, Bagging classifier, and LightGBM.

A total of 137 images among which 117 are X-ray and 20 are chest CT scans images. These are COVID-19 positive patients data and similar in number healthy patients data.

PMobileNet and Inception V3 with Bagging classifier have provided the best classification performance with 99% for precision, recall and F Score. However the data is limited. 

Predict the need and requirements of ventilation during the diagnosis of COVID-19 patients.

XGBoost classifier based method which uses an ensemble learning technique to learn and classify the low risk or high risk category patients.

197 patients with positive diagnosis of COVID-19.

The specificity and sensitivity results indicate that the model was able to identify and discriminate the patients who require the ventilation support. Significant gain in computational time. Accuracy of the method is proved to be better than other optimization methods. COVID-19 diagnosis, prognosis and risk prediction. Their analysis over 145 models in 107 published documents showed that there exists a high risk of bias. The results of these models are probabilistic and hence are not recommended to be adopted for practical use. They call for more rigorous analysis of these models with proper methodological guidance and provision of description of populations under study. They also warned that if the studies are unreliable, it would lead to harmful effects in diagnosis and prognosis of the disease. Based on the careful review of the existing literature on ML based diagnostics for COVID-19 we conclude that the proposed models have significant potential. The existing models can be used as stepping stones for building more robust and resilient models that would assist the healthcare professionals in diagnosis and decision making. AI/ML researchers should learn from the experiences of this pandemic and focus on developing models in collaboration with healthcare professionals and medical experts. We note that the most important challenge is the availability of data to train the models as well as the treatment of the data. Resolving this issue can have a big impact on the robustness, generalization ability of models for practical applications.

Machine learning algorithms are increasingly being used to search for new chemical combinations that can lead to effective medicine. Artificial intelligence and machine learning techniques have become an integral part of the pharmaceutical world. Integrating these techniques into the complex drug developing pipeline has proven to be both cost-effective and less time-consuming. Machine learning techniques are particularly useful as they provide a set of tools that improve the process of drug discovery and development for specific situations with the help of available data that is reliable and of high quality. As a results a large effort has been under way to apply AI/ML based solutions in pharmacology. A summary of the survey of the current literature in the field is provided in Table 5 . Several pharmaceutical companies have employed ML-based algorithms such as artificial neural networks, Support Vector Machines (SVM), deep learning and many others to develop various drugs and vaccines [36] . The authors in [36] provide a review of recently developed algorithms to design automated drug development pipelines consisting of drug discovery, drug testing and drug re-purposing. In drug discovery, the deep learning algorithm Generative Adversarial Networks (GAN) is used to identify DNA sequences associated with specific functions and Bayesian Optimization (BO) is used to produce proteins of interest with lower costs. In drug testing, sequential decision-making algorithms such as the Bayesian-based Multi-Armed Bandit (MAB) algorithms are used to test several drug candidates and determine the best treatments. In drug re-purposing, text mining methods and graph-based recommender systems are used to identify correlations and predict drug-disease interactions. The authors compiled a list of relevant data sets for drug development pipeline studies.

In 2019, the National Institute of Allergy and Infectious Diseases sponsored the first U.S. clinical trial to develop a vaccine against SARS-CoV-2 using an AI-based model [5] . An AI program called synthetic chemist was created to generate trillions of synthetic compounds and another AI-based program called Search Algorithm for Ligands (SAM) was used to sift through the trillions of compounds and determine the most suitable candidates as vaccine adjuvants. With the fast spread of COVID-19, there has recently been a race in utilizing ML techniques and AI capabilities to develop an effective vaccine and antivirals.

The authors in [1] incorporated reverse vaccinology, bioinformatics, immunoinformatics and deep learning strategies to build a computational framework for identifying probable vaccine candidates and constructing an epitope-based vaccine against COVID-19. The screening of viral proteome sequences resulted in short listing of Spike protein or Surface Glycoprotein of SARS-CoV-2 as a potential protein target that can be used to design the vaccine. The physicochemical properties of the protein were further examined using LSTMs and the results showed that the protein is the primary responsible for the pathophysiology of SARS-CoV-2. The authors proposed that their computational pipeline can be used to design effective and safe vaccine against COVID-19. In [47] , the authors used an 'In-Silico' analysis to design a potent multi-epitope peptide vaccine against SARS-CoV-2. MLP and SVM algorithms were used to screen for potential epitopes. The vaccine immunogenicity was enhanced using three potent adjuvants and its tertiary structure was predicted, refined and validated using appropriate strategies. The results showed that the vaccine can interact effectively with toll-like receptors (TLR) 3, 5, 8 and by using in silico cloning, it has demonstrated a high-quality structure, high stability and potential for expression in Escherichia coli. The authors in [35] surveyed existing literature about COVID-19 and vaccine development. They used Vaxign Reserve Vaccinology (VRV) tool and Vaxign-ML, a machine learning-based vaccine candidate prediction and analysis system, to predict and evaluate potential vaccine candidates for COVID-19. The results showed that in addition to the commonly used S protein, the non-structural protein (nsp3) was found to be second highest in protective antigenicity. Further investigation of the the sequence conservation and immunogenicity of the multi-domain nsp3 protein, the authors concluded that the nsp3 can be an effective and safe vaccine target against COVID19.

For the development of drug treatment for COVID19, the authors in [10] used a pre-trained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT-DTI) to predict any commercially available antiviral drugs that could be effective against SARS-CoV-2. The model was compared to CNN-based model called DeepDTA and another two traditional machine learning based algorithms, gradient boosting and regularized least-squares model, using various data set. The MT-DTI showed the best performance in predicting the drug-target interactions and was able to identify various antiviral drugs such as redeliver, dolutegravir, efavirenz and atazanavir which could potentially be used in the treatment of SARS-CoV-2 infection. In [23] , the authors used deep neural networks (DNN) and established an AI platform to identify potential old drugs that could be used against the SARS-CoV-2. Different learning data sets consisting of compounds reported or proven active against SARS-CoV, SARS-CoV-2, human immunodeficiency virus (HIV), and influenza virus were generated and used to predict drugs potentially active against coronavirusout of the marketed drugs. The predicted drugs were then tested and verified to serve as feedbacks to the AI platform for relearning and thus to generate a modified AI model. The implemented AI-based framework was able to identify eight drugs with activities against Feline Infectious Peritonitis (FIP) coronavirus. The authors suggested that with prior use experiences in patients, these identified old drugs can potentially be proven to have anti-SARS-CoV-2 activity and hence be applied for fighting COVID-19 pandemic. The authors in [25] analyzed over 10 million compounds using a machine learning pipeline in order to predict chemicals that interfere with SARS-CoV-2 targets. The pipeline involves selection of important physicochemical features for each target using recursive feature elimination algorithms, followed by fitting aggregated multiple support vector machines (SVM) models and regularized random forest algorithm (regRF) to improve generalizability and then evaluating model performance using various computational validation methods. The authors concluded that their identified chemicals can accelerate testing of short-term and long-term treatment strategies for COVID19. The importance of AI and Machine learning (ML) techniques that can accelerate the discovery of a possible cure for COVID-19 is discussed in a recent review article by [8] . The review article by [20] focused on the recent advances of COVID-19 drug and vaccine development using artificial intelligence and discussed the potential of intelligent training for the discovery of COVID-19 therapeutics.

Effective contact tracing is a major factor in a virus containment strategy [2] . In conventional contact tracing, a health care professional interviews the infected patient to trace and discover other individuals who may potentially be infected though contact with the patient. The main challenge of the conventional approach is the difficulty for an individual to recall all his contacts. In addition, the process requires availability of specialized clinicians using their experience and other resources [13] . Recent technological improvements allowed the contact tracing process to be optimized with less human intervention in an intelligent approach known as digital proximity (DP) contact tracing. The DP approach utilizes network technologies to identify and locate individuals who could be potentially infected through contact.

With he widespread availability of computing networks and mobile applications -and their associated technologies including smartphone, smartwatch, and others -most of the technology-based contact tracing systems are built on mobile platforms [4, 30] . These systems, named digital contact tracing (DCT), enable a registered user's exposure to be evaluated through wireless signals such as Bluetooth low energy. Alternative technology-based tracing systems that are non-mobile and application-based utilize tracking information collected from a variety of sources such as banking transactions, security camera footage, GPS data from vehicles, mobile phones and others to estimate the proximity of an individual to an infected person.

Artificial intelligence and machine learning -in particular deep learning algorithms -have been successfully used in medical diagnosis and screening systems due to their exceptional learning capabilities. In the context of DCT systems, these technologies can be incorporated to aid the decision-making process and improve the detection accuracy of contact tracing. Concretely, the data collected from registered users such as their daily tracks and geo locations in the DCT system are explored by the ML algorithm within digital platforms to provide medical professionals and government officials with useful insights. Artificial intelligence and machine learning applications are currently utilized through the entire life cycle of COVID-19 starting from detection to mitigation [27] . In contact tracing, a virtual AI agent is an alternative to a health professional in the case of classical contact tracing. The virtual AI agent with natural language capabilities can collect the information previously gathered by a health professional. In DCT systems, Bluetooth technology is widely employed as a proximity detector for COVID cases. However, the performance of Bluetooth-based contact tracing apps may be affected by changing signal intensity, which can be exhibited by different mobile devices, mobile positions, body positions, and physical barriers [51] . Generic wireless multipath effects and shadowing are persistent issues which can lead to false Vaxign-RV and Vaxign-ML strategies are used. The sequence conservation and immunogenicity of the predicted protein were further investigated.

NCBI and UniProt databases.

S protein had highest protective antigenicity score. nsp3 protein -with the second highest antigenicity score -was predicted as an alternate vaccine candidate.

Beck et al.,

Predict binding affinity values between commercially available antiviral drugs and target proteins.

Deep learning-based MT-DTI was compared to DeepDTA, gradient boosting and regularized least-squares model.

DrugBank, Drug Target Common (DTC) database, Bind-ingDB and NCBI databases.

MT-DTI performance best compared to a DeepDTA and MLbased algorithms(SimBoost and KronRLS). Atazanavir is the best chemical compound against the SARS-CoV-2 3C-like proteinase.

Ke et al., SML-aided molecular docking is one of the most prevalent approaches for virtual screening. 3CLpro is the most popular target for virtual screening. Spike protein has been the most popular candidate for virtual vaccine discovery positive and false negative identification. To improve the proximity detection accuracy in DCT systems, ML techniques can be used to analyze the Bluetooth signal and other phone sensors' data.

Recently, a 2-stage classifier was proposed that utilizes vanilla neural network to extract features from a signal emanating from different sources [18] . Employing a deep learning technique directly on a smartphone involves high computational cost and power consumption. Therefore, during the first stage raw data from different sources is converted into fixed-length vectors and stored in the database. In the second stage, the vanilla deep learning algorithm is applied to detect proximity [18] . A similar project under the TC4TL challenge compares several deep learning models including Conv 1d [29] , support vector machines [40] , and decision tree-based algorithms [14] to evaluate the accuracy of Bluetooth-based distance measurement [39, 33] . The performance of different techniques is measured based the lowest normalized decision cost function (NDCF) which represents proximity detection performance considering the combination of false negatives and false positives. The results show that the Conv 1d network has the lowest NDCF.

It is evident that the performance of classification algorithms varies widely based on proximity thresholds. For example, Song [43] reported that when considering two people six feet apart in classifying Bluetooth beacon RSSI values, a Gaussian support vector machine classifier yielded better accuracy than a decision tree classifier. For validation, each experiment was conducted by placing two Raspberry Pi's six feet apart and measuring the RSSI values.

An AI-based contact tracing app named COVI developed in Canada leverages probabilistic risk levels to profile an individual's infection risk level [3] . COVI uses the advantages of ML algorithms to optimize and automate the integration of pseudonymized user data in assessing the risk levels. An a priori version of an epidemiological model-based simulated dataset is used to pre-train the ML models. Upon collection of real data through an app, the simulator parameters are tuned to match with real data. The impact of ML in the COVI app is observed by using the ML predictor inside the simulator to influence the behavior of the agent in recommending the risk levels. The contact tracing application can be used to predict the lockdown area based on places visited by an infected patient. In [30] the authors proposed a K-Means clustering algorithm with DASV seeding to predict the lockdown area. The proposed method has been tested in Denver, USA and successfully identified the area to be locked down as users walking in the area approach each other very frequently. Despite the significant advantages of using DCT systems, there are issues related to data privacy and use. However, these are out of the scope of this review paper.

Machine learning has become a potent tool in many applications. In particular, it has recently been employed in the battle against COVID-19. There exists a growing body of literature that is dedicated to the subject. The decision by the major publishers to make all COVID-19 related research publicly available has improved information flow. In this paper, we attempt to provide an overview of the rapidly increasing corpus of research in machine learning related to COVID-19. We discuss the state-of-the-art research including the material on research archives. In particular, we covered four major areas of ML research related to COVID-19: forecasting, medical diagnostics, drug development, and contact tracing.

Our survey revealed the following key observations. In forecasting, recurrent neural network such as LSTMs have been used to predict the future infection and death rates. Many studies are focused on the 

Capture the dependencies across the whole history of the user.

Transformer deep learning architecture is used as the base algorithm.

Simulated dataset based on epidemiological model.

ML-based risk prediction could reduce the reproduction number compared to standard digital contact tracing applications.

Predict the lockdown area based on people movement.

K-Nearest unsupervised machine learning algorithm is used for prediction.

Android based smartphone application for user data collection.

Used a threshold of five meters the proposed protocol predicts the lockdown area.

North American region, but also other countries including Brazil and China. The best models achieve correlation of 0.999. In medical diagnostics, deep learning models that have previously shown success in other domains are being deployed to detect the presence of the infection based on CT scans and X-rays. The best models achieve accuracy rate of 99%. In drug discovery, a variety of algorithms are being used to develop new vaccine against the infection. However, the majority of the studies are still in the initial stage. In contact tracing, AI based applications are utilized to identify and locate potential virus carriers though with limited success. Despite the tremendous progress, the current machine learning approaches suffer from two major drawbacks. First, the underlying algorithms have not yet reached the level of human reasoning. The deep learning models such as CNNs, LSTMs, Transformer, and others remain imperfect and cannot consistently outperform a human expert. Second, the lack of data hinders the training and development of the models. Patient data is notoriously difficult to obtain. Since deep learning models rely on abundance of data the lack of thereof results in suboptimal generalization performance.

Our main recommendation based on the extensive survey of current literature is the involvement of government agencies to facilitate procurement of COVID-19 related data. Public institutions and government agencies can play a key role in obtaining and disseminating data from hospitals to researchers. Since machine learning algorithms rely heavily on large amounts of data its availability can drastically improve results.

Identification of vaccine targets; design of vaccine against SARS-CoV-2 coronavirus using computational and deep learning-based approaches

A survey of covid-19 contact tracing apps

COVI White Paper. arXiv preprint 2020

Application of deep learning technique to manage COVID-19 in routine clinical practice using CT images: results of 10 convolutional neural networks

Artificial intelligence and COVID-19: A multidisciplinary approach

Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks

Covid-19 outbreak prediction with machine learning

Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development

Prediction and analysis of COVID-19 positive cases using deep learning models: A descriptive case study of India

Predicting commercially available antiviral drugs that may act on the novel coronavirus (SARS-CoV-2) through a drug-target interaction deep learning model

Prediction of respiratory decompensation in Covid-19 patients using machine learning: The READY trial

COVID-19 Rapid Response Virtual Agent -Google Cloud. (n.d.). Retrieved

Applicability of mobile contact tracing in fighting pandemic (covid-19): Issues, challenges and solutions

Development of a datadriven COVID-19 prognostication tool to inform triage and step-down care for hospitalised patients in Hong Kong: A population based cohort study

Worldwide and Regional Forecasting of Coronavirus (Covid-19) Spread using a Deep Learning Model

New machine learning method for image-based diagnosis of COVID-19

The Effect of Energy Cryptos on Efficient Portfolios of Key Energy Listed Companies in the S&P Composite 1500 Energy Index

A 2-stage Classifier for Contact Detection with BluetoothLE and INS Signals

Forecasting significant stock price changes using neural networks

The Role of Artificial Intelligence and Machine Learning Techniques: Race for COVID-19 Vaccine. Archives Of Clinical Infectious Diseases

Examining COVID-19 Forecasting using Spatio-Temporal Graph Neural Networks

Automatic Detection of Coronavirus Disease (COVID-19

Artificial intelligence approach fighting COVID-19 with repurposing drugs

Artificial Intelligence for COVID-19 Drug Discovery and Vaccine Development

Predicting novel drugs for SARS-CoV-2 using machine learning from a ¿10 million chemical space

Cautions about radiologic diagnosis of COVID-19 infection driven by artificial intelligence. The Lancet Digital Health

Applications of machine learning and artificial intelligence for Covid-19 (SARS-CoV-2) pandemic: A review

Using artificial intelligence to detect COVID-19 and community-acquired pneumonia based on pulmonary CT: evaluation of the diagnostic accuracy

COVID-19 Surveillance through Twitter using Self-Supervised Learning and Few Shot Learning

A smartphone enabled approach to manage COVID-19 lockdown and economic crisis

Multiple ensemble neural network models with fuzzy response aggregation for predicting COVID-19 time series: the case of Mexico

Artificial intelligence-enabled rapid diagnosis of patients with COVID-19

NIST Pilot TC4TL Challenge (2020) NIST TC4TL Challenge

A novel medical diagnosis model for COVID-19 infection detection based on deep features and Bayesian optimization

COVID-19 coronavirus vaccine design using reverse vaccinology and machine learning

Machine learning applications in drug development

COVID-19 Future Forecasting Using Supervised Machine Learning Models

Predictions for COVID-19 with deep learning models of LSTM

Proximity Sensing for Contact Tracing

Survey of Decentralized Solutions with Mobile Devices for User Location Tracking, Proximity Detection, and Contact Tracing in the COVID-19 Era

Review of artificial intelligence techniques in imaging data acquisition, segmentation and diagnosis for covid-19

Forecasting Brazilian and American COVID-19 cases based on artificial intelligence coupled with climatic exogenous variables

Using Machine Learning to Perform Proximity Detection-Classifying Bluetooth Beacon RSSI Values

Adaptive feature selection guided deep forest for covid-19 classification with chest ct

Precise pulmonary scanning and reducing medical radiation exposure by developing a clinically applicable intelligent CT system: Toward improving patient care

Prediction models for diagnosis and prognosis of covid-19: systematic review and critical appraisal

Design an efficient Multi-Epitope Peptide vaccine candidate against SARS-CoV-2: An in silico Analysis

Clinically applicable AI system for accurate diagnosis, quantitative measurements, and prognosis of covid-19 pneumonia using computed tomography

Deep learning-based detection for COVID-19 from chest CT using weak label

A rapid, accurate and machineagnostic segmentation and quantification method for CT-based covid-19 diagnosis

On the accuracy of measured proximity of bluetooth-based contact tracing apps