key: cord-167889-um3djluz authors: Chen, Jianguo; Li, Kenli; Zhang, Zhaolei; Li, Keqin; Yu, Philip S. title: A Survey on Applications of Artificial Intelligence in Fighting Against COVID-19 date: 2020-07-04 journal: nan DOI: nan sha: doc_id: 167889 cord_uid: um3djluz The COVID-19 pandemic caused by the SARS-CoV-2 virus has spread rapidly worldwide, leading to a global outbreak. Most governments, enterprises, and scientific research institutions are participating in the COVID-19 struggle to curb the spread of the pandemic. As a powerful tool against COVID-19, artificial intelligence (AI) technologies are widely used in combating this pandemic. In this survey, we investigate the main scope and contributions of AI in combating COVID-19 from the aspects of disease detection and diagnosis, virology and pathogenesis, drug and vaccine development, and epidemic and transmission prediction. In addition, we summarize the available data and resources that can be used for AI-based COVID-19 research. Finally, the main challenges and potential directions of AI in fighting against COVID-19 are discussed. Currently, AI mainly focuses on medical image inspection, genomics, drug development, and transmission prediction, and thus AI still has great potential in this field. This survey presents medical and AI researchers with a comprehensive view of the existing and potential applications of AI technology in combating COVID-19 with the goal of inspiring researches to continue to maximize the advantages of AI and big data to fight COVID-19. 1 Introduction Figure 1 . Main scope of AI in fighting against COVID-19. We collected 1273 online publications related to COVID-19, SARS-CoV-2, and 2019-nCoV from databases such as Nature, Elsevier, Google Scholar, arxiv, biorxiv, and medRxiv. Then, we filter out 267 papers that explicitly use AI methods. Long short-term memory [66] VAE Variational auto-encoder [186] serological diagnosis, chest X-ray and CT image inspection, and other noninvasive methods. Benefitting from the advantages of high sensitivity and specificity, real-time Reverse Transcriptase Polymerase Chain Reaction (RT-PCR) is the current standard detection technology in diagnosing the SARS-CoV-2 virus and bacterial infections. Using RT-PCR, 9 RNA positives were detected from pharyngeal swabs of patients, indicating that the SARS-CoV-2 virus had spread in communities of Wuhan, China, in early January 2020 [106] . The shedding of the SARS-CoV-2 virus detected in the throat, lungs, and feces suggests multiple routes of virus transmission [217, 227] . However, RT-PCR faces the limitations of a complicated sample preparation, low detection efficiency, and high false-negative rate [106, 212, 223] . Isothermal nucleic acid amplification and blood testing methods are also commonly used for the rapid screening of SARS-CoV-2 [103, 122, 223 ]. An ML classification method was used for blood testing to extract important routine hematological and biochemical characteristics and to provide COVID-19 classification. In [223] , 105 blood test reports were collected, of which, 27 were collected as positive saples from patients with confirmed COVID-19, and for comparison, negative samples were collected from patients with ordinary pneumonia, tuberculosis, and lung cancer. Each sample contains 49 feature variables, including routine hematological and biochemical parameters. Next, the authors implemented the RF algorithm [22] on the training samples to perform feature learning and classification. Based on the extracted 11 key feature variables, they built an RF classifier and tested 253 samples of 169 patients with suspected COVID-19 with an accuracy of 96.97%. Although AI technologies rarely participate directly in RT-PCR and blood testing, the viral load and COVID-19 case data collected in these methods provide important data sources for the subsequent AI-based analysis. Medical imaging inspection is another widely used clinical approach for COVID-19 detection and diagnosis. COVID-19 medical image inspection mainly includes chest X-ray and lung CT imaging. AI technology plays an important role in medical image inspection and has achieved significant results in image acquisition, organ recognition, infection region segmentation, and disease classification. It not only greatly shortens the time of a radiologist's image diagnosis but also improves the accuracy and performance of the diagnosis. We will discuss in detail the contributions of AI methods to chest X-ray and lung CT imaging. CT imaging provides an important basis for the early diagnosis of COVID-19. The CT imaging manifestations of COVID-19 are mainly Ground Glass Opacity (GGO) in the periphery of the subpleural region, and some are consolidated. If the situation improves, the area will be absorbed and form fibrous stripes. Examples of lung CT images of normal and COVID-19 cases are shown in Fig. 2 [42, 161, 184] . The progress of CT image inspection based on AI usually includes the following steps: Region Of Interest (ROI) segmentation, lung tissue feature extraction, candidate infection region detection, and COVID-19 classification. The representative AI architecture for COVID-19 CT image classification is shown in Fig. 3 . The segmentation of lung organs and ROIs is a foundational step in AI-based image inspection. It depicts the ROIs in lung CT images (such as lungs, lung lobes, bronchopulmonary segments, and infected regions or lesions) for further evaluation and quantification. Different DL models (such as U-Net, V-Net, and VB-Net) have been used for CT image segmentation [32, 94, 114, 193, 226] . In [179] , Shan et al. collected 549 CT images from patients with confirmed COVID-19 and proposed an improved segmentation model (named VB-NET) based on the V-NET [117] and ResNet [69] models. In [32] , Chen et al. built a DL model based on the U-Net++ structure [244] to extract the ROIs from each CT image and detect the training curve of suspicious lesions. In [226] , Xu et al. used a 3D DL model to segment the infection regions from lung CT images. They then built a classification model using ResNet and location-attention structures and divided the segmented area images into three categories, such as COVID-19, influenza-A viral pneumonia, and normal. In [114] , Li et al. used the U-Net segmentation model to extract the lung organ from each lung CT image as an ROI. In [94] , Jin et al. proposed an AI-based COVID-19 diagnostic system, which consists of a lung segmentation module and a COVID-19 diagnostic module. The lung segmentation module is implemented based on Deeplabv1 [33] . In [193] , Tang et al. used the VB-Net model [179] to accurately segment 18 lung regions and infected regions from lung CT images and further calculated 63 quantitative features. Focusing on the detection and localization of candidate infection regions, different AI methods were proposed in [65, 85, 181, 212] . In [65] , Gozes et al. used commercial software to identify lung nodules and small opacities within the 3D lung volume. Then, they constructed a DL model consisting of U-Net and ResNet structures, where the U-Net module was used to extract the ROI regions, and the ReseNet model was used to detect and classify diffuse turbidity and ground glass infiltration. The authors compared the CT images of 56 patients with confirmed COVID-19 and 101 noncoronavirus patients and analyzed the CT features of COVID-19 in detail. In [181] , Shi et al. used a CNN model based on V-Net to segment lung organs and infection regions from lung CT images. Then, they used the LASSO method to calculate the best CT morphological features. Finally, the severity of COVID-19 was predicted and evaluated based on the best CT morphology and clinical features. In [212] , Wang et al. collected 195 CT images from 44 patients with COVID-19 and 258 CT images from 55 negative patients. They used the CNN model with the Inception structure [191] to classify randomly selected ROI images and predict COVID-19 disease. In [85] , Huang et al. used an InferReadTM CT pneumonia tool based on AI to quantitatively evaluate changes in the lung burden of patients with COVID-19. The tool includes three modules: lung and lobe region extraction, pneumonia segmentation, and quantitative analysis. The CT image features of COVID-19 pneumonia are divided into four types: mild, moderate, severe, and critical. Based on ROI segmentation and candidate infection region detection, the important features of ROIs and infection regions are extracted for COVID-19 classification [156] . In [156] , Qi et al. collected 71 CT images from 52 patients with confirmed COVID-19 in 5 hospitals. They used the pyradiomics method to extract 1,218 features from each CT image and then performed LR and RF methods on these features to distinguish between short-term and long-term hospital stay. In [180] , Shi et al. used the VB-NET model [179] to segment the infection and lung fields from CT images and divided them based on 96 features, including 26 volume features, 31 digital features, 32 histogram features, and 7 surface features. Next, they proposed an iSARF method to classify features and predict COVID-19 disease. Comparative experiments showed that the iSARF method is superior to the LR, SVM, and NN methods. In [241] , Zheng et al. proposed a 3D DCNN model (named DeCoVNet) to detect COVID-19 from CT images. The proposed DeCoVNet model includes three components. The first component uses vanilla 3D convolutional layers to extract lung image features, the second component consists of two 3D residual blocks that perform element conversion on the 3D feature maps, and the third component gradually extracts the information in the 3D feature map through 3D max-pooling and outputs the probability of COVID-19. In [187] , Song et al. collected 1990 CT images, including 777 images from 88 patients with COVID-19, 505 images from 100 patients with bacterial pneumonia, and 708 images from 86 healthy people. They proposed a DRE-NET DL model based on the pretrained ResNet50 structure and functional pyramid networks. DRE-NET extracts the top-K lesion features from each CT image to predict the classification of patients with COVID-19. The lack of large-scale datasets is the main challenge that hinders the implementation of AI-based CT image inspection and affects diagnostic performance. To address these challenges, strategies such as transfer learning, data augmentation, and "Human-In-The-Loop" were used in [94, 179, 239] . In [94] , Jin et al. used the ImageNet7 dataset [47] to pretrain the proposed 2D classification network. In [239] , Zhao et al. provided a public COVID-19 CT scan dataset, including 275 COVID-19 cases and 195 non-COVID-19 cases. They used data augmentation and TL methods to alleviate the shortage of training data. In terms of data augmentation, they used transformation operations to expand the training dataset, such as random transformation, cropping, and rotation. In terms of TL, they pretrained the DenseNet model [84] on the chest X-ray dataset [213] and then used the pretrained model to predict COVID-19. In addition, a "Human-In-The-Loop" strategy was adopted to reduce the workload of radiologists in annotating the training samples [179] . Radiologists annotate a small portion of training samples in the first batch of training. Then, they manually correct the segmentation results in the second batch and used them as annotations of the images. Iterative training is performed in this way to complete the annotation of all training samples. It is commendable that several works provided open-source code of the designed models and online COVID-19 CT image inspection systems. For example, Li [114] , Jin [94] , Zheng [241] , and Zhao [239] published the proposed DL models on GitHub [88] . In addition, Song et al. [187] provided an online CT diagnosis service. Wang et al. [212] provided a public website for CT image uploading and testing. In [32] , Chen et al. developed a public online CT diagnostic system, and anyone can upload CT images for self-diagnosis. More detailed information about AI-based CT image segmentation and classification methods is provided in Table 2 and Fig. 4 . Compared with CT images, chest X-ray (CXR) images are easier to obtain in clinical radiology inspections. Although CXR image inspection is a typical imaging method used for COVID-19 diagnosis, it is generally considered to be less sensitive than CT image inspection. Some CXR images of patients with early COVID-19 showed normal characteristics. The radiological signs of COVID-19 CXR images include airspace opacity, GGO, and later mergers. In addition, the distribution of bilateral, peripheral, and lower regions is mostly observed. Examples of CXR images of normal and COVID-19 cases are shown in Fig. 5 from [124, 160] . The CXR image inspection process based on AI techniques usually includes steps such as data preprocessing, DL model training, and COVID-19 classification. The representative AI architecture for COVID-19 CXR image inspection is shown in Fig. 6 . Unlike CT images, CXR image segmentation is more challenging because the ribs are projected onto soft tissues, which is confused with image contrast. In this way, most DL models focus on the classification of the entire CXR image, while few works focus on segmenting ROIs and lung organs from CXR images. In [67] , Hassanien et al. used a classification method to identify and classify COVID-19 on lung X-ray images through a multilevel threshold and SVM. A multilevel image segmentation threshold was used to segment the lung organs from the background, and then the SVM module classified the infected lungs from the uninfected lungs. Focusing on COVID-19 classification based on CXR images, several studies built AI-based [237] proposed a new DL model, which consists of a backbone network, a classification module, and an anomaly detection module. The backbone network extracts the features of each input CXR image. The classification module and anomaly detection module use the extracted features to generate classification scores and scalar anomaly scores, respectively. In [209] , Wang et al. introduced a COVID-Net DCNN model to identify COVID-19 cases based on CXR images. The COVID-Net model uses a large number of convolutional layers in a projection-expansion-projection design pattern. They collected 13,800 CXR images from 13,725 patients (including 183 COVID-19 patients) to establish a CXR database (called COVIDx) for training COVID-Net. It is commendable that the authors provided an open-source of the proposed model code and the COVIDx database. Similar to CT images, in CXR image inspection, there is also the problem of a lack of large-scale datasets for DL model training. In [121] , Loey et al. used the GAN model [86] to generate more CXR images, thereby extending the scale of the CXR dataset. In addition, three DL models (Alexnet [109] , GoogleNet [190] , and ResNet18 [69] ) were used to classify CXR images into four categories: COVID-19, normal, pneumonia bacteria, and pneumonia virus. In [127] , Maghdid et al. used CNN and AlexNet models to train CXR and CT images to diagnose COVID-19 cases, respectively. Among them, the AlexNet model was pretrained on the ImageNet dataset to perform COVID-19 classification on the datasets in [41, 124, 140] . Unlike existing TL and image augmentation methods, Afshar et al. designed a capsule network model (named COVID-CAPS) suitable for small-scale CXR datasets [2] . Each layer of the COVID-CAPS model contains multiple capsules, and each capsule represents a specific image instance at a specific position through multiple neurons. The capsule module [75] uses protocol routing to capture alternative models of spatial information and attempts to reach a consensus on the existence of objects. In this way, the protocol uses information from instances and objects to identify the relationship between them without the need for large-scale datasets. More detailed information about AI-based CXR image classification methods for COVID-19 inspection is shown in Table 3 and Fig. 7 . In addition to RT-PCR detection and image inspection techniques, some noninvasive measurement methods have also been used for COVID-19 detection and diagnosis, including cough sound judgment and breathing pattern detection. (1) Monitoring COVID-19 through AI-based cough sound analysis. Schuller et al. [174] discussed the potential application of computer audition (CA) and AI in the analysis of cough sounds in patients with COVID-19. They first analyzed the CA's ability to automatically recognize and monitor speech and cough under different semantics, such as breathing, dry and wet coughing or sneezing, speech during colds, eating behaviors, drowsiness, or pain. Then, they suggested applying the CA technology to the diagnosis and treatment of patients with COVID-19. However, due to the lack of available datasets and annotation information, there is no report on the application of this technology in COVID-19 diagnosis. Similarly, Iqbal et al. [90] also discussed an abstract framework that uses the speech recognition function of mobile applications to capture and analyze the cough sounds of suspicious persons to determine whether the user is healthy or suffers from a respiratory disease. In [214] , Wang et al. analyzed the respiratory patterns of patients with COVID-19 and other breathing patterns of patients with influenza and the common-cold. In addition, they proposed a respiratory simulation model (named BI-AT-GRU) for COVID-19 diagnosis. The BI-AT-GRU model includes a GRU neural network with a bidirectional and attention mechanism and can classify 6 types of clinical respiratory patterns, such as Eupnea, Tachypnea, Bradypnea, Biots, Cheyne-Stokes, and Central-Apnea. (2) COVID-19 diagnosis based on noninvasive measurements. In [128] , Maghdid et al. designed an abstract framework for COVID-19 diagnosis based on smart phone sensors. In the proposed framework, smart phones can be used to collect the disease characteristics of potential patients. For example, the sensors can acquire the patient's voice through the recording function and can obtain the patient's body temperature through the fingerprint recognition function. Then, the collected data are submitted to the AI-supported cloud server for disease diagnosis and analysis. The virology and pathogenesis of SARS-CoV-2 are one of the most important scientific studies in the fields of biology and medicine. Scientists have analyzed the virus characteristics of SARS-CoV-2 through proteomics and genomic studies [7, 77, 123] . In the field of virology, the origin and classification of SARS-CoV-2, the physical and chemical properties, receptor interactions, cell entry, and the ecology and genomic variation in SARS-CoV-2 have been studied [123, 217, 242] . We mainly discuss the contribution of AI in the pathological research of SARS-CoV-2 from the perspective of proteomics and genomics. Since the advent of SARS-CoV-2, there have been a large number of research achievements in proteomics. Five types of structural proteins of SARS-CoV-2 were confirmed, including nucleocapsid (N) proteins, envelope (E) proteins, membrane (M) proteins, and spike (S) proteins [145, 206, 240] . In addition, other proteins translated in the host cells essential for virus replication have also attracted the attention of researchers, such as non-structural protein 5 (NSP5) and 3C-like protease (3CLpro). Moreover, several studies have shown that SARS-CoV-2 uses the human Angiotensin-Converting Enzyme 2 (ACE2) to enter the host [77, 242] . In this field, AI techniques are used to predict protein structures and analyze the interaction network between proteins and drugs. The representative AI architecture for protein structure predication is shown in Fig. 8 . In [176, 177] , Senior et al. used DL models to implement the AlphaFold system for protein structure prediction. The AlphaFold system uses a ResNet model [69] to analyze the covariance and amino acid residue contacts in homologous gene sequences and to predict the corresponding protein structures. The AlphaFold system consists of a feature extraction module and a distance prediction neural network. The feature extraction module is responsible for searching for protein sequences that are similar to the input protein sequences and constructing the multiple sequence alignment (MSA). The module simultaneously generates residual position and sequence contour features, and the output of 485 feature parameters are input into the distance prediction neural network. The distance prediction neural network is a two-dimensional (2D) ResNet structure, which is responsible for accurately predicting the distance between all residue pairs of every two protein sequences. The authors added a one-dimensional output layer to the network to predict the accessible surface area, distance map, and secondary structure of each residue. Finally, the generated potential is optimized by gradient descent to generate protein structures. Based on [176, 177] , Jumper et al. [96] used the AlphaFold system to predict the structure of SARS-CoV-2 membrane proteins. They published the predicted protein structures such as 3a, Nsp2, Nsp4, Nsp6, and papain-like proteases. Although the structure of these proteins has not been verified by clinical experiments, this publication allows researchers to quickly conduct SARA-CoV-2 studies. In [145] , Ortega et al. used a computational method to detect changes in the S1 subunit of the spike receptor-binding domain and determined mutations in the SARS-CoV-2 spike protein sequence, which may be beneficial for studying human-to-human transmission. They collected sequences for modeling and constructed the SARS-CoV-2 spike protein model from the Protein Data Bank (PDB) [19] and used SWISS-MODEL software [12] to construct the SARS-CoV-2 spike protein model. Then, Z-dock software [151] was used to dock between the spike protein and ACE2, and a clustering algorithm was used to cluster the docking results. The work indicated that the SARS-CoV-2 spike protein has a higher affinity for human ACE2 receptors. Another branch of AI-assisted proteomics research involves finding new compounds and drug candidates for the treatment of COVID-19 by building interactive networks and knowledge maps between proteins and drugs. Please see Section 4 for details. Genomics is mainly used in SARS-CoV-2 to analyze the origin of SARS-CoV-2, vaccine development, and PT-PCR detection. Various AI algorithms are applied for similarity comparisons of gene sequences, gene fragments, and miRNA prediction [46, 164] . In [164] , Randhawa et al. used different ML methods to analyze the pathogen sequences of COVID-19 and identified the inherent features of the viral genomes, thereby rapidly classifying new pathogens. They collected the complete reference genome of the COVID virus from NCBI [53] , the bat β-coronavirus from GISAID [61] , and all available virus sequences from Virus-Host DB [135] . Each genomic sequence was mapped to a corresponding genomic signal in a discrete digital sequence by using chaotic game representation [93] . In addition, the amplitude spectrum of these genomic signals was calculated by using a discrete Fourier transform. On this basis, they used 6 ML classification models to train the above sequence distance matrix and compared their performance. Finally, they conducted the trained ML models on 29 COVID-19 sequences to classify COVID-19 pathogens. The results of this work support the hypothesis that COVID-19 originated in bats and its classification as a β-coronavirus. In [46] , Demirci et al. performed a miRNA prediction on the SARS-CoV-2 genome based on 3 ML methods and identified miRNA-like hairpins and microRNA-mediated SARS-CoV-2 infection interactions. They collected the complete COVID-19 genome from NCBI [53] and human-mature miRNA sequences from miRBase [108] . The genomic sequences are transcribed and divided into multiple overlapping fragments, which are folded into a secondary structure to extract the hairpin structure. On this basis, the authors used 3 ML methods (e.g., DT, Naive Bayes, and RF) to predict the category of each hairpin and determined the similarity between the hairpins and human miRNA. They searched for mature miRNA targets in human and SARS-CoV-2 genes and analyzed the potential interactions between SARS-CoV-2 miRNAs and human genes and between human miRNAs and SARS-CoV-2 genes. Finally, the gene ontology of SARS-CoV-2 miRNA targets in human genes were analyzed, and the similarity between SARS-CoV-2 miRNA candidates and mature miRNAs of any known organism was evaluated using the PANTHER classification system [134] . In [133] , Metsky et al. used genomic and AI technologies to rapidly design nucleic acid detection assays and improved current RT-PCR testing of SARS-CoV-2. They developed a CRISPR tool that uses enzymes to edit the genome by cutting specific genetic code chains and used different ML methods to predict the diversity of the target genome. The authors designed the RT-PCR test method through the CRISPR tool, and it can effectively detect 67 respiratory viruses, including SARS-CoV-2. In the field of drug development, AI technologies can screen existing drug candidates for COVID-19 by analyzing the interaction between existing drugs and COVID-19 protein targets. In addition, AI technologies can help to discover new drug-like compounds against COVID-19 by constructing new molecular structures that have inhibitory effects on proteases at the molecular level. The representative AI architecture for new drug-like compound discover is shown in Fig. 9 . Drug development can be divided into small-molecule drug discovery and biological product development. Small-molecule drug discovery mainly focuses on chemically synthesized small molecules of active substances, which can be made into small-molecule drugs through chemical reactions between different organic and inorganic compounds. One group of AI-based drug development focuses on the discovery of new drug-like compounds at the molecular level. In [18, 182] , Beck et al. proposed a Figure 9 . Representative AI architecture for new drug-like compound discover. DL-based drug-target interaction model (MT-DTI) to predict potential drug candidates for COVID-19. The MT-DTI model uses SMILES strings and amino-acid sequences to predict target proteins with 3D crystal structures. The authors collected the amino-acid sequences of 3C-like proteases and related antiviral drugs and drug targets from the databases of NCBI [53] , Drug Target Common (DTC) [194] , and BindingDB [120] . In addition, they used a molecular docking and virtual screening tool (AutoDock Vina [202] ) to predict the binding affinity between 3,410 drugs and SARS-CoV-2 3CLpro. The experimental results provided 6 potential drugs, such as Remdesivir, Atazanavir, Efavirenz, Ritonavir, Dolutegravir, Kaletra (lopinavir/ritonavir). Note that Remdesivir shows promising in clinical trial. In [136] , Moskal et al. used AI methods to analyze the molecular similarity between anti-COVID-19 drugs (termed "parents") and drugs involving similar indications to screen out second-generation drugs (termed "progeny") for COVID-19. They first used the Mol2Vec [91] method to convert the molecular structure of the parent drugs into a high-dimensional vector space, treated the drug molecule as a "sentence", and mapped its molecular substructure to a "word". Then, they used the VAE [186] model to generate SMILES strings with similar 3D shape and pharmacodynamic properties to a given seed molecule [63] . In addition, CNN, LSTM, and MLP models are used to generate the corresponding SMILES strings and molecules. The authors selected 71 parent drugs as seed molecules from the literature and selected 4456 drugs as candidate progeny drugs from ZINC [245] and ChEMBL [56] . In [23] , Bung et al. committed to the development of new chemical entities for the SARS-CoV-2 3CLpro based on DL technology. They constructed an RL-based RNN model to classify protease inhibitor molecules and obtained a smaller subset that favored the chemical space. Then, they collected 2515 protease inhibitor molecules in SMILES format from the ChEMBL database as training data, where each SMILES string is regarded as a time series, and each position or symbol is regarded as a time point. The output of small molecules was docked to the 3CLpro structure with minimal energy and ranked based on the virtual screening score obtained by selecting candidates of anti-SARS-CoV-2 [202] . In [192] , Tang et al. analyzed 3CLpro with a 3D structure similar to SARS-CoV and evaluated it as an attractive target for anti-COVID-19 drug development. They proposed an advanced deep-Q learning network (called ADQN-FBDD) to generate potential lead compounds of SARS-CoV-2 3CLpro. They collected 284 reported molecules as SARS-CoV-2 3CLpro inhibitors. These molecules were split using the improved BRICS algorithm [45] to obtain the target fragment library of SARS-CoV-2 3CLpro. Then, the proposed ADQN-FBDD model trains each target fragment and predicts the corresponding molecules and lead compounds. Through the proposed Structure-Based Optimization Policy (SBOP), they finally obtained 47 derivatives with inhibitory effects on SARS-CoV-2 3CLpro from these lead compounds, which are regarded as potential anti-SARS-CoV-2 drugs. Another group of studies focused on screening candidate biological products for COVID-19. Biological products are a type of protein products with therapeutic effects, which are mainly combined with specific cell receptors involved in the disease process. Biological products are prepared from microbial cells such as genetically modified bacteria, yeast, or mammalian cell strains through biotechnology processes. In [79] , Hu et al. established a multitask DL model to predict the possible binding between potential drugs and SARS-CoV-2 protein targets, thereby selecting available drugs for SARS-CoV-2. They first collected 8 SARS-CoV-2 viral proteins from GHDDI [58] as potential targets. The proposed DL model is based on the AtomNet model [188, 205] and includes a shared layer to learn the joint representation of all tasks and a task processing layer for performing specific tasks. By fine-tuning the DL model using a coronavirus-specific dataset, the model can predict the possible binding between the drugs and the protein targets and output the binding affinity score. Based on existing studies, RdRp, 3CLpro, and papain-like protease have been confirmed as the three principal targets of SARS-CoV-2 [60, 146, 206] . Based on the prediction results [113, 210] , the authors selected the top 10 potential drugs with a high likelihood of inhibition for each target. In [98] , Kadioglu et al. used High-Performance Computing (HPC), virtual drug screening, molecular docking, and ML technologies to identify SARS-CoV-2 drug candidates. After performing virtual drug screening and molecular docking, two supervised ML models(e.g., NN and Naivebayes) were used to analyze clinical drugs and test compounds to construct corresponding drug likelihood prediction models. Several approved drugs, including those used for the hepatitis C virus (HCV), the enveloped ssRNA virus, and other infectious diseases, were selected as SARS-CoV-2 drug candidates. Facing the known COVID-19 protease target 3CLpro, Zhavoronkov et al. [240] designed a small-molecule drug-discovery pipeline to produce 3CLpro inhibitors, used 3CLpro's crystal structure, homology modeling, and co-crystallized fragments to generate 3CLpro molecules. They collected the crystal structure of COVID-19 3CLpro from [230] and constructed a homology model. At the same time, molecules with activity on various proteases were extracted from [56, 89] and constituted a protease peptidomimetic dataset with 5,891 compounds. Then, they used 28 ML methods (such as GAE, GAN, and GA) and RL strategies to separately train input datasets (e.g., crystal structure, homology model, and co-crystal ligands), and generated new molecular structures with a high score. In [78] , Hofmarcher et al. used a ChemAI DL model [154] based on the SmilesLSTM structure [76] to test the resistance of the molecules to COVID-19 proteases. They collected 3.6 million molecules from ChEMBL [56] , ZINC [245], and PubChem [104] and formed a training dataset. Then, the ChemAI model was trained on the dataset in a multitask parallel training way, where the output neurons of the model represent the biological effects of the input molecules. The authors used the ChemAI model to predict the inhibitory effects of these molecules on the 3CLpro and PLpro proteases of COVID-19. These molecules have a binding, inhibitory, and toxic effect on the targets. A list of COVID-19 drug development methods based on AI is provided in Table 4 . Currently, there are 3 types of COVID-19 vaccine candidates, such as (1) whole virus vaccines, (2) recombinant protein subunit vaccines, and (3) nucleic acid vaccines [38, 238] . AI technology has been involved in the design and development of COVID-19 vaccines. Compared with explicit applications in other fields, AI technology is usually used in the sub-processes of vaccine development in an implicit manner. The AI algorithms of netMHC and netMHCpan are used in the development of COVID-19 vaccines for epitope prediction [74, 97, 215] . In [74] , Herst et al. obtained the SARS-CoV-2 protein sequences from GenBank and used the MSA algorithm to trim the nucleocapsid phosphoprotein sequences to possible peptide sequences. On this basis, they used netMHC and netMHCpan AI algorithms to train and predict peptide sequences [8, 97] . The pan variant of netMHC integrates the in-vitro objects of 215 HLAs for prediction. Finally, they used the average value of the ANN, SVM, netMHC and netMHCpan methods to calculate the vaccine candidates. In [215] , Ward et al downloaded the SARS-CoV-2 nucleotide sequences from the NCBI [53] and GISAID [61] databases, and generated a consensus sequence for each SARS-CoV-2 protein. The sequences can be used as references for prediction, specificity, and epitope mapping analysis. Next, the authors used different epitope prediction tools to predict B cell epitopes and map them to the amino acid sequences of each gene. On this basis, they used the AI-based netMHCpan algorithm to predict HLA-1 peptides and obtained a total of 2,915 alleles in all peptide lengths. BLASTp tool [6] was used to locate the short amino acid epitope sequences to the canonical sequences of SARS-CoV-2 proteins. Finally, the author provided an online tool that provides functions of SARS-CoV-2 genetic variation analysis, epitope prediction, coronavirus homology analysis, and candidate proteome analysis. In [143] , Ong et al. used ML and Reverse Vaccinology (RV) methods to predict and evaluate potential vaccines for COVID-19. They used RV to analyze the bioinformatics of pathogen genomes to identify promising vaccine candidates. They obtained the SARS-CoV-2 sequences and all proteins of the 6 known human coronavirus strains from the NCBI [53] and UniProt [17] databases. Then, they used Vaxign and Vaxign-ML [71, 142] to analyze the complete proteome of the coronaviruses and predicted their service biological characteristics. Next, they improved the Vaxign-ML model based on ML and RV using LR, SVM, KNN, RF, and XGBoost methods and predicted the protein level of all SARS-CoV-2 proteins. The nsp3 protein was selected for phylogenetic analysis, and the immunogenicity of nsp3 was evaluated by predicting T cell MHC-I and MHC-II and linear B cell epitopes. In [158] , Qiao et al. used DL to predict the patient's mutated new antigen and identified the best T-cell epitope for peptide-based COVID-19 vaccines. They first sequenced the diseased cells in the patient's blood and extracted 6 human leukocyte antigen (HLA) types and T-cell receptor (TCR) sequences. Then, they proposed the DeepNovo model to train the patient's immune peptide and to identify the best T-cell epitope set based on a person's HLA alleles and immune peptide group information. The DeepNovo model uses LSTM and RNN structures to capture sequence patterns in peptides or proteins and predicts HLA peptides from conserved regions of the virus, thereby predicting new mutant antigens in patients. In addition, they used the IEDB [204] tool to predict the immunogenicity of 177 peptides. They suggested designing an epitope-based COVID-19 vaccine specifically for each person based on their HLA alleles. The prediction of immune stimulation ability is an important part of vaccine designing [162, 166] . Different ML methods and position-specific scoring matrices (PSSM) are usually used to predict epitope and immune interactions, thereby predicting the generation of adaptive immunity in the target host. In [162] , Rahman et al. used immuno-informatics and comparative genomic methods to design a multi-epitope peptide vaccine against SARS-CoV-2, which combines the epitopes of S, M, and E proteins. They used the Ellipro antibody epitope prediction tool [87] to predict linear B cell epitopes on the S protein. Ellipro uses multiple ML methods to predict and visualize a given protein sequence or B-cell epitope in the structure. In addition, Sarkar et al. [172] studied the epitope-based vaccine design for 15/36 COVID-19 and used the SVM method to predict the toxicity of the selected epitopes. In [153] , Prachar et al. used 19 epitope-HLA combined prediction tools including IEDB, ANN, and PSSM algorithms to predict and verify 174 SARS-CoV-2 epitopes. Thanks to the developed information and multimedia technology, the outbreak and spread of COVID-19 were reported in a timely and accurate manner. The number of suspected, confirmed, cured, and dead COVID-19 cases in each country/region is announced in real time. In addition, passenger travel trajectories and related big data are shared for scientific research. Based on the rich data, numerous researchers have participated in the prediction, spread, and tracking of the COVID-19 outbreak. Researchers collected clinical COVID-19 case data and used different AI methods to extract important features and to predict the mortality and survival rate of patients with COVID-19. The representative AI architecture for prediction of patient mortality and survival rate is shown in Fig. 10 . Figure 10 . Representative AI architecture for prediction of patient mortality and survival rate. In [152] , Pourhomayoun et al. used 6 AI methods to predict the mortality rate of patients with COVID-19. They used public data of patients with COVID-19 from 76 countries around the world [225] , and counted 112 features, including 80 medical annotations and disease features and 32 features from the patients' demographic and physiological data. Based on the filtering method and wrapper method, 42 best features were extracted, such as demographic features, general medical information, and patient symptoms. On this basis, 6 AI methods (such as SVM, NN, RF, DT, LR, and KNN) are used to predict the mortality of patients with COVID-19. In [173] , Sarkar et al. used the RF model to analyze the records of 433 patients with COVID-19 from Kaggle [43] and identified the important features and their impact on mortality. Experimental results show that patients over 62 years of age have a higher risk of death. In [228, 229] , Yan et al. analyzed a blood sample dataset of 404 patients with COVID-19 in Wuhan, China, and used the XGBoost classification method [37] to select three important biomarkers and to predict individual patient survival rates. Experimental results with an accuracy of 90% indicated that higher LDH levels seem to play an important role in distinguishing the most critical COVID-19 cases. BlueDot [21] and Metabiota [132] are two AI companies that made accurate predictions for the COVID-19 outbreak. BlueDot collected large-scale heterogeneous data from various sources, such as news reports, global ticketing data, animal diseases, global infectious disease alerts, and real-time climate conditions. Then, it used filtering tools to narrow its focus; used various ML and Natural Language Processing (NLP) techniques to detect, mark, and display the potential risk frequency of COVID-19; and predicted the outbreak time of transmission. It is worth mentioning that 9 days before the official 16/36 announcement of the COVID-19 outbreak, BlueDot accurately predicted the epidemic of COVID-19 and cities with a high risk of virus outbreaks. Metabiota collected large-scale data from social and nonsocial sources (such as biology, socioeconomic, political, and environmental data) and used technologies such as AI, ML, big data, and NLP to accurately predict the outbreak, spread, and intervention measures of COVID-19. More AI-based COVID-19 outbreak and transmission prediction methods are shown in Table 5 . Table 5 . COVID-19 outbreak and transmission prediction based on AI methods. Data sources Methods Country/region Huang [82] Yang [231] , WHO [216] CNN, LSTM, MLP, GRU China Hu [80, 81] The Paper [148] , WHO [216] MAE, clustering China Yang [233] Baidu [16] SEIR, LSTM China Fong [51, 52] NHC [139] SVM, PNN China Ai [3] WHO [54, 216] ANFIS, FPA China, USA Rizk [168] WHO [216] ISACL-MFNN USA, Italy, Spain Giuliani [62] Italy [144] EMTMGL Italy Ayyoubzadeh [14] Worldometer [218] , Google [201] LR, LSTM Iran Marini [129, 130] Swiss population Enerpol Switzerland Lai [110] IATA [126] , Worldpop [219] ML Global Punn [155] JHU CSSE [49] SVR, PR, DNN, LSTM, RNN Lampos [111] MediaCloud [131] , PHE [64] , ECDC [55] Transfer learning Global Although the source of the COVID-19 epidemic has not yet been identified, it was first reported in Wuhan, China. Therefore, the outbreak and spread of COVID-19 in China have received extensive attention. In [82] , Huang et al. used 4 DL models, such as CNN, LSTM, GRU, and MLP to train and predict the COVID-19 case data from 7 severe epidemic cities in China. The input of these DL models is the features of the COVID-19 cases, including the number of confirmed cases, cured cases, and deaths. Based on the input of the previous 5 days, each model can predict the number of COVID-19 cases for the following few days. The architecture of the COVID-19 outbreak prediction model based on AI models is shown in Fig. 11 . Figure 11 . Architecture of COVID-19 outbreak prediction model based on DL models. In [80, 81] , Hu et al. used AI methods such as MAE and clustering algorithms to predict the number of confirmed COVID-19 cases in different provinces and cities in China. In addition, they clustered 34 provinces and cities in China into 9 clusters based on the prediction results and further predicted the spread of COVID-19 among provinces and cities. In [233] , Yang et al. used the SEIR model [101] and the LSTM model to predict COVID-19 in China. The population migration data and the latest COVID-19 epidemiological data from Baidu [16] were input into the SEIR model to derive the epidemic curves. In addition, they used SARS data from 2003 to pretrain the LSTM model to predict COVID-19 for the following few days, in which epidemiological parameters, such as the transmission, incubation, recovery probability, and the number of deaths, were selected as input features. Both the SEIR and LSTM models predicted a daily infection peak of 4,000 in the first week of February. In [51, 52] , Fong et al. obtained early COVID-19 epidemiological data from NHC [139] . Then, they used traditional time series data analysis methods (e.g., ARIMA, Exponential, and Holt-Winters), ML methods (e.g., KR, SVM, and DT), and AI methods (e.g., PNN) to analyze and predict future outbreaks. In addition to China, the outbreak and spread of COVID-19 in other countries (including the United States, Italy, Spain, Iran, and Switzerland) have also received widespread attention. In [3] , Ai et al. proposed an improved ANFIS method [92] to predict the number of COVID-19 cases. The proposed system connects fuzzy logic and neural networks and uses and enhanced Flower Pollination Algorithm (FPA) [232] for model parameter optimization and model training. In [168] , Rizk et al. proposed an improved Multi-layer Feed-forward Neural Network (ISACL-MFNN) model, which uses an Internal Search Algorithm (ISA) to optimize model parameters and uses the CL strategy to enhance the ISA performance. From the official COVID-19 dataset reported by the WHO [216] , data from January 22, 2020, to April 3, 2020, in the United States, Italy, and Spain were collected to train the ISACL-MFNN model and to predict the confirmed cases within the next 10 days. In [62] , Giuliani et al. collected the number of infected people in Italian provinces [144] and used the EMTMGL model to simulate and predict the spatial and temporal distribution of COVID-19 infection in Italy. In [14] , Ayyoubzadeh used real-time COVID-19 epidemic data from Google Trends [201] and Worldometer [218] to predict COVID-19 cases in Iran. They collected daily epidemic data and saved them as a time series data format and then used the LR and LSTM models to make predictions, thereby obtaining the outbreak and spread trend of COVID-19 in Iran. In [129, 130] , Marini et al. developed an agent-based AI platform to predict the development of COVID-19 in Switzerland. The system accepts the entire Swiss population as input data to simulate and predict the spread of COVID-19 in Switzerland. It simulates the people's daily trajectories by calibrating the micro-census data and effectively predicts the individual contacts and possible transmission routes. Many studies have likewise focused on the prediction of the spread of COVID-19 around the world. They collected a large amount of travel data, mobile phone data, and social media data and used AI methods to accurately predict the potential transmission range and transmission route of COVID-19. In [110] , Lai et al. collected a large amount of travel and mobile phone data from [219] and constructed corresponding models to predict the transmission risk of COVID-19 in different countries. On this basis, they established air travel network models between domestic cities and cities in other countries to predict risk cities at home and abroad. In [155] , Punn et al. used 2 ML models (e.g., SVR [13] and PR [44] ) and 3 DL regression models (e.g., DNN, LSTM [66] , and RNN) to predict real-time COVID-19 cases. In [111] , Lampos et al. used an automatic crawling tool to obtain daily confirmed COVID-19 case data and related articles from online media such as MediaCloud [131] , Public Health England (PHE) [64] , and European Centre for Disease Prevention and Control (ECDC) [55] . They used the TL strategy to transfer the COVID-19 model of the country where the disease spread to other countries that are still in the early stage of the epidemic curve, and thus achieving the target country's epidemic prediction. In addition, companies such as Microsoft Bing [20] , Google [201] , and Baidu [16] have aggregated multiple available data sources and developed COVID-19 global tracking systems to provide a visual tracking interface. In addition to AI methods, various methods based on statistics and epidemiology are used to predict the outbreak and spread of COVID-19. In [70] , He et al. collected the highest viral load in the pharyngeal swabs of 94 patients with confirmed COVID-19. They fitted a generalized additive model with identity links and smooth spline curves to analyze its overall trend. A gamma distribution was fitted to the transmission pair data to evaluate the serial interval distribution. The results of statistical analysis showed that the patients with confirmed COVID-19 reach the peak of virus shedding before or during symptom onset, and some kinds of transmission may occur before the initial symptoms. In [208] , Wang et al. determined a set of technical indicators (e.g., number of infection cases in the hospital, daily infection rate, and daily cured rate) that reflect the infection status of COVID-19. Next, they proposed a calculation method based on statistical theory to quantify the iconic characteristics of each period and predict the turning point in the development of the epidemic. In addition, numerous studies based on the Susceptible-Infected-Recovered (SIR) and SEIR models have studied the spread of COVID-19 from an epidemiological perspective. Please see [25, 119, 169, 185, 195, 199, 234] for more information. When COVID-19 appeared, most countries in the world adopted different forms of social control, social alienation, school closures, and blockade measures to prevent the spread of the epidemic [203] . AI technologies have been widely used in epidemic control and social management, including individual temperature detection, video tracking, contact tracking, intelligent robots, etc. Many countries have used smart devices equipped with AI to detect suspicious persons in public transportation places such as airports and train stations [40, 167] . For example, infrared cameras are used to scan for high temperatures in a crowd, and different AI methods perform efficient analysis to detect whether an individual is wearing a mask in real time. In addition, DL-based video tracking technology is used to detect and track suspicious COVID-19 patients in public places [31] . Moreover, at the entrances and exits of cities, the identity information of each passing person was collected. Then, AI-based systems are used to efficiently query the travel history and trajectory of each passing individual to check whether they are from a region seriously affected by COVID-19 [35, 197] . AI technologies are also used in contact tracking of patients with COVID-19 [100] . For each patient with confirmed COVID-19, personal data such as mobile phone positioning data, consumption records, and travel records may be integrated to identify the potential transmission trajectory [57] . In addition, when people are in social isolation, mobile phone positioning and AI frameworks can assist the government in better understanding the status of individuals [165] . Moreover, intelligent robots are used to perform site disinfection and product transfer, and mobile phone positioning functions are used to detect and track the distribution and flow of personnel. Another group of studies focused on the impact of various social control strategies on the spread of COVID-19. In The implementation and performance improvement of AI greatly depends on the large-scale available data and resources. Therefore, we compiled available public resources that can be used for COVID-19 disease diagnosis, virology research, drug and vaccine development, and epidemic and transmission prediction. Three types of data and resources were summarized, including medical images, biological data, and informatics resources. 19/36 We collected 17 groups of COVID-19 medical images such as CXR and CT images from individual researchers and organizations. Among them, the CXR image data set published by Cohen et al. [41] is widely cited, which is a collection of CXR images from multiple references. In addition, many researchers uploaded CXR and CT images to Kaggle [15, 112, 124, 138, 160] for COVID-19 research. Moreover, organizations such as the British Society of Thoracic Imaging (BSTI), Eurorad, and Radiopaedia also released online CXR and CT images. Table 6 displays the detailed description of medical image data resources of COVID-19. Table 6 . Medical image data resources for COVID-19 research. Data type Cited by Refs. Zhao [239] CT images [239] HRCT [48] CT images [94] Armato [11] CT images [94] Coronacases [42] CT images -Medical segmentation [175] CT images -Cohen [41] CXR images [2, 9, 73, 121, 127, 137, 178, 209 , 237] Wang [213] CXR images [237, 239] COVIDx [24] CXR images [209] Adrian [159] CXR images [73] COVID-Net [209] CXR images [209] Kermany [102] CXR images [9] Mendeley data [5] CXR images -Kaggle [15, 112, 124, 138, 160] CXR and CT images [2, 9, 121, 127, 137, 173, 178, 209 ] BSTI [140] CXR and CT images [127] SIRM [184] CXR and CT images -Eurorad [50] CXR and CT images -Radiopaedia [161] CXR and CT images - We collected 10 biological data resources, such as NCBI, Protein Data Bank (PDB), UniProt, Clarivate Analytics Integrity (CAI), Drug Target Common (DTC), and Virus-Host DB (VHDB), as shown in Table 7 . These data resources provide abundant biological data resources, including gene sequences, proteins, drug molecules and compounds, and miRNA sequences. Informatics resources such as COVID-19 situation reports, dashboards, COVID-19 cases, and demographic data are gathered in Table 8 [135] Genome sequences Virus sequences [164] PDB [149] Proteins 3D shapes of proteins, nucleic acids, and assemblies [19, 145] UniProt [17] Proteins SARS-CoV-2 protein entries and receptors [143] miRBase [108] miRNA sequences Human mature miRNA sequences [46] ZINC [245] Drug compounds drug compounds and molecules [78, 98, 136 ] DTC [194] Drug molecules Drug molecules for 3C-like proteases [18, 182] CAI [89] Drug discovery Empowering knowledge-based drug discovery and development [240] BindingDB [120] Amino-acid sequences Amino-acid sequences of 3C-like proteases [18, 182] [219] Demographic data Spatial demographic and air travel data [110] GHDDI [58] Community Drug discovery community [79] Humdata [68] Community community perceptions of COVID- 19 - We summarize the main challenges currently faced by AI against COVID-19 and provide the corresponding suggestions. At present, the applications of AI in COVID-19 research mainly faces four challenges: the lack of available large-scale training data, massive noisy data and rumors, the limited knowledge on the intersection of computer science and medicine, and data privacy and human rights protection. • Lack of available large-scale training data. Most AI methods rely on large-scale annotated training data, including medical images and various biological data. However, due to the rapid outbreak of COVID-19, there are insufficient datasets available for AI. In addition, annotating training samples is very time-consuming and requires professional medical personnel. • Massive noisy data and rumors. Challenges arise from relying on the developed mobile Internet and social media; massive noise information and fake news about COVID-19 has been published on various online media without rigorous review. However, AI algorithms seem to be powerless in 21/36 judging and filtering the noise and erroneous data. This problem limits the application and performance of AI, especially in epidemic prediction and transmission analysis. • Limited knowledge in the intersection of computer science and medicine. Many AI scientists are from computer science, but the application of AI in the COVID-19 battle requires in-depth cooperation in computer science, medical imaging, bioinformatics, virology, and many other disciplines. Therefore, it is crucial to coordinate the cooperative work of researchers from different fields and integrate the knowledge of multiple subjects to jointly deal with COVID-19. • Data privacy and human rights protection. In the era of big data and AI, the cost of obtaining personal privacy data is very low. Faced with public health issues such as COVID-19, many governments want to obtain various types of personal information, including mobile phone positioning data, personal travel trajectory data, and patient disease data. How to effectively protect personal privacy and human rights during information acquisition and AI-based processing is an issue worthy of discussion and attention. In addition to the applications investigated in this paper, AI can also contribute to the battle of COVID-19 from the following 10 potential directions. 1. Noncontact disease detection. In CXR and CT image detection, the use of noncontact automatic image acquisition can effectively avoid the risk of infection between radiologists and patients during the COVID-19 pandemic. AI can be used for patient posture positioning, standard section acquisition of CXR and CT images, and movement of camera equipment. 2. Remote video diagnosis. AI and NLP technologies can be used to develop remote video diagnosis systems and chat robot systems and provide COVID-19 disease consultation and preliminary diagnosis to the public. 3. Patient prognosis management. AI technology (such as intelligent image and video analysis) can be used to automatically monitor patient behavior during the follow-up monitoring and prognostic management process, in addition to long-term tracking and management of patients with COVID-19. 4. Biological research. In the field of biological research, AI can be used to discover protein structures and features of virus through accurate analysis of biomedical information, such as large-scale protein structures, gene sequences, and viral trajectories. 5. Drug and vaccine development. AI can not only be used to discover potential drugs and vaccines but also to simulate the interaction between drugs and proteins and between vaccines and receptors, thereby predicting the potential responses to the drugs and vaccines of patients with COVID-19 with different constitutions. 6. Identification and filtering of fake news. AI can be used to reduce and eliminate fake news and noise data on online social media platforms to provide reliable, correct, and scientific information about the COVID-19 pandemic. 7. Impact simulation and evaluation. Various simulation models can use AI to analyze the impact of different social control strategies on disease transmission. Then, they can be used to explore more effective and scientific approaches of disease prevention and social control. 8. Patient contact tracking. By constructing social relationship networks and knowledge graphs, AI can identify and track the trajectories of people in close contact with patients with COVID-19, thereby accurately predicting and controlling the potential spread of the disease. 22/36 9. Intelligent robots. Intelligent robots are expected to be used in applications such as disinfection and cleaning in public places, product distribution, and patient care. 10. Intelligent Internet of Things. AI is expected to be combined with the Internet of Things to deploy in customs, airports, railway stations, bus stations, and business centers. In this case, we can quickly identify suspicious COVID-19 virus and patients through intelligent monitoring of the environment and personnel. In this survey, we investigated the main scope and contributions of AI in combating COVID-19. Compared with the pandemic of SARS-CoV in 2003 and MERS-CoV in 2012, AI technologies have been successfully applied in almost every corner of the COVID-19 battle. The application of AI in COVID-19 research can be summarized in four aspects, such as disease detection and diagnosis, virology research, drug and vaccine development, and epidemic and transmission prediction. Among them, medical image analysis, drug discovery, and epidemic prediction are the main battlefields of AI in the fight against COVID-19. We also summarized the currently available data and resources for COVID-19 research based on AI, including medical imaging data, biological data, and informatics resources. Finally, we highlighted the main challenges and potential directions in this field. This survey provided medical and AI researchers with a comprehensive view of the existing and potential contributions of AI in combating COVID-19, with the goal of inspiring them to continue to maximize the advantages of AI and big data to fight against this pandemic. The covid-19 pandemic calls for spatial distancing and social closeness: not for social distancing Covid-caps: a capsule network-based framework for identification of covid-19 cases from x-ray images Optimization method for forecasting confirmed cases of covid-19 in china Artificial intelligence (ai) provided early detection of the coronavirus (covid-19) in china and will influence future urban health policy internationally Augmented covid-19 x-ray images dataset. Mendeley Data Basic local alignment search tool The proximal origin of sars-cov-2 Gapped sequence alignment using artificial neural networks: application to the mhc class i system Covid-19: automatic detection from x-ray images utilizing transfer learning with convolutional neural networks Covid-19: a novel coronavirus and a novel challenge for critical care The lung image database consortium (lidc) and image database resource initiative (idri): a completed reference database of lung nodules on ct scans The swiss-model workspace: a web-based environment for protein structure homology modelling Support vector regression Predicting covid-19 incidence using google trends and data mining techniques: a pilot study in iran Covid-19 x-rays. Website Real-time covid-19 data. Website The universal protein resource (uniprot) Predicting commercially available antiviral drugs that may act on the novel coronavirus (sars-cov-2) through a drug-target interaction deep learning model Announcing the worldwide protein data bank Covid-19 tracker). Website An ai epidemiologist sent the first warnings of the wuhan virus. Website Random forests De novo design of new chemical entities (nces) for sars-cov-2 using artificial intelligence Covid-19 chest x-ray dataset initiative. Website Chinese and italian covid-19 outbreaks can be correctly described by a modified sird model. medRxiv Artificial intelligence applied on chest x-ray can aid in the diagnosis of covid-19 infection: a first experience from lombardy, italy. medRxiv A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster Modelling transmission and control of the covid-19 pandemic in australia A comparative study of fine-tuning deep learning models for plant disease identification Clinical characteristics and intrauterine vertical transmission potential of covid-19 infection in nine pregnant women: a retrospective review of medical records Distributed deep learning model for intelligent video surveillance systems with edge computing Deep learning-based model for detecting 2019 novel coronavirus pneumonia on high-resolution computed tomography: a prospective study. medRxiv Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs Keep up with the latest coronavirus research Covid-19 control in china during mass population movements at new year. The Lancet Fangcang shelter hospitals: a novel concept for responding to public health emergencies. The Lancet Xgboost: a scalable tree boosting system The sars-cov-2 vaccine pipeline: an overview Xception: deep learning with depthwise separable convolutions In a time of coronavirus, china's investment in ai is paying off in a big way Covid-19 image data collection Ct images of confirmed covid-19 cases. Mendeley Data Novel corona virus 2019 dataset. Website Approximate optimal designs for multivariate polynomial regression On the art of compiling and using'drug-like'chemical fragment spaces Computational analysis of microrna-mediated interactions in sars-cov-2 infection. bioRxiv Imagenet: a large-scale hierarchical image database Building a reference multimedia database for interstitial lung diseases An interactive web-based dashboard to track covid-19 in real time Images of covid-19 cases. Mendeley Data Composite monte carlo decision making under high uncertainty of novel coronavirus epidemic using hybridized deep learning and fuzzy rule induction Finding an accurate early forecasting model from small dataset: a case of 2019-ncov novel coronavirus outbreak Genome sequencing data of sars-cov-2. Website Disease Control and Prevention. Weekly influenza confirmed cases. Website Disease Prevention and Control. Geographic distribution covid-19 cases worldwide. Website Chembl: a large-scale bioactivity database for drug discovery The role of close contacts tracking management in covid-19 prevention: a cluster investigation in jiaxing, china Targeting covid-19: Ghddi info sharing portal. Website First known person-to-person transmission of severe acute respiratory syndrome coronavirus 2 (sars-cov-2) in the usa. The Lancet Structure of the respiratory syncytial virus polymerase complex Gisaid: global initiative on sharing all influenza data. Website Modelling and predicting the spatio-temporal spread of coronavirus disease 2019 (covid-19) in italy Automatic chemical design using a data-driven continuous representation of molecules Coronavirus (covid-19) cases in the uk. Website Rapid ai development cycle for the coronavirus (covid-19) pandemic: Initial results for automated detection and patient monitoring using deep learning ct image analysis Lstm: a search space odyssey Automatic x-ray covid-19 lung image classification system based on multi-level thresholding and support vector machine. medRxiv Covid-19 pandemic. Website Deep residual learning for image recognition Temporal dynamics in viral shedding and transmissibility of covid-19 Vaxign: the first web-based vaccine design program for reverse vaccinology and applications for vaccine development Feasibility of controlling covid-19 outbreaks by isolation of cases and contacts. The Lancet Global Health Covidx-net: A framework of deep learning classifiers to diagnose covid-19 in x-ray images An effective ctl peptide vaccine for ebola zaire based on survivors' cd8+ targeting of a particular nucleocapsid protein epitope with potential implications for covid-19 vaccine design Matrix capsules with em routing Long short-term memory Sars-cov-2 cell entry depends on ace2 and tmprss2 and is blocked by a clinically proven protease inhibitor Large-scale ligand-based virtual screening for sars-cov-2 inhibitors using deep neural networks Prediction of potential commercially inhibitors against sars-cov-2 by multi-task deep model Artificial intelligence forecasting of covid-19 in china Evaluating the effect of public health intervention on the global-wide spread trajectory of covid-19. medRxiv Multiple-input deep convolutional neural network model for covid-19 forecasting inchina. medRxiv Clinical features of patients infected with 2019 novel coronavirus in wuhan, china. The Lancet Densely connected convolutional networks Serialquantitative chest ct assessment of covid-19: deep-learning approach Generative adversarial nets Ellipro: an antibody epitope prediction tool. Website Clarivate analytics integrity. Website Active surveillance for covid-19 through artificial intelligence using concept of real-time speech-recognition mobile application to analyse cough sound Mol2vec: unsupervised machine learning approach with chemical intuition Anfis: adaptive-network-based fuzzy inference system Chaos game representation of gene structure Development and evaluation of an ai system for covid-19 diagnosis. medRxiv Virology, epidemiology, pathogenesis, and control of covid-19 Computational predictions of protein structures associated with covid-19. Website Netmhcpan-4.0: improved peptide-mhc class i interaction predictions integrating eluted ligand and peptide binding affinity data Identification of novel compounds against three targets of sars cov-2 coronavirus by combined virtual screening and supervised machine learning Health security capacities in the context of covid-19 outbreak: an analysis of international health regulations annual report data from 182 countries. The Lancet The efficacy of contact tracing for the containment of the 2019 novel coronavirus(covid-19). medRxiv Modeling infectious diseases in humans and animals Identifying medical diagnoses and treatable diseases by image-based deep learning A simple and multiplex loop-mediated isothermal amplification assay for rapid detectionof sars-cov Pubchem substance and compound databases Social distancing strategies for curbing the covid-19 epidemic. medRxiv Sars-cov-2 detection in patients with influenza-like illness Interventions to mitigate early spread of sars-cov-2 in singapore: a modelling study mirbase: from microrna sequences to function Imagenet classification with deep convolutional neural networks Assessing spread risk of wuhan novel coronavirus within and beyond china Tracking covid-19 using online search Covid-19 x-rays. Website Therapeutic options for the 2019 novel coronavirus Artificial intelligence distinguishes covid-19 from community acquired pneumonia on chest ct Characterizing the propagation of situational information in social media during covid-19 epidemic: a case study on weibo Preliminary assessment of the covid-19 outbreak using 3-staged model e-ishr Volumetric medical image segmentation: a 3d deep coarse-to-fine framework and its adversarial examples Hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting sars-cov-2 infection in vitro Covid-19progression timeline and effectiveness of response-to-spread interventions across the united states. medRxiv Bindingdb: a web-accessible database of experimentally determined protein-ligand binding affinities Within the lack of covid-19 benchmark dataset: a novel gan with deep transfer learning for corona-virus detection in chest x-ray images Development of anovel reverse transcription loop-mediated isothermal amplification method for rapid detection ofsars-cov-2 Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Chest x-ray images (pneumonia) Visualizing data using t-sne International air travel association(iata). Website Diagnosing covid-19 pneumonia from x-ray and ct images using deep learning and transfer learning algorithms A novel ai-enabled framework to diagnose coronavirus covid 19 using smartphone embedded sensors: Design study Enhancing response preparedness to influenza epidemics: agent-based study of 2050 influenza season in switzerland. Simulation Modelling Practice and Theory Covid-19 epidemic in switzerland: growth prediction and containment strategy using artificial intelligence and big data. medRxiv How ai is battling the coronavirus outbreak. Website Crispr-based surveillance for covid-19 using kosoko-thoroddsen, tinna-solveig f-comprehensive machine learning design. bioRxiv Panther in 2013: modeling the evolution of gene function, and other gene attributes, in the context of phylogenetic trees Linking virus genomes with host taxonomy Suggestions for second-pass anti-covid-19 drugs based on the artificial intelligence measures of molecular similarity, shape and pharmacophore distribution Automatic detection of coronavirus disease (covid-19) using x-ray images and deep convolutional neural networks Rsna pneumonia detection challenge. Website of the People's Republic of China. Real-time covid-19 report. Website Covid-19 bsti imaging database. Website Polynomial neural networks architecture: analysis and design Vaxign-ml: supervised machine learning reverse vaccinology model for improved prediction of bacterial protective antigens Covid-19 coronavirus vaccine design using reverse vaccinology and machine learning The coronavirus datasets in italy. Website Role of changes in sars-cov-2 spike protein in the interaction with the human ace2 receptor: nn in silico analysis Characterization of spike glycoprotein of sars-cov-2 on virus entry and its immune cross-reactivity with sars-cov Identification of a potential mechanism of acute kidney injury during the covid-19 outbreak: a study based on single-cell transcriptome analysis The paper news network. Website Covid-19/sars-cov-2 resources. Website Intensive care management of coronavirus disease 2019 (covid-19): challenges and recommendations. The Lancet Respiratory Medicine Zdock server: interactive docking prediction of protein-protein complexes and symmetric multimers Predicting mortality risk in patients with covid-19 using artificial intelligence to help medical decision-making. medRxiv Covid-19 vaccine candidates: Prediction and validation of 174 sars-cov-2 epitopes. bioRxiv Interpretable deep learning in drug discovery Covid-19 epidemic analysis using machine learning and deep learning algorithms. medRxiv Machine learning-based ct radiomics model for predicting hospital stay in patients with pneumonia associated with sars-cov-2 infection: a multicenter study. medRxiv Fighting against the common enemy of covid-19: a practice of building a community with a shared future for mankind Personalized workflow to identify optimal t-cell epitopes for peptide-based vaccines against covid-19 Detecting covid-19 in x-ray images with keras, tensorflow, and deep learning. Website Novel corona virus 2019 dataset. Website Images of covid-19 cases. Mendeley Data Epitope-based chimeric peptide vaccine design against s, m and e proteins of sars-cov-2 etiologic agent of global pandemic covid-19: an in silico approach The role of artificial intelligence in management of critical covid-19 patients Machine learning using intrinsic genomic signatures for rapid classification of novel pathogens: Covid-19 case study. bioRxiv Identification of covid-19 can be quicker through artificial intelligence framework using a mobile phone-based survey in the populations when cities/towns are under quarantine Computational immunology meets bioinformatics: the use of prediction tools for molecular binding in the simulation of the immune system Drones and artificial intelligence to enforce social isolation during covid-19 outbreak Covid-19 forecasting based on an improved interior search algorithm and multi-layer feed forward neural network Mathematical modeling of epidemic diseases Mobilenetv2: inverted residuals and linear bottlenecks Ai-driven tools for coronavirus outbreak: need of active learning and cross-population train/test models on multitudinal/multimodal data The essential facts of wuhan novel coronavirus outbreak in china and epitope-based vaccine designing against 2019-ncov A machine learning model reveals older age and delayed hospitalization as predictors of mortality in patients with covid-19. medRxiv Covid-19 and computer audition: An overview on what speech and sound analysis could contribute in the sars-cov-2 corona crisis Covid-19 ct segmentation dataset. Website Improved protein structure prediction using potentials from deep learning Protein structure prediction using multiple deep neural networks casp13 Detection of coronavirus disease (covid-19) based on deep features Lung infection quantification of covid-19 in ct images with deep learning Large-scale screening of covid-19 from community acquired pneumonia using infection size-aware classification Deep learning-based quantitative computed tomography model in predicting the severity of covid-19: a retrospective study in 196 patients Self-attention based molecule representation for predicting drug-target interaction Very deep convolutional networks for large-scale image recognition Covid-19 database. Website From a single host to global spread. the global mobility based modelling of the covid-19 pandemic implies higher infection and lower detection rates than current estimates. The Global Mobility Based Modelling Shape-based generative modeling for de novo drug design Deep learning enables accurate diagnosis of novel coronavirus (covid-19) with ct images. medRxiv Development and evaluation of a deep learning model for protein-ligand binding affinity prediction Inception-v4, inception-resnet and the impact of residual connections on learning Going deeper with convolutions Rethinking the inception architecture for computer vision Ai-aided design of novel targeted covalent inhibitors against sars-cov-2. bioRxiv Severity assessment of coronavirus disease 2019 (covid-19) using quantitative features from chest ct images Drug target commons 2.0: a community platform for systematic analysis of drug-target interaction profiles Predicting the evolution of sars-covid-2 in portugal using an adapted sir model previously used in south korea for the mers outbreak Breadth of concomitant immune responses prior to patient recovery: a case report of non-severe covid-19 An investigation of transmission control measures during the first 50 days of the covid-19 epidemic in china Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by sars-cov-2: an observational cohort study. The Lancet Infectious Diseases Susceptible-infected-recovered (sir) dynamics of covid-19 and economic impact Of chloroquine and covid-19 Coronavirus search trends. Website Autodock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading School closure and management practices during coronavirus outbreaks including covid-19: a rapid systematic review. The Lancet Child and Adolescent Health The immune epitope database 2.0 Atomnet: a deep convolutional neural network for bioactivity prediction in structure-based drug discovery Unexpected receptor functional mimicry elucidates activation of coronavirus fusion Structure, function, and antigenicity of the sars-cov-2 spike glycoprotein Tracking and forecasting milepost moments of the epidemic in the early-outbreak: framework and applications to the covid-19. medRxiv Covid-net: A tailored deep convolutional neural network design for detection of covid-19 cases from chest radiography images Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-ncov) in vitro Structural definition of a neutralization-sensitive epitope on themers-cov s1-ntd A deep learning algorithm using ct images to screen for corona virus disease (covid-19). medRxiv Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases Abnormal respiratory patterns classifier may contribute to large-scale screening of people infected with covid-19 in an accurate and unobtrusive manner An integrated in silico immuno-genetic analytical platform provides insights into covid-19 serological and vaccine targets. bioRxiv WHO. Novel coronavirus 2019 (covid-19). Website Virological assessment of hospitalized patients with covid-2019 Covid-19 coronavirus pandemic. Website The statistics datasets on holidays and air travel. Website A new coronavirus associated with human respiratory disease in china Estimating clinical severity of covid-19 from the transmission dynamics in wuhan, china Nowcasting and forecasting the potential domestic and international spread of the 2019-ncov outbreak originating in wuhan, china: a modelling study Rapid and accurate identification of covid-19 infection through machine learning based on clinical available blood test results. medRxiv Inhibition of sars-cov-2 (previously 2019-ncov) infection by a highly potent pan-coronavirus fusion inhibitor targeting its spike protein that harbors a high capacity to mediate membrane fusion Epidemiological data from the covid-19 outbreak, real-time case information Deep learning system to screen coronavirus disease 2019 pneumonia Characteristics of pediatric sars-cov-2 infection and potential evidence for persistent fecal viral shedding A machine learning-based model for survival prediction in patients with severe covid-19 infection. medRxiv Prediction of criticality in patients with severe covid-19 infection using three clinical features: a machine learning-based prognostic model with clinical data in wuhan. medRxiv Design of wide-spectrum inhibitors targeting coronavirus main proteases Clinical characteristics and imaging manifestations of the 2019 novel coronavirus disease (covid-19): a multi-center study in wenzhou city, zhejiang Flower pollination algorithm for global optimization Modified seir and ai prediction of the epidemics trend of covid-19 under public health interventions A model for covid-19 prediction in iran based on china parameters Covid-19: review indigenous peoples' data The epidemiology, diagnosis and treatment of covid-19 Covid-19 screening on chest x-ray images using deep learning based anomaly detection Progress and prospects on vaccine development against sars-cov-2. Vaccines Potential covid-2019 3c-like protease inhibitors designed using generative deep learning approaches Deep learning-based detection for covid-19 from chest ct using weak label. medRxiv A pneumonia outbreak associated with a new coronavirus of probable bat origin Network-based drug repurposing for novel coronavirus 2019-ncov/sars-cov-2 Unet++: a nested u-net architecture for medical image segmentation