key: cord-0721685-u7tz6xzz authors: Ciotti, Marco; Angeletti, Silvia; Minieri, Marilena; Giovannetti, Marta; Benvenuto, Domenico; Pascarella, Stefano; Sagnelli, Caterina; Bianchi, Martina; Bernardini, Sergio; Ciccozzi, Massimo title: COVID-19 Outbreak: An Overview date: 2020-04-07 journal: Chemotherapy DOI: 10.1159/000507423 sha: 95d9f458494cc2e63461249f75c8575cecb202e7 doc_id: 721685 cord_uid: u7tz6xzz BACKGROUND: In late December 2019, Chinese health authorities reported an outbreak of pneumonia of unknown origin in Wuhan, Hubei Province. SUMMARY: A few days later, the genome of a novel coronavirus was released (http://virological.org/t/novel-2019-coronavirus-genome/319; Wuhan-Hu-1, GenBank accession No. MN908947) and made publicly available to the scientific community. This novel coronavirus was provisionally named 2019-nCoV, now SARS-CoV-2 according to the Coronavirus Study Group of the International Committee on Taxonomy of Viruses. SARS-CoV-2 belongs to the Coronaviridae family, Betacoronavirus genus, subgenus Sarbecovirus. Since its discovery, the virus has spread globally, causing thousands of deaths and having an enormous impact on our health systems and economies. In this review, we summarize the current knowledge about the epidemiology, phylogenesis, homology modeling, and molecular diagnostics of SARS-CoV-2. KEY MESSAGES: Phylogenetic analysis is essential to understand viral evolution, whereas homology modeling is important for vaccine strategies and therapies. Highly sensitive and specific diagnostic assays are key to case identification, contact tracing, identification of the animal source, and implementation of control measures. In December 2019, an outbreak of pneumonia of unknown origin was reported in Wuhan, Hubei Province, China. Most of these cases were epidemiologically linked to the Huanan Seafood Wholesale Market. Inoculation of bronchoalveolar lavage fluid obtained from patients with pneumonia of unknown origin into human airway epithelial cells and Vero E6 and Huh7 cell lines led to the isolation of a novel coronavirus, SARS-CoV-2, previously named 2019-nCov [1] . Coronaviruses belong to the family Coronaviridae and are positive single-stranded RNA viruses surrounded by an envelope. They are divided into four genera: Alpha-, Beta-, Gamma-, and Deltacoronavirus. To date, seven human coronaviruses (HCoVs) have been identified, which fall within the Alpha-and Betacoronavirus genera. The Alphacoronavirus genus includes HCoV-NL63 and HCoV-229E, while the Betacoronavirus genus comprises HCoV-OC43, HCoV-HKU1, SARS-CoV (severe acute respiratory syndrome coronavirus), MERS-CoV (Middle East respiratory syndrome-related coronavirus), and the novel SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) [2] [3] [4] [5] [6] [7] . The alphacoronaviruses HCoV-NL63 and HCoV-229E and the betacoronaviruses HCoV-OC43 and HCoV-HKU1 usually cause common colds, but also severe lower respiratory tract infections, especially in the elderly and children [8] . HCoV-NL63 infection has also been significantly associated with croup (laryngotracheitis) [9, 10] , and HCoV-OC43 infection with severe lower respiratory tract infection in children [11] . SARS-CoV and MERS-CoV are zoonotic in origin; they cause severe respiratory syndrome and are often fatal [12] . Since the beginning of the epidemic in late December 2019, SARS-CoV-2 has now spread to all continents, and as of March 18, 2020 , the WHO communicated 179,111 confirmed cases and 7,426 deaths globally (Situation Report-57). In this review, we try to summarize the most recent knowledge about some epidemiological parameters including clinical symptoms, transmissibility of the virus, and the incubation period. Furthermore, the molecular diagnostics, protein modeling of the spike glycoprotein, and phylogenesis of the virus will be discussed. Patients infected with SARS-CoV-2 can present a wide range of symptoms ranging from mild to severe. Fever, cough, and shortness of breath are the most common symptoms reported in 83, 82, and 31% of patients [13] . In those patients who develop pneumonia, multiple mottling and ground-glass opacity are described on chest Xray [1, 13] . Patients that develop acute respiratory distress syndrome may worsen rapidly and die of multiple organ failure [13] . It has also been reported that about 2-10% of the patients with COVID-19 had gastrointestinal symptoms such as vomiting, diarrhea, and abdominal pain [13, 14] . Diarrhea and nausea preceded the development of fever and respiratory symptoms in 10% of patients [13] . At present, the exact mechanism of transmission of SARS-CoV-2 is still not completely understood. Humanto-human transmission via droplets is the main route of transmission within a susceptible population. Chinese health authorities reported an R 0 of 1.4-2.5 on January 23, 2020, to the WHO International Health Regulations (2005) Emergency Committee. Transmission by asymptomatic carriers cannot be ruled out. Actually, it was reported that an asymptomatic family member who traveled from the epidemic center of Wuhan was most likely responsible for a familial cluster of COVID-19 pneumonia once back home. Her reverse transcription polymerase chain reaction (RT-PCR) result was positive for SARS-CoV-2, but her chest CT images did not show significant alterations [15] . Another route of possible viral transmission is the oral-fecal route. The scientific literature showed that SARS-CoV and MERS-CoV are viable in environmental conditions that facilitate oral-fecal transmission. SARS-CoV has been detected in sewage water of two Chinese hospitals in which patients with SARS were treated, and MERS-CoV was found to be viable on different surfaces at low temperature and low humidity [16, 17] . SARS-CoV-2 was detected in stool of patients with COVID-19 pneumonia, as well as in respiratory samples [18] . Thus, it is plausible that also SARS-CoV-2 can be transmitted via the oral-fecal route as well as via fomites. To know the incubation period of SARS-CoV-2 infection is key for implementing control measures and surveillance. It has been estimated that the median incubation period is 5.1 days (95% CI, 4.5-5.8), and 97.5% of the infected subjects will develop symptoms within 11.5 days (95% CI, 8.2-15.6) of infection. Based on these estimates, it can be assumed that 101 out of 10,000 cases will develop symptoms after 14 days of observation or quarantine [19] . These estimates are consistent with those of other studies that reported a mean incubation period of 6.4 days (95% credible interval: 5.6-7.7), ranging from 2.1 to 11.1 days (2.5th to 97.5th percentile) [20] or 5.2 days (95% CI, 4.1-7.0), with the 95th percentile of the distribution at 12.5 days [21] . Thus, 14-day monitoring is advised following contact with a probable or confirmed SARS-CoV-2 case [22] . Confirmation of cases with suspected SARS-CoV-2 infection is performed by detection of unique viral sequences with nucleic acid amplification tests such as reverse real-time PCR (rRT-PCR). As soon as on January 7, 2020, the Chinese health authorities had declared that a novel coronavirus was responsible for this outbreak of pneumonia in Wuhan, a European network of academic and public laboratories designed an rRT-PCR protocol based on the comparison and alignment of previously available SARS-CoV and bat-related coronavirus genome sequences as well as five sequences derived from the novel coronavirus SARS-CoV-2 made available by the Chinese authorities [23] . Three assays were developed. The first-line assay targets the E gene encoding for the envelope protein, which is common to the Sarbecovirus subgenus, while the second specific assay targets the RdRp gene encoding for RNA-dependent RNA polymerase. This assay contains two probes: one probe, which reacts with the SARS-CoV and SARS-CoV-2 RdRp gene, and a second probe (RdRP_SARSr-P2) which is specific to SARS-CoV-2. Finally, the third additional confirmatory assay targets the nucleocapsid (N) gene. This last assay was not further validated because it is slightly less sensitive [23] . This protocol was adopted in more than 30 European laboratories [24] . Recently, a novel rRT-PCR assay targeting a different region of the RdRp/Hel gene of SARS-CoV-2 has been developed that showed a higher sensitivity and specificity than the RdRp-P2 assay [25] . Currently, several amplification protocols are available on the market and validated for in vitro diagnostic use (CE marked): GeneFinder TM COVID-19 Plus Real-Amp Kit (OSANG Healthcare Co., Ltd, South Korea); genesig ® Real-Time PCR Coronavirus (COVID-19) (genesig, UK); Allplex TM 2019-nCoV Assay (Seegene, South Korea), etc. Highly sensitive and specific diagnostic assays are key to the identification of cases, contact tracing, identification of the animal source, and implementation of control measures [26] [27] [28] . When performing nucleic acid amplification test assays, it is useful to remind ourselves that several factors can be responsible for a negative result in an infected individual, such as the poor quality of a specimen, the time of specimen collection (specimen collected too early or too late during infection), inappropriate handling or shipment of the specimen, and technical reasons. Coronavirus entry into the host cell is mediated by the transmembrane spike (S) glycoprotein that forms homotrimers that protrude from the viral surface [29] . The S protein is composed of the two subunits S1 and S2 responsible for binding to the host cell receptor and fusion of the viral and cellular membranes, respectively. Different coronaviruses use different domains within the S1 subunit to enter the cell. These domains are named S A and S B . SARS-CoV and SARS-related coronaviruses interact with the angiotensin-converting enzyme 2 (ACE2) via domain S B to enter target cells [30] [31] [32] [33] [34] . It has recently been shown that SARS-CoV-2 binds the ACE2 receptor via the S B domain similarly to SARS-CoV, and that murine polyclonal antibodies inhibited SARS-CoV-2 entry into the cell mediated by S. These data suggest that crossneutralizing antibodies targeting conserved S epitopes elicited by vaccination could be used against SARS-CoV-2, SARS-CoV, and SARS-related coronaviruses [35] . Previous studies have shown the presence of positive selective pressure on the Nucleocapsid, Spike glycoprotein, and ORF1ab regions, while until now no evidence of a positive selective pressure has been found on the Envelope, Membrane, and other ORF proteins. In the Nucleocapsid region, significant (p < 0.05) pervasive episodic selection was found in 2 sites. In amino acid position 380 of the Wuhan coronavirus sequence there is a Gln residue instead of an Asn, while in amino acid position 410 there is a Thr residue instead of an Ala. Significant (p < 0.05) pervasive negative selection in 6 sites (14%) has been evidenced and confirmed by FUBAR (Fast Unconstrained Bayesian Approximation) analysis [36] . In the Spike glycoprotein region, significant (p < 0.05) pervasive episodic selection was found in 2 different sites (536th and 644th nucleotide position using the reference sequence). In the 536th amino acid position of the Wuhan coronavirus sequence there is an Asn residue instead of an Asp acid residue, while in amino acid position 644 there is a Thr residue instead of an Ala residue. Significant (p < 0.05) pervasive negative selection in 1,065 sites (87%) has been evidenced and confirmed by FUBAR analysis, suggesting that the S region could be highly conserved [36] . Regarding the sites under positive selective pressure found on the Spike glycoprotein, the results have shown that amino acid position 536 in COVID-19 has an Asn residue, while the Bat SARS-like coronavirus has a Gln 4 DOI: 10.1159/000507423 residue; the SARS virus, instead, has an Asp residue. In amino acid position 644 of the COVID-19 sequence there is a Thr residue, while the Bat SARS-like virus has a Ser residue; instead, the SARS virus has an Ala residue. Another study highlighted that several key residues responsible for binding of the SARS-CoV receptor-binding domain to the ACE2 receptor were variable in the COV-ID-19 receptor-binding domain (including Asn439, Asn501, Gln493, Gly485, and Phe486; COVID-19 numbering), and a number of deletion events in amino acid positions 455-457, 463-464, and 485-497 occurred in the bat-derived strains [37] . Also in the ORF1ab region, potential sites under positive selective pressure have been found (p < 0.05). Particularly, in the amino acid position 501, COVID-19 has a Gln residue, the Bat SARS-like coronavirus has a Thr residue, and the SARS virus has an Ala residue. In position 723 of the COVID-19 sequence there is a Ser residue, while the Bat SARS-like virus and the SARS virus have a Gly residue. In amino acid position 1,010, COVID-19 has a Pro residue, the Bat SARS-like coronavirus has a His residue, and the SARS virus has an Ile residue. As for the residue in position 723 (543 in the nsp3 protein), the COVID-19 sequence displays a Ser, replacing for Gly in the Bat SARS-like and SARS coronaviruses. In this case, it may be argued that this substitution could increase local stiffness of the polypeptide chain both for a steric effect (in contrast to Ser, Gly has no side chain) and for the ability of the Ser side chain to form H-bonds. Moreover, Ser can act as a nucleophile in determined structural environments, such as those of enzymes' active sites, and can be a phosphorylation site. However, within the I-TASSER model, this position is predicted to have low solvent accessibility. Regarding the amino acid position 1,010 (corresponding to position 192 of the nsp3 protein), the homologous region of the Bat SARS-like coronavirus and SARS virus has a polar and an apolar amino acid, respectively, while COVID-19 has a Pro residue. In this case, it may be speculated that due to the steric bulge and stiffness of Pro, the molecular structure of COVID-19 may undergo a local conformational perturbation compared to the proteins of the other two viruses. In nsp3, the mutation falls near the polyprotein domain similar to a phosphatase present also in the SARS coronavirus (PDB code 2ACF) playing a key role in the replication process of the virus in infected cells [38] . According to the I-TASSER model, the position is partially accessible to the solvent. The sites under positive selective pressure in this protein may suggest a possible interpretation of some clinical features of this virus compared to SARS and Bat SARS-like coronavirus. This analysis should find which are probably the most common sites undergoing an amino acid change, providing insight into some important proteins of COVID-19 that are involved in the mechanism of viral entry and viral replication. These data should contribute to improving our understanding of how this virus acts in its pathogenicity. Furthermore, to identify a potential molecular target is fundamental to follow the molecular evolution of the virus, which can suggest some interesting sites for a potential therapy or vaccine. The structural similarity of the region in which the positive selective pressure occurs, and the stabilizing mutation falling in the endosome-associated protein-like domain of the nsp2 protein, should be probable reasons why this virus is more contagious than SARS. Instead, the destabilizing mutation located near the phosphatase domain of the nsp3 protein may explain why viral replication is slower than in SARS with a longer incubation period. Anyway, further studies are needed on this aspect [39] . The availability of protein structural information is an essential prerequisite for the interpretation of biological phenomena. In this case, knowledge of the virus's protein structure would greatly enhance the possibility of understanding the biological meaning of the observed muta-tions. Now, only the X-ray structure of COVID-19 nsp5 protease (PBD code 6LU7) is available, although it is expected that many other structures will become available soon. In the meantime, homology modeling could provide preliminary structural clues. Homology modeling needs structural templates sharing sufficient sequence similarity to the targets. In Figure 1 and Table 1 , a list of potential templates for homology modeling of the proteins coded by the COVID-19 genome is displayed. The structures with the largest coverage and the greatest sequence identity have been incorporated into Figure 1 and Table 1 . According to this list, it is evident that most of the viral proteins are at modeling distance from PDB structures. This information should be exploited as soon as possible. Phylogenetic analysis of the SARS-CoV-2 genomes showed that the novel coronavirus responsible for the pneumonia outbreak in Wuhan, China, belongs to the Betacoronavirus genus, subgenus Sarbecovirus [37] . Within the Betacoronavirus genus, 2019-nCov (SARS-CoV-2) is distant from SARS-CoV (about 79% identity) and MERS-CoV (about 50% identity) responsible for the 2002-2003 [4] and 2012 [7] epidemics, respectively, but closely related (88% identity) to the two bat-derived (SARS)-like coronaviruses bat-SL-CoVZC45 and bat-SL-CoVZXC21 [37] . The origin of the virus is still unclear; however, genomic analysis suggests that SARS-CoV-2 is most closely related to viruses previously identified in bats (Fig. 2) . It is plausible that there were other intermediate animal transmissions before its introduction into humans. How-ever, there is no evidence of snakes as an intermediary [36] . Using 74 publicly shared novel coronavirus (nCoV) genomes, we examined genetic diversity to infer the date of the common ancestor and the rate of spread. The high similarity of the genomes suggests they share a recent common ancestor. Otherwise, we would expect a greater number of differences between the samples. The jump from bats to humans most likely occurred in late November or early December 2019 (November 25, 2019; 95% HPD: September 28, 2019; December 21, 2019) [40] . Previous research on related coronaviruses suggests that these viruses accumulate between 1 and 3 changes in their genome per month (rates of 3 × 10 -4 to 1 × 10 -3 per site per year). Molecular clock calibration estimated the evolutionary rate of the SARS-CoV-2 whole genome sequences at 6.58 × 10 -3 substitutions per site per year (95% HPD: 5.2 × 10 -3 to 8.1 × 10 -3 ). The outbreak first started in Wuhan, China, but cases have been identified in many East and South-East Asian countries, the USA, Australia, the Middle East, and Europe. Vietnam, Japan, and Germany have reported transmission within the country, albeit always with a known link to Wuhan, China (Fig. 3) . This study is a picture of the current research on molecular evolution, epidemiology, and diagnostics in response to the outbreak of COVID-19. Many studies have been published within different scientific disciplines with the intent to control and prevent this pandemic. Phylogenetic analysis and homology modeling add new knowledge together with epidemiological and diagnostic methods. Studies exploring the genome and the structure of the viral proteins are essential in order to define preven-tion and control measures to minimize the impact of the outbreak. All this knowledge will pave the way for the development of a vaccine and antiviral therapy. A Novel Coronavirus from Patients with Pneumonia in China The morphology of three previously uncharacterized human respiratory viruses that grow in organ culture Isolation from man of "avian infectious bronchitis viruslike" viruses (coronaviruses) similar to 229E virus, with some epidemiological observations Severe acute respiratory syndrome Identification of a new human coronavirus Characterization and complete genome sequence of a novel coronavirus, coronavirus HKU1, from patients with pneumonia Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Recently discovered human coronaviruses Croup is associated with the novel coronavirus NL63 The association of newly identified respiratory viruses with lower respiratory tract infections in Korean children A novel human coronavirus OC43 genotype detected in mainland China Origin and evolution of pathogenic coronaviruses. Nat Rev Microbiol Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Presumed asymptomatic carrier transmission of COVID-19 Concentration and detection of SARS coronavirus in sewage from Xiao Tang Shan Hospital and the 309th Hospital of the Chinese People's Liberation Army Stability of Middle East respiratory syndrome coronavirus (MERS-CoV) under different environmental conditions Washington State 2019-nCoV Case Investigation Team. First Case of 2019 Novel Coronavirus in the United States The Incubation Period of Coronavirus Disease 2019 (COVID-19) from Publicly Reported Confirmed Cases: Estimation and Application Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan Early Transmission Dynamics in Wuhan, China, of Novel Coronavirus-Infected Pneumonia European Centre for Disease Prevention and Control. Algorithm for the management of contacts of probable or confirmed COVID-19 cases Detection of 2019 novel coronavirus (2019-nCoV) by real-time RT-PCR Laboratory readiness and response for novel coronavirus (2019-nCoV) in expert laboratories in 30 EU/EEA countries Improved molecular diagnosis of COVID-19 by the novel, highly sensitive and specific COVID-19-RdRp/Hel real-time reverse transcription-polymerase chain reaction assay validated in vitro and with clinical specimens Coronavirus as a possible cause of severe acute respiratory syndrome Severe acute respiratory syndrome coronavirus as an agent of emerging and reemerging infection Middle East respiratory syndrome coronavirus: another zoonotic betacoronavirus causing SARS-like disease Structural insights into coronavirus entry Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Stabilized coronavirus spikes are resistant to conformational changes induced by receptor recognition or proteolysis Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2 Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Structure of SARS coronavirus spike receptor-binding domain complexed with receptor Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein The 2019-new coronavirus epidemic: evidence for virus evolution Genomic characterisation and epidemiology of 2019 novel coronavirus: implications for virus origins and receptor binding Structural basis of severe acute respiratory syndrome coronavirus ADP-ribose-1′′-phosphate dephosphorylation by a conserved domain of nsP3 COV-ID-2019: the role of the nsp2 and nsp3 in its pathogenesis The global spread of 2019-nCoV: a molecular evolutionary analysis. Pathog Glob Health The authors have no conflicts of interest to declare. There was no funding for this review.