key: cord-0740606-ne5r4d4b authors: Cui, Jie; Li, Fang; Shi, Zheng-Li title: Origin and evolution of pathogenic coronaviruses date: 2018-12-10 journal: Nat Rev Microbiol DOI: 10.1038/s41579-018-0118-9 sha: 8bcd1c3897124adec322dffb8a315fc4e24cb17e doc_id: 740606 cord_uid: ne5r4d4b Severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV) are two highly transmissible and pathogenic viruses that emerged in humans at the beginning of the 21st century. Both viruses likely originated in bats, and genetically diverse coronaviruses that are related to SARS-CoV and MERS-CoV were discovered in bats worldwide. In this Review, we summarize the current knowledge on the origin and evolution of these two pathogenic coronaviruses and discuss their receptor usage; we also highlight the diversity and potential of spillover of bat-borne coronaviruses, as evidenced by the recent spillover of swine acute diarrhoea syndrome coronavirus (SADS-CoV) to pigs. likely the major natural reservoirs of alphacoronaviruses and betacoronaviruses 24 . At the beginning of the SARS epidemic, almost all early index patients had animal exposure before developing disease. After the causative agent of SARS was identified, SARS-CoV and/or anti-SARS-CoV antibodies were found in masked palm civets (Paguma larvata) and animal handlers in a market place 12, 16, [39] [40] [41] [42] . However, later, wide-reaching investigations of farmed and wild-caught civets revealed that the SARS-CoV strains found in market civets were transmitted to them by other animals 16, 39 . In 2005, two teams independently reported the discovery of novel coronaviruses related to human SARS-CoV, which were named SARS-CoV-related viruses or SARS-like coronaviruses, in horseshoe bats (genus Rhinolophus) 15, 43 . These discoveries suggested that bats may be the natural hosts for SARS-CoV and that civets were only intermediate hosts. Subsequently, many coronaviruses phylogenetically related to SARS-CoV (SARSr-CoVs) were discovered in bats from different provinces in China and also from European, African and Southeast Asian countries 15, 20, 38, [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] (FIg. 4; Supplementary Fig. S1a ). According to the ICTV criteria, only the strains found in Rhinolophus bats in European countries, Southeast Asian countries and China are SARSr-CoV variants. Those from Hipposideros bats in Africa are less closely related to SARS-CoV and should be classified as a new coronavirus species 54 . These data indicate that SARSr-CoVs have wide geographical spread and might have been prevalent in bats for a very long time. A 5-year longitudinal study revealed the coexistence of highly diverse SARSr-CoVs in bat populations in one cave of Yunnan province, China 18, 20, 55 . This location is a diversity hot spot, and the SARSr-CoVs in this location contain all the genetic diversity found in other locations of China. Furthermore, the viral strains that exist in this one location contain all genetic elements that are needed to form SARS-CoV (FIg. 5) . As no direct progenitor of SARS-CoV was found in bat populations despite 15 years of searching and as RNA recombination is frequent within coronaviruses 56 , it is highly likely that SARS-CoV newly emerged through recombination of Coronaviruses form enveloped and spherical particles of 100-160 nm in diameter. They contain a positive-sense, single-stranded RNA (ssRNA) genome of 27-32 kb in size. The 5'-terminal two-thirds of the genome encodes a polyprotein, pp1ab, which is further cleaved into 16 nonstructural proteins that are involved in genome transcription and replication. The 3' terminus encodes structural proteins, including envelope glycoproteins spike (S), envelope (E), membrane (M) and nucleocapsid (N). In addition to the genes encoding structural proteins, there are accessory genes that are species-specific and dispensable for virus replication. Here, we compare prototypical and representative strains of four coronavirus genera: feline infectious peritonitis virus (FIPV), Rhinolophus bat coronavirus HKU2, severe acute respiratory syndrome coronavirus (SARS-CoV) strains GD02 and SZ3 from humans infected during the early phase of the SARS epidemic and from civets, respectively , SARS-CoV strain hTor02 from humans infected during the middle and late phases of the SARS epidemic, bat SARS-related coronavirus (SARSr-CoV) strain WIV1, Middle East respiratory syndrome coronavirus (MERS-CoV), mouse hepatitis virus (MHV), infectious bronchitis virus (IBV) and bulbul coronavirus HKU11. www.nature.com/nrmicro bat SARSr-CoVs in this or other yet-to-be-identified bat caves. This hypothesis is consistent with previous data showing that a direct progenitor of SARS-CoV emerged before 2002 (rEFs 42, 57, 58 ). Recombination analysis also strongly supported the hypothesis that the civet SARS-CoV strain SZ3 arose through recombination of two existing bat strains, WIV16 and Rf4092 (rEF. 20 ). Furthermore, WIV16, the closest relative to SARS-CoV found in bats, likely arose through recombination of two other prevalent bat SARSr-CoV strains 20 . The most frequent recombination breakpoints are within the S gene, which encodes the spike (S) protein that contains the receptor-binding domain (RBD), and upstream of orf8, which encodes an accessory protein 20, 58, 59 . Given the prevalence and great genetic diversity of bat SARSr-CoVs, their close coexistence and the frequent recombination of the coronaviruses, it is expected that novel variants will emerge in the future 60, 61 . Because there were no SARS cases in Yunnan province during the SARS outbreak, we hypothesize that the direct progenitor of SARS-CoV was produced by recombination within bats and then transmitted to farmed civets or another mammal, which then transmitted the virus to civets by faecal-oral transmission. When the virus-infected civets were transported to Guangdong market, the virus spread in market civets and acquired further mutations before spillover to humans. The genome sequences of SARS-CoVs from market civets are almost identical to the genomes of human SARS-CoVs 42,62 . However, two genes show major variation. The first variable region is located in the S gene. The SARS-CoV S protein is functionally divided into two subunits, denoted S1 and S2, which are responsible for receptor binding and fusion with the cellular membrane, respectively 1 . S1 is further divided into the amino-terminal domain (S1-NTD) and the carboxy-terminal domain (S1-CTD). The S1-CTD functions as the RBD and is responsible for binding ACE2 and entering cells 7, 63, 64 . Two amino acid residues in the RBD, 479 and 487, were identified to be essential for ACE2-mediated SARS-CoV infection and critical for virus transmission from civets to humans 76, 78 . The second major location of variation is the accessory gene orf8 (FIg. 5) . On the basis of SARS spread, the SARS 2002-2003 outbreak could be divided into three phases, with the early phase characterized by a limited number of localized cases, followed by a middle phase during which a superspreader event occurred in a hospital and finally the late phase of international spread 62 . The viral genomes from early-phase patients contain two genotypes of orf8, one with a complete orf8 (369 nucleo tides) and the other containing an 82-nucleotide deletion. By contrast, viral genomes from late-phase patients and most of the genomes from middle-phase patients contain a split orf8 (orf8a and orf8b) owing to a 29-nucleotide deletion; two exceptions were found in middle-phase genomes, one containing an 82-nucleotide deletion in orf8 and the other with the whole orf8 deleted. The human isolates from 2004 and all civet SARS-CoV genomes have a complete orf8 except one civet strain with an 82-nucleotide deletion 62 . These data indicate that orf8 genes underwent adaptations during transmission from animals to humans during the SARS epidemic. A limited functional analysis suggested that the ORF8a protein is dispensable for SARS-CoV replication in Vero Variability of bat SARSr-CoVs SARS-CoVs and bat SARSr-CoVs mainly vary in three regions: S, ORF8 and ORF3 (FIg. 5) . Bat SARSr-CoVs share high sequence identity with SARS-CoV in the S2 region but are highly variable in the S1 region. Compared with human and civet SARS-CoV, bat SARSr-CoV S1 can be divided into two clades: clade 1, which is found only in Yunnan province, has the same size S protein as human and civet isolates [18] [19] [20] 51 , whereas clade 2, which is found in many locations, has a shorter size S protein owing to deletions of 5, 12 or 13 amino acids in length 15 BtC oV HK U9 -1 BatCoV The second variable region is located in ORF8. Most of bat SARSr-CoVs retain an intact orf8 (366 or 369 nucleotides) and share 47.7-100% sequence identity among themselves and 50.6-98.4% with SARS-CoV in civets and early-phase patients. A split orf8 (364 nucleotides) owing to a 5-nucleotide deletion was found in one bat SARSr-CoV strain, similar to that of SARS-CoVs from middle-phase and late-phase patients 20 . The European bat SARSr-CoV has completely lost orf8 (rEF. 45 ). These data show that the orf8 genes in bat SARSr-CoVs are constantly evolving in their natural reservoirs. Considering the variability of orf8 in bats, civets and humans, investigating the function of orf8 is a priority, particularly the contribution of these different variants to viral pathogenesis. The third variable region is in ORF3. The SARS-CoV genome encodes a 154-amino acid ORF3b, which is an interferon antagonist. Bat SARSr-CoVs and SARS-CoV are highly similar in ORF3a (96.4-98.9% amino acid identity), but bat SARSr-CoVs have different sizes of ORF3b (54-154 amino acids) (a large part of the region encoding ORF3b overlaps with the ORF3a coding region) 20, 70 . ORF3b retains the anti-interferon function in some bat SARSr-CoVs but has lost the function in other bat SARSr-CoVs 70 . A novel accessory gene, named orfx and located between orf6 and orf7, was identified in the genomes of several bat SARSr-CoVs from Yunnan province [18] [19] [20] (FIg. 5) . A preliminary study indicated that ORFX is involved in an anti-interferon response 71 . Receptor usage of SARS-CoV and SARSr-CoV ACE2 binding is a critical determinant for the host range of SARS-CoV 72,73 . Electron microscopic studies have shown that the SARS-CoV S protein forms a clover shaped trimer, with three S1 heads and a trimeric S2 stalk 74, 75 . The RBD is located on the tip of each S1 head. The RBD binds to the outer surface of ACE2, away from its zinc-chelating enzymatic site 77, 141 (FIg. 6a) . Different SARS-CoV strains isolated from several hosts vary in their binding affinities for human ACE2 and consequently in their infectivity of human cells 76, 78 (FIg. 6b) . The epidemic strain hTor02 was isolated from humans during the late phase of the outbreak in 2002-2003. It has a high affinity for human ACE2 and high infectivity in human cells, and consequently, it was transmitted efficiently between humans 62 . Strains cSz02 and cHb05 were isolated from palm civets in 2002-2003 and 2005, respectively. Both have low affinity for human ACE2 and low infectivity in human cells but have high affinity for civet ACE2 and high infectivity in civet cells 12, 79 . Strain hcGd03 was isolated from both humans and palm civets in 2003-2004 and has moderate affinity for human ACE2 and moderate infectivity in human cells; it infected humans but did not transmit between humans 80 . Strain hHae08 was isolated from human cell culture and has high affinity for human ACE2 and high infectivity in human cells 81 . Understanding the molecular basis for human receptor usage by different SARS-CoV strains is crucial for understanding the cross-species transmission of SARS-CoV and for epidemiological monitoring of potential future outbreaks. Crystal structures of the SARS-CoV RBD complexed with human ACE2 revealed that the SARS-CoV RBD contains a core structure and a receptor-binding ).The strain Zhejiang2013 (GenBank No. KF636752) was used as a root. b | By contrast, Middle East respiratory syndrome-related coronaviruses (MERSr-CoVs) form two major viral lineages, L1 and L2. L1 is found in humans and camels, and L2 is found only in camels. Two small clusters, B1 (bat 1) and B2, and one single virus, SA , from South Africa, were found in bats. The phylogenetic tree of MERSr-CoVs is based on a published trees 94, 139 and reconstructed using full-genome alignment of all coding regions using the same method as above. HKU4-1 (EF065505) and HKU5-1 (EF065509), two 2c betacoronaviruses, served as the root of the tree. Detailed phylogenetic trees and grouping information can be found in Supplementary Fig. S1 . MERS-CoV, Middle East respiratory syndrome coronavirus. motif (RBM) 82, 141 (FIg. 6a) . Two virus-binding hot spots have been identified at the interface of the RBD and human ACE2, centring on ACE2 residues Lys31 (hot spot 31) and Lys353 (hot spot 353) 83, 84 (FIg. 6b) . They both consist of a salt bridge (between Lys31 and Glu35 for hot spot 31 and between Lys353 and Asp38 for hot spot 353); both salt bridges are buried in hydrophobic pockets and contribute a substantial amount of energy to RBD-ACE2 binding as well as filling voids at the RBD-ACE2 interface. Naturally selected RBM mutations all interact with the hot spots (FIg. 6b; TAblE 1) and affect RBD-ACE2 binding. Mutations in RBM residue 479 had an important role in the civet-to-human transmission of SARS-CoV 42, 76, 78, 85 . Residue 479 is an asparagine in strains hTor02, hcGd03 and hHae08 but is a lysine in strain cSz02 and an arginine in strain cHb05 (TAblE 1) . Asn479 is located near hot spot 31, without interfering with the structure of hot spot 31 (rEF. 85 ) (FIg. 6b, c) . However, a change to Lys479 leads to steric and electrostatic interference with hot spot 31, reducing the binding affinity between the SARS-CoV RBD and human ACE2. By contrast, Arg479 reaches the vicinity of hot spot 353 and forms a salt bridge with ACE2 residue Asp38 (rEF. 83 ) (FIg. 6d) . Hence, strains hTor02, hcGd03 and hHae08 (all of which contain Asn479) and strain cHb05 (which contains Arg479) recognize human ACE2 and infect human cells efficiently, whereas strain cSz02 (which contains Lys479) recognizes human ACE2 inefficiently and infects human cells inefficiently. The above structural analyses are supported by biochemical, functional and epidemiological data 42, 76, 78, [83] [84] [85] . Because of residue differences between human ACE2 and civet ACE2, both Asn479 and Lys479 fit well into the interface between the RBD and civet ACE2, although Arg479 fits even better 83, 85 ; consequently, strains hTor02, cSz02, hcGd03 and cHb05 (which contain either Asn479, Lys479 or Arg479) recognize civet ACE2 and infect civet cells efficiently 79 . In sum, Asn479 and Arg479 are viral adaptations to human ACE2, whereas Lys479 is incompatible with human ACE2; Arg479 is a viral adaptation to civet ACE2, whereas Asn479 and Lys479 are also compatible with civet ACE2. Mutations in RBM residue 487 had an important role in the human-to-human transmission of SARS-CoV. Residue 487 is a threonine in strain hTor02 but is a serine in the other strains isolated from humans and civets. The methyl group of Thr487 interacts with hot spot 353 in human ACE2 by providing stacking support for the formation of the salt bridge between Lys353 and Asp38; consequently, strain hTor02 recognizes human ACE2 efficiently and was transmitted between humans during the 2002-2003 SARS epidemic. By contrast, Ser487 cannot provide support to hot spot 353, and hence the other strains isolated from humans and civets recognize human ACE2 inefficiently. Consequently, neither cSz02 nor hcGd03 was transmitted between humans. The above structural analyses are supported by biochemical, functional and epidemiological data 42, 76, 78, [83] [84] [85] . Because of residue differences between human ACE2 and civet ACE2, Ser487 fits well into the RBD-civet ACE2 interface although still not as well as Thr487 (rEFs 83,85 ); consequently, strains sSZ02, hcGd03 and cHb05 (which contain Ser487) recognize civet ACE2 and infect civet cells efficiently 79 . In sum, Thr487 is a viral adaptation to both human and civet ACE2, and Ser487 is much more compatible with civet ACE2 than with human ACE2 (FIg. 6b) . RBM residues 442, 472 and 480 also contribute to receptor recognition and host range of SARS-CoV although not as much as residues 479 and 487. Detailed structural, biochemical and functional analyses showed that Phe442, Phe472 and Asp480 are viral adaptations to human ACE2, whereas Tyr442, Leu472 or Pro472, and Gly480 are viral adaptations to civet ACE2 (rEFs 72,83 ). SARS-CoV SZ3 SARS-CoV hTor02 S1-NTD S1-RBD 3a and 3b Variability and thus species adaptation majorly affect three severe acute respiratory syndrome coronavirus (SARS-CoV) and SARS-related coronavirus (SARSr-CoV) proteins: the spike protein (S) (both the S1 amino-terminal domain (S1-NTD) and the S1 receptorbinding domain (S1-RBD) show variability), ORF3 (3a and 3b) and ORF8 (8a and b). SARS-CoV GD02 and hTor02 represent strains that were isolated from patients during the early , and middle or late phase of the SARS epidemic in 2002-2003, respectively ; SARS-CoV CZ3 is a representative of strains isolated from civets in 2003 and 2004 (rEFs 42,62 ). All bat SARSr-CoVs, except HKU3 and Rp3, were discovered in Yunnan province during 2011-2015. On the basis of deletions in the RBD, bat SARSr-CoVs can be divided into two clades. Those without a deletion and thus an identical size in S1 to SARS-CoV can be further divided into four genotypes: genotype 1, represented by WIV16, is highly similar to SARS-CoV in both the NTD and the RBD; genotype 2, represented by WIV1, differs in NTD from SARS-CoV; genotype 3, represented by Rs4231, differs in RBD from SARS-CoV; and genotype 4, represented by SHC014 and Rs4084, differs in both NTD and RBD from SARS-CoV 20 . The differences in S influence species-specific receptor binding, whereas differences in the accessory proteins, including potentially the newly discovered ORFX (X), mainly affect immune responses and viral immune evasion. Adapted from rEF. 20 , CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/). A structure in proteins that forms a bond between oppositely charged residues that are sufficiently close to each other to experience electrostatic attraction. To corroborate the importance of these residues for SARS-CoV binding to either human or civet ACE2, two SARS-CoV S proteins, hOptimize and cOptimize, were rationally designed: the former contains all of the human ACE2-adapted residues (Phe442, Phe472, Asn479, Asp480 and Thr487), whereas the latter contains the civet ACE2-adapted residues (Tyr442, Pro472, Arg479, Gly480 and Thr487). These two S proteins demonstrate exceptionally high affinity for human ACE2 and civet ACE2, confirming that the human ACE2adapted and civet ACE2-adapted RBM residues help determine SARS-CoV host range 72, 83 Fig. 6 | Receptor recognition by SARS-CoV and MERS-CoV. a | Severe acute respiratory syndrome coronavirus (SARS-CoV) uses its receptorbinding domain (RBD) (as shown in the structure of strain hTor02, containing core structure (cyan) and receptor-binding motif (RBM; magenta)) to bind human angiotensin-converting enzyme 2 (ACE2; green; Protein Data Bank ID: 2AJF). ACE2 is a peptidase with zinc (blue) in its active centre. b | Several residues in the host and viral receptor, as well as two salt bridges that stabilize the structure (dotted lines) and form two binding hot spots, are crucial for binding of the severe acute respiratory syndrome (SARS) epidemic strain hTor02. Hydrophobic residues surrounding the two salt bridges are present in the structure but are not shown in the figure. c | By contrast, the SARS-related coronavirus (SARSr-CoV) strain bWIV1, which was isolated from bats and can infect both civet and human cells, differs in residues 442, 472 and 487. The mutation from threonine to asparagine in residue 487 introduces a polar side chain and is predicted to interfere with binding at hot spot 353. The model shown here was built on the basis of the structure of hTor02 RBD complexed with human ACE2 (Protein Data Bank ID: 2AJF), in which residues 442, 472 and 487 were mutated from those in strain hTor02 to those in strain bWIV1. d | The bat SARSr-CoV strain bRsSHC014 can also infect human and civet cells; it carries an alanine in position 487 , and the short side chain of this residue does not support the structure of hot spot 353. The model was built on the basis of the structure of cOptimize RBD complexed with human ACE2 (Protein Data Bank ID: 3SCJ), in which residues 442, 480 and 487 were mutated from those in strain cOptimize to those in strain bWIV1. e | The Middle East respiratory syndrome coronavirus (MERS-CoV) RBD (core structure in cyan and RBM in magenta) binds human dipeptidyl peptidase 4 (DPP4; green; Protein Data Bank ID: 4KR0). Structure figures were made using PyMOL 115 . Modelled mutations in panels c and d were performed using Coot 140 . Panels a-d are adapted from rEF. 83 receptor binding, proteolytic cleavage of S and potentially other mutations that affect virion and trimer stability may also be important for virus transmissibility in different hosts, and these factors need to be studied further. To date, numerous SARSr-CoV strains have been identified from bats 15, 16, [18] [19] [20] . These bat SARSr-CoVs are the likely progenitors of SARS-CoV that infected humans and civets, and hence understanding their interactions with human or civet ACE2 is critical for tracing the origins of SARS-CoV and for preventing and controlling future SARS-CoV outbreaks in humans. The RBD sequences of these bat SARSr-CoVs fall into three major groups; the representative strains from each group are bHKU3 (isolated in 2005), bWIV1 (isolated in 2013) and bRsSHC014 (isolated in 2013) (TAblE 1) . Strains bWIV1 and bRsSHC014, but not strain bHKU3, use both human and civet ACE2 and hence can infect both human and civet cells 16, [18] [19] [20] 86, 87 . Strain bHKU3 has a truncated RBM (TAblE 1) , which distorts the structure of the RBM and abolishes its binding to human and civet ACE2. Neither strain bWIV1 nor strain bRsSHC014 contains truncations in its RBM, and hence, their RBMs likely retain the same structure as SARS-CoV RBMs. Here, we analysed the potential interactions between these two strains (bWIV1 and bRsSHC014) and human ACE2 by building homology structural models of their RBDs complexed with human ACE2, focusing on residues 479 and 487 (FIg. 6c, d) . Strain bWIV1 contains Asn479 and Asn487 in its RBM. Whereas Asn479 is a viral adaptation to human ACE2, the polar side chain of Asn487 may have unfavourable interactions with the aliphatic portion of residue Lys353 in human ACE2, which is part of hot spot 353 (FIg. 6c) . Strain bRsSHC014 contains Arg479 and Ala487 in its RBM. Whereas Arg479 is a viral adaptation to human ACE2, the small side chain of Ala487 does not provide support to the structure of hot spot 353 (FIg. 6d) . Therefore, although both bWIV1 and bRsSHC014 can infect human airway cells, they bind human ACE2 less well than hTor02 and produce less severe symptoms than the epidemic strain of SARS-CoV in vivo 88, 89 . Similarly, both bWIV1 and bRsSHC014 can infect civet cells, but they bind civet ACE2 less well than cSz02. Thus, it is predicted that both strains will be attenuated compared with early-phase or late-phase human SARS epidemic viruses. Future evolution of bat SARSr-CoV strains bWIV1 and bRsSHC014 in crucial RBM residues may allow them to cross the species barriers between bats, civets and humans, posing potential health threats. Whereas the emergence of SARS involved palm civets, most of the early MERS index cases had contact with dromedary camels. Indeed, MERS-CoV strains isolated from camels were almost identical to those isolated from humans [90] [91] [92] [93] [94] [95] . Moreover, MERS-CoV-specific antibodies were highly prevalent in camels from the Middle East, Africa and Asia 13, 14, [96] [97] [98] [99] [100] [101] [102] [103] . MERS-CoV infections were detected in camel serum samples collected in 1983 (rEF. 100 ), suggesting that MERS-CoV was present in camels at least 30 years ago. Genomic sequence analysis indicated that MERS-CoV, Tylonycteris bat coronavirus HKU4 and Pipistrellus bat coronavirus HKU5 are phylogenetically related (denoted as betacoronavirus lineage C) 21 . The viruses in this lineage have identical genomic structures and are highly conserved in their poly proteins and most structural proteins, but their S proteins and accessory proteins are highly variable. MERSr-CoVs were found in at least 14 bat species from two bat families, Vespertilionidae and Nycteridae. However, none of these MERSr-CoVs is a direct progenitor of MERS-CoV, as their S proteins differ substantially from that of MERS-CoV 98,104-106 . To understand the evolutionary relationships between MERS-CoV and MERSr-CoVs, we constructed a phylogenetic tree on the basis of the alignment of all the coding regions (FIg. 4b; Supplementary Fig. S1b ). The phylogenetic tree contains two main clusters and several small clades or strains. Overall, the genetic diversity within the L1 and L2 viral lineages is low, indicating that humans and camels have been infected by viruses from the same source within a short time period. The L1 viruses include human and camel MERS-CoVs mainly be other yet-to-be-identified viruses that are circulating in nature and directly contributed to the emergence of MERS-CoV in humans and camels. Hopefully, such viruses will be found in bats in the future. Not surprisingly, recombination events have taken place in the evolution and emergence of MERS-CoV 94,105,107-109 . Phylogenetic trees constructed using genes encoding orf1ab and S were incongruent with the tree topology of the complete genome, suggesting potential recombination in these genes 108 . Numerous recombinations imply that MERS-CoV originated from the exchange of genetic elements between different viral ancestors, including those isolated from camels and the assumed natural host bats 94, 105, 107, 110, 111 . The full-length genomic sequences of MERS-CoVs isolated from humans and camels are almost identical (>99% identity). The major variations are located in S, ORF4b and ORF3, particularly in African camel MERS-CoVs 94 . Substitutions of a few amino acid residues were found in the S protein of some camel MERS-CoVs, but none of them was located in the RBD 94, 112 . Neutralization assays indicated that camel sera that are positive for MERS-CoV can completely neutralize the human MERS-CoV strains, suggesting that MERS-CoVs isolated from humans and camels are antigenically similar to each other 94 . MERS-CoVs from both humans and camels contain variable ORF3 and ORF4 proteins with different lengths owing to either terminal truncations or internal deletions 94 . ORF4b is known to be an interferon antagonist 113, 114 . MERS-CoV isolates from West African camels with a truncated ORF4b gene replicate less efficiently in human cell culture and are less pathogenic in human DPP4 transgenic mice 94 . Curiously, deletion of the orf4 gene in the human MERS-CoV strain EMC did not substantially reduce virus replication, although it induced a stronger interferon response 94 . Another study demonstrated that the deletion of orf3-orf5 dramatically attenuated MERS-CoV virulence, primarily through increased host responses, including disrupted cellular processes, increased activation of the interferon pathway and robust inflammation 115 . To date, bat MERSr-CoVs and human and camel MERS-CoVs share the same genomic structures but differ substantially in their genomic sequences 105, 106, 110, 111, 116 . The highest overall genomic sequence identity between bat MERSr-CoV and human and camel MERS-CoV is ~85%. On the basis of their genomic sequences, several bat MERSr-CoV strains discovered in China, such as Ii-MERSr-CoV, Ve-MERSr-CoV and Hy-MERSr-CoV, have just reached the taxonomic threshold to be considered the same species as MERS-CoV 106, 110, 111 . Compared with human and camel MERS-CoV, bat MERSr-CoVs vary most in S and accessory genes. The sequence identity of the S protein between bat MERSr-CoVs and human and camel MERS-CoVs is approximately 45-65%, with even lower sequence identity in the RBD region 110, 111 . The size of these S proteins differs in these strains, mainly because of deletions in their RBD region and/or the S1 and S2 boundary. These deletions are considered to be related to the differences in receptor binding and cell entry 111, 116 . The accessory genes, including those encoding ORF3, ORF4a, ORF4b and ORF5, are also highly variable in length and sequence between bat MERSr-CoVs and human and camel MERS-CoVs, suggesting substantial evolution of these genes in their natural hosts 105, 106, 110, 111, 116 . In contrast to SARS-CoV, which uses ACE2 as its receptor, MERS-CoV uses DPP4. Similar to SARS-CoV S1-CTD, MERS-CoV S1-CTD functions as the viral RBD 10, 117 . Like the SARS-CoV S1-CTD, the MERS-CoV S1-CTD also contains two subdomains, a core structure and an RBM 9,118-120 (FIg. 6e) . The core structures of these two S1-CTDs are similar to each other, with both containing a five-stranded β-sheet as the main scaffold. However, their RBMs differ substantially: whereas the SARS-CoV RBM mainly contains loops, the MERS-CoV RBM mainly contains a four-stranded β-sheet. The structural differences between MERS-CoV and SARS-CoV RBMs account for the different receptor specificities of the two viruses 121 . Like the interactions between SARS-CoV and ACE2, the interactions between MERS-CoV and DPP4 have been extensively examined. DPP4 from humans, camels, horses and bats can function as a receptor for MERS-CoV, whereas DPP4 from mice, hamsters and ferrets cannot 112, [122] [123] [124] [125] . Key residue differences between human DPP4 and the DPP4 from other species affect the species specificities of MERS-CoV. For example, two residues (288 and 330) in mouse DPP4 and five residues (291, 295, 336, 341 and 346) in hamster DPP4 are largely responsible for the incompatibility of mouse and hamster DPP4s with MERS-CoV 112, 123 . Mutating these residues to the corresponding residues in human DPP4 makes mouse and hamster DPP4 functional receptors for MERS-CoV. On the other hand, MERS-CoV and MERSr-CoVs have been isolated from camels and bats, respectively. MERS-CoV strains isolated from humans and camels are highly similar to each other, and they both use human DPP4 efficiently 112 . MERSr-CoVs from bats in general share only ~60-70% sequence identity with MERS-CoV in the RBD, and only some of these bat viruses, including HKU4, recognize DPP4 as the receptor 110, 111, 126 . However, they bind DPP4 less efficiently than MERS-CoV. Mutating three residues in the HKU4 RBD (540, 547 and 558) substantially increased its affinity for human DPP4 (rEF. 127 ). Overall, as in the case of SARS-CoV, receptor recognition is a crucial determinant of the host range of MERS-CoV. From 28 October 2016 to 2 May 2017, swine acute diarrhoea syndrome (SADS) was observed in four pig breeding farms in Guangdong province, with a mortality up to 90% for piglets 5 days or younger. A novel HKU2-related bat coronavirus, named SADS-CoV, was identified as the causative agent 34, 128, 129 . The SADS-CoV isolates from piglets of the four farms were almost identical and shared 95% identity with Rhinolophus bat coronavirus HKU2 (rEF. 130 ), indicating the bat origin of this pig virus. Immediately after the SADS outbreak, SADS-related CoVs (SADSr-CoVs) with 96-98% sequence identity to SADS-CoV were detected in 9.8% of anal swabs collected from different Rhinolophus species in Guangdong province during 2013-2016. Although genetically highly similar, bat SADSr-CoVs show high genetic diversity in the S gene, with 72-92% nucleotide and 80-98% amino acid identity to SADS-CoV. Receptor analysis indicated that none of the known coronavirus receptors, ACE2, DPP4 and aminopeptidase N, are essential for SADS-CoV entry 34 . The mechanism of transmission of SADS-CoV from bats to pigs and the pathogenesis of bat-originated SADSr-CoVs in pigs need further exploration. This is the first documented spillover of a bat coronavirus that caused severe diseases in domestic animals, although molecular evolution data suggested PEDV probably originated in bats 37, 38 . The collected data on genetic evolution, receptor binding and pathogenesis demonstrated that SARS-CoV most likely originated in bats through sequential recombination of bat SARSr-CoVs. Recombination likely occurred in bats before SARS-CoV was introduced into Guangdong province through infected civets or other infected mammals from Yunnan. The introduced SARS-CoV underwent rapid mutations in S and orf8 and successfully spread in market civets. After several independent spillovers to humans, some of the strains underwent further mutations in S and became epidemic during the SARS outbreak in 2002-2003. However, a recent serological investigation revealed the presence of antibodies against the SARSr-CoV nucleocapsid in humans living around a bat cave but who had not shown clinical signs of disease, suggesting that the virus can infect humans through frequent contact 131 . A similar scenario might have happened for MERS-CoV. Since its outbreak in 2012, MERSr-CoVs and related viruses (HKU4 and HKU5) have been found in different bat species in five continents 17, 21, 106, 110, 111, 116, 126, 127, 132 . The ORF1ab of these viruses is highly similar to MERS-CoV ORF1ab, but they are highly diverse in their S proteins. Surprisingly, some bat MERSr-CoVs and HKU can use the same receptor, DPP4, as MERS-CoV 110, 111, 126, 127 . Given the massive number of coronaviruses carried by different bat species, the high plasticity in receptor usage and other features such as adaptive mutation and recombination, frequent interspecies transmission from bats to animals and humans is expected. Currently, no clinical treatments or prevention strategies are available for any human coronavirus. Given the conserved RBDs of SARS-CoV and bat SARSr-CoVs, some anti-SARS-CoV strategies in development, such as anti-RBD antibodies or RBD-based vaccines, should be tested against bat SARSr-CoVs. Recent studies demonstrated that anti-SARS-CoV strategies worked against only WIV1 and not SHC014 (rEFs 71, 88, 89 ). In addition, little information is available on HKU3-related strains that have much wider geographical distribution and bear truncations in their RBD. Similarly, anti-S antibodies against MERS-CoV could not protect from infection with a pseudovirus bearing the bat MERSr-CoV S 111 . Furthermore, little is known about the replication and pathogenesis of these bat viruses. Thus, future work should be focused on the biological properties of these viruses using virus isolation, reverse genetics and in vitro and in vivo infection assays. The resulting data would help the prevention and control of emerging SARS-like or MERS-like diseases in the future. It is widely accepted that many viruses have existed in their natural reservoirs for a very long time. The constant spillover of viruses from natural hosts to humans and other animals is largely due to human activities, including modern agricultural practices and urbanization. Therefore, the most effective way to prevent viral zoonosis is to maintain the barriers between natural reservoirs and human society, in mind of the 'one health' concept. Epidemiology and cause of severe acute respiratory syndrome (SARS) in Guangdong, People's Republic of China Identification of a novel coronavirus in patients with severe acute respiratory syndrome Aetiology: Koch's postulates fulfilled for SARS virus A novel coronavirus associated with severe acute respiratory syndrome Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Innate immune response of human alveolar type II cells infected with severe acute respiratory syndrome-coronavirus Molecular basis of binding between novel human coronavirus MERS-CoV and its receptor CD26 Dipeptidyl peptidase 4 is a functional receptor for the emerging human coronavirus-EMC Reverse genetics with a full-length infectious cDNA of the Middle East respiratory syndrome coronavirus Isolation and characterization of viruses related to the SARS coronavirus from animals in southern China Middle East respiratory syndrome coronavirus infection in dromedary camels in Saudi Arabia Middle East Respiratory Syndrome (MERS) coronavirus seroprevalence in domestic livestock in Saudi Arabia Severe acute respiratory syndrome coronavirus-like virus in Chinese horseshoe bats Molecular evolution analysis and geographic investigation of severe acute respiratory syndrome coronavirus-like virus in palm civets at an animal market and on farms Close relative of human Middle East respiratory syndrome coronavirus in bat Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor Isolation and characterization of a novel bat coronavirus closely related to the direct progenitor of severe acute respiratory syndrome coronavirus Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus This paper identifies a gene pool of SARS-CoVs in bats Genetic characterization of Betacoronavirus lineage C viruses in bats reveals marked sequence divergence in the spike protein of pipistrellus bat coronavirus HKU5 in Japanese pipistrelle: implications for the origin of the novel Middle East respiratory syndrome coronavirus Coronavirus diversity, phylogeny and interspecies jumping Coronaviruses post-SARS: update on replication and pathogenesis Discovery of seven novel mammalian and avian coronaviruses in the genus deltacoronavirus supports bat coronaviruses as the gene source of alphacoronavirus and betacoronavirus and avian coronaviruses as the gene source of gammacoronavirus and deltacoronavirus This paper describes coronavirus origins by phylogenetic analysis A decade after SARS: strategies for controlling emerging coronaviruses Bat origin of human coronaviruses SARS and MERS: recent insights into emerging coronaviruses Epidemiology, genetic recombination, and pathogenesis of coronaviruses Molecular evolution of human coronavirus genomes Global patterns in coronavirus diversity Bat-origin coronaviruses expand their host range to pigs Coronavirus genome structure and replication Evolution, antigenicity and pathogenicity of global porcine epidemic diarrhea virus strains Fatal swine acute diarrhoea syndrome caused by an HKU2-related coronavirus of bat origin Origin, evolution, and genotyping of emergent porcine epidemic diarrhea virus strains in the United States Receptor usage and cell entry of porcine epidemic diarrhea coronavirus Bat coronavirus in Brazil related to appalachian ridge and porcine epidemic diarrhea viruses Genetic diversity of coronaviruses in bats in Lao PDR and Cambodia Antibodies to SARS coronavirus in civets Analysis on the risk factors of severe acute respiratory syndromes coronavirus infection in workers from animal markets An epidemiologic investigation on infection with severe acute respiratory syndrome coronavirus This paper describes genetic evolution of SARS-CoV during transmission from animals and humans Bats are natural reservoirs of SARS-like coronaviruses Full-length genome sequences of two SARS-like coronaviruses in horseshoe bats and genetic variation analysis Genomic characterization of severe acute respiratory syndrome-related coronavirus in European bats and classification of coronaviruses based on partial RNA-dependent RNA polymerase gene sequences Ecoepidemiology and complete genome comparison of different strains of severe acute respiratory syndrome-related Rhinolophus bat coronavirus in China reveal bats as a reservoir for acute, self-limiting infection that allows recombination events Identification of SARS-like coronaviruses in horseshoe bats (Rhinolophus hipposideros) in Slovenia Intraspecies diversity of SARS-like coronaviruses in Rhinolophus sinicus and its implications for the origin of SARS coronaviruses in humans A real-time PCR assay for bat SARSlike coronavirus detection and its application to Italian greater horseshoe bat faecal sample surveys Novel SARS-like betacoronaviruses in bats, China Identification of diverse alphacoronaviruses and genomic characterization of a novel severe acute respiratory syndrome-like coronavirus from bats in China SARS-coronavirus ancestor's footprints in South-East Asian bat colonies and the refuge theory Diversity of coronavirus in bats from Eastern Thailand Detection of novel SARS-like and other coronaviruses in bats from Kenya Longitudinal surveillance of SARSlike coronaviruses in bats by quantitative real-time PCR The molecular biology of coronaviruses Moderate mutation rate in the SARS coronavirus genome and its implications Evidence of the recombinant origin of a bat severe acute respiratory syndrome (SARS)-like coronavirus and its implications on the direct ancestor of SARS coronavirus ORF8-related genetic evidence for chinese horseshoe bats as the source of human severe acute respiratory syndrome coronavirus New insights into the mechanisms of RNA recombination Generation of coronavirus spike deletion variants by high-frequency recombination at regions of predicted RNA secondary structure Molecular evolution of the SARS coronavirus during the course of the SARS epidemic in China Amino acids 270 to 510 of the severe acute respiratory syndrome coronavirus spike protein are required for interaction with receptor 193-amino acid fragment of the SARS coronavirus S protein efficiently binds angiotensinconverting enzyme 2 Expression, post-translational modification and biochemical characterization of proteins encoded by subgenomic mRNA8 of the severe acute respiratory syndrome coronavirus The 29-nucleotide deletion present in human but not in animal severe acute respiratory syndrome coronaviruses disrupts the functional expression of open reading frame 8 Accessory proteins 8b and 8ab of severe acute respiratory syndrome coronavirus suppress the interferon signaling pathway by mediating ubiquitin-dependent rapid degradation of interferon regulatory factor 3 The 8ab protein of SARS-CoV is a luminal ER membrane-associated protein and induces the activation of ATF6 Open reading frame 8a of the human severe acute respiratory syndrome coronavirus not only promotes viral replication but also induces apoptosis Bat severe acute respiratory syndrome-like coronavirus ORF3b homologues display different interferon antagonist activities Cross-neutralization of SARS coronavirus-specific antibodies against bat SARS-like coronaviruses Receptor recognition and cross-species infections of SARS coronavirus Animal origins of the severe acute respiratory syndrome coronavirus: insight from ACE2-S-protein interactions Conformational states of the severe acute respiratory syndrome coronavirus spike protein ectodomain Cryo-EM structures of MERS-CoV and SARS-CoV spike glycoproteins reveal the dynamic receptor binding domains Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2 ACE2 X-ray structures reveal a large hinge-bending motion important for inhibitor binding and catalysis Identification of two critical amino acid residues of the severe acute respiratory syndrome coronavirus spike protein for its variation in zoonotic tropism transition via a double substitution strategy Natural mutations in the receptor binding domain of spike glycoprotein determine the reactivity of cross-neutralization between palm civet coronavirus and severe acute respiratory syndrome coronavirus Laboratory diagnosis of four recent sporadic cases of community-acquired SARS, guangdong province, china Mechanisms of zoonotic severe acute respiratory syndrome coronavirus host range expansion in human airway epithelium The Nidoviruses: Toward Control of SARSs and Other Nidovirus Diseases Mechanisms of host receptor adaptation by severe acute respiratory syndrome coronavirus A virus-binding hot spot on human angiotensin-converting enzyme 2 Is critical for binding of two different coronaviruses Structural analysis of major species barriers between humans and palm civets for severe acute respiratory syndrome coronavirus infections Synthetic recombinant bat SARS-like coronavirus is infectious in cultured cells and in mice Difference in receptor usage between severe acute respiratory syndrome (SARS) coronavirus and SARS-like coronavirus of bat origin A SARS-like cluster of circulating bat coronaviruses shows potential for human emergence SARS-like WIV1-CoV poised for human emergence Middle East respiratory syndrome coronavirus in dromedary camels: an outbreak investigation This paper provides the first identification of MERS-CoV in camels Evidence for camel-to-human transmission of MERS coronavirus Isolation of MERS coronavirus from a dromedary camel Co-circulation of three camel coronavirus species and recombination of MERS-CoVs in Saudi Arabia MERS coronaviruses from camels in Africa exhibit region-dependent genetic diversity Zoonotic origin and transmission of Middle East respiratory syndrome coronavirus in the UAE Seroepidemiology for MERS coronavirus using microneutralisation and pseudoparticle virus neutralisation assays reveal a high prevalence of antibody in dromedary camels in Egypt Middle East respiratory syndrome coronavirus neutralising serum antibodies in dromedary camels: a comparative serological study MERS coronavirus in dromedary camel herd, Saudi Arabia Antibodies against MERS coronavirus in dromedary camels Presence of Middle East respiratory syndrome coronavirus antibodies in Saudi Arabia: a nationwide, cross-sectional, serological study Serologic evidence for MERS-CoV infection in dromedary camels The prevalence of Middle East respiratory syndrome coronavirus (MERS-CoV) antibodies in dromedary camels in Israel Middle East respiratory syndrome coronavirus (MERS-CoV): announcement of the Coronavirus Study Group Rooting the phylogenetic tree of middle East respiratory syndrome coronavirus by characterization of a conspecific virus from an African bat MERS-related betacoronavirus in Vespertilio superans bats MERS-CoV recombination: implications about the reservoir and potential for adaptation Origin and possible genetic recombination of the Middle East respiratory syndrome coronavirus from the first imported case in China: phylogenetics and coalescence analysis Evolutionary dynamics of MERS-CoV: potential recombination, positive selection and transmission Receptor usage of a novel bat lineage C betacoronavirus reveals evolution of Middle East respiratory syndrome-related coronavirus spike proteins for human dipeptidyl peptidase 4 binding Discovery of novel bat coronaviruses in South China that use the same receptor as Middle East respiratory syndrome coronavirus Receptor variation and susceptibility to Middle East respiratory syndrome coronavirus infection The ORF4b-encoded accessory proteins of Middle East respiratory syndrome coronavirus and two related bat coronaviruses localize to the nucleus and inhibit innate immune signalling Middle East respiratory syndrome coronavirus ORF4b protein inhibits type I interferon production through both cytoplasmic and nuclear targets MERS-CoV accessory ORFs play key role for infection and pathogenesis Further evidence for bats as the evolutionary source of Middle East respiratory syndrome coronavirus Identification of a receptor-binding domain in the S protein of the novel human coronavirus Middle East respiratory syndrome coronavirus as an essential target for vaccine development The receptor binding domain of the new Middle East respiratory syndrome coronavirus maps to a 231-residue region in the spike protein that efficiently elicits neutralizing antibodies Structure of MERS-CoV spike receptorbinding domain complexed with human receptor DPP4 Crystal structure of the receptor-binding domain from newly emerged Middle East respiratory syndrome coronavirus Receptor recognition mechanisms of coronaviruses: a decade of structural studies Mouse dipeptidyl peptidase 4 is not a functional receptor for Middle East respiratory syndrome coronavirus infection Host species restriction of Middle East respiratory syndrome coronavirus through its receptor, dipeptidyl peptidase 4 Glycosylation of mouse DPP4 plays a role in inhibiting Middle East respiratory syndrome coronavirus infection Permissivity of dipeptidyl peptidase 4 orthologs to Middle East respiratory syndrome coronavirus is governed by glycosylation and other complex determinants Receptor usage and cell entry of bat coronavirus HKU4 provide insight into bat-to-human transmission of MERS coronavirus Bat origins of MERS-CoV supported by bat coronavirus HKU4 usage of human receptor CD26 A new bat-HKU2-like coronavirus in swine Discovery of a novel swine enteric alphacoronavirus (SeACoV) in southern China Complete genome sequence of bat coronavirus HKU2 from Chinese horseshoe bats revealed a much smaller spike gene with a different evolutionary lineage from the rest of the genome Serological evidence of bat SARSrelated coronavirus infection in humans Middle East respiratory syndrome coronavirus in bats, Saudi Arabia Evidence supporting a zoonotic origin of human coronavirus strain NL63 Surveillance of bat coronaviruses in Kenya identifies relatives of human coronaviruses NL63 and 229E and their recombination history This paper describes the bat origins of two human coronaviruses Link of a ubiquitous human coronavirus to dromedary camels Ecology, evolution and classification of bat coronaviruses in the aftermath of SARS New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0 Extensive diversity of coronaviruses in bats from China MERS coronavirus: diagnostics, epidemiology and transmission Coot: model-building tools for molecular graphics Structure of SARS coronavirus spike receptor-binding domain complexed with receptor All authors researched data for the article, contributed substantially to discussion of the content, wrote the article and reviewed and edited the manuscript before submission. The authors declare no competing interests. Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. Nature Reviews Microbiology thanks R. Baric, B. Haagmans and K.-Y. Yuen for their contribution to the peer review of this work. Supplementary information is available for this paper at https://doi.org/10.1038/s41579-018-0118-9. international Committee on Taxonomy of viruses: http:// www.ictvonline.org/ www.nature.com/nrmicro