key: cord-0855881-pbkwtyks authors: Huang, Szu-Wei; Wang, Sheng-Fan title: SARS-CoV-2 Entry Related Viral and Host Genetic Variations: Implications on COVID-19 Severity, Immune Escape, and Infectivity date: 2021-03-17 journal: Int J Mol Sci DOI: 10.3390/ijms22063060 sha: 248e27822d93a1bfab088d540cb456c07d81f43f doc_id: 855881 cord_uid: pbkwtyks Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has evolved to display particular patterns of genetic diversity in the genome across geographical regions. These variations in the virus and genetic variation in human populations can determine virus transmissibility and coronavirus disease 2019 (COVID-19) severity. Genetic variations and immune differences in human populations could be the driving forces in viral evolution. Recently emerged SARS-CoV-2 variants show several mutations at the receptor binding domain in the spike (S) glycoprotein and contribute to immune escape and enhanced binding with angiotensin 1-converting enzyme 2 (ACE2). Since ACE2 and transmembrane protease serine 2 (TMPRSS2) play important roles in SARS-CoV-2 entry into the cell, genetic variation in these host entry-related proteins may be a driving force for positive selection in the SARS-CoV-2 S glycoprotein. Dendritic or liver/lymph cell-specific intercellular adhesion molecule (ICAM)-3-grabbing non-integrin is also known to play vital roles in several pathogens. Genetic variations of these host proteins may affect the susceptibility to SARS-CoV-2. This review summarizes the latest research to describe the impacts of genetic variation in the viral S glycoprotein and critical host proteins and aims to provide better insights for understanding transmission and pathogenesis and more broadly for developing vaccine/antiviral drugs and precision medicine strategies, especially for high risk populations with genetic risk variants. The recently identified severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the pandemic of coronavirus disease 2019 (COVID- 19) , which began at the end of 2019 and is ongoing [1, 2] . The number of confirmed cases continues to rise rapidly worldwide, with nearly 78% of confirmed cases in the Americas (44.5%) and Europe (34%) (calculated until 22 February 2021) (World Health Organization (WHO) COVID-19 report, https://covid19.who.int/, accessed on 22 February 2021). According to a meta-analysis, the fatality rate of COVID-19 is around 3% [3] . SARS-CoV-2 is the third coronavirus to cause a pandemic, the other two being SARS-CoV [4] and Middle East Respiratory Syndrome Coronavirus (MERS-CoV) [5] in 2003 and 2012, respectively. Genetic evidence from the SARS-CoV-2 genome shows high identity with two bat-derived SARS-like coronaviruses, bat-SL-CoVZC45 (87.6%) and bat-SL-CoVZXC21 (87.5%) [6] . Due to the lack of antiviral drug and vaccine selection pressure, the current genetic diversity patterns of SARS-CoV-2 in different geographical regions may be associated with genetic variation in populations, with increasing genetic diversity in the virus attributed to natural selection driven by long periods of an evolutionary arms race between host and virus [7] . Several studies have demonstrated that there is a positive correlation between host cell genetic variation and the susceptibility to different viruses [8] [9] [10] . Older age, male sex, and some co-morbidities have been found to be risk factors associated with COVID-19 severity, however, those risk factors do not fully explain the differences between asymptomatic, mild, and severe patients [11] . A recent genome-wide association study by Ellinghaus and colleagues showed that rs1138592 and rs657152 genetic variants were significantly associated with respiratory failure in severe COVID-19 patients. Notably, rs65712 is located at the ABO blood group gene, and Ellinghaus and colleagues further confirmed that patients with blood group A showed higher risk than others [12] . Another study by Zeberg and Pääbo found that the core haplotype in chromosome 3 is strongly associated with COVID-19 severity. The frequency of this haplotype was found to vary between South Asian (30%), European (8%), admixed American (4%), and East Asian (almost absent) populations [13] . However, future study is required to investigate the significance of this variation on COVID-19 severity. It is believed that human genetic variation can result in different responses to SARS-CoV-2 infection, even with the same age, sex, and health status. SARS-CoV-2 has evolved to contain cumulative mutations in its genome, with the most highly mutated regions being ORF1ab, spike, and nucleocapsid genes. It is inferred that positive selection contributes to the evolution of SARS-CoV-2 [14, 15] . Several recently emerged SARS-CoV-2 variants, B.1.1.7 lineage (a.k.a. 20B/501Y.V1 Variant and VUI202012/01), B.1.351 lineage (a.k.a. 20C/501Y.V2), P.1/P.2 lineages (descendent of B.1.1.28), and B.1.429, have been found responsible for the dramatic increase of infections in the United Kingdom [16] , South Africa [17] , Brazil [18] , and North America [19] , respectively. Viral genome analysis showed these variants to carry multiple mutations in the S glycoprotein, including some at the receptor binding domain (RBD). Some of these mutations are believed to be the result of adaptive evolution and have biological importance. The direct impacts of the mutations in the S glycoprotein of SARS-CoV-2 include affecting the viral transmissibility through interaction with the host cell binding receptor and contributing to the immune escape through changes in the RBD. The most important host proteins involved in SARS-CoV-2 entry have been identified as angiotensin 1-converting enzyme 2 (ACE2) and cell-surface associated transmembrane protease serine 2 (TMPRSS2). Genetic variations in ACE2 and TMPRSS2 may provide the driving force for viral evolution, therefore causing positive selection for these emerging mutations in the SARS-CoV-2 S glycoprotein. Additionally, dendritic or liver/lymph cell-specific intercellular adhesion molecule (ICAM)-3-grabbing non-integrin (DC/L-SIGN) has been known to play vital roles for several pathogens, including SARS-CoV [20] . Therefore, the genetic variation of these host proteins may also affect susceptibility to SARS-CoV-2. Investigating the correlations between genetic variation in populations and viral infectivity or clinical outcomes could provide great insights for developing precision medicine strategies. In this review, we aim to compile knowledge and current advances on the impacts of genetic variations in the viral S glycoprotein and critical host proteins on the susceptibility to SARS-CoV-2 infection and immune escape. This understanding is crucial for controlling the pandemic through enhanced surveillance and vaccine development. SARS-CoV-2 is one of the coronaviruses (CoV) and is an enveloped and positive-sense ssRNA (~30 kb) virus which belongs to the Betacoronavirus genus, Nidovirales order. Two replicase open reading frames (ORFs) encoded by ORF1a (~13.2 kb) and ORF1b (~8.1 kb) occupy at least two-thirds of the CoV genome ( Figure 1A ). The polyprotein ORF1ab (as known as pp1ab) is translated due to a −1 ribosomal frameshift upstream of the ORF1a stop codon [21] . Polyprotein ORF1a (as known as pp1a) and pp1ab can be further processed to 16 functional non-structural proteins (nsps) by self-produced nsp5 and nsp3 proteases. Nsp5 protease (also called 3C-like protease, 3CLpro, Mpro) contains a chymotrypsin-like fold and is responsible for processing nsp4 to nsp16, whereas nsp3 papain-like protease (PLpro) is responsible for processing nsp1 to nsp4 [22] . SARS-CoV nsps have been well studied and characterized for their involvement in the different steps of the virus replication cycle [22] . SARS-CoV-2 is one of the coronaviruses (CoV) and is an enveloped and positive-sense ssRNA (~30 kb) virus which belongs to the Betacoronavirus genus, Nidovirales order. Two replicase open reading frames (ORFs) encoded by ORF1a (~13.2 kb) and ORF1b (~8.1 kb) occupy at least two-thirds of the CoV genome ( Figure 1A ). The polyprotein ORF1ab (as known as pp1ab) is translated due to a −1 ribosomal frameshift upstream of the ORF1a stop codon [21] . Polyprotein ORF1a (as known as pp1a) and pp1ab can be further processed to 16 functional non-structural proteins (nsps) by self-produced nsp5 and nsp3 proteases. Nsp5 protease (also called 3C-like protease, 3CLpro, Mpro) contains a chymotrypsin-like fold and is responsible for processing nsp4 to nsp16, whereas nsp3 papain-like protease (PLpro) is responsible for processing nsp1 to nsp4 [22] . SARS-CoV nsps have been well studied and characterized for their involvement in the different steps of the virus replication cycle [22] . Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic organization and structure of S glycoprotein. (A) The ORF1a and ORF1b genes can be translated to polyprotein 1a (pp1a) and polyprotein 1ab (pp1ab, -1 ribosomal frameshift). Pp1a and pp1ab can be processed into several functional non-structural proteins (nsps). Structural genes encode four structure proteins, S glycoprotein, envelope protein (E), membrane protein (M), and nucleocapsid protein (N). Several accessory proteins are encoded in the end of genome, include ORF3a, ORF6, ORF7, ORF8, and ORF10. NTD, N-terminal domain; SD1 and SD2, subdomain 1 and 2; FP, fusion peptide; HR1 and HR2, heptad repeat 1 and 2; TM, transmembrane region. (B) Structure of SARS-CoV-2 S glycoprotein bound to angiotensin 1-converting enzyme 2 (ACE2). S glycoprotein consists of three S glycoprotein monomers which are shown in blue, pink, and light Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic organization and structure of S glycoprotein. (A) The ORF1a and ORF1b genes can be translated to polyprotein 1a (pp1a) and polyprotein 1ab (pp1ab, -1 ribosomal frameshift). Pp1a and pp1ab can be processed into several functional non-structural proteins (nsps). Structural genes encode four structure proteins, S glycoprotein, envelope protein (E), membrane protein (M), and nucleocapsid protein (N). Several accessory proteins are encoded in the end of genome, include ORF3a, ORF6, ORF7, ORF8, and ORF10. NTD, N-terminal domain; SD1 and SD2, subdomain 1 and 2; FP, fusion peptide; HR1 and HR2, heptad repeat 1 and 2; TM, transmembrane region. (B) Structure of SARS-CoV-2 S glycoprotein bound to angiotensin 1-converting enzyme 2 (ACE2). S glycoprotein consists of three S glycoprotein monomers which are shown in blue, pink, and light green. ACE2 is shown in green. The interface of SARS-CoV-2 receptor binding domain (RBD)-receptor binding motif (RBM) and ACE2 is enlarged in right panel. The amino acid positions of RBD and ACE2 responsible for binding are shown in yellow (ACE2) and red (RBD). Structure depicting S glycoprotein bound to ACE2 using PDB: 7A94. Structure depicting RBD bound to ACE2 using PDB: 6VW1. Figures are generated by UCSF ChimeraX software. Four structural protein genes are located in the C-terminal region of the CoV genome. The SARS-CoV-2 S glycoprotein contains a furin recognition cleavage site (polybasic cleavage site, PRRAR) which provides efficient proteolytic processing into S1 and S2 [23] . Additionally, recent evidence has shown that the furin cleavage site of the SARS-CoV-2 S glycoprotein plays a critical role in regulating viral replication and pathogenesis, which could be a potential therapeutic target against SARS-CoV-2 infection [24] [25] [26] . The SARS-CoV-2 RBD, located in the S1 domain, binds to the host cell ACE2 receptor, while S2 functions as the membrane fusion subunit [27, 28] . Within the RBD, the receptor binding motif (RBM) is in close contact with ACE2. Several amino acid positions of the interface between ACE2 and RBM have been found to play important roles for binding through the formation of hydrogen bonds and salt bridges ( Figure 1B ). The envelope (E) protein is involved in virion production and pathogenesis [29] , and the membrane (M) protein plays a pivotal role in mediating virus assembly and budding. In addition, the M protein interacts with the viral nucleocapsid (N) protein for viral RNA packaging and recruits other structural proteins to the endoplasmic reticulum (ER)-Golgi-intermediate compartment [30, 31] . The N protein encapsulates and protects the CoV genome in the virion and then enters the host cells to promote viral replication [32, 33] . Several accessory proteins can be found in CoV, however, though they can affect viral viability and pathogenesis, evidence has shown that accessory proteins are not essential for viral replication [34] . The first step of CoV infection of a target cell is the binding of the viral RBD of the S glycoprotein to the cell membrane receptor, ACE2 ( Figure 1B) . The SARS-CoV-2 RBD and ACE2 binding structure is nearly identical to that of the SARS-CoV RBD [35, 36] . There are several studies which have shown that SARS-CoV-2 is more infectious than SARS-CoV, which may explain why SARS-CoV-2 has caused a more severe pandemic than SARS-CoV [37] [38] [39] [40] . Three mechanisms which have been proposed to potentially play a role in this increased SARS-CoV-2 infectivity are: (1) higher binding affinity of RBD to ACE2, (2) less exposed RBD (immune escape), and (3) pre-activation by furin (enhanced viral entry) [41] . Other than ACE2, another protein which has been shown to be important in SARS-CoV-2 entry into the cell is TMPRSS2. The cleavage ability of TMPRSS2 to prime the S glycoprotein during viral attachment plays a vital role in SARS-CoV-2 entry into the cell [27] . Evidence has shown that the furin processing region (amino acid position 675 to 692) has the highest mutation density (number of distinct mutations in the region) [42] . Mutations which occur in this region may provide an advantage to the virus, allowing it to utilize a large number of host proteases to enhance infectivity. The S glycoprotein has also been shown to play pivotal roles in identifying host specificity, viral pathogenesis, and inducing human neutralizing antibodies. Viral surface proteins have been shown as promising targets to generate therapeutic or prevention purpose antibodies [43] . Current studies have focused on the mutation at the RBD (residues 319 to 541) in the SARS-CoV-2 S glycoprotein. Within the interface of the S glycoprotein (RBM) and ACE2, several mutations have been identified. The continuous transmission of SARS-CoV-2 has caused rapid accumulation of mutations in the S glycoprotein across geographical regions. One of the predominant mutations, D614G, was found to be circulating and rapidly spreading outside China in the early pandemic. The recently emerged variants in the United Kingdom (B. [44] . B.1.429, which has recently been found to be spreading rapidly in California, USA, includes three mutations in the S glycoprotein, S13I and W152C in the S1 domain and L452R in the RBD [19] . Those variants have caused a severe increase in SARS-CoV-2 infections since December 2020. Notably, the N501Y mutation, which is located in the RBD and has been found in most of the variants, is believed to enhance the transmissibility of SARS-CoV-2. The aforementioned variants all have the D614G mutation, though this is expected due to the predominance of D614G since the early pandemic. Other than the mutations found in those variants, several mutations in S glycoprotein have been found sporadically and contribute to the immune escape and the transmissibility of SARS-CoV-2. The impact of D614G on viral transmissibility has been widely studied due to its emergence in the early pandemic and its worldwide presence. The D614G mutation is of rising concern, as it has the potential to affect SARS-CoV-2 infectivity through changes to RBD structure, S1/S2 subunit interaction, viral entry, and immune response [45] . Becerra-Flores and colleagues have found that patients infected with SARS-CoV-2 containing the D614G mutation have a higher case fatality rate [46] . The effects of the D614G mutation on the SARS-CoV-2 S glycoprotein have been comprehensively investigated by several studies. First, this mutation has been found to be associated with higher viral load in the upper respiratory tract in patients, further confirmed in pseudotyped experiments and animal models [45, 47, 48] . Second, a detailed structure analysis showed that the D614G mutation shifts the conformation of the S glycoprotein to be more open, therefore contributing to enhanced ACE2 binding and fusion efficiency [49] . This conformation change of the S glycoprotein has been found to be important for SARS-CoV-2 binding with ACE2 [50] . Third, the D614G mutation can decrease S1 shedding, which indicates enhanced efficiency of processing by furin-like proprotein convertase [51] . However, the D614G mutation did not show resistance to neutralizing antibodies [52] . A recent report indicated that the D614G mutation can potentially affect the glycosylation at residue 616 which may be able to enhance virulence through DC/L-SIGN binding in dendritic cells [53] . Notably, D614G combined with other mutations in S glycoprotein exhibits more infectivity in different cell lines [54] . As there was no available treatment or vaccination selective pressure in SARS-CoV-2 infection in the early pandemic, how the D614G mutation occurred and became predominant outside China is not clear. Our previous study showed that the D614G mutation is significantly associated with the differences in ACE2 expression levels in populations [55] . This study indicates that populations with lower ACE2 expression, such as Europe and Africa, provide the environment for selective pressure for SARS-CoV-2 adaptive evolution. ∆H69/V70 are located at the N-terminal domain (NTD) of the S glycoprotein. The ∆H69/V70 deletion has been found globally, however, tracking of SARS-CoV-2 sequences has shown it to be mainly circulating in Europe. A single round infectivity experiment showed that SARS-CoV-2 carrying either ∆H69/V70 or ∆H69/V70 combined with N501Y can enhance infectivity in 293T/hACE2 cells. Additionally, a virus carrying the ∆H69/V70 mutation exhibits more S glycoprotein incorporated in the virion [56] . A recent study showed that ∆H69/V70 with D796H in the S glycoprotein can potentially contribute to immune escape in immunocompromised patients, while D796H itself decreased the infectivity but contributed to the reduction of susceptibility to neutralizing antibodies [57] . The loss of infectivity caused by D796H could be compensated for in cases where it cooccurs with ∆H69/V70. ∆H69/V70 also frequently co-occurs with N439K or Y453F, which are located at the RBD of the SARS-CoV-2 S glycoprotein. The binding affinity of Y453F with ACE2 is controversial, however, it is seen to contribute to the immune escape for neutralizing antibodies and human convalescent sera [58] [59] [60] . The N439K mutation could enhance the binding affinity with ACE2 through the formation of a new salt bridge and has resistance to some neutralizing antibodies and human convalescent sera [61] . S477N is located in the RBD and has been found to enhance binding with ACE2 [58] . The E484K mutation in the RBD of S glycoprotein is of rising concern due to its emergence in several current variants which cause severe transmission. A current study showed that E484K could enhance the binding with ACE2 through a conformational change of the S glycoprotein [62] . L452R, located at the RBD, has been shown to increase the infectivity by stabilizing the S glycoprotein and ACE2 interaction [63] [64] [65] . N501Y is another mutation of rising concern due to its co-occurrence in several current SARS-CoV-2 variants in the United Kingdom and South Africa. N501Y is also located in the RBD of the S glycoprotein and could potentially affect binding with ACE2. The N501Y mutation emerged in infected wild-type mice at early passage and is believed to be the result of adaptive evolution of the SARS-CoV-2 virus [66] . Studies using a comprehensive scanning approach [58] and in silico methods [67] have shown that the N501Y mutation can increase the binding affinity for ACE2. The resulting enhanced binding affinity may be due to additional hydrogen bonds with ACE2 at residues Y41 and K353 [68] and may contribute to a more open conformation of the RBD in the S glycoprotein [69] . The P681H mutation is juxtaposed to the furin processing site (amino acid position 682 to 685). The furin processing of the S glycoprotein into S1/S2 is an important step for virus fusion into cells [70] , however, whether the P681H mutation could affect viral infectivity and efficiency of furin processing needs further investigation. V1176 is located at the stalk domain of the S glycoprotein. The flexible stalk domain is necessary for viral entry and fusion into the cells [71] . According to a molecular dynamics simulations analysis, the V1176F mutation could enhance the flexibility of the S glycoprotein by increasing motility and inducing compactness [72] . Additionally, evidence has shown that V1176F is associated with higher patient mortality [72, 73] . The mutations N331Q and N343Q could disrupt the N-glycosylation site of the S glycoprotein and strongly decrease the viral infectivity, however, there is no current circulating SARS-CoV-2 carrying those mutations [54, 74] (Table 1) . The ∆Y144 and the L242/L244 deletions (∆L242/244) are located at the NTD of the S glycoprotein and show a loss of binding ability with neutralizing antibodies [75, 76] . Starr and colleagues mapped the mutations in the RBD of SARS-CoV-2 which could escape neutralization by the antibodies used to treat COVID-19 patients, Regeneron's REGN-COV2 cocktail (consisting of two antibodies, REGN10933 and REGN10987, emergency use authorization for treatment of COVID-19) and Eli Lilly's LY-CoV016 antibody (also known as CB6 or JS016, phase 3 clinical trials). They found that E406W can escape the neutralization by the REGN-COV2 cocktail. K417N can escape the neutralization by several monoclonal antibodies including LY-CoV016. N439K and N440D can escape the neutralization by the REGN10987 antibody. Y453F, F486L, Y489H, and Q493K also escape the neutralization by REGN10933 [77] . K417N is one of the major mutations found in B.1.351 which has also been recently shown to escape the neutralization by monoclonal antibodies [52, 77, 80] . A recent study used free energy perturbation calculations to show that the combination of N501Y and K417N could enhance the binding with ACE2 while dramatically decreasing the binding with antibodies [81] . The E484K mutation, located in the RBD, not only enhanced binding with ACE2, it also exhibited strong or moderate resistance to several human neutralizing antibodies and human convalescent sera [52, 76, 78, 80, 82, 83] , which indicates this mutation is important in the viral evolution to escape neutralizing antibodies. L452R can also reduce the sensitivity to several antibodies and human convalescent sera [54, 78] . Several rare mutations (<0.1%) have been found to contribute to the immune escape of neutralization by monoclonal antibodies and human convalescent sera, including N440D, K444N, G446D/V, N450K/Y/D, A475V, G476S, T478I, P479S, F486L, Y489H, Q493K, P499L, and D796H [54, 57, 77, 78] (Table 1) . There are several mutations which appear in more than 0.1% of sequences circulating across geographical regions, including ∆H69/V70 and ∆144 in S1 NTD; K417N/T/V, N439K, L452R, Y453F, S477N, E484K, and N501Y in RBD; D614G in S1 CTD; P681H and V1176F in S2. ∆H69/V70, ∆144, N439K, Y453F, and N501Y are found circulating mainly in Europe. K417N/T/V is carried by B.1.351 and B.1.1.28 which are mainly circulating in Africa and Europe. L452R is carried by B.1.429 which is mainly circulating in North America and Europe. S477N is mainly circulating in Europe and Oceania. E484K is carried by several current circulating variants and is found in Africa (33.8%) and Europe (30.8%), however, there are also more than 10% distributed in North and South Americas (Figure 2 ). P681H is carried by B.1.1.7 and is mainly found circulating in Europe. Notably, SARS-CoV-2 sequences carrying only P681H (excluded B.1.1.7) make up 4.8% of those circulating in North America. V1176F is mainly circulating in South America, however, there are more than 10% of SARS-CoV-2 sequences carrying this mutation in Europe and North America (Table 1) . Several functionally important mutations have been found to be circulating across geographical regions and co-occurring with other variants. E484K is one of the most concerning mutations which exhibits increased dynamics in several severe transmission regions. Recent evidence has shown that SARS-CoV-2 carrying E484K or K417N (less prevalent than E484K) in the S glycoprotein could contribute to broad immune escape from monoclonal antibodies and human convalescent sera [52, 76] . The current SARS-CoV-2 variants exhibit high transmissibility and immune escape ability as a result of several co-occurring mutations in the S glycoprotein. Increased transmissibility is conferred by ΔH69/V70 and N501Y in B. The ACE2 gene contains 18 exons located in chromosome X. ACE2 consists of three domains: (1) N-terminal peptidase domain (residues 19-615), (2) C-terminal collectrin-like domain (CLD, residues 616-768), and (3) end with a hydrophobic transmembrane region and an intracellular segment of 43 residues [84, 85] . ACE2 belongs to the family of angiotensin converting enzymes (ACE) members. ACE is a widely distributed protein which converts angiotensin (Ang) I (inactive form) to AngII (activate form). This conversion is known to play a vital role in several biological functions, such as controlling blood pressure [86, 87] , regulating water and sodium absorption in the Several functionally important mutations have been found to be circulating across geographical regions and co-occurring with other variants. E484K is one of the most concerning mutations which exhibits increased dynamics in several severe transmission regions. Recent evidence has shown that SARS-CoV-2 carrying E484K or K417N (less prevalent than E484K) in the S glycoprotein could contribute to broad immune escape from monoclonal antibodies and human convalescent sera [52, 76] . The current SARS-CoV-2 variants exhibit high transmissibility and immune escape ability as a result of several co-occurring mutations in the S glycoprotein. Increased transmissibility is conferred by ∆H69/V70 and N501Y in B. The ACE2 gene contains 18 exons located in chromosome X. ACE2 consists of three domains: (1) N-terminal peptidase domain (residues 19-615), (2) C-terminal collectrinlike domain (CLD, residues 616-768), and (3) end with a hydrophobic transmembrane region and an intracellular segment of 43 residues [84, 85] . ACE2 belongs to the family of angiotensin converting enzymes (ACE) members. ACE is a widely distributed protein which converts angiotensin (Ang) I (inactive form) to AngII (activate form). This conversion is known to play a vital role in several biological functions, such as controlling blood pressure [86, 87] , regulating water and sodium absorption in the kidneys [88] , and mediating cell proliferation [89] . ACE2 has been demonstrated to be involved in regulating heart function, hypertension (HT), diabetic heart disease, and dyslipidemian [90] . Several studies have shown that polymorphisms of ACE2 are significantly associated with blood pressure in different populations [91, 92] . Additionally, COVID-19 patients who have HT, heart disease, and diabetes are associated with severe infections and clinical outcomes [11, 93] . A previous study on a group 2 coronavirus demonstrated that the correlation between viral receptor genetic variation and viral binding activity can affect host susceptibility [94] . Similarly, the relationship between human immunodeficiency virus type 1 (HIV-1) gp120 and the CD4 T cell co-receptor CCR5 is another well-known example of a receptor polymorphism affecting viral entry. Individuals carrying the CCR5∆32 polymorphism (CCR5 contains 32 bp deletions) can block HIV-1 entry into host cells and prevent infection [9] . For CoV, the polymorphisms (three missenses and one deletion) of the functional receptor dipeptidyl-peptidase 4 (DPP4/CD26) of MERS-CoV have recently been demonstrated to reduce the interaction with the S glycoprotein [95] . In addition, different expression levels of ACE2 have been demonstrated to be positively correlated with SARS-CoV and NL63 (another human related respiratory coronavirus) infection [96] . Jia and colleagues have shown that a point mutation (L584A) in ACE2 can facilitate SARS-CoV entry into the host cell [97] . Hence, the genetic variation of ACE2 between different populations may contribute to susceptibility to SARS-CoV-2. According to a recent report by Darbani and colleagues, 34 ACE2 variants have been defined with importance for SARS-CoV-2 entry and infection [98] . The ACE2 allele frequencies included six interaction-booster variants (S19P, I21V, K26R, T27A, N64K, and H378R) and eight interaction-inhibitor variants (E37K, N51D, K68E, F72V, M82I, G326E, Q388L, and P389H) which have been shown to vary significantly between populations. In addition, Darbani and colleagues showed that more than half of the variants were found in males, which may explain previous clinical observations showing higher mortality rates in males [99, 100] . However, eight ACE2 variants located at the binding interface showed no disruption of the interaction between ACE2 and the RBD [101] . Cao and colleagues recently found two ACE2 intron variants and ten other protein intron variants (located within or near the ACE2 gene, three from CLTRN, five from CA5B, and two from an unknown gene) which showed association with higher ACE2 expression levels by genetic analysis of expression quantitative trait loci (eQTLs) [102] . Intriguingly, nine of twelve intron variants showed significantly higher allele frequencies in Asian populations when compared to others (African, European, and American). Notably, most of these intron variants were located at the CLTRN and the CA5B genes. Future study is required to clarify the correlation between the ACE2 gene regulatory network and genetic variation. Another systematic ACE2 genetic analysis by Stawiski and colleagues identified nine ACE2 variants which could increase susceptibility to SARS-CoV-2 and 17 ACE2 variants which displayed protective roles on SARS-CoV-2 infection by structural computational analysis [103] . The missense variants of ACE2 identified by recent studies, including key residues of ACE2 binding with SARS-CoV-2 RBD and residues with potential to affect binding, are summarized in Table 2 . All of the ACE2 missense variants in Table 2 are rare variants (<0.01 allele frequency), and most of the rare missense variants are distributed in European populations. However, based on our current knowledge, the ACE2 rare missense variants in the population do not disrupt the interaction with the SARS-CoV-2 S glycoprotein. Hashizume and collegues identified seven ACE2 missense variants which exist in Asian but not in American and European populations. They further demonstrated that these ACE2 missense variants have a limited effect on SARS-CoV-2 infectivity in vitro [104] . In addition to the missense variants, overall ACE2 expression level is another factor which could affect SARS-CoV-2 transmissibility. According to current studies, the ACE2 genetic variants with high allele frequencies are associated with the higher expression level of ACE2 in Asian populations. ACE2 expression is found to be significantly lower in North America, Europe, and Africa, in decreasing order [105] . Additionally, the difference in ACE2 expression is significantly correlated with the prevalence of the D614G variant across geographical regions [55] . Further biological study is required to confirm the relationship between the emergence of the D614G mutation with varying ACE2 expression levels. The TMPRSS2 gene contains 14 exons located in chromosome 21. TMPRSS2 is mainly expressed on the luminal side of the prostate epithelium. The expression level is regulated by androgens, and overexpression of TMPRSS2 can be found in prostate cancer tissue [106] . In addition to its proteolytic activity, TMPRSS2 has been shown to be a critical helping factor in the fusion of influenza viruses and CoV into target cells [107, 108] . A case-controlled genetic study identified two single nucleotide polymorphisms which are associated with high expression of TMPRSS2, and individuals who carry these polymorphisms were shown to be more susceptible to influenza virus infection [109] . TMPRSS2 is another essential protein for SARS-CoV-2 S glycoprotein priming [27] . The first question is whether the higher mortality rates in SARS-CoV-2 infected males are due to androgendependent TMPRSS2 expression. However, there is no difference in expression levels of TMPRSS2 between males and females in lung tissue [110] . The second question is whether genetic variation within TMPRSS2 could affect its expression level, protein structure, and functions, further affecting individual susceptibility to SARS-CoV-2 infection. A systematic investigation of TMPRSS2 variants identified 13 intron variants, two exon variants (coding regions), and six 3 UTR variants that can affect TMPRSS2 structure and function. rs12329760 and rs75603675 (both missense variants) potentially affect TMPRSS2 structure and post-translational modifications, respectively. Six 3 UTR variants (rs456142, rs462574, rs456298, rs12627374, rs12473206, and rs75036690) potentially affect the miRNA target activity [111] . Recently, another four variants, rs464397, rs469390, rs2070788, and rs38351, have been shown to be able to increase TMPRSS2 expression and show higher allele frequencies in European and American population when compared to Asian populations [112] . TMPRSS2 expression level is significantly lower in Africa due to genetic variability, which could possibly explain the lower number of reported infection cases in Africa [105] . Taken together, based on current evidence, genetic variations of ACE2 and TMPRSS2 are believed to affect individual susceptibility to SARS-CoV-2. However, a large scale clinical epigenetic study is needed to further confirm the effect of genetic variation on the susceptibility to SARS-CoV-2 infection. DC-SIGN is a C-type lectin receptor expressed on dendritic cells. A DC-SIGN related receptor called L-SIGN (or CD209L and DC-SIGNR) is expressed on lymph node and liver cells. The function of DC/L-SIGN is to recognize high mannose glycans on the cell and the pathogen surface [113, 114] . Moreover, DC/L-SIGN binding with viral surface proteins can affect viral pathogenesis [115] . Notably, DC/L-SIGN can bind with the SARS-CoV S glycoprotein and facilitate virus transmission. Both L-SIGN and ACE2 are expressed on human type II alveolar cells which suggests that SARS-CoV can use both as entering receptors [116] . A previous study on SARS-CoV by Han and colleagues showed that seven glycosylation sites on the S glycoprotein play a vital role in DC/L-SIGN mediated virus entry [117] . Several studies have demonstrated that the allele frequency distribution of L-SIGN (CD209) promotor variant (rs4804803, -336A>G) is strongly associated with the pathogenesis of HIV-1, Mycobacterium tuberculosis, and Dengue infection [118] [119] [120] . Furthermore, Chan and colleagues showed that -336G is a protective allele for SARS-CoV infection [121] . Notably, the -336G allele distribution frequency is significantly lower in Asian populations than others (Asian (0.070) vs. African (0.426), South Asian (0.190), European (0.211), American (0.164), other (0.210), and global (0.244)). Therefore, it is speculated that the -336G allele may be positively associated with SARS-CoV-2 severity. Other than the -336G allele, the homozygosity of L-SIGN has also been found to play a protective role in SARS-CoV-1 infection [20] . Future case-controlled genetic studies are required to elucidate the correlation between DC/L-SIGN genetic variation and susceptibility to SARS-CoV-2 infection. The pandemic of COVID-19 has caused more than 111 million confirmed cases and more than 2.4 million deaths globally as of 22 February 2021 since the first case was reported from Wuhan, China. The confirmed cases and deaths are rising quickly, and the fast evolution and transmission of SARS-CoV-2 has generated several particular mutations across geographic regions [122] [123] [124] . Since there were no vaccine and treatment-based selective pressures in the early pandemic, the host genetic variability could drive adaptive evolution by selecting for increased genetic diversity in SARS-CoV-2 across geographical regions. Mutations in the S glycoprotein have been shown to enhance viral transmissibility and immune escape ability, however, no current mutations increase viral pathogenicity or COVID-19 severity. The recent emergence of B1. show variants with several mutations in the S glycoprotein, especially within the RBD. Some mutations have been found to enhance viral infectivity (∆H69/V70, N501Y, and P681H) or contribute to immune escape (∆Y144, ∆L242/244, E484K, L452R, and N501Y). Recent evidence has shown that K417N, E484K, and N501Y emerge in existing antibody selection pressure in vitro cell culture experiments, suggesting that those mutations are important for SARS-CoV-2 immune escape evolution [52] . Current COVID-19 vaccines seem to maintain intact neutralization activity for B.1.1.7, however, a remarkable decrease in neutralization activity for B.1.351 has been seen using sera from vaccinee and monoclonal antibodies [76] . The decrease of neutralizing activity is believed to be caused by the E484K mutation in S glycoprotein. SARS-CoV-2 carrying the aforementioned mutations, which have been found to co-occur with other variants and are circulating across geographic regions, should be monitored, as they contribute to decreased sensitivity to several clinically used monoclonal antibodies and human convalescent sera. There are different genetic nonsynonymous diversity patterns of SARS-CoV-2 across the world, possibly driven by genetic variation across human populations. To understand the role of host entry factors for SARS-CoV-2, future study should first focus on the correlation between the 21 genetic variants of TMPRSS2 and susceptibility to SARS-CoV-2 in human populations. For DC/L-SIGN, future study is required to understand the correlation between genetic variants and the severity of COVID-19, especially focusing on -336G and the homozygous/heterozygous forms of L-SIGN. Continuing and enhancing surveillance, monitoring evolutionary changes of SARS-CoV-2 in different populations, and understanding the impact of mutations on viral transmissibility and immune escape ability are urgently needed to provide guidance on controlling and measuring transmission. Additionally, the acceleration of COVID-19 vaccine roll-out to the public is urgently needed to prevent SARS-CoV-2 evolution against the current vaccines. Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. The species Severe acute respiratory syndrome-related coronavirus: Classifying 2019-nCoV and naming it SARS-CoV-2 A pneumonia outbreak associated with a new coronavirus of probable bat origin Novel coronavirus disease (COVID-19) pandemic: A recent mini review Identification of a novel coronavirus in patients with severe acute respiratory syndrome Isolation of a novel coronavirus from a man with pneumonia in Saudi Arabia Genomic characterisation and epidemiology of 2019 novel coronavirus: Implications for virus origins and receptor binding Rules of engagement: Molecular insights from host-virus arms races C-type lectins DC-SIGN and L-SIGN mediate cellular entry by Ebola virus in cis and in trans Resistance to HIV-1 infection in caucasian individuals bearing mutant alleles of the CCR-5 chemokine receptor gene Altmeyer, R. C-type lectins L-SIGN and DC-SIGN capture and transmit infectious hepatitis C virus pseudotype particles Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: A retrospective cohort study Genomewide Association Study of Severe Covid-19 with Respiratory Failure The major genetic risk factor for severe COVID-19 is inherited from Neanderthals Large scale genomic analysis of 3067 SARS-CoV-2 genomes reveals a clonal geo-distribution and a rich genetic variations of hotspots mutations Genome-Wide Identification and Characterization of Point Mutations in the SARS-CoV-2 Genome Early transmissibility assessment of the N501Y mutant strains of SARS-CoV-2 in the United Kingdom Sixteen novel lineages of SARS-CoV-2 in South Africa Resurgence of COVID-19 in Manaus, Brazil, despite high seroprevalence Emergence of a Novel SARS-CoV-2 Variant in Southern California Homozygous L-SIGN (CLEC4M) plays a protective role in SARS coronavirus infection Characterization of an efficient coronavirus ribosomal frameshifting signal: Requirement for an RNA pseudoknot The Nonstructural Proteins Directing Coronavirus RNA Synthesis and Processing The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade Loss of furin cleavage site attenuates SARS-CoV-2 pathogenesis Structure of Furin Protease Binding to SARS-CoV-2 Spike Glycoprotein and Implications for Potential Targets and Virulence Cleavage Site in the Spike Protein of SARS-CoV-2 Is Essential for Infection of Human Lung Cells SARS-CoV-2 Cell Entry Depends on ACE2 and TMPRSS2 and Is Blocked by a Clinically Proven Protease Inhibitor Structural and Functional Basis of SARS-CoV-2 Entry by Using Human ACE2 Coronavirus envelope protein: Current knowledge Coronavirus M proteins accumulate in the Golgi complex beyond the site of virion budding Studies on membrane topology, N-glycosylation and functionality of SARS-CoV membrane protein Interactions between coronavirus nucleocapsid protein and viral RNAs: Implications for viral transcription The nucleoprotein is required for efficient coronavirus genome replication Accessory proteins of SARS-CoV and other coronaviruses Structure of the SARS-CoV-2 spike receptor-binding domain bound to the ACE2 receptor Structure of SARS coronavirus spike receptor-binding domain complexed with receptor High Contagiousness and Rapid Spread of Severe Acute Respiratory Syndrome Coronavirus 2. Emerg Comparative replication and immune activation profiles of SARS-CoV-2 and SARS-CoV in human lungs: An ex vivo study with implications for the pathogenesis of COVID-19 Functional assessment of cell entry and receptor usage for SARS-CoV-2 and other lineage B betacoronaviruses Cryo-EM structure of the 2019-nCoV spike in the prefusion conformation Cell entry mechanisms of SARS-CoV-2 Human SARS CoV-2 spike protein mutations Neutralization of Virus Infectivity by Antibodies: Old Problems in New Perspectives Introduction of Brazilian SARS-CoV-2 484K.V2 related variants into the UK Tracking Changes in SARS-CoV-2 Spike: Evidence that D614G Increases Infectivity of the COVID-19 Virus SARS-CoV-2 viral spike G614 mutation exhibits higher case fatality rate Spike mutation D614G alters SARS-CoV-2 fitness SARS-CoV-2 D614G variant exhibits efficient replication ex vivo and transmission in vivo Structural and Functional Analysis of the D614G SARS-CoV-2 Spike Protein Variant Structural basis of receptor recognition by SARS-CoV-2 SARS-CoV-2 spike-protein D614G mutation increases virion spike density and infectivity mRNA vaccine-elicited antibodies to SARS-CoV-2 and circulating variants DC/L-SIGNs of Hope in the COVID-19 Pandemic The Impact of Mutations in SARS-CoV-2 Spike on Viral Infectivity and Antigenicity Impact of Genetic Variability in ACE2 Expression on the Evolutionary Dynamics of SARS-CoV-2 Spike D614G Mutation Recurrent emergence and transmission of a SARS-CoV-2 Spike deletion H69/V70. bioRxiv 2021 SARS-CoV-2 evolution during treatment of chronic infection Deep Mutational Scanning of SARS-CoV-2 Receptor Binding Domain Reveals Constraints on Folding and ACE2 Binding Antibody cocktail to SARS-CoV-2 spike protein prevents rapid mutational escape seen with individual antibodies Effect of RBD (Y453F) mutation in spike glycoprotein of SARS-CoV-2 on neutralizing IgG affinity Circulating SARS-CoV-2 spike N439K variants maintain fitness while evading antibody-mediated immunity Soon-Shiong, P. Molecular dynamic simulation reveals E484K mutation enhances spike RBD-ACE2 affinity and the combination of E484K, K417N and N501Y mutations (501Y.V2 variant) induces conformational change greater than N501Y mutant alone, potentially resulting in an escape mutant Systemic effects of missense mutations on SARS-CoV-2 spike glycoprotein stability and receptor-binding affinity Mutations Strengthened SARS-CoV-2 Infectivity Transmission, infectivity, and antibody neutralization of an emerging SARS-CoV-2 variant in California carrying a L452R spike protein mutation Adaptation of SARS-CoV-2 in BALB/c mice for testing vaccine efficacy SARS-CoV-2 Variants with a Focus at the ACE2-Spike RBD Interface Impact of South African 501.V2 Variant on SARS-CoV-2 Spike Infectivity and Neutralization: A Structure-based Computational Assessment Modelling conformational state dynamics and its role on infection for SARS-CoV-2 Spike protein variants A Unique Protease Cleavage Site Predicted in the Spike Protein of the Novel Pneumonia Coronavirus (2019-nCoV) Potentially Related to Viral Transmissibility In situ structural analysis of SARS-CoV-2 spike reveals flexibility mediated by three hinges Large-scale population analysis of SARS-CoV-2 whole genome sequences reveals host-mediated viral evolution with emergence of mutations in the viral Spike protein associated with elevated mortality rates Two mutations in the SARS-CoV-2 spike protein and RNA polymerase complex are associated with COVID-19 mortality risk Site-specific characterisation of SARS-CoV-2 spike glycoprotein receptor binding domain Recurrent deletions in the SARS-CoV-2 spike glycoprotein drive antibody escape Antibody Resistance of SARS-CoV-2 Variants Prospective mapping of viral mutations that escape antibodies used to treat COVID-19 Identification of SARS-CoV-2 spike mutations that attenuate monoclonal and serum antibody neutralization Genetic Characteristics and Phylogeny of 969-bp S Gene Sequence of SARS-CoV-2 from Hawaii Reveals the Worldwide Emerging P681H Mutation Complete Mapping of Mutations to the SARS-CoV-2 Spike Receptor-Binding Domain that Escape Antibody Recognition The N501Y and K417N mutations in the spike protein of SARS-CoV-2 alter the interactions with both hACE2 and human derived antibody: A Free energy of perturbation study Escape from neutralizing antibodies by SARS-CoV-2 spike protein variants Comprehensive mapping of mutations in the SARS-CoV-2 receptor-binding domain that affect recognition by polyclonal human plasma antibodies A novel angiotensin-converting enzyme-related carboxypeptidase (ACE2) converts angiotensin I to angiotensin 1-9 Collectrin, a collecting duct-specific transmembrane glycoprotein, is a novel homolog of ACE2 and is developmentally regulated in embryonic kidneys The role of ACE2 in cardiovascular physiology Exploring the structure and function of zinc metallopeptidases: Old enzymes and new discoveries The renin-angiotensin-aldosterone system and the kidney: Effects on kidney disease Aspects of gene polymorphisms in cardiovascular disease: The renin-angiotensin system The renin-angiotensin system: Going beyond the classical paradigms Correlation of angiotensin-converting enzyme 2 gene polymorphisms with stage 2 hypertension in Han Chinese The combination of ACE I/D and ACE2 G8790A polymorphisms revels susceptibility to hypertension: A genetic association study in Brazilian patients Preliminary Estimates of the Prevalence of Selected Underlying Health Conditions Among Patients with Coronavirus Disease 2019-United States Mouse susceptibility to mouse hepatitis virus infection is linked to viral receptor genotype Polymorphisms in dipeptidyl peptidase 4 reduce host cell entry of Middle East respiratory syndrome coronavirus ACE2 receptor expression and severe acute respiratory syndrome coronavirus infection depend on differentiation of human airway epithelia Ectodomain shedding of angiotensin converting enzyme 2 in human airway epithelia The Expression and Polymorphism of Entry Machinery for COVID-19 in Human: Juxtaposing Population Groups, Gender, and Different Tissues Chinese Center for Disease Control and Prevention. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China Clinical impact of COVID-19 on patients with cancer (CCC19): A cohort study Interaction of the spike protein RBD from SARS-CoV-2 with ACE2: Similarity with SARS-CoV, hot-spot analysis and effect of the receptor polymorphism Comparative genetic analysis of the novel coronavirus (2019-nCoV/SARS-CoV-2) receptor ACE2 in different populations Human ACE2 receptor polymorphisms predict SARS-CoV-2 susceptibility Population-Specific ACE2 Single-Nucleotide Polymorphisms Have Limited Impact on SARS-CoV-2 Infectivity In Vitro Genetic variability in the expression of the SARS-CoV-2 host cell entry factors across populations The androgen-regulated type II serine protease TMPRSS2 is differentially expressed and mislocalized in prostate adenocarcinoma Proteolytic activation of influenza viruses by serine proteases TMPRSS2 and HAT from human airway epithelium Efficient activation of the severe acute respiratory syndrome coronavirus spike protein by the transmembrane protease TMPRSS2 Identification of TMPRSS2 as a Susceptibility Gene for Severe 2009 Pandemic A(H1N1) Influenza and A(H7N9) Influenza Serendipity or Opportunity for Intervention? Cancer Discov. 2020 First comprehensive computational analysis of functional consequences of TMPRSS2 SNPs in susceptibility to SARS-CoV-2 among different populations Genetic variants that influence SARS-CoV-2 receptor TMPRSS2 expression among population cohorts from multiple continents The physiological role of DC-SIGN: A tale of mice and men Structural basis for selective recognition of oligosaccharides by DC-SIGN and DC-SIGNR The SIGNs for infection DC-SIGN and DC-SIGNR interact with the glycoprotein of Marburg virus and the S protein of severe acute respiratory syndrome coronavirus Specific asparagine-linked glycosylation sites are critical for DC-SIGN-and L-SIGN-mediated severe acute respiratory syndrome coronavirus entry Association of DC-SIGN promoter polymorphism with increased risk for parenteral, but not mucosal, acquisition of human immunodeficiency virus type 1 infection Promoter variation in the DC-SIGN-encoding gene CD209 is associated with tuberculosis A variant in the CD209 promoter is associated with severity of dengue disease Association of a single nucleotide polymorphism in the CD209 (DC-SIGN) promoter with SARS severity Evidence of Increasing Diversification of Emerging SARS-CoV-2 Strains Comparative insight into the genomic landscape of SARS-CoV-2 and identification of mutations associated with the origin of infection and diversity Genomic Diversity and Hotspot Mutations in 30,983 SARS-CoV-2 Genomes: Moving Toward a Universal Vaccine for the The authors wish to thank the staff from Kaohsiung Medical University Hospital and Center for Tropical Medicine and Infectious Disease, Kaohsiung Medical University for their technical assistance. We thank C.H. Yen at the Kaohsiung Medical University for insightful discussions and advice. S.W.H. was supported by a Cancer Research Training Award from the National Cancer Institute (NCI). The content of this publication does not necessarily reflect the views or policies of the NCI, National Institutes of Health, or Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government. The authors declare no conflict of interest.