key: cord-0957535-3rb915cn authors: Cruz, Juliana de O.; Conceição, Izabela M. C. A.; Sousa, Sandra Mara B.; Luizon, Marcelo R. title: Functional prediction and frequency of coding variants in human ACE2 at binding sites with SARS‐CoV‐2 spike protein on different populations date: 2020-06-03 journal: J Med Virol DOI: 10.1002/jmv.26126 sha: 9d10e8bb748ffaaf15487884fa941304165f18fa doc_id: 957535 cord_uid: 3rb915cn Hussain et al.(1) identified 17 natural coding variants for human angiotensin‐converting enzyme 2 (ACE2) that were found at important positions for binding of ACE2 with SARS‐CoV‐2 spike protein(1). They suggested that positive prognosis of COVID‐19 may be due to the existence of ACE2 variants like rs73635825 and rs143936283 in some individuals, and to screen frequencies of candidate alleles in different populations to predict the prognosis of COVID‐19(1). We contributed with further data for these 17 ACE2 variants using other function prediction tools. Moreover, we searched for the minor allele frequency (MAF) for these ACE2 variants as reported in different populations and debated regarding their use in population genetic studies. This article is protected by copyright. All rights reserved. populations to predict the prognosis of COVID-19 1 . We contributed with further data for these 17 ACE2 variants using other function prediction tools. Moreover, we searched for the minor allele frequency (MAF) for these This article is protected by copyright. All rights reserved. Accepted Article protein 1 . The authors suggested that positive prognosis of COVID-19 may be due to the existence of ACE2 variants like rs73635825 and rs143936283 in some individuals, and that their findings provide clues to screen frequencies of candidate alleles in different populations to predict the prognosis of COVID-19 1 . We would like to contribute with further data for these 17 ACE2 variants from other function prediction tools, and to debate regarding their use in population genetic studies. The combination of functional predictors could yield more reliable findings mainly regarding to missense variants 2 , as previously reported 3 . Therefore, we searched for these ACE2 variants using other recommended predictors. No data was reported by ClinVar and Clinpred, but rs73635825, rs143936283, rs4646116 and rs146676783 were predicted as tolerated by FATHMM, using the inherited disease algorithm to analyze protein missense variants. Compared to SIFT, rs73635825 was also predicted as tolerated, but rs146676783 was predicted as damaging 1 . Both variants were predicted as probably damaging by PolyPhen-2, but as likely benign by CADD and REVEL 1 . The identified ACE2 variants have functional importance in binding viral spike protein to ACE2 receptor, but the reported allele frequencies are < 1% 1 . The Table 1 show the minor allele frequency (MAF) for different populations where these 17 ACE2 variants have already been found. Noteworthy, all of them are considered as rare (MAF < 1%) genetics variants 4, 5 . Rare variants do not necessarily contribute with a large fraction of the genetic variance underlying complex traits 6 . However, one can This article is protected by copyright. All rights reserved. expect that much of the variance to be due to rare alleles only for diseases that are caused primarily by strongly deleterious mutations 6 . In this context, we understand their goal for selecting coding variants in the binding region with SARS-CoV-2 spike protein to ACE2 receptor 1 . However, this increases the chance of selecting rare and/or low-frequency (MAF < 1-5%) variants 4, 5 , which tend to be population or sample specific 6, 7 . Conversely, population genetic studies usually focus on common variants (MAF ≥ 5%) that are expected to be found in different populations. Moreover, a previous study that analyzed coding variants for ACE2 showed no direct genetic evidence supporting the existence of coronavirus Sprotein binding-resistant ACE2 mutants in different populations 8 . However, it is important to highlight that this study did not include the variants identified by Hussain et al. 1 , and that further investigation are warranted regarding ACE2 polymorphisms 8 . Indeed, the majority of associations with low-frequency and rare variants demonstrate relatively small effects on complex traits and disease 9 . Despite limitations of power and resolution, rare variant association studies are becoming increasingly mature, which have been carried out with multiple alleles of different genes in many different populations 9 . For example, the rare-variant analytical approaches allow the identification of genes containing an excess of rare and presumably deleterious variation among cases ascertained for complex disease traits, relative to controls 10 . There is little evolutionary time for SARS-CoV-2 contact with humans, so not enough time for selective pressure. In addition, This article is protected by copyright. All rights reserved. Accepted Article not all the variants described were described as deleterious 1 , their effects are unknow and they did not passed through the filters of qualifying a variant used by these approaches 10 . Therefore, although different approaches are being developed to evaluate the use of rare variants in population studies 10, rare variants may not be suitable markers for population genetic studies and demographic differentiation related to COVID-19. For example, the low MAF for these 17 ACE2 rare variants ( Table 1 ) may hinder analysis such as linkage disequilibrium maps used to assess the non-random association of alleles at different loci, which could be used to examine the correlation between ACE2 variants as a factor resistance or susceptibility against COVID-19 in different populations. Moreover, the absence of reported pathogenicity for these ACE2 variants according to several predictors suggest they may not provide protection in a level that explains the differences in infection rates and mortality among the affected countries, as described by World Health Structural Variations in Human ACE2 may Influence its Binding with SARS-CoV-2 Spike Protein Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges Candidate genes identified by whole-exome sequencing in preeclampsia families: insights into functional annotation and in-silico prediction of deleterious variants The UK10K project identifies rare variants in health and disease Functional architecture of lowfrequency variants highlights strength of negative selection across coding and non-coding annotations The deleterious mutation load is insensitive to recent population history A map of human genome variation from population-scale sequencing Comparative genetic analysis of the novel coronavirus (2019-nCoV/SARS-CoV-2) receptor ACE2 in different populations The impact of rare and lowfrequency genetic variants in common disease Rare-variant collapsing analyses for complex traits: guidelines and applications C=0 Abbreviations: GnomAD, The Genome Aggregation Database.