key: cord-0707579-sw34iotp authors: Sheikh, Javaid Ahmad; Singh, Jasdeep; Singh, Hina; Jamal, Salma; Khubaib, Mohd.; Kohli, Sunil; Dobrindt, Ulrich; Rahman, Syed Asad; Ehtesham, Nasreen Zafar; Hasnain, Seyed Ehtesham title: Emerging genetic diversity among clinical isolates of SARS-CoV-2: Lessons for today date: 2020-04-24 journal: Infect Genet Evol DOI: 10.1016/j.meegid.2020.104330 sha: 4ee0488ce59af4533cad65b4676c2e7dafa7eee9 doc_id: 707579 cord_uid: sw34iotp Abstract Considering the current pandemic of COVID-19, it is imperative to gauge the role of molecular divergence in SARS-CoV-2 with time, due to clinical and epidemiological concerns. Our analyses involving molecular phylogenetics is a step toward understanding the transmission clusters that can be correlated to pathophysiology of the disease to gain insight into virulence mechanism. As the infections are increasing rapidly, more divergence is expected followed possibly by viral adaptation. We could identify mutational hotspots which appear to be major drivers of diversity among strains, with RBD of spike protein emerging as the key region involved in interaction with ACE2 and consequently a major determinant of infection outcome. We believe that such molecular analyses correlated with clinical characteristics and host predisposition need to be evaluated at the earliest to understand viral adaptability, disease prognosis, and transmission dynamics. Abstract: Considering the current pandemic of COVID-19, it is imperative to gauge the role of molecular divergence in SARS-CoV-2 with time, due to clinical and epidemiological concerns. Our analyses involving molecular phylogenetics is a step toward understanding the transmission clusters that can be correlated to pathophysiology of the disease to gain insight into virulence mechanism. As the infections are increasing rapidly, more divergence is expected followed possibly by viral adaptation. We could identify mutational hotspots which appear to be major drivers of diversity among strains, with RBD of spike protein emerging as the key region involved in interaction with ACE2 and consequently a major determinant of infection outcome. We believe that such molecular analyses correlated with clinical characteristics and host predisposition need to be evaluated at the earliest to understand viral adaptability, disease prognosis, and transmission dynamics. Humanity is at the verge of a serious crisis due to COVID-19 pandemic caused by a novel coronavirus SARS-CoV-2. The disease was reported earlier from Wuhan, China in December 2019 and in a matter of weeks, witnessed a meteoric spread to almost every part of the world (Zhu et al., 2020) . New emerging epicentres like Italy, Iran, Spain and USA have amplified the crisis. The looming disaster of its spread to over populated countries with fragile health infrastructure necessitates even more urgently to implement populationcentred approaches to understand this outbreak and fight the pandemic. NGS has helped the community to explore the evolutionary diversity and mutational propensity of the virus. Efforts of medical and scientific fraternity have been unprecedented with genetic data being available at exhilarating speed along with rapid epidemiological and clinical data sharing -all aiming to translate these into developing effective interventions (Baden and Rubin, 2020) . We analysed the genome sequence of available clinical isolates (n=250+) to comprehend the phylogenetic diversity of the isolates (Shu and McCauley, 2017) . The mutations arising during human to human spread could provide insights into transmission dynamics and, compounded with clinical and epidemiological data can predict disease prognosis (Wertheim, 2020) . Moreover, the mutational propensity of RNA viruses can impede the development of interventions which is an emerging challenge before the scientific community. Our analyses employing advanced machine learning approaches revealed a great deal of genetic diversity emerging among clinical isolates (Figure 1a ). One fascinating aspect that emerged is the observed diversity of viral strains in every continent (Figure 1b) . However, there was no geographical clustering of the isolates and every continent seems to have multiple introductions of different viral strains. Earlier pandemics had geographical signatures in their sequences that abetted the tracing of new infections (Miller et al., 2009 ). Those markers are obscure yet in the current pandemic possibly due to rampant globalisation, a major bottleneck in tracing the lineages. The intriguing difference in the significantly more than average transmission and mortality in the Lombardy region of Italy, compared to other European countries or that of African continent or even China, could not be correlated to any specific molecular divergence pattern and very likely a reflection of the advanced age/co-morbidity. Intriguingly, we observed 5' terminal of the genome to be more variable and prone to mutations, as compared to 3' terminal. It appears that ORF1ab, spike, ORF3a and E are key drivers of diversity among strains with RBD of spike emerging as mutational hotspot (Figure 1c ). Interaction of RBD with ACE2 could determine the outcome of infection indicating the clinical importance of these mutations. Our phylogenetic analyses reveal at least five different clades circulating as of date and more divergence is expected with time ( Figure 1d ). We believe that these emerging mutations, once corroborated with viral pathogenesis and clinical characteristics along with epidemiological correlates would be valuable in predicting disease progression and also tracing pathogen mobility and reemergence. Covid-19 -The Search for Effective Therapy The signature features of influenza pandemics--implications for policy GISAID: Global initiative on sharing all influenza data -from vision to reality A glimpse into the origins of genetic diversity in SARS-CoV-2 A Novel Coronavirus from Patients with Pneumonia in China  Every continent seems to have multiple introductions of different viral strains.  5' terminal of the viral genome is more prone to mutations compared to 3' end.  ORF1ab, spike, ORF3a and E are key protein prone to mutations.  Receptor Binding Domain of spike protein emerged as mutational hotspot Our phylogenetic analyses reveal at least five different clades of SARS-CoV-2