key: cord-0866277-niq3hosc authors: Bajaj, Priyanka; Arya, Prakash Chandra title: Climatic-niche evolution of SARS-CoV-2 date: 2020-10-28 journal: bioRxiv DOI: 10.1101/2020.06.18.147074 sha: 3585d02bb11d627248dbf912bfd23b15f143b3f8 doc_id: 866277 cord_uid: niq3hosc COVID-19 pandemic is studied by several field experts. However, it is still unclear why it was restricted to higher latitudes during the initial days & later cascaded in the tropics. Here, we analyzed 176 SARS-CoV-2 genomes across different latitudes & climate (Koppen’s climate) that provided insights about within species virus evolution & its relation to abiotic factors. Two genetically variant groups, named as G1 & G2 were identified, well defined by four mutations. The G1 group (ancestor), is mainly restricted to warm & moist, temperate climate (Koppen’s C climate) while its descendent G2 group surpasses the climatic restrictions of G1, initially cascading into neighboring cold climate (D) of higher latitudes & later into hot climate of the tropics (A). It appears that the gradation of temperate climate (Cfa-Cfb) to “cold climate” (Dfa-Dfb) climate drives the evolution of G1 into G2 variant group which later adapted to tropical climate (A) as well. It seems this virus follows inverse latitudinal gradient in the beginning due to its preference towards temperate (C) & cold climate (D). Nevertheless, due to the uncertainty of COVID-19 data, the results must be cautiously interpreted & should not be extrapolated to climate types and climatic conditions other than those analyzed here for the early evolution period. Our work elucidates virus evolutionary studies combined with climatic studies can provide crucial information about the pathogenesis & natural spreading pathways in such outbreaks which is hard to achieve through individual studies. Graphical Abstract In Brief The authors elucidate adaptation of SARS-CoV-2 to different climates by studying phylogenetics & the distribution of strains on Koppen’s climate map. Highlights SARS-CoV-2 follows inverse latitudinal gradient during initial days. Phylogenetic network divides SARS-CoV-2 strains into two variant groups, G1 & G2. G1 strains is restricted to Koppen’s “temperate” climate (mainly Cfa-Cfb). G2 strains has evolved from G1 to sustain in mainly “humid-continental” (Dfa-Dfb) & “tropical-savannah” (Aw) climate. • SARS-CoV-2 follows inverse latitudinal gradient during initial days. • Phylogenetic network divides SARS-CoV-2 strains into two variant groups, 13 G1 & G2. 14 • G1 strains is restricted to Koppen's "temperate" climate (mainly Cfa-Cfb). 15 • G2 strains has evolved from G1 to sustain in mainly "humid-continental" cluster from the others is referred to as "virus cluster SNPs" throughout this paper. Mapping virus strain on the Koppen's climate map 473 The location of each SARS-CoV-2 strain is obtained from the METADATA file 474 provided in GISAID database for each viral isolate ( Genomic coordinates in this study is based on reference genome 21 . The SNP positions are based on 661 the reference genome. Nucleotide T represents nucleotide U in the SARS-CoV-2 RNA genome. 662 Mutation at the protein level is not mentioned for the SNPs arising in the non-coding region. The 663 amino acid position numbering is according to its position within the specified gene (CDS Coronavirus Acts as a Dominant Immunogen Revealed by a Clustering Region 577 of Novel Functionally and Structurally Defined Cytotoxic T-Lymphocyte The coronavirus E protein: Assembly and 580 beyond The 582 SARS coronavirus nucleocapsid protein -Forms and functions A scientific history of air, weather, and climate Köppen-Geiger climate classification A new coronavirus associated with human respiratory disease in Identification of Severe Acute Respiratory Syndrome 592 Coronavirus Replicase Products and Characterization of Papain-Like Protease The D614G mutation in the SARS-CoV-2 spike protein reduces 598 S1 shedding and increases infectivity Spike mutation pipeline reveals the emergence of a more 601 transmissible form of SARS-CoV-2. bioRxiv (2020) SARS-CoV-2 viral spike G614 mutation 606 exhibits higher case fatality rate Severe Acute Respiratory Syndrome) Codon influence on protein expression in E. coli correlates with 611 mRNA levels Synonymous Mutations and Ribosome Stalling Can Lead to Altered Folding Pathways and Distinct Minima A periodic pattern of 616 mRNA secondary structure created by the genetic code Furin cleavage of the SARS coronavirus 619 spike glycoprotein enhances cell-cell fusion but does not affect virion entry Phylogenetic network 622 analysis of SARS-CoV-2 genomes Coronavirus Update (Live): 8,522,724 Cases and 453,714 Deaths from 625 COVID-19 Virus Pandemic -Worldometer Molecular 627 evolutionary genetics analysis across computing platforms The neighbor-joining method: a new method for reco PubMed result Prospects for inferring very large phylogenies 632 by using the neighbor-joining method Travel Weather Averages (Weatherbase) Climate data for cities worldwide -Climate-Data.org Phylogenetic study is carried out by P.B. GIS study & Koppen's climate map 643 interpretations is done by P.C.A. Worldometer data analysis is carried out by both 644 the authors Area 681 of the pie-chart covered by a climate zone is proportional to the percentage of COVID-19 cases 682 occurring in their respective climate zones as depicted by black squares. The percentage of COVID-19 683 cases for NFZ & SSTZ is extremely low, therefore, it is not mentioned in the pie-chart Temperate Zone is divided into an interval of 7° latitude. The area of the pie-chart covered is directly 685 proportional to the percentage of COVID-19 cases occurring in their respective latitude range as 686 depicted in black squares Molecular phylogeny analysis to infer genomic similarities of SARS-CoV-2 & their 747 distribution across different climate zones 19 & Koppen's climate types 20 . (a) Genomic architecture of 748 SARS-CoV-2 genome highlighting four positions, substitutions on these positions enabled evolution of 749 G1 into G2. (b, e-g) Strains found within a virus cluster (as shown in the phylogenetic tree & mentioned 750 in Table 1) were analysed for significant mutations The height of the bar is proportional to percent virus strain 753 occurring in the specified condition i.e., labelled on the x-axis. Box in the left panel consist of color 754 code for each climate zone & box in the right panel consist of color code for Koppen's climate. Left 755 panel shows distribution of percent virus strains in different climate zones & right panel shows 756 distribution of percent virus strain in Koppen's climate (b) Percent virus strains prevailing in different 757 climate zones, stratified by SARS-CoV-2 variant groups Abiotic factors influencing evolutionary dynamics of phylogenetic virus clusters. (f) Percent of virus 760 strains with high frequency SNPs in each gene. (g) Type of mutation i.e. non-synonymous or 761 synonymous exhibited by viruses Each strain is labelled as per the strain ID (1 to 176) within parenthesis. The G1 strains 766 were symbolized as 'Yellow-circle', & G2 as 'Square', pink square denotes strain clusters (80-115) 767 stable across C, D & A climate, purple square represents strain cluster (126-176) stable majorly in D 768 climate, the remaining G2 strains (blue squares) are stable across C & D climate. Standard Koppen's 769 climate-type symbols are mentioned in the legend, the criteria for distinguishing these climate types 770 is mentioned in Table S3. Table S4 contains full form of these symbols Aw) are of tropical climate, initials with 'B' belong to desert climate The shades of blue on the map Shades of yellow & green belongs to C climate, shades of red, orange & pink belongs to Desert climate Global distribution of SARS-CoV-2 strains (n=176) (a) in the coastal & continental region (b) Number of virus strains in G1 population is represented by 778 light grey color & of virus strains in G2 population is represented by dark grey color