key: cord-0994802-ru11bhyb authors: Parasar, J.; Pandey, R. K.; Patel, Y.; Singh, P. P.; Srivastava, A.; Mishra, R.; Kumar, B.; Rai, N.; Mishra, V. N.; Shrivastava, P.; Suravajhala, P.; Chaubey, G. title: Genetic diversity and spatiotemporal distribution of SARS-CoV-2 alpha variant in India date: 2022-04-21 journal: nan DOI: 10.1101/2022.04.20.22274084 sha: 8479fb1b67ce7dcd42b701713d4dd806d5170216 doc_id: 994802 cord_uid: ru11bhyb After the spill to humans, in the timeline of SARS-CoV-2, several positively selected variants have emerged. A phylogeographic study on these variants can reveal their spatial and temporal distribution. In December 2020, the alpha variant of the severe acute respiratory syndrome coronavirus (SARS-CoV-2), which has been designated as a variant of concern (VOC) by WHO, was discovered in the southeastern United Kingdom (UK). Slowly, it expanded across India, with a considerable number of cases, particularly in North India. The study focuses on determining the prevalence and expansion of the alpha variants in various parts of India. The genetic diversity estimation helped us understand various evolutionary forces that have shaped the spatial distribution of this variant during the peak. Overall, our study paves the way to understand the evolution and expansion of a virus variant. The COVID-19 pandemic, caused by the SARS Coronavirus 2 (SARS CoV-2), has impacted the world, with India having the second-highest number of confirmed cases (MOHFW-GoI, n.d.) . The first instances of COVID -19 in India were recorded in Kerala in January 2020 among three medical students who had returned from Wuhan, the pandemic's first epicentre (Wu et al., 2020) . SARS-CoV-2 is a single-stranded, positive-sense RNA (+ssRNA) virus in the Coronaviridae family that belongs to lineage B of the genus Beta-coronavirus. The genome size of SARS-CoV-2 is ~29.9 kb, sharing ~78% sequence homology with SARS-CoV (Qianqian Zhang et al., 2021) . SARS-CoV-2 has been evolving at a rate consistent with the virus gaining two mutations per month in the global population since December 2019 (William T. Harvey et al., 2021) . Mutation has given rise to different variants of SARS-CoV-2. Any virus strain becomes a variant of concern when it increases transmissibility and virulence or decreases its effectiveness towards vaccines (www.who.int, last accessed April 15, 2022.). The principal variants of concern are Alpha (B.1.1.7), Beta (B.1351), Delta (B.16172) and Gamma (P.1). Callaway, 2021) . These VOCs were thought to be more resistant to the neutralizing activity of antibodies produced during natural infection or vaccination and their increased transmission rate (Gobeil et al., 2021) . The viral Spike protein's Receptor Binding Domain (RBD) contains most of the mutations observed in VOCs (S). Mutation N501Y, present in all except the Delta VOC, confers a higher affinity for the cellular ACE2 (Angiotensin-converting Enzyme 2) viral receptor and may be related to increased transmissibility of VOCs carrying this mutation (Rossana C. Jaspe et al., 2021) . After being detected in the UK in no time, the Alpha variant became dominant in other countries. Its rapid success was due to its capacity to weaken our body's first immune defense line, giving the mutant more time to reproduce. Alpha has 23 mutations, eight of which are in the spike protein, that distinguish it from other coronaviruses. N501Y, spike deletion 69-70del, and P681H are the three mutations that have the greatest biological impact. It was also found that there was an 80 fold increase in Orf9b which leads to an increase in Orf9b protein levels in this variant. Orf9b protein of virus antagonizes innate immune response by . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 21, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 interacting with Tom 70, a mitochondrial import receptor required for Type I interferon production (Lucy et al., 2021) . Beta and Delta differ from Alpha in terms of the production of Orf9b protein (Singh et al. 2020) . They drive down levels of interferon but to do so, they do not overproduce Orf9 proteins. According to the data obtained from GISAID (GISAID -Initiative, 2008), there is a variation in the number of alpha cases recorded in different parts of India. When the frequency of the Alpha variant was calculated across India, it was found that it is more common in various northern and central Indian states than in the southern states. In February and March 2020, several Indian states had the largest number of Alpha cases, after which India saw the second wave of the pandemic. Phylogenetic examination of samples from multiple Indian states revealed that it arose in the northern parts of the country from where it spread through the country. While affected regions vary in their value of genetic diversity, the relationship between genetic diversity and the number of cases provided a stance for these mutations in the conditions that prevailed in those regions, perhaps resulting in variable diversity in different regions. GISAID, an open source for Covid 19 and influenza viruses, was used to download the genomic data of the alpha variant (GISAID -Initiative, 2008; www.gisaid.org, last accessed April 15, 2022). Multiple Alignment using Fast Fourier Transform (MAFFT, 2020) was used to align genome sequences to reference sequences. Aligned sequences were downloaded in Fasta format and were run on MEGAX (MEGAX, 2021) for the phylogenetic tree. Charts and Graphs were constructed by (Google Sheets, 2009) . Frequency maps were generated by Datawrapper (2012), and haplotype diversity was derived from DnaSP (Julio Rozas et al., 2019) 1131 genomic sequences of the alpha variant were extracted on June 11/2021, with high coverage and aligned to the reference sequence (Wuhan/2019-EPI_ISL_402124). Each state, union territory and Wuhan sequences were labelled with distinct colours to be distinguished on a phylogenetic tree. Alpha variant samples were filtered from all the affected regions, and the frequency percentage was calculated. From December 2020 to May 2021, the frequency was determined separately for each month, and then distribution in different parts of India was evaluated. Its dominance in specific locations of India was discovered using the obtained . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 21, 2022. ; frequency. Samples from each state and union territory were loaded on DnaSP to determine mutations in all impacted regions. The value of haplotype diversity was calculated using Fu and Li's approach (Fu and Li 1993) . Among the available sequences from India, Punjab had the highest number of samples for alpha variant with a frequency of 0.74, followed by Chandigarh (0.467) and Madhya Pradesh (0.296) (Supplementary Table 1 ). According to GISAID data, cases of alpha in India began to appear in a few states in October and November 2020, while the UK's first sample was reported in In order to understand the diversity of this variant in different regions of India, we computed the haplotype diversity (Table 1 ). The diversity of this variant ranged from 0.798 to 0.997 in India. The highest haplotype diversity values were found in Chandigarh and Haryana (0.997), followed by Gujarat and Delhi, 0.995 (Table 1 ). This disparity in the diversity values range among various locations suggests variable viral dynamics in different regions (Table 1 ). Nearly high diversity in Chandigarh, Haryana, Gujarat and Delhi suggested either the multiple incoming or the prolonged presence of this variant and at least two of its lineages in these regions. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 21, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 Irrespective of the high frequency of this variant in Punjab, we do not see a high diversity of this variant, similar to Chandigarh and New Delhi (Table 1) . This suggests that the spread of the virus in Punjab was rapid and associated with a limited number of founder events. On the other hand, farmers' protests in Punjab and other regions caused an exponential increase in frequency in Punjab, with the majority of the farmers hailing from Punjab and Haryana. Rallies were staged in several parts of north India, highlighting the critical role of social dynamics in the SARS-CoV-2 pandemic and the rise of the second wave in India. In April, cases of the Delta variant were more prevalent than those of the alpha type, and the frequency percentage in most locations declined. Concerning the alpha variant, there existed a strong link between Delhi and Punjab. The outbreak in Delhi in April 2021 was preceded by outbreaks in Maharashtra, Kerala, and Punjab, and we can say that the outbreak in Delhi in 2021 was fueled by alpha alone (GISAID -Initiative, 2008 ). Thus, this study outlines the need for phylogenetics to understand a virus variant's spatial and temporal distribution. Virus variant alpha was predominant in north India compared to the southern part. The rapid expansion of this variant has had limited founders in Punjab and was likely to be associated with the farmers' agitation. Its moderate presence in the south was likely due to a more virulent strain delta which has eventually taken the complete landscape of India after that. The outcome of this study is dependent on the number of sequences and available samples. However, there are places where proper testing, collecting, sequencing and submission of samples have not been adequately made, which may impact the outcome. All datasets generated for this study are included in the article/Supplementary Material. The authors declare no competing interests. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 21, 2022. . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 21, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 Tracking the Emergence of SARS-CoV-2 Alpha Variant in the United Kingdom Genetic Association of ACE2 rs2285666 Polymorphism With COVID-19 Spatial Distribution in India Delta coronavirus variant: Scientists brace for impact Increased transmissibility and global spread of SARS-CoV-2 variants of concern Evolution of enhanced innate immune evasion by the SARS-CoV-2 B.1.1.7 UK variant Multiple alignment program for amino acid or nucleotide sequences MEGAX (10.2.4). (2021) Molecular mechanism of interaction between SARS-CoV-2 and host cells and interventional therapy Risk of mortality in patients infected with SARS-CoV-2 variant of concern 202012/1: Matched cohort study Introduction and rapid dissemination of SARS-CoV-2 Gamma Variant of Concern in Venezuela Tracking SARS-CoV-2 variants Decoding SARS-CoV-2 Hijacking of Host Mitochondria in COVID-19 Pathogenesis SARS-CoV-2 variants, spike mutations and immune escape Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 21, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022