key: cord-0967152-wmpgh4wl authors: Weber, S.; Ramirez, C. C. M.; Weiser, B.; Burger, H.; Doerfler, W. title: SARS-CoV-2 Worldwide Replication Drives Rapid Rise and Selection of Mutations across the Viral Genome: A Time-Course StudyPotential Challenge for Vaccines and Therapies date: 2021-02-06 journal: nan DOI: 10.1101/2021.02.04.21251111 sha: c6ef722743493f01b187eb80f3873919eea62717 doc_id: 967152 cord_uid: wmpgh4wl Scientists and public were alarmed at first viral variant of SARS-CoV2 reported in December 2020. We have followed time course of emerging viral mutants and variants during the SARS-CoV-2 pandemic in ten countries. We examined complete SARS-CoV-2 nucleotide sequences in GISAID with sampling extending until January 20, 2021. These sequences originated from ten different countries: United Kingdom, South Africa, Brazil, USA, India, Russia, France, Spain, Germany, and China. Among the novel mutations, some previously reported mutations waned and some of them increased over time. VUI2012/01 (B.1.1.7) and 501Y.V2 (B.1.351), the UK and South Africa variants, respectively, and two variants from Brazil, 484K.V2, P.1 and P.2, increased in prevalence. Despite lockdowns, worldwide active replication in genetically and socio-economically diverse populations facilitated selection of new mutations. The data on mutant and variant SARS-CoV-2 strains provided here comprise a global resource for easy access to the myriad mutations and variants detected to date globally. Rapidly evolving new variant and mutant strains might give rise to escape variants, capable of limiting the efficacy of vaccines, therapies, and diagnostic tests. associated with higher transmissibility 9,10 and at least one confirmed case of reinfection 11 leading to lockdowns and travel bans in efforts to contain its spread. On December 23, 2020, the time of the lockdown, the variant was already found in Australia, Denmark and Italy. As of January 29, 2021, this variant is now reported in 54 countries according to GISAID (https://www.gisaid.org/hcov19-variants). On December 18, 2020 12 , another variant of concern, unrelated to the UK variant but also having the N501Y mutation, was announced in South Africa, and was dubbed 501Y.V2 or B.1.351 13 . This variant is characterized by 8 mutations in Spike including K417N, E484K and N501Y 13, 14 ( Table 1 ). As of January 29, 2021 this variant has been reported in 24 countries and 5 continents. Also rising independently, are 2 Brazil variants that are now called P.1 and P.2. P.1 has 17 unique amino acid changes, 3 deletions, 4 synonymous mutations and one 4 nucleotide insertion 15 ( Table 1) . P.1 shares the 501Y and a deletion in ORF1ab with both the UK and the South Africa Variant. It is interesting to note that the N501Y mutation was not widely spread in Brazil before this variant was described while the E484K is more prevalent, although Brazil is not sequencing large numbers of samples. The E484K and the N501Y mutations are of particular concern in that they have been suggested to reduce neutralization by antibodies and increase the affinity for ACE2. P.1 and B.1.351 share both mutations N501Y and E484K (Table 1) . P.1 has been associated with a case of documented reinfection 16 and one case has been reported in the United States. P.2, unrelated to P.1, is characterized by the E484K mutation and has been implicated in two cases of reinfection 17, 18 . These variants have caused concerns regarding efficacy of the vaccines. Recently Wu et al. described the efficacy of mRNA-1273 vaccine against many spike mutations tested both separately and in combination 19 . They show that sera from both vaccinated non-human primates and vaccinated humans are effective against the UK variant and various other spike mutations. They also found neutralization, albeit at lower levels, against the full South Africa variant B.1.135. It has been shown that the Pfizer BNT162b2 vaccine is effective against the N501Y mutant alone 20 as well as the UK variant B.1.117 21 , There have also been preliminary data from two other vaccine manufacturers showing efficacy against the South African variant. To illustrate the rise of mutations and variants over time, we list the number of variants and mutations deposited in GISAID worldwide across time ( Figure 1 ). Table 2 lists the number of variant sequences deposited in GISIAD by country. The rapid appearance of the variants across the world illustrates the importance of sequencing viral pathogens and tracking mutations. There is emerging evidence that these variants may alter transmissibility and have the potential to reduce the efficacy of existing COVID-19 vaccines. Sequencing SARS-CoV-2 is both a scientific and clinical imperative 22 . Because nucleic acid sequencing of SARS-CoV-2 samples is not part of routine clinical practice at this time, it is . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint necessary to institute programs to monitor sequence variation as a matter of course in order to detect mutations in the viral genome. A consequence of the lack of routine viral sequencing is that it may contribute to selection bias. Sequences deposited to GISAID may not be representative of viral prevalence as different countries contribute different numbers of sequences. It is also possible that selection bias may be inherent, as different countries deposit sequences at different rates. Further, it was found that the Spike ΔH69/ΔV70 causes the so-called S-dropout, rendering the nucleic acid test (NAT) negative for Spike (S) and positive for nucleocapsid (N) . As this is one of the mutations in B.1.1.7, it has been used as a screening tool for this variant 23 . While useful for screening, this deletion might create selection bias because patients who were positive for SARS-CoV-2 with an S dropout may have their samples preferentially sequenced as the prevalence for the new variant is being assessed. Rapid increases in the number and types of new SARS-CoV-2 mutations in the world population within a time span of weeks to months are a remarkable biologic event. The uncontrolled rapid replication of SARS-CoV-2 in an immunologically naïve world population during one year constituted a wake-up call of the need to sequence and track the evolution of novel pathogens as these mutations and variants have raised concerns regarding increased transmissibility, immune escape and the efficacy of vaccines and the validity of diagnostic tests. We analyzed complete SARS-CoV-2 genome sequences with known dates of sampling that were downloaded from GISAID: (i) Only complete sequences were included. (ii) For a chosen time period, all complete sequences with a sampling date from each country were included. Sequences were binned according to sampling date. iii) Sequences by country were filtered by country using the GISAID interface 24 . Nucleotide sequences from the UK, South Africa, Brazil, the US, India, Russia, France, Spain, Germany and China were compared to the reference genome of the SARS-CoV-2 isolate from Wuhan-Hu-1, NCBI Reference Sequence: NC_045512.2. The programs Vector NTI Advance™ 11 (Invitrogen™), Tool Align X, or Snapgene (GSL Biotech), by using the algorithm MUSCLE (Multiple Sequence Comparison by Log-Expectation), for the alignment of sequences. Amino acid sequences were also analyzed with the program Snapgene. DNA sequence analyses of reverse-transcripts of an RNA genome will have to be considered with the possibility that errors may have been introduced at several steps. e.g., by preferred reading mistakes of the reverse transcriptase due to specific sequence or structural properties of SARS-CoV-2 RNA. We have tried to overcome this obvious complication by analyzing a large number of genomes. Percentages were calculated by dividing the number of sequences with the mutation that were sampled at that time and available in the database by the total number of complete sequences with a known sampling date. In addition to the determination of mutants for defined time spans in ten countries, the total number of individual mutations was also determined in all sequences deposited to GISAID up until January 20, 2021 by using GESS (Global Evaluation of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint SARS-COV-2/hCOV-19 Sequences 21 as well as CoV-Glue 25 and PANGOLIN (Phylogenetic Assignment of Named Global Outbreak LINeages) https://github.com/hCoV -2019/pangolin) 26 . In the present study, somewhat arbitrarily, we set a 2% mark of mutations at a given nucleotide in the viral sequence as the cutoff for hotspot status and mutations recording in Tables 3 to 12. The SARS-CoV-2 RNA sequences investigated for mutant status had been deposited at time intervals of 2020 as follows: Brazil: 02/25 to 08/15/2020; China-I: 12/23/2019 to 03/18/2020; China-II: 03/20 to 07/22/2020; France: April to 09/12/2020; Germany-I: February to 03/23/2020; Germany-II: February to 06/17/2020; Germany-III: 06/24 to 08/28/2020; Germany-IV 09/10 to 10/13; India: 01/27 to 05/27/2020 and 06/03 to 07/04/2020; Russia: 03/24 to 06/07/2020; South Africa: 09/01 to 12/07/2020; Spain: 06/01 to 09/20/2020; UK: 01/29 -12/04/2020; US-I: 02/29 to 04/26/2020; US-II: 06/12 to 07/07/2020; US-III: 07/09 to 07/22/2020; US-IV 08/01 to 12/01. Some of the data had been reported previously in Table 1 of Weber et al. 2020 1 , but were included here again for comparison. These data were designated with an asterisk. We examined mutations in 383,570 complete sequences with known sampling dates in GISAID up until January 20, 2021. Figure 1 shows the worldwide distribution of Spike mutations as well as other variants of interest over time from April 2020 to January 20, 2021. Table 1 lists the signature mutations for the variants. Table 2 shows the total number of each variant of interest (B.1.1.7 (the UK Variant), 501Y.V2 (the South African Variant) and 484K.V2 (B.1.1 lineage with S: E484K/D614G, V1176F N: A199S/R203K/G204R) deposited in GISAID by each country as of January 20, 2021. Selection of novel mutations in humans was rapid and frequent in 2020. Among the novel mutations discovered in the current study, some were seen only in one country and others occurred in several different countries. We will present the identified mutations arising in the SARS-CoV-2 RNA country by country for the designated time periods (Tables 3 to 12) . The data covering time course analyses of the appearance of mutations and their nature in most of the ten different countries are presented in Tables 3A to 12A, The corresponding B Tables summarize the total number of mutations in individual sequence position at a cut off of 2% preponderance for the time period 01/19/2020 to 01/20/2021, i.e. of the entire first Covid-19 year. The following paragraphs document the mutational repertoire of SARS-CoV-2 in different regions of the world. The results are somewhat biased in that countries differed considerably in the number of sequences that had become available for inspection in the GISAID database . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint (www.gisaid.org) 24 . We have emphasized the time course of appearance of novel mutations in SARS-CoV-2 isolates that had a history of vigorous replication in some of the most severely affected populations on the globe, such as UK, South Africa, Brazil, the US, India, Russia, France, Spain, Germany and China. The most recent update [January 30, 2021] of Covid-19 cases and fatalities in the ten countries, whose isolates were analyzed for mutations, are presented in Table 13 . For mutations arising in the UK, we have not followed the time course of emerging mutations during earlier periods of the pandemic. In a total of >71.000 viral isolates of SARS-CoV-2 genomes from around the world, that were deposited between 01/19/2020 and 01/20/2021, 4 of the prevalent mutations found worldwide, at positions 241, 3,037, 14,408 and 23,403, had reached almost 100% representation ( Table 3 ). In a total of 70 sequence positions >2% deviations in comparison to the Wuhan reference were noted, > 50% were C to U (T) transitions (see also Tables 3 to 11B ). Twelve novel mutations reached prevalence values between 15% and 49%, 7 of them around 49%. Several of these mutations were also found in other countries (Tables 4 to 12 ). High prevalence of new mutations correlated with active replication in countries of high Covid-19 incidence. On December 8, 2020 Rambaut et al 5 described a novel variant of SARS-CoV-2 that was circulating in England starting in October and increased in prevalence suggesting a possible increase in transmissibility 9, 10, 22 . An analysis of its genome revealed 14 non-synonymous mutations and 3 deletions that comprised a few nucleotides. In the spike glycoprotein 6 of these mutations and 2 deletions were located, one of them N501Y due to an A23063T replacement. This particular variant is now considered a variant of concern VOC202012/01 22 . Current reports have described increased infectivity of this variant, whereas its pathogenicity is currently being assessed 10 (ii) We analyzed 95 SARS-CoV-2 sequences from viral isolates in South Africa that were deposited in the GISAID databank [ Table 4A ]; 28 mutations overall were found in those sequences. Four of the 7 prevalent mutations, known from isolates all over the world, had reached 100% representation in the SARS-CoV-2 sequences, except those at positions 1,059 (~10%), 25,563 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint (~10%), and 28,881 (~63%). There were 7 new mutations unique to the South African variant, four of which caused non-synonymous amino acid exchanges. Twelve of the novel mutations were shared with other countries, eight of these mutations led to amino acid exchanges, many of them to non-synonymous replacements. Twenty-five percent of the mutations affected the spike glycoprotein, a finding that should alert us to the capacity of the virus to respond to potential vaccines directed against the viral spikes. There was one each mutation that involved the viral endoRNAse and the RNA-dependent RNA polymerase. For the entire year 2020 (January 19, 2020 to January 20, 2021), the four prevalent mutations at positions 241, 3,037, 14,408, and 23,403 were again [ Table 4B ] represented close to 100%, the mutation at 28,881/2/3 in the nucleocapsid phosphoprotein gene at about 70% [ Table 4B ]. There were 8 new mutations at >10% prevalence. In a total of 63 positions in the viral genome deviations from the Wuhan reference sequence were noted above the 2% cutoff. Recently, the N501Y variant was detected in South Africa which also had two additional point mutations, K417 and E484K. Data about its possible increased infectivity and transmissibility were preliminary 29 . Also in December 2020, another variant called 501Y.V2, B.1.351 also known South African variant is characterized by 8 lineage defining mutation with 3 in the receptor binding domains: K417N, E484K and N501Y. This variant also appeared to spread quickly in South Africa giving rise to travel bans from South Africa. It has been suggested that this variant is able to escape neutralization by donor plasma 30 . Increased transmissibility has also been suggested 29 . Furthermore, there is early evidence that the efficacy of multiple existing vaccines against the B.1.351 variant may be diminished 19, 28, 31 . It will be important to continue to perform sequence analysis of viral strains and to correlate the evolution of mutants and variants with viral transmission and vaccine efficacy. In the nine SARS-CoV-2 mutations identified in a subset of about 100 published sequences available from Brazil in one time frame between 02/25 and 08/15 [ Table 5A ], five belonged to the worldwide prevalent hotspots at nucleotide numbers 241, 3,037, 14,408, 23,403, and 28,881. Two mutations at positions 12,053 and 25,088 were unique to the sequences from Brazil, and were noted in between 15.7 and 34.4 % of the analyzed sequences, respectively. Two of the novel shared mutations were also identified in sequences from France and Russia (27,299 and 29,148) at frequencies of about 40 %. The mutation at nucleotide position 28,881 was found in 71.6 % of the viral sequences studied. This mutation occurred in viral sequences from all countries investigated, except in those from China. Of note, among the nine different new mutations observed in the SARS-CoV-2 isolates from Brazil, two were not observed in isolates from any of the eight other countries investigated. Possibly, they had recently emerged in the Brazilian population in which the virus had been replicating very actively, and the mutations had been selected under conditions of pandemic viral . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint abundance. The frequent C → T mutations amounted to 44.4% frequency in this selection. Note that the time analysis cut off occurred before the reported emergence of variant strains P.1 and P.2 were identified. We include the related 484K.V2 variant in Table 5B along with the number of individual mutations for all complete sequences with known sampling dates deposited to GISAID by January 20, 2021. Impact on Coding Capacity: The two Brazil-unique mutations at positions 12,053 (viral replicase) and 25,088 (viral spike protein) led to leu to phe and val to phe synonymous replacements, respectively. The two novel shared mutations at positions 27,299 (ORF6 protein) and 29,148 (nucleocapsid phosphoprotein) both caused ile to thr replacements of a nonconservative nature. Table 5B shows 27 individual mutations for the >1100 complete sequences with known sampling dates deposited to GISAID by January 20, 2021. The predominant mutations at positions 241, 3,037, 14,408, 23,403 showed frequencies at 99%. The mutation in the nucleocapsid phosphoprotein at position 28,881/2/3 presented with 93%, the highest frequency for this mutation among all 10 countries studied. As shown in Table 5A , in the time course study the nucleocapsid mutation reached a similarly high of 89%. C to U transitions in these samples reached only 29%. As of January 20, 5 cases of occurrence of the B.1.1.7 variant from the UK were reported. (iv) USA Table 6A lists mutations from a random subset of sequences selected in the US at 4 different time points. Some of the long-term prevalent mutations presented in the table under US-I and US-II were already included in a previous analysis as indicated by an asterisk 1 . They were listed here again to facilitate comparisons to the wider spectrum of new mutations that arose in the US (US-III, US-IV) and in different countries in the course of a few weeks. In addition to the worldwide occurring prevalent mutations, at nucleotide (nt) numbers 241, 1,059, 3,037, 8,782, 14,408, 23,403, 25,563, 28,144, and 28,881, there were a total of 13 unique, i.e. not previously described mutations in our analyses of which nine were found exclusively in the US-III sample cohort at frequencies between 4 % and 29.3 % (Table 6A , unique). Except for three of these mutations, many attained their highest frequency of occurrence at the time point US-III. Two of the novel unique mutations in sequence positions 17,858 and 18,060 had disappeared in the US-III samples. Seventeen of the novel mutations were shared by other regions in the world, seven appeared in most or all ten countries investigated. We listed 13 mutations that had disappeared in the July samples of US-III, possibly they had proved not to be penetrating enough or were not sampled due to selection bias. As apparent in the table, five of the 15 new mutations among the US-II sequences deposited between June 12 and July 07 occurred at low frequencies (< 10%) exclusively in this collection of sequences, others, also at low frequencies, were also present in isolates from other countries as indicated. There were a number of novel shared mutations which were also represented in other countries-BR Brazil, CN China, FR France, DE Germany; IN India, RU Russia, ES Spain, ZA South Africa. The more recently selected SARS-CoV-2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint mutations under US-III stemmed from the time period between July 09 and July 22, 2020. The comparison of June and July US-III sequences and their mutations to their counterparts from a month earlier (US-II) revealed the complex vitality of new mutants arising in a SARS-CoV-2 population that had been replicating during a most critical phase of the US pandemic during the summer of 2020. During the four months' period 08/01 to 12/01 (US-IV), another 117 SARS-CoV-2 sequences were added to Table 6A . Several of the predominant mutations reached 100% representation. Eight novel mutations, some unique, others shared, were listed at nucleotide positions 8,083, 10,139, 18,424, 21,304, 25,907, 28,472, 28,869, and 28,887; most of them reached >20% representation. At many nucleotide positions in the viral genome, the frequencies of the long-term predominant mutations increased over the entire time period between the last days of February to the end of July. This study has thus allowed us to witness the spread of mutations in the US population and at the same time the constant emergence of novel mutations and their increase in frequency with time. There is the idea that all mutations exist at a low level, but are detected when they are selected and proliferate. Of the 39 SARS-CoV-2 RNA sites mutated, 13 mutations, i.e. 42%, remained without effect on the encoded protein. In contrast, 18, i.e. 58%, exhibited changes in the genomes coding capacity [noted in red in Table 6A ] which affected most of the virus-encoded proteins.. Most amino acid exchanges were non-synonymous and were likely responsible for functionally important alterations as judged from the type of amino acid replacements, e.g. pro to ser (nucleotide position 4,226) in nsp3; leu to phe (7,837), also in nsp3; tyr to cys (17,858) in the viral helicase; asp to gly (23,403) in the spike glycoprotein; arg-gly to lys-arg (28,881) in the nucleocapsid phosphoprotein and others. Among the additional eight mutations in the US-IV period, four led to non-synonymous amino acid exchanges in functionally important proteins as the 2'-O ribose-methyltransferase, the 5'-3' exonuclease, and the nucleocapsid phosphoprotein. The asp to gly exchange due to the mutation in position 23,403 that affected the viral spike glycoprotein, was described earlier 18 . The mutant grows to higher titers in cell cultures, reaches higher viral loads in the upper respiratory tract but does not lead to increased disease severity 18 . The mutation has been reported to increase susceptibility to neutralization 19 . At this point, the functional consequences of most of the identified mutations for viral replication and/or pathogenicity need to be assessed. The SARS-Co-V-2 variant discovered in the UK in December 2020 will be discussed in part (iii) of the Conclusion section. In addition, a total of 52,934 SARS-CoV-2 sequences from the US in GISAID was analyzed for the presence of mutations as compared to the original Wuhan sequence (Table 6B ) over the entire year 2020. A total of 42 sequence positions showed >2% deviations from the reference sequence; 21 (50%) were C to U (T) transitions. Similarly high C to U preferences in sequence exchanges were observed in isolates from some of the other 9 countries that were analyzed. In the Conclusion section of this work, a presumptive editing function . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint (APOBEC) is discussed to account for the prevalence of C to U transitions in all these viral genomes. SARS-CoV-2 represents itself as a highly adaptable virus that optimally utilizes its and the host cell's capacities to generated mutations and has them efficiently selected under a wide range of conditions in human populations. As of January 20, 2021, a total of 81 isolates of the UK variant B.1.1.7 was reported in the US which is probably a gross underestimate, by January 22 this variant had reached 12 US states. Worldwide, the occurrence of SARS-CoV-2 mutations and variants is changing daily as expected at the height of this pandemic. During the periods of sequence analyses between January 27, 2020 to May 27, 2020 (IN-I) and June 03, 2020 to July 04, 2020 (IN-II) the prevalent hotspot mutations at sequence positions 241, 3,037, 14,408, 23,403, and 25,563 had reached values of representation approaching 100 %, except at position 25,563 which amounted to 52% of sequences [ Table 7A ]. New mutations emerged during these time periods. A set of nine novel mutations, unique to the Indian population, were observed, i.e. 39.1% out of a total of 23 mutations in all sub-samples from India. These unique mutations were located in genome positions which were completely different from the newly arising SARS-CoV-2 mutations in the US or in any other population investigated in our study [Table 7A ]. A total of seven of these novel mutations originated or increased in frequency in the late IN-II time period, whereas two of the mutations could no longer be detected during that same period. An additional nine newly arising mutations were shared with those in countries as indicated, some of which reached a frequency of up to 50%. Among all mutations from the Indian samples, C → T transitions held the majority of 15/23, i.e. 65.2%. We note that 18 out of 23 (78.3%) mutations in the SARS-CoV-2 isolates from our sub-samples from India were novel. About 7/9 of the India-unique mutations appeared de novo or increased in frequency within a time period of a few weeks of very active replication of the virus in the Indian population. New mutations are not only perpetually arising during the present stage of a nearly uncontrolled Covid-19 pandemic, but are also capable of becoming selected in Indian population. Table 7B lists 46 individual mutations for >3270 complete sequences with known sampling dates deposited to GISAID by January 20, 2021. The prevalent mutations at positions 241, 3,037, 14,408, and 23,403 (Tables 3 -12) were represented at about 86%, at position 28,881 at 44%. In total 46 positions showed mutations at frequency levels >2%, ten of them >10%. The frequency of C to U transitions among all mutations in the samples from India was 50%. The change in coding capacity of the long-term prevalent mutations in positions 241, 3,037, 14,408, and 23,403 was described for the US samples. Among the nine India-unique mutations, the following four led to functionally significant amino acid exchanges: Position 2,292 (nsp2) gln -pro; 18,568 (3'-5'-exonuclease) leuphe; 19,154 (3'-5'exonuclease) thrile; and 28,311 (nucleocapsid phosphoprotein) serleu. Among the nine additional mutations, which were shared by one or several countries, only the following four led to amino acid exchanges: 6,312 (nsp3) thrlys; 11,083 (nsp6) leu/tyrphe; 21,724 (spike . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint protein) leu-phephe-phe; 28,854 (nucleocapsid phosphoprotein) serleu (Table 7A) . Again, many of the new SARS-CoV-2 mutations were responsible for functionally important nonsynonymous amino acid exchanges in the corresponding protein. Among the RU-I subsample of 226 SARS-CoV-2 RNA sequences analyzed between 03/24 and 06/07/2020 in the isolates from Russia, there were ten mutations of which six belonged to the previously described long-term prevalent mutations at positions 241, 3,037, 14,408, 23,403, 25,563, and 28,881 [ Table 8A ]. The latter mutation in position 28,881 at a frequency of representation of 76.1% stood out in that it was not a point mutation but involved a three nucleotide exchange creating a highly basic domain in the 3' terminal region of the SARS-CoV-2 nucleocapsid phosphoprotein as reported earlier 1 . The 28,881 mutation in the Russian sequences had reached one of the highest frequency at 88%. The four new mutations were located at sequence positions 3,140 (CC → TC, with a pro to asn-leu exchange in the amino acid sequence of nsp3), 20,268 (AG → GG, without change in amino acid composition in the endo RNase), 26,750 (CA → TA, without effect on the membrane glycoprotein), and at 27,415 (GC → TC, and an ala to ser change in the ORF6 protein). Table 8B presents similar results of analyses on about 1,330 sequences collected during one year between 01/19/2020 -01/20/2021. Again the prevalent mutations had reached close to 100% frequency, the nucleocapsid phosphoprotein about 90%. New mutations were not apparent. C to U transitions stood at 38%. As of January 20, the detection of any of the new SARS-CoV-2 variants was not reported in Russia. Mutation frequencies were determined between 04 and 09/12, 2020 (116 SARS-CoV-2 sequences). In the sequences a total of 27 mutations were documented. Among them, seven of the previously described long-term prevalent mutations were identified at frequencies as follows: Nucleotide position 241 (100%), 1,059 (13.8%), 3,037 (99.1%), 14,408 (98.3%), 23,403 (100%), 25,563 (49.1%), 28,881 (14.7%). There were 20 new mutations at frequencies between 10 and 20% that were not described previously 1 . C -U transitions reached 37%. [ Table 9A ]. Of interest, none of the new mutations was unique to France in the 116 sequences displayed in Table 9A . Instead, a large percentage of the mutations were shared with Germany and Spain, both neighboring countries. Most novel mutations occurred between 10% and 20% [ Table 8A ] at frequencies between 10 and 20%. C to U transitions were found in 46% of sequences. Among the novel mutations, 20 occurred at >10, many of them >20% frequencies. Table 9B lists mutational frequency in sequences deposited up until January 20, 2021, including data on variants of interest and of concern. There are scant data on the occurrence of variants [ Table 2 ], the UK variant B.1.1.7 was counted 20 times, the South African one 5 times. As . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint complete sequence analyses on Covid-19 isolates are progressing rapidly, new data on the emergence of new variants can be expected. Impact on Coding Capacity: Among these 20 not-previously described novel mutations, eight did not affect the coding capacity of the relevant viral proteins. Most of the 12 coding-relevant mutations led to amino acid exchanges that were non-synonymous: nsp2, 3, 4, RNA-dependent RNA polymerase, the helicase, the endoRNAse, the spike glycoprotein, and the nucleocapsid phosphoprotein [ Table 9A , B]. In the Spanish isolates from the period between 06/01 and 09/20/2020, we analyzed 135 sequences and observed 20 mutations [ Table 10A ]. Of these, four, the long-term prevalent ones, had been described earlier in positions 241, 3,037, 14,408, and 28,881. Except for the latter one at 10.4% frequency, the three former came close to 100% occurrence. Of the 16 new mutations, six occurred in Spanish isolates exclusively (termed unique), namely in positions 5,572 (GT → TT, frequency 8.1%, changing the amino acid sequence met to ile in nsp3), 5,784 (CT → TT, frequency 9.6%, thr to ile in nsp3), 25,062 (GT → TT, frequency 13.3%, amino acid change gly to val in the spike glycoprotein), 27,982 (CA → TA, frequency 9.6%, changing the sequence from pro to leu in the ORF8 protein), 28,657 (CG → TG, at frequency of 14.1%, without affecting the nucleocapsid phosphoprotein), and 28,932 (CT → TT at frequency of 65.9% and altering the amino acid composition in this position in the nucleocapsid phosphoprotein from ala to val). The remaining 10 novel shared mutants were also found in isolates from other countries and were located in positions as shown in previous tables. With the exception of a point mutation at position 25,049 in the spike glycoprotein and an ensuing amino acid exchange from asp to tyr, none of the other nine mutations in the shared category led to an amino acid exchange. We also note that in the Spanish collection of SARS-CoV-2 mutations, there were four in the spike glycoprotein, all different from the well-known position 23,403. Two of these new spike mutations led to non-synonymous amino acid exchanges in the spike glycoprotein: In position 25,049 asp to tyr, and in 25,062 gly to val [ Table 10A ]. Such mutations might become relevant when evaluating the efficacy of a solely spike-directed SARS-CoV-2 vaccine. As a note of caution, one should not rule out functional consequences of nominally silent mutations for SARS-CoV-2 competence, since they might affect the secondary structure of the viral RNA with sequelae in replication and relevant interactions of the viral genome with viral and/or cellular proteins. It is interesting to note that although the latest Spanish collection of SARS-CoV-2 mutations contains four mutations in the spike glycoprotein, in earlier time points, the D614G mutation at position 23,403, the site of a prevalent mutation 1, 32 was not present [ Table 10A ]. In Table 10B , describing mutant frequencies between 01/19/2020 and 01/20/2021, the 23,403 mutant was present at about 80%, whereas in France and England prevalence was >96%. Moreover, for the 01/2020 to 01/2021 period, mutations in 38 sequences lay above the 2% cut off. The predominant . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint mutations reached values around 80% representation. C to T transitions were at 42%. Among the novel mutations 17 showed prevalence of >10%, 8 of them of >20%. During the course of the pandemic, we tabulated the occurrence of SARS-CoV-2 mutants which arose between February to 03/23 (DE-I) 1 There were mutations in six positions which had been observed also in isolates from other countries, as indicated, and all of them showed modest frequencies. It is interesting to note that 52% of the mutations detected in sequences from France were shared with Germany, but only 16% of the mutations identified from Germany were shared with those from France [ Table 9A ]. During the time interval of about a month, 09/10 to 10/13 (DE-IV), that immediately preceded a marked rise in Covid-19 cases in Germany, 23 new mutations were identified 6 of which reached a prevalence of >20% and 7 of >10% in the SARS-CoV-2 sequences studied. During the same period, 4 of the prevalent mutations were represented in 100% of sequences, one, at 28,881 of 54%. Table 11B lists the total number of mutations and variants up until January 20, 2021 from GISAID complete sequences with 52 entries at >2% incidence. The prevalent mutations reach about 86% occurrence. Only at three sites, mutations were found at >10%. C to U transitions were recorded in 46% of the studied sites. Impact on Coding Capacity: With the exception of the point mutation at 6,941 which was synonymous, the five other mutations were non-synonymous: 3,602 his to tyr (nsp3); 21,855 ser to phe (nsp3); 25,505 glu to arg (ORF3a protein); 25,906 gly to arg (ORF3a protein); 28,869 pro to leu (nucleocapsid phosphoprotein). In late December of 2020, the first cases of Covid-19 emerged in Wuhan, Hubei Province in China, reportedly among workers and customers of the Huanan Seafood Market. The Chinese authorities eventually reacted with a very strict shutdown in Hubei Province, the epicenter of Covid-19, to limit the spread of the new disease. At present, most new cases of Covid-19 are . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint reportedly being registered in Shanghai and a few additional places. The analyses of SARS-CoV-2 mutants up to March 18, 2020 (CN-I) revealed point mutations in only two genome positions, 8,782 (CC → TC, without amino acid exchanges) and 28,144 (TA → CA causing a leu to ser exchange in ORF8 protein), both at frequencies of 29.3% [ Table 12A ]. An extension of our mutant research among a relatively limited number of published sequences to the period from 03/20 to 06/22, 2020 (CN-II) revealed mutations in five of the long-term prevalently affected sequence positions: 241 (CG → TG at a frequency of 69.7% without coding changes), 3,037 (CT → TT, at a frequency of 69.7%, without coding changes), 14,408 (CT → TT at a frequency of 57.6% and a codon change pro to leu in the gene for the RNA-dependent RNA polymerase), 23,403 (AT → GT at a frequency of 66.7% and an asp to gly exchange in the spike glycoprotein), and at 28,881 (GGG → AAC at a frequency of 33.3% and the codon exchange arg-gly to lys-arg, reported previously). Remarkably, the novel shared point mutations in positions 8,782 and 28,144 had disappeared at the later time point [Table 21 ]. These latter mutations may have been introduced to China by visitors or business travelers, and then died out because they did not confer a strong evolutionary advantage or due to not enough sequencing. The total counts of mutations up until January 20 th are presented in Table 12B . It has been the intent of this project to follow the genetic evolution of SARS-CoV-2 after the virus transgressed a host barrier and during the ensuing major pandemic in the human population. The virus has shown great replicative and mutagenic potential and appeared in the large human population of 7.8 billion that lacked previous encounters with SARS-CoV-2. In this context, the primary question was not to understand viral mutagenesis in general in its biochemical or genetic details, but to identify mutants that have potential to become prevalent with possible fitness advantages. Which mutants and variants would have the capability to persist and multiply in the course of rapid spread of SARS-CoV-2 within the human population? It will be a continuing long-term challenge to pursue the outcome and time course of a competition in that 29,903 nucleotides in the viral genome were pitted against about 3 billion in the human genome. The SARS-CoV-2 has a repertoire of mutable sites in a stretch of 29,903 nucleotides that cannot only be varied by introducing point mutations but be extended by an almost inexhaustible combination of multiple mutations in the same genome, by deletions and insertions. Before the viral dominance in the human population began, SARS-CoV-2 had already made a major leap, its transition from an animal to the novel human host, an undocumented step in its own right in which mutagenesis and selection must have played a major role. Thus, the impact of ethnic and socio-economic differences in the human population will have to be considered as important factors. In a summary of all mutation analyses we have compared the number and types of mutations to the extent of the Covid-19 pandemic in ten different countries that currently report high numbers of cases and fatalities [ Table 13 ]. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint Of course, this summary offers only a broad temporal correlation of mutant data and extent of the pandemic in individual countries. High current incidence of Covid-19 is paralleled by high numbers of new mutations and variants, although this relationship was not observed in Brazil or Russia. In anticipation, it will be a further challenge to evaluate the real-world success of the numerous Covid-19 vaccination programs. Rapid worldwide replication of SARS-CoV-2 in heterogeneous populations has been paralleled by the rise of novel mutations. In this report, we have studied mutations in SARS-CoV-2 RNA sequences isolated in the UK, South Africa, Brazil, the US, India, Russia, France, Spain, Germany and China that have become available in the GISAID database during a one-year period between January 19, 2020 and January 20, 2021. We have examined the rise of novel mutations both using sequence subsets segregated by date and also overall in a large cross-section. It seems that towards the end of the year, more mutations in combination were found and propagated rapidly despite lockdowns and other efforts to contain the spread, perhaps owing to potential increased transmissibility. The current data are compatible with the interpretation that rapid regional expansion and efficient viral replication in human populations of very different genetic and socio-economic backgrounds further the selection of new mutations in the viral RNA genome. Differences in defense mechanisms operative in various populations infected by SARS-CoV-2 and/or the various therapeutic measures employed in fighting the infection might also have influenced the selection of new mutants. It is uncertain whether there was regions-specific selection of specific mutations or whether other factors might have furthered differences in unique versus shared novel mutations. Figure 1 and Table 1 show the number of novel variants in each country as of January 20, 2021. The speed by which the virus traveled even during lockdowns emphasizes the difficulty in suppressing transmission of highly contagious respiratory viruses. The new variants have not been associated with increased pathogenesis although more research needs to be done. The preliminary finding of increased transmissibility of the B.1.1.7 and B.135 variant hinder efforts to contain the virus 5, 9, 10, 13, 22, 27, 29, 30 . The vaccines are expected to work against the novel variants, although with some at reduced efficacy 19,20,28,31 , but caution is urged to watch viral evolution. After initially demonstrating the prevalence of about 10 mutants in at least 10 different countries, SARS-COV-2 evolved to display new point mutations worldwide that were selected among affected populations in a time period of weeks [Tables 3 A, B to 12 A, B]. In Table 13 , column 5, the number of new point mutations in some of the countries analyzed ranged between 16 and 38. As a consequence of highly efficient sequencing programs in the UK [UK Consort], previously not recognized variants started to appear in in late 2020 and are currently spreading worldwide . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint ( Figure 1 , Table 2 ). The impact of these variants on potential increases of the already existing pathogenicity cannot be UK, South Africa, Brazil, the US, India, Russia, France, Spain, Germany and China assessed at present. We posit the following hypothesis: SARS-CoV-2 uses the ACE2 (angiotensin-converting enzyme 2) receptor for its entry into human cells 2 . It remains to be determined how the interaction of ACE2 receptor with the spike protein of SARS-CoV-2 affects the location and activity of APOBEC (apolipoprotein B mRNA-editing enzyme, catalytic polypeptide). This class of m-RNA editing functions causes deamination of cytosine to uracil 33 . The high frequency of C to T (U) transitions among SARS-CoV-2 mutants--among them, 40.7% (France), 59% (US) to 69.9% (India)as examples -were C to U (T) transitions (see also Tables 4B to 12B) [see Table 13 ]up to >88% in the UK samples -might be linked to an m-RNA-editing mechanism 1,34,35 . Moreover, the high incidence of C to T (U) transitions renders research on the occurrence of methyl-cytosine bases in SARS-CoV-2 RNA a project of considerable importance. In this context, the introduction of 14 point mutations and 3 small deletions in the genome of the B.1.1.7 suggests m-RNA editing as a plausible model. The rapid generation of SARS-CoV-2 point mutations and the sudden rise of ubiquitous and efficiently selected SARS-CoV-2 variants also supports the m-RNA editing mechanism as an attractive hypothesis. The editing function has been interpreted as a cellular defense against intruding viral genomes, and SARS-CoV-2 exploits exactly this mechanism to further its mutagenic potential. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. With this conceptual bridge, Hie et al. studied viral escape mechanisms to circumvent cellular defenses by adapting machine learning algorithms that had been developed to analyze human languages 37 . The viral mutation seeks to escape by looking different to the immune system akin to word changes keeping the grammar of a sentence while altering its meaning. With that seemingly remote approach, the authors hope to predict viral structures that will be able to escape immunological defenses. This idea has been applied to influenza hemagglutinin, HIV-1 envelope glycoprotein and SARS-CoV-2 spike glycoprotein. Will an intellectual spiel from linguistics can actually help solve a complex biological problem remains to be seen and a subject for future work. We acknowledge and are very grateful to the GISAID Initiative and for the hard work and openscience of the individual research labs and public health agencies that have made their genome data accessible on GISAID, on which this research is based. The authors declare no competing interests. S.W. carried out all work involving sequence selection and formal analyses, was involved in the conceptualization of the project and in the analysis and interpretation of data. . C.R. performed the analysis on the large sequence database and variants of interest/concern using GISAIS, GESS, CoV-Glue and other computational tools, statistical analyses, interpretation of the data, and writing of the manuscript. B.W. and H.B. contributed to the analysis and interpretation of the data. B.W. contributed to writing the manuscript. W.D. initiated the project, was involved in the conceptualization of the project, in the analysis and interpretation of data and wrote the manuscript with C.R.'s and B.W.'s contributions. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint Variants of SARS-CoV-2 by country as of January 20, 2021. Currently, new variants are being detected and characterized in rapid succession. This Table could be outdated by the time of publication. For updating of data consult GISAID 24 . Details of the mutant analyses of 7,144 SARS-CoV-2 isolates for deviations from the Wuhan reference sequence. These sequences were deposited in the GISAID initiative between 01/19/2020 and 01/20/2021. For design of Tables see legend to Table 4 . The Table presents characteristics of SARS-CoV-2 mutants from South African isolates. For Table design, see legend to Table 6 . The Table presents characteristics of SARS-CoV-2 mutants from isolates collected in the Brazilian population. For Table design, see legend to Table 6 . Table 6 -USA The general design of this Table is similar to Tables 3 to 5, and 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint The GGG → AAC is a non-point mutation in nucleotide position 28,881 that generated a highly basic amino acid sequence in the SARS-CoV-2 nucleocapsid phosphor-protein. We have speculated that this mutation might have originated from a recombination event between different viral RNA molecules [Weber et al, 2020 1 ] Part B: A total of 5,710 SARS-CoV-2 from the GISAID source was analyzed. Deviations from the Wuhan reference sequence of >2% incidence were found at 42 sites in the sequence. Further details were described in the text. The general design of these Tables follows the outline described in detail in the legend to Table 6 (USA). The number of sequences investigated for SARS-CoV-2 mutations was detailed in Tables for individual countries. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint Signal hotspot mutations in SARS-CoV-2 genomes evolve as the virus spreads and actively replicates in different parts of the world SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor An interactive web-based dashboard to track COVID-19 in real time disease and diplomacy: GISAID's innovative contribution to global health Preliminary genomic characterisation of an emergentSars-CoV-2 lineage in the UK defined by a novel set of spike mutations Neutralizing antibodies in Spike mediated SARS-CoV-2 adaption Recurrent emergence and transmission of a SARS-CoV-2 Spike deletion ΔH69/ΔV70 MedRxiv 2020 Key residues of the receptor binding motif in the spike protein of SARS-CoV-2 that interact with ACE2 and neutralizing antibodies NERVTAG. New and Emerging Respiratory Virus Threats Advisory Group. NERVTAG meeting on SARS-CoV-2 variant under investigation Transmission of SARS-CoV-2 Lineage B.1.1.7 in England: Insights from linking epidemiological and genetic data Confirmed Reinfection with SARS-CoV-2 Variant VOC-202012/01 Emergence and rapid spread of a new severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2) lineage with multiple spike mutations in South Africa A preliminary selection analysis of the South African V501.V2 SAR-CoV-2 clade Genomic characterisation of an emergent SARS-CoV-2 lineage in Manaus: preliminary findings Genomic evidence of a SARS-CoV-2 reinfection case with E484K spike mutation in Brazil Spike E484K mutation in the first SARS-CoV-2 reinfection case confirmed in Brazil mRNA-1273 vaccine induces neutralizing antibodies against spike mutations from global SARS-CoV-2 variants Neutralization of N501Y mutant SARS-CoV-2 by BNT162b2 vaccine-elicted sera . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021. ; https://doi.org/10.1101/2021.02.04.21251111 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted February 6, 2021.