key: cord-1034399-npxdsm94 authors: Mari, A.; Roloff, T.-C.; Stange, M.; Soegaard, K. K.; Asllanaj, E.; Tauriello, G.; Alexander, L. T.; Schweitzer, M.; Leuzinger, K.; Gensch, A.; Martinez, A.; Bielicki, J.; Pargger, H.; Siegemund, M.; Nickel, C.; Bingisser, R.; Osthoff, M.; Bassetti, S.; Sendi, P.; Battegay, M.; Marzolini, C.; Seth-Smith, H.; Schwede, T.; Hirsch, H. H.; Egli, A. title: Global surveillance of potential antiviral drug resistance in SARS-CoV-2: proof of concept focussing on the RNA-dependent RNA polymerase date: 2021-01-04 journal: nan DOI: 10.1101/2020.12.28.20248663 sha: e14c3655c13c03b25162786e41a5f7c6219589c8 doc_id: 1034399 cord_uid: npxdsm94 Antiviral treatments for COVID-19 have involved many repurposed drugs. Currently, SARS-CoV-2 RNA-dependent RNA polymerase (RdRp, encoded by nsp12-nsp7-nsp8) has been targeted by numerous inhibitors with debated clinical impact. Among these, remdesivir has been conditionally approved for the treatment of COVID-19 patients. Although the emergence of antiviral resistance, an indirect proxy for antiviral efficacy, poses a considerable healthcare threat, an evolutionary perspective on emerging resistant mutants is still lacking. Here we show that SARS-CoV-2 RdRp is under purifying selection, that potential escape mutations are rare, and unlikely to lead to viral fitness loss. In more than 56,000 viral genomes from 105 countries dating from December 2019 to July 2020 we found negative selective pressure affecting nsp12 (Tajimas D = -2.62), with potential antiviral escape mutations in only 0.3% of sequenced genomes. Those affected known key residues, such as Nsp12:Val473 and Nsp12:Arg555. Of the potential escape mutations found globally, in silico structural models show that this rarely implies loss of stability in RdRp. No potential escape mutation were found in our local cohort of remdesivir treated patients from the first wave (n=8). Our results indicate that RdRp is a suitable drug target, and that remdesivir does not seem to exert high selective pressure. Our study could be the starting point of a larger monitoring effort of drug resistance throughout the COVID-19 pandemic. We recommend the application of repetitive genome sequencing of SARS-CoV-2 from patients treated with antivirals to provide early insights into the evolution or antiviral resistance. Infection with SARS-CoV-2 is associated with substantial morbidity and mortality. As no approved therapy is available to-date, there have been recently multiple efforts in drug repurposing. The selective pressure on the virus generated by potential antiviral drugs and the eventual emergence of antiviral resistance provides interesting information about the mode of action of antiviral drugs and their efficacy. In this context, the surveillance of resistance emergence is paramount for public health during the COVID-19 pandemic. The first drugs considered for antiviral treatment were inhibitors targeting 3C-like proteases (3CL-Pro) and Spike proteins (S) 1, 2 . However, the alarming number of side effects and the lack of clinical efficacy forced the identification of new targets of viral replication machinery 3 . Because of its high degree of aminoacid conservation within beta-coronaviridae, RNA dependent RNA polymerase (RdRp) (96% identity 4 ) is a key target of antiviral drug development, and recently, All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. 3 of drug-repurposing 5, 4 . Several studies have indicated in vitro efficacy against SARS-CoV-2 of potential inhibitory candidates such as remdesvir, sofosbuvir, galidesivir, tenofovir, and ribavirin 6, 7, 8 . All of them share the same mechanism of action, binding RdRp in its active site as nucleoside analogues, interrupting the RNA polymerisation 6 ( Figure S1 ). Antiviral treatments have been shown to lead to the emergence of antiviral resistance in infectious diseases such as those caused by Hepatitis B Virus , Human Immunodeficiency Virus, and Hepatitis C Virus (HCV), or influenza itself 9, 10, 11 . For example, the emergence of ribavirin resistant mutants has been described for HCV RdRp and has been largely attributed to the emergence of resistance-conferring single nucleotide polymorphisms (SNPs), usually resulting in a remarkable fitness loss 10, 12 . Similarly, in vitro experiments on SARS-CoV, cause of the Severe Acute Respiratory Syndrome (SARS), which is the closest related to SARS-CoV-2, show that specific SNPs within nsp12 may alter the effectiveness of remdesivir: namely Nsp12:Phe480Leu and Nsp12:Val557Leu 13 . The substitution Nsp12:Phe480Leu destabilises the interface between different sub-domains of the protein ("palm" and "fingers"), likely affecting the proof-reading capacity of RdRp 4 . Nsp12:Val557Leu affects binding to the template RNA and indirectly to remdesivir 4 . The EC50 of remdesivir increased six-fold from 0.01µM to 0.06µM in cultures of SARS-CoV carrying Nsp12:Phe480Leu or Nsp12:Val557Leu mutations 13 . In the absence of remdesivir, these viral mutants were found to replicate less efficiently, and showed a substantially reduced fitness 13 . The clinical efficacy of remdesivir in COVID-19 treatment has been recently debated. Two studies suggested a reduction in recovery time 14, 15 , while others did not show a reduction of mortality 16, 17 . At present, expert panels from authorities have made recommendations on the use of remdesivir in patients with different severities of disease. (https://www.covid19treatmentguidelines.nih.gov/whats-new/). is essential for disease surveillance. In the genomes of circulating SARS-CoV-2, no SNPs have been associated with clinical failure of remdesivir treatment to date, although mutations in nsp12 have been reported. In this study, we address the suitability of RdRp as a valuable drug target by evaluating the selective pressure affecting RdRp and monitoring the emergence of potential escape mutations in key drug binding motifs. We screen real world data from both a global dataset consisting of more than 56,000 genomes from 105 countries, and an inpatient longitudinal genome cohort of 197 remdesivir treated patients (189 untreated patients, eight remdesivir treated patients with All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint follow-up time points). We show that potential escape mutations to remdesivir are rare, as RdRp is under negative selection, and that these escape mutations generally do not hamper RdRp stability, such that compensatory mutations are not necessary. Understanding selection pressure on viral genes are critical in studying potential effects of antiviral drugs. Therefore, we inferred the selection pressures on the whole nsp12 gene (2793 nucleotides, 931 codons) applying Tajima Figure 2) . These results indicate that nsp12 is under purifying selection, implying that accumulation and fixation of mutations is evolutionarily unfavoured with deleterious mutations being eliminated from the coding sequence. The structure of the active site of RdRp was screened in silico to identify motifs likely to be involved in nucleoside binding, to affect the binding affinity of remdesivir, and to compromise the stability or the proofreading activity of RdRp 4, 18, 19 . Those include Nsp12:Phe480 and Nsp12:Val557 and were identified in the buried chains starting at Nsp12:Arg467 till Nsp12:Val493 and from Nsp12:Leu544 to Nsp12:Gln570, referred to as first and second potential escape motifs, for ease of nomenclature. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint Building on our published genome analysis pipeline COVGAP 20 , we developed an updated version, which is able to track genetic diversification of drug target residues, among others, the mentioned potential escape motifs of RdRp, particularly in its main chain Nsp12. 58,806 high-quality publicly available SARS-CoV-2 genomes were collected between 24th December 2019 and 12th July 2020 from GISAID 21 (SI Table1). 1: Potential escape mutations inside 1 st and 2 nd escape motif and are not uniformly distributed, they are rare worldwide and stable overtime: (A) Mutation prevalence in the two potential escape motifs, SNPs are highlighted in the upper genome panel, their frequency in the lower variant count panel, genome origin in the upper right country distribution panel, (B) Frequency changes of potentially resistant genomes over time become stable after the 13th calender week in 2020 and settle to 0.32%. The cumulative count panel displays available genomes at a defined week. The time frame considered spans from 25 th of December 2019 till 12 th of July 2020, NA/ND stand for no date information available, or incomplete respectively. (C) Escape mutants are distributed heterogeneously over the world, with Philippines harbouring 16% of potential escape mutants (total sequences n=12). In Switzerland, the potential escape mutants frequency cumulates to 2.2% (total sequences n=676), more than 7 times the global average (C-zoom). All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint Non-synonymous mutations in the nsp12 coding sequence were found in 46,469 of the 56,806 (81.08%) viral genomes. However, most of these mutations (e.g position C14408T leading to Pro323Leu -SI Figure 3 ) were neither within the potential escape motifs nor located around key residues of the active site which includes, among others, the residues Arg555, Lys545, Asn691, Asp623 19 . Only 182/56,806 (0.32%) genomes contained a total of 85 different nonsynonymous mutations within the potential escape motifs (SI Table2, SI Table3) ( Figure 1A) , thereby being potential candidates for reduced remdesivir effectiveness. Of the 182, three genomes exhibited non-synonymous mutations affecting Nsp12:Phe480: Nsp12:Phe480Leu, Nsp12:Phe480Ser and Nsp12:Phe480Cys. The residue Nsp12:Val557 was found to be mutated in a single genome (Nsp12:Val557Glu). Additional high frequency non-synonymous SNPs included those encoding Nsp12:Asn491Ser (occurring in 34 genomes) and Nsp12:Val473Phe (19 genomes). (Figure 1A , SI Table3). The first genome carrying a non-synonymous variant falling into a potential escape motif (Nsp12:Ser564Ile) was registered on 20th January, 2020 in a 30 year old female patient from China 22 (Figure 1) . The sample carrying Nsp12:Phe480Leu was collected on 3rd March 2020 from a 72 years old male patient from England. With increasing numbers of available sequences, genomes carrying non-synonymous mutations in potential escape motifs settled at 0.21% by calendar week 13, and reached a stable rate of 0.3% (+/-0.064) from week 15 on (available genomes n=35,055). No proportional increase has been detected in any time points after that date, even after conditional approval of remdesivir ( Figure 1B) . To investigate the geographical distribution of the identified escape mutants, we considered only the countries having submitted at least 100 high quality genomes. We found that Switzerland (2.2%), Chile (1.4%), and Bangladesh (1.02%) showed the highest percentages of genomes that feature potential escape mutations ( Figure 1C , SI Table 4 ). To add granularity to the Swiss data, we collected an open cohort of 690 individuals from the University Hospital Basel, who were tested only once (single-time tested) between 23rd of February and 30th of April 2020. We did not find any mutation in the potential escape motifs, nor minority alleles in the samples that would hint to intra-host diversity. The variant distribution across RdRp is in line with the distribution observed in the global dataset, with the exception of a high frequency of a synonymous mutation in nucleotide position 15324 (located in RdRp), recently described as a Basel-area specific mutation 20 (SI Figure 4) . All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; We determined phylogenetic (pangolin nomenclature) lineages of the genomes carrying potential escape motif mutations 23 , and found that they are distributed across 37 different lineages in 21 countries. Among these is the B1.108 lineage (32 genomes) , as yet only identified in the USA, first detected on 14th March 2020 and not seen after 24th April 2020. This lineage is defined by nucleotide mutation nsp12:A14912G, encoding a non-synonymous SNP leading to Nsp12:Asn491Ser. This lineage thus shows a potential escape mutation rate of 100% (SI Table 5 ). These results indicate that emergence of escape mutations leading to potential antiviral resistance is a rare event, independent of geographical location. Table 6 ). Through our COVGAP pipeline, we automatically monitored the occurrence of major and minor SNPs causing amino acid changes in any specific motifs. Patients who received Remdesivir treatment in the study period. Successfully sequenced samples are marked with black arrows. Grey arrows indicate samples that could not be sequenced for reasons of accessibilty, sample quality or low viral load. Remdesivir treatment is indicated by black horizontal bars. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint 8 We did not observe any highly supported variants in the two potential escape motifs in the remdesivir-treated patients. However, in one sample from a hospitalised 47 years old male patient, who did not receive remdesivir treatment, we observed a minority allele carrying a non-synonymous mutation in nsp12 encoding Thr374Cys (21% alternative allele support of 27 read coverage). This substitution is not within the escape motifs and is outside the active site of RdRp. These findings are in line with the rarity of escape mutants observed in the global dataset, which support the hypothesis of a low selective pressure provided by remdesivir treatment. We found that the level of genetic diversity was significantly higher for potential escape mutants compared to non-mutants for both escape motifs after t-test run over Monte-Carlo Mutations in escape motifs harbour higher genome diversity and tightly associated mutations: The genome entropy is significantly higher in 1 st and 2 nd motif escape mutants. (A) Diversity is calculated as incidence-based mutation richness along the chao2 diversity index, boxplot represents the interquartile range, red dots indicate the mean. Stars indicate the merged pvalue, calculated with a Monte Carlo t-test simulation, see methods. (B) Association between escape mutants and other mutations across the genome. Candidates are evaluated through generalised linear models with lineage correction. Upper panel: depiction of mutation incidence ratio escape/non escape. Only mutations showing a ratio > 95 th percentile are considered escape-associated (light blue area), of note mutation in position 16,210 is significantly associated to escape mutants. Below: adjusted p-value for multiple testing according to Benjamini-Yekutieli -only mutations with significant pvalues are shown-. (C,D) Location of escape mutations and associated mutation on RdRp bound to RNA template and remdesivir, 1 st escape motif is indicated in blue, 2 nd escape motif is indicated in red. (E,F) Met924Leu (encoded by a SNP in position 16210) decreases the distance to Ile864 by more than 2-fold. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; 9 simulation, used to correct for group size disparity (chi-sq =115989.4, df=20000, merged-pval << 0.001) and (chi-sq =191521.7, df=20000, merged-pval << 0.001), (Figure 3 A) (Chao2 diversity incidence index, SI Table 1) . We then screened the escape mutants to find escape-associated mutations. Within the 177 escape mutants we discarded five genomes because retaining stop-codon gain substitutions and we identified 1,056 non-synonymous potential escape-associated mutations. We evaluated lineage corrected co-occurrences through generalised linear models, and 174 non-synonymous mutations showed significant response to the predictor ( Figure 3B , SI Table 7 ). Among these, nine were positively associated with potential escape mutants. Only one of the found escape-associated mutation falls into nsp12, in genome position 16210 (Nsp12:Met924Leu) (SI Table 8 ). This variant occurs in genomes carrying the potential escape mutations Nsp12:Arg555Pro, Nsp12:Met566Val, Nsp12:Thr567Ser and Nsp12:Arg569Gly and could hamper the binding to the RNA template 19 . We found that the substitution to Leu924 shortens the distance with the close Ile864 residue from 5.6Å to 2.3Å (Figure 3 To determine the protein stability change in each scenario, we calculated the energy state of RdRp under escape and escape-associated mutation combination co-occurring in the same genomes (SI Table9). We included the combinations between potential escape mutations Nsp12:Met566Val, Nsp12:Arg555Pro, Nsp12:Thr567Ser, Nsp12:Arg569Gly, Nsp12:Val473Phe, and the associated mutation Nsp12:Met924Leu. As a control we evaluated Nsp12:Phe480Leu and Nsp12:Val557Leu as well (SI Table 9 ). To correct for possible poor protein resolution, we inferred each combination on six different RdRp structures (SI Figure 5 , SI Table 9 ). We detected significant destabilisation only in the case of Nsp12:Met566Val (2.53 kcal/mol) and Nsp12:Val473Phe (3.97 kcal/mol). The former is associated with Nsp12:Met924Leu, which does not show any sort of stabilising effect on its own, nor in combination with any other escape mutation. Nsp12:Val473Phe is strongly associated with a SNP located at genomic position 24378 (S:Ser939Phe, adjusted p value=0.00012, SI Table 7 ) While Ser939Phe yields no effect on pre-fusion S structures (-0.17 kcal/mol), this substitution yields a stabilising effect on post-fusion S structures (-1.9 kcal/mol) (SI Table 9 ). The physiological gain of this stabilisation is yet to be understood. These results show that escapeassociated mutations within RdRp are unlikely to compensate for RdRp stability losses. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint The rapid global spread of SARS_CoV-2 has led to the emergence of many different variants Those two early mutations did not further spread, as there is no trace in later time points in the global dataset, probably because of purifying selection. Remdesivir treatment for COVID-19 was initiated at our hospital as a part of a clinical trial in mid March 2020; it was approved for general use in Swiss hospitals as of June 30. Among the patients enrolled in the trial, only eight fulfilled the criteria for treatment with remdesivir in our study period. The only mutation affecting RdRp was not a potential escape motifs, and was found in a non-remdesivir treated patient. Although sampling bias could be the explanation of these results, a potential explanation could be that remdesivir treatment, at the administered doses, does not provide enough selective pressure against the virus due to a low efficacy 30 . Among the escape mutations showing stable association with other variants, the only instances of destabilisation include Nsp12:Met566Val and Nsp12:Phe480Leu. We found that Nsp12:Met924Leu, the only significantly associated mutation on RdRp, does not rescue Nsp12:Met566Val destabilisation, nor does it have a stabilizing effect on its own. If the recovered escape mutations were costly in terms of fitness, we would have expected to observe All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. a destabilising effect on RdRp, and a correspondent stabilising effect of associated mutations. This is surprisingly not the case, especially considering Nsp12:Arg555Pro, a key residue involved in remdesivir binding, and Nsp12:Val557Leu 18, 19 . These results, however, do not exclude that a different form of fitness loss/compensation not involving stability could take place 31 . For example, Nsp12:Arg555Pro could disrupt the binding affinity of remdesivir and Nsp12:Met924Leu could improve the protein binding to the template. In other motifs of RdRp, a depolarising mutation such as Nsp12:Met924Leu has been linked to a reduction of the polymerase proofreading activity 4 . Alternatively such association could involve more widely the viral physiology, as for example the association between the potential escape mutation Nsp12:Val473Phe and S:Ser939Phe. The nature of such cross-protein association remains speculative and additional data would be necessary to illustrate that in full. The rarity of potential escape mutations and the purifying selection acting on RdRp could indicate that remdesivir use has not yet selected for resistant variants observable on a global scale. However, any selective pressure caused by remdesivir will not necessarily be captured within the entire global dataset (n=56,806), as available sequences represent (i) largely samples from non-remdesivir treated patients (data available from the GISAID repository) and (ii) likely the first isolate from patients, prior to treatment with remdesivir. Therefore, it remains important to continuously sequence isolates from antiviral treated patients and monitor emerging mutations. In summary, our study offers a surveillance framework for SARS-CoV-2 evolution focusing on the potential emergence of antiviral resistance. Our findings demonstrate the high conservation of RdRp worldwide and point towards a low selective pressure provided by anti-RdRp drugs (eg. remdesivir). Potential remdesivir escape mutations were very rare, which could be an indicator of little selective pressure and hence no therapeutic effect of remdesivir. Notably, our analysis could be extended to other repurposed RdRp-targeting drugs, such as sofosbuvir and ribavirin, that have the same mechanism of RdRp inhibition as remdesivir. Our data indicates RdRp as a potential drug target candidate because of the many minor variants screened, yet none of them has gained selective advantage against remdesivir. Patient treated with a potential antiviral drug should be closely monitored for the potential emergence of antiviral resistance. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint We inferred selection on the nsp12 gene using Tajima To infer the selection pressure that is acting on the entire nsp12 gene region, we calculated Tajimas's D statistic 35, 36 using MEGA version 7 37 to test for neutral selection for the entire coding sequence. The test statistic is based on two estimates, number of segregating sites 38 and average number of nucleotide differences, gained from pairwise comparisons. Tajimas's D statistics equalling zero means that number of segregating sites roughly equals average nucleotide differences; Tajimas's D being smaller than zero means that number of segregating sites are more abundant than average nucleotide differences, indicating an excess of rare alleles All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint 13 or a recent population expansion after a bottleneck; and Tajimas's D being larger than zero means that fewer segregating sites than average nucleotide differences exist, indicating that rare alleles are selected against. Further, we inferred the ratio of nonsynonymous to synonymous mutations (dN/dS or omega) in a Bayesian sliding window approach using genomegamap 39 respectively. We focussed on the motifs neighbouring those two mutations along the following rational: for Val557 we included the entire beta-sheet Val557 finds itself in, and also the loose loops and part of the neighbouring alpha helix closer to the active site 18 . For Phe480, as seemingly no neighbouring structure binds either the template or the nucleoside, we included the entire loose loop and alpha helix neighbouring Phe480, as part of the hydrophobic core of RdRp and crucial for the protein proof-reading stability 4 . The resulting motives stretch from Arg467 to Val493 for the first motif (neighbouring Phe480), and from Leu544 to Gln570 for the second motif (neighbouring Val557). 77,150 SARS-CoV-2 consensus genomes were downloaded as of 23rd July 2020 from the GISAID platform 21 . The genomes were quality filtered: 18,942 genomes were discarded as All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint containing more than 10% ambiguous basecalls (Ns); and 1,402 genomes were discarded for containing ambiguous variant calls. Mutations were considered only when a clear alternative allele was found, genomes reporting errors in the alternative alleles calls were discarded. Degenerate basecalls in the variant call other than N were not determining the exclusion of the genome from the dataset. Using our COVGAP pipeline 20 , the sequences were first aligned to the Wuhan-1 reference 40 using mafft v7.467, point mutations were identified with snp-sites v2.5.1 41 . The collected variants were then annotated using snpeff v4.5covid19 42 . The variants positions, together with the variant annotation, were screened using R base. Genomes whose variants fell into the potential escape motifs were labeled as potential escape mutants. Statistics on spatio-temporal trends were calculated via R base v3.6.2, lineages classification was inferred through Pangolin 23 We generated COVGAP2, an adaptation of our bioinformatic COVID- 19 preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint All persons with a first-time positive SARS-CoV-2 PCR test between the 26th of February (first case in Switzerland) and the 30th of April 2020 were eligible for inclusion in the cohort. To calculate the diversity of each genome chao2 index was chosen because if its incidence based nature 48 . Log values were used to ensure normal distribution. Given the large size disparity, a Monte-Carlo approach was chosen. For both first and second motif (n=115, n=69 respectively), the correspondant no-escape mutant group ("None") (n=56691, n=56737 All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint respectively) was randomly subsampled without replacement in 10000 groups each with n=150. Each subgroup was tested against the same motif group through t-student test. Resulting p-values were adjusted following the Benjamini-Yekutieli correction 49 , and subsequently merged using the sum of logs Fisher method 50 . The R package metap v=1.4 was used. To infer escape-associated mutations, all the samples from the global cohort (n=56,806) were labelled based on whether or not they showed at least one non-synonymous mutation in either potential escape motifs, respectively. From the initial 16,051 mutations detected over the entire cohort, only mutations appearing in both non potential escape and potential escape mutants were considered (n=1,056). Generalised linear models were used to evaluate the correlation between each point mutation and the escape/non-escape predictor. Lineages were added as a fixed effect only when the mutation appeared in multiple lineages. The response error was assumed to be distributed binomially. The retrieved p-value was then corrected for multiple testing using Benjamini-Yekutieli correction 49 and considered significant only for those mutations showing an adjusted p value < 0.05 . In order to assess to which response the significant mutations were associated with, we first calculated for each mutation the frequency in escape mutants divided by the frequency in non-escape mutants. We then defined mutations as escape-associated, only if the escape/non-escape frequency ratio was above the 95th percentile of the significant mutation ratio distribution. Conversely, mutations showing a ratio below the 5th percentile were considered not-escape associated. Percentiles were established according to Hyndman and Fan (1996) in order to obtain median-unbiased quantiles. The glm function of R base stats package 3.6.2 was used. The retrieved associated mutations were fitted into the protein structure reported by Yin et al., 19 (PBD ID 7BV2). Visualisation and figure drafting took place with pymol version 2.3.5 51 . The effect of mutations on the stability of RdRp was calculated by using the FoldX 5.0 advanced protein design suite. 52 FoldX 5.0 has the capabilities to measure the stability changes of a protein structure model upon several mutations and is able to consider the interactions of the protein with RNA in its calculations, which is vital for an analysis in the RdRp system. The input PDB files are identical to those downloaded from the PDB and were not otherwise preprocessed. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020. 12.28.20248663 doi: medRxiv preprint We used the FoldX command "BuildModel" and enabled the calculation of interactions between the protein and RNA. For every combination of protein structure and mutations, the calculations of Gibbs energies of protein folding were repeated five times, and the differences between respective wild types and mutated proteins were reported. We then calculated the average total Gibbs energy difference over the six protein structures of RdRp to have a final estimate of the stability difference between the wild type and the mutated protein. To distinguish genuine stability predictions from noise, we filtered the results based on the reported standard deviations of FoldX energy calculations as described in Buss et al 53 . The accuracy of FoldX is described to be dependent on the resolution of the investigated structures. Given that the resolution of the used protein structures ranges on the lower end with values between 2.5 and 2.9 Ångstrom, we used the conservative threshold of 1.78 kcal/mol. Energy differences below this threshold were not considered to denote changes in the stability of the investigated protein. To have a diverse perspective on the impact of the candidate resistance and compensatory mutations, a stability analysis was conducted on multiple experimentally resolved structures of the SARS-CoV-2 RdRp: 7CXM , 7AAP, 7BV2, 7C2K, 6YYT, and 6M71. The PDB files of these structures were directly downloaded from the Protein Data Bank 54 . The study was conducted according to good laboratory practice and in accordance with the Declaration of Helsinki and national and institutional standards and was approved by the ethical committee (EKNZ 2020-00769). The clinical trial accession numbers are NCT04323761 and NCT04351503 (clinicaltrials.gov). perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint Figure S1 : Remdesivir mechanism of action. The prodrug is metabolized into a ATP analogue, and triggers the termination of RNA polymerization after binding to the active site. the two mutation are schematized in their location, Phe480 buried, Val557 adjacent to the binding site. Credits to Biorender.com All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint 25 Figure S2 : Codon-based estimates of the ratio of nonsynonymous to synonymous mutations (omega, dN/dS) along nsp12 using a sliding window model in a Bayesian computation approach. An excess of nonsynonymous mutations (dN/dS > 1) is interpreted as the site being under positive selection, whereas an excess of synonymous mutations (dN/dS < 1) is interpreted as purifying or balancing selection acting on this particular position. A balance of non-synonymous and synonymous mutations (dN/dS = 1) is understood as neutral selection. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint Average energy difference between wild-type and mutated RdRp across six different experimental structures. Destabilizing mutations have a positive energy difference while stabilizing mutations are negative. The panel above shows the single variants while the panel below depicts the found mutation combinations. The threshold upon which an energy difference is considered significant is marked with a light grey rectangle. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. ; https://doi.org/10.1101/2020.12.28.20248663 doi: medRxiv preprint Genome assemblies from the Basel open cohort were submitted to GISAID. Amplicon sequences from the remdesivir treated patients are available upon request. No dedicated funding was received for this work. All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in Potential inhibitors against 2019-nCoV coronavirus M protease from clinically approved medicines Abelson Kinase Inhibitors Are Potent Inhibitors of Severe Acute Respiratory Syndrome Coronavirus and Middle East Respiratory Syndrome Coronavirus Fusion Stopping lopinavir/ritonavir in COVID-19 patients: duration of the drug interacting effect Remdesivir and SARS-CoV-2: Structural requirements at both nsp12 RdRp and nsp14 Exonuclease active-sites Structural and Functional Basis of the Fidelity of Nucleotide Selection by Flavivirus RNA-Dependent RNA Polymerases Remdesivir, Sofosbuvir, Galidesivir, and Tenofovir against SARS-CoV-2 RNA dependent RNA polymerase (RdRp): A molecular docking study Nucleotide Analogues as Inhibitors of SARS-CoV-2 Polymerase, a Key Drug Target for COVID-19 Sofosbuvir terminated RNA is more resistant to SARS-CoV-2 proofreader than RNA terminated by Comparisons of the HBV and HIV polymerase, and antiviral resistance mutations Inhibitors of the Hepatitis C Virus Polymerase; Mode of Action and Resistance Permissive secondary mutations enable the evolution of influenza oseltamivir resistance Selected replicon variants with low-level in vitro resistance to the hepatitis C virus NS5B polymerase inhibitor PSI-6130 lack cross-resistance with R1479 Effect of Remdesivir vs Standard Care on Clinical Status at 11 Days in Patients With Moderate COVID-19: A Randomized Clinical Trial Repurposed Antiviral Drugs for Covid-19 -Interim WHO Solidarity Trial Results Remdesivir in adults with severe COVID-19: a randomised, double-blind, placebo-controlled, multicentre trial Structure of the RNA-dependent RNA polymerase from COVID-19 virus Structural basis for inhibition of the RNA-dependent RNA polymerase from SARS-CoV-2 by remdesivir SARS-CoV-2 outbreak in a tri-national urban area is dominated by a B.1 lineage variant linked to mass gathering events Data, disease and diplomacy: GISAID's innovative contribution to global health Transmission Potential of Asymptomatic and Paucisymptomatic Severe Acute Respiratory Syndrome Coronavirus 2 Infections: A 3-Family Cluster Study in China A dynamic nomenclature proposal for SARS-CoV-2 lineages to assist genomic epidemiology Emerging SARS-CoV-2 mutation hot spots include a novel RNAdependent-RNA polymerase variant All rights reserved. No reuse allowed without permission. perpetuity preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase from severe acute respiratory syndrome coronavirus 2 with high potency Remdesivir Inhibits SARS-CoV-2 in Human Lung Cells and Chimeric SARS-CoV Expressing the SARS-CoV-2 RNA Polymerase in Mice Clinical benefit of remdesivir in rhesus macaques infected with SARS-CoV-2 The distinct contributions of fitness and genetic barrier to the development of antiviral drug resistance Compensatory mutations and epistasis for protein function A Degradome-Based Polymerase Chain Reaction to Resolve the Potential of Environmental Samples for 2,4-Dichlorophenol Biodegradation Nextstrain: real-time tracking of pathogen evolution MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability Statistical method for testing the neutral mutation hypothesis by DNA polymorphism All rights reserved. No reuse allowed without permission. perpetuity preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted Molecular Evolutionary Genetics Analysis Version 7.0 for Bigger Datasets On the number of segregating sites in genetical models without recombination. Theor GenomegaMap: Within-Species Genome-Wide dN/dS Estimation from over 10,000 Genomes A new coronavirus associated with human respiratory disease in China SNP-sites: rapid efficient extraction of SNPs from multi-FASTA alignments A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain Reshaping Data with the reshape Package Orchestrating high-throughput genomic analysis with Bioconductor Software for computing and annotating genomic ranges Dates and Times Made Easy with lubridate ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics All rights reserved. No reuse allowed without permission. perpetuity preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted We are grateful to Dr All rights reserved. No reuse allowed without permission. perpetuity. preprint (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in The copyright holder for this this version posted January 4, 2021. Algeria