key: cord-0891622-0dm161b4 authors: Conti, Brian J.; Kirchdoerfer, Robert N.; Sussman, Michael R. title: Mass spectrometric based detection of protein nucleotidylation in the RNA polymerase of SARS-CoV-2 date: 2020-10-07 journal: bioRxiv DOI: 10.1101/2020.10.07.330324 sha: 9797fc5f4f12c817ec3ddcd8deb9d184971ffcb3 doc_id: 891622 cord_uid: 0dm161b4 Coronaviruses, like SARS-CoV-2, encode a nucleotidyl transferase in the N-terminal NiRAN domain of the non-structural protein (nsp) 12 protein within the RNA dependent RNA polymerase (RdRP) 1-3. Though the substrate targets of the viral nucleotidyl transferase are unknown, NiRAN active sites are highly conserved and essential for viral replication 3. We show, for the first time, the detection and sequence location of GMP-modified amino acids in nidovirus RdRP-associated proteins using heavy isotope-assisted MS and MS/MS peptide sequencing. We identified lys-143 in the equine arteritis virus (EAV) protein, nsp7, as a primary site of nucleotidylation in vitro that uses a phosphoramide bond to covalently attach with GMP. In SARS-CoV-2 replicase proteins, we demonstrate a unique O-linked GMP attachment on nsp7 ser-1, whose formation required the presence of nsp12. It is clear that additional nucleotidylation sites remain undiscovered, which includes the possibility that nsp12 itself may form a transient GMP adduct in the NiRAN active site that has eluted detection in these initial studies due to instability of the covalent attachment. Our results demonstrate new strategies for detecting GMP-peptide linkages that can be adapted for higher throughput screening using mass spectrometric technologies. These data are expected to be important for a rapid and timely characterization of a new enzymatic activity in SARS-CoV-2 that may be an attractive drug target aimed at limiting viral replication in infected patients. The world is faced with a pandemic disease, COVID-19, caused by the emergence and global 25 spread of a new species of coronavirus. Although it is clear that lethality of the disease progression is 26 often caused by pulmonary problems associated with severe respiratory symptoms, this virus continues 27 to surprise the medical community with new pathologies that are creating challenges for effective 28 treatment. While the causative agent of COVID-19, SARS-CoV-2, is similar to other coronaviruses that 29 infect humans and animals, many molecular details of this virus's macromolecular structures and 30 functions remain unknown. Immunological approaches to neutralize the virus, using both passive (e.g. 31 injection of antibodies into patients) or active (e.g., injection of DNA, mRNA or protein that generates a 32 neutralizing immune response) immunity are being pursued by scientists in both the private and public 33 sectors. Since it is unclear whether any of these methods will succeed, it is prudent to explore other 34 approaches to treat this disease, including drug therapy. Small molecule drug therapy has been highly 35 successful in controlling HIV-1 infections and curing infections of hepatitis C virus [4] [5] [6] [7] . Creating small 36 molecule drugs that target the SARS-CoV-2 replication machinery would provide a complementary 37 approach to ongoing vaccine development efforts as well as preparing the world for future outbreaks 38 caused by emerging coronaviruses. 39 A logical target for drugs that inhibit the virus is the conserved protein machinery responsible for 40 replication of the viral RNA genome. Targeting proteins associated with the viral RNA-dependent RNA 41 Polymerase (RdRP) is particularly attractive given the absence of similar RNA replication machinery in 42 human cells. The small molecule Remsedevir targets the coronavirus replication machinery causing 43 premature RNA chain termination 8 . Remdesivir is now undergoing clinical trials and has been approved 44 for compassionate use in the treatment of COVID-19. Discovery of new enzymatic and protein binding 45 activities in SARS-CoV-2 is a critical part in elucidating the viral life cycle and importantly, in developing 46 novel strategies to combat this disease as well as potentially new emerging viruses that use similar 47 enzymes. 48 5 residue in SARS-CoV, which is also conserved in SARS-CoV-2, abolishes viral propagation in Vero-E6 cells 74 3, 14 . In our assay, a mutation of SARS-CoV-2 nsp12 that replaced a basic amino acid with a neutral amino 75 acid, i.e. K73A, abolished all incorporation of radioactivity in nsp7 and nsp8, further demonstrating the 76 central role of this nsp12 RdRP protein in nucleotidyation. Promiscuous labeling of a non-relevant 77 protein, bovine serum albumin (BSA), was not observed suggesting nsp12-mediated nucleotidylation 78 exhibits substrate specificity, possibly through close localization of proteins within the macromolecular 79 polymerase complex. 80 We next utilized heavy isotope incorporation and the high mass accuracy of a tribrid Orbitrap 81 based mass spectrometer to confirm that the transfer of radioactivity to the nsp proteins was a result of 82 nucleotidylation. We performed nucleotidylation reactions in the absence of nucleotide, in the presence 83 of natural GTP or, alternatively, GTP containing the heavy isotopes 15 N or 13 C. This allowed us to identify 84 GMP-labeled peptide peaks using LC-MS alone, independent of MS/MS spectrum acquisition and 85 matching. Unlabeled nsp peptides that were present in all sample injections ran at similar retention 86 times (Fig. 2a, Extended Data Fig. 3 ). LC-MS peaks that corresponded to GMP-labeled peptides were 87 identified by two criteria (Fig. 2b) : (1) their absence in unlabeled sample injections, and (2) the presence 88 of appropriately mass-shifted peptides at the same retention time in samples that were labeled with 15 N-89 and 13 C-GTP. Although a typical SARS-CoV-2 sample had ~13,000 different LC-MS peaks, only three peaks 90 were found that met the above criteria of GMP-labeled peptides (Extended Data Fig. 4 , Supplementary 91 Tables 1 and 2) . Similarly, the sample set for EAV nsp proteins contained ~91,000 peaks, but only 50 92 peaks satisfied the specified criteria as GMP modified candidates (Extended Data Fig.5 , Supplementary 93 Table 3 ). 94 For the next step of analysis, candidate peaks were subjected to MS/MS analysis in the tribrid 95 mass spectrometer using higher energy collisional dissociation (HCD) fragmentation methodology as 96 well as electron transfer/higher-energy collision dissociation (EThcD). EThcD often preserves the 97 attachment of labile peptide modifications that are difficult to preserve with HCD alone [15] [16] [17] . Table 1 98 summarizes all GMP-modified peptides identified by MS/MS using the Sequest HT database search 99 algorithm (also see Supplementary Table 4) 18, 19 . In SARS-CoV-2, we could only verify a single GMP-00 modified site by MS/MS located at ser-2 of the nsp7 recombinant protein, which is the N-terminal ser-1 01 residue in the nsp7 protein produced by the virus. Multiple forms of this same modified peptide were 02 identified. For EAV, five total sites were identified in nsp proteins. This included nucleotidyation of nsp7 03 lys-143, which was observed in multiple peptides, and one site in nsp9. All GMP linkages in the EAV nsp 04 proteins occurred via a phosphoramidite (P-N) bond rather than the O-linked GMP found in SARS CoV-2 05 (Table 1) . 06 Compared to HCD, EThcD spectra provided clean, higher-quality, and more interpretable data that 07 had improved spectrum matching scores (Xcorrelation scores) and that contained both N-terminal and C-08 terminal fragment ions (c-ion series and z-,y-ion series, respectively) ( Fig. 2c, Fig. 3 , Extended Data Fig. 6 09 -9). Preservation of the GMP attachment allowed us to determine the modification site. However, some 10 loss of the GMP modification during fragmentation was observed. One example is the 1438.73035 ion in 11 Figure 2c that corresponds to the z13 + fragment that lost the GMP modification. If not for the presence of 12 the y13 + ion at 1798.78857, it would be easy to assign the GMP attachment site to the N-terminal glycine 13 residue instead. 14 The most prominent feature in HCD spectra was the fragmentation of the GMP moiety itself that, 15 in every instance, leaves a characteristic and dominant guanine (C5N5OH6) tracer ion (Table 1, Fig. 3, and 16 Extended Data Fig. 6-9 ). The guanine ions were also observed in EThcD spectra, but at a low intensity. 17 The mass of the guanine fragment changed appropriately depending on whether the peptide was labeled 18 with GMP, 13 C-GMP or 15 N-GMP. The theoretical masses of guanine that is derived from each of these 19 versions of GMP are 152.0572, 157.0740 or 157.0424, respectively. Peptide ions that do not contain the 20 modification were prominent in these HCD spectra. For example, when the modification was located 21 closer to the N-terminus, the y-ion series is easily identified, and the b-ion series is missing (Fig. 3a, b) . 22 7 series is generally absent (Fig. 3c) . The missing ion series was often observed with the loss of the GMP 24 modification (Extended Data Fig. 6 and Extended Data Fig. 10 ), which makes its localization difficult if 25 HCD is used alone. One main ion that has lost the GMP modification along with one or two H2O molecules 26 often stands outs, for example, the 926.46 and 917.45 m/z ions in Figure 3a We failed to identify GMP-modified sites in SARS-CoV-2 nsp8, even though the proteins were 40 clearly labeled with radioactivity in the gel-based assay. Although we utilized a variety of proteases to 41 examine our samples by LC-MS, it is possible that the modified peptides could not easily be isolated 42 under the liquid chromatography conditions or they failed to ionize efficiently, which are both necessary 43 to produce a strong MS signal. GMP-modification added hydrophobicity to some peptides as they eluted 44 with higher organic solvent on a reverse phase column compared to their unmodified counterparts ( Fig. 45 2a vs 2b). It is also possible that the stability of GMP attachment is dependent on the chemical 46 environment provided by neighboring amino acids. Additionally, attachment to amides through the 47 formation of phosphoramide bonds is known to be unstable in acid conditions used in LC-MS (Extended Data Fig. 1 ) 3, 12, 13 . Potentially, the observation of the GMP-modified ser-2 on recombinant SARS-CoV-2 49 nsp7 in the mass spectrometer could be the manifestation of a lys-3 GMP-modification via a 50 phosphoramide bond that was transferred to a more energetically favorable position on the neighboring 51 residue during the sample handling. It should be noted that non-canonical phosphorylation on basic 52 amino acids, which uses identical phosphoramidate chemistry (i.e., the P-N bond versus P-O in S, T and Y 53 amino acids) for the GMP linkage, is beginning to emerge as an important component of signaling 54 systems in nature 16, 23 . These phosphoramidate bonds are more labile under the conditions commonly 55 used in analytical instrumentation such as HPLC and mass spectrometry and are therefore understudied. 56 This highlights the need for more comprehensive studies that examine the impact of additional factors on 57 their in vivo and in vitro lability, such as the specific local chemical environment provided by unique 58 amino acid sequences on either side of the modified residue. 59 Nucleotidyl transferase activity plays a role in diverse biological processes in bacteria, viruses and 60 eukaryotes. The addition of AMP (AMPylation or adenylation) was first described as a mechanism to 61 regulate E. coli glutamine synthetase activity 24 . It is now recognized, along with UMPylation, as a post-62 translation modification that impacts bacterial DNA replication 25,26 , ER and oxidative stress responses 27-63 29 , bacterial pathogenesis 30,31 , and viral RNA synthesis 32 where nucleotides are stably attached to serine, 64 threonine or tyrosine resides 33,34 . AMP and GMP adducts are also formed as intermediates in RNA 65 capping 12, 35, 36 , DNA ligation 37,38 , SUMOylation and ubiquitinylation 39-43 as well as other enzymatic 66 reactions 44,45 . In these cases, as with non-canonical phosphorylation of lysine and histidine, the 67 nucleotide forms less stable linkages with amine (lysine or histidine) or carboxyl groups that promote its 68 transfer to substrate. 69 Here we have demonstrated nucleotidylation activity of SARS-CoV-2 proteins that is hypothesized 70 to be critical for viral propagation 3 . Mutation of nsp12 K73 caused a loss of transferase activity, which 71 indicates the N-terminal NiRAN domain is necessary for formation of GMP adducts on nsp7 and nsp8. (residues 400 -932) and also the binding sites for nsp7 and nsp8 1,2 . Nsp7 and nsp8 also form alternate, 74 oligomeric structures 46 . Unlike other conserved nucleotidyl transferase domains, the NiRAN domain is 75 newly defined and its functional role in viral life cycles remains to be elucidated. While many 76 components of coronavirus RNA capping machinery have been characterized, the hypothesized 77 guanyltransferase, which traditionally forms GMP-intermediates as observed here, remains undiscovered 78 12, 35, 36, 47 . In other RNA viruses, a uridylylated protein, known as VPg, is used to initiate RNA synthesis 48-79 53 . SARS-CoV-2 nsp7 and nsp8 are positioned to interact with the RNA 5' terminus when bound to nsp12 80 54,55 and are critical for robust RNA replication 9, 56 . We expect the data provided here will lay the 81 foundation for future studies defining the exact mechanistic and functional purposes of this novel 82 nucleotidylation activity in SARS-CoV-2 and related coronaviruses. The translational utility of this 83 information for discovering and/or designing new inhibitors of the process that function to thwart 84 COVID-19 is an important future goal for research on SARS-CoV-2 as well as unknown but potentially 85 newly emerging coronaviruses. 86 The virally produced protein does not contain a glycine at position 1. 2 Assigned GMP localization for the top-scoring spectrum match. 3 Top score amongst GMP, 15 N-GMP and 13 C-GMP labeled peptides are shown. "-" indicates that corresponding HCD spectra did not pass MS/MS search criteria. 4 Indicates guanine fragment ion was the predominant ion in corresponding HCD spectra, even when they were not identified as peptide spectrum matches using the MS/MS search criteria. Signals are shown for samples that were incubated without GTP (no nucleotide), with GTP, or with a mixture of GTP, 15 N-GTP and 13 C-GTP (top, middle and lower panels, respectively). b, MS1 signals at the m/z of 928.4078 for the same samples are shown (left panel). This mass corresponds to the GMP-modified version of the same peptide with a charge state of +2. The average mass spectrum across these peaks is shown from the m/z range of 927 -938 in order to visualize the presence or absence of the isotopic mass envelops for the GMP-labeled peptide and the corresponding 15 N-and 13 C-labeled GMP peptides (right panel). For the unlabeled sample, average mass spectrum for retention times of ~45.68 -45.78 is shown. The inset indicates the mass added to any peptide modified by GMP, 15 N-GMP and 13 C-GMP as well as the mass differences between the GMP-labeled and the heavy-labeled GMP peptides for the specified charge states (z). c,The top-scoring EThcD peptide spectrum match for same GMP-labeled peptide is shown with labeling of main fragment ions. Inset shows an abbreviated list of expected peptide mass fragments, which are colored red or blue if they were matched in the spectrum within an error of 0.04 Daltons. Note that not all matched ions are labeled in the spectrum. Examples of top-scoring HCD peptide spectrum matches from two EAV nsp7 peptides are shown that are either labeled closer to the N-terminus (a, b) or the C-terminus (c). Each spectrum represents a peptide modified with either GMP, 15 N-GMP or 13 C-GMP as indicated. Inset lists expected fragment ions in an identical manner as in Figure 2c . Spectra are additionally labeled with guanine fragment ions and dominant peptide fragments that have lost the GMP modified plus one or two H 2 O molecules. Note that Sequest HT assigned the GMP modification site for the spectrum in panel c to R17, instead of the correct site at K16. DNA for SARS-CoV-2 nsp12 encompassing the a.a. 1-932 and EAV nsp9 1-693 was synthesized with codon optimization (Genscript) and cloned into pFastBac with an N-terminal MG addition and Cterminal TEV protease site and two Strep tags. Expression was performed by transducing recombinant baculoviruses into Sf21 insect cells (Expression Systems). Cells were harvested by centrifugation and resuspended in 25 mM HEPES pH 7.4, 300 mM sodium chloride, 1 mM magnesium chloride, and 2 mM dithiothreitol. The resuspended cells were then lysed using a microfluidizer, clarified by centrifugation at 25,000 × g for 30 min and filtered using a 0.45 μm filter. Nsp12 was purified using Streptactin Agarose (IBA) eluting with 2 mM desthiobiotin. Eluted protein was further purified by size exclusion chromatography using a Superdex200 column (GE Life Sciences) in 25 mM HEPES pH 7.4, 300 mM NaCl, 0.1 mM magnesium chloride, and 2 mM tris(2-carboxyethyl)phosphine. Full-length, codon-optimized nsp7 and nsp8 genes were cloned into pET46 for expression in E. coli. The N-terminal tags for SARS-CoV-2 nsp7 and nsp8 are MAHHHHHHVDDDDKMENLYFQG and for EAV nsp7 is MAHHHHHHVGTENLYFQ. The TEV protease cleavage sites (ENLYFQ|G) were positioned to leave N-terminal glycines on SARS-CoV-2 nsp7 and nsp8 and no additional amino acids on EAV nsp7. Proteins were expressed in Rosetta2 pLysS E. coli (Novagen). Bacterial cultures were grown to an OD600 of 0.8 at 37 °C, and then the expression was induced with a final concentration of 0.5 mM of isopropylβ-D-1-thiogalactopyranoside and the growth temperature was reduced to16 °C for 14-16 h. Cells were harvested by centrifugation and were resuspended in 10 mM HEPES pH 7.4, 300 mM sodium chloride, 30 mM imidazole, and 2 mM dithiothreitol. Resuspended cells were lysed using a microfluidizer. Lysates were cleared by centrifugation at 25,000 × g for 30 min and then filtered using a 0.45 μm filter. Protein was purified using Ni-NTA agarose (Qiagen) eluting with 300 mM imidazole. Eluted proteins were digested with 1% (w/w) TEV protease. TEV protease-digested proteins were passed over Ni-NTA agarose to remove uncleaved proteins and then further purified by size exclusion chromatography using a Superdex200 column (GE Life Sciences) in 25 mM HEPES pH 7.4, 300 mM sodium chloride, 0.1 mM magnesium chloride, and 2 mM dithiothreitol. Nucleotidylation assays were performed similarly to as previously described 1 Samples were mixed with 4x LDS loading dye (Thermo) containing 200 mM DTT and analyzed on Bolt Bis-Tris Plus gels (Thermo) according to manufacturer's instructions. Gels were either stained with GelCode TM Blue Protein Stain (Fisher Scientific) or prepared for autoradiography by incubation in a 10% polyethyleneglycol 8000 (Sigma) solution for 30 minutes and gel drying at 70°C for 45 minutes using a Hoefer gel drier, Savant condenser unit and an TRIVAC pump (Leybolt). Gels were exposed to phosphor screens for 2 to 48 hours and developed with a Molecular Dynamics Storm 850 imager. Gel images were examined using ImageJ opensource software. Nucleotidylation reactions were brought to 6.7 M Urea in 50 mM ammonium bicarbonate, 5 mM DTT and incubated at 42°C for 15 minutes. Cysteines were alkylated by the addition of 15 mM iodoacetamide for 30 minutes at room temperature. Following the neutralization of unreacted iodoacetamide with 15 mM DTT, nsp proteins were diluted to a final concentration if 1M urea and digested overnight with either Trypsin/LysC mix, Chymotrypsin, or GluC (Promega) according to manufacturer's instructions. The resulting peptides were desalted using OMIX C18 pipette tips (Agilent Technologies) or 1 mL C18 Sep Pak cartridges (Waters) in 10 mM ammonium formate at pH 7.0, eluted in 75% acetonitrile, and dried to completion with a vacuum centrifuge. For LC-MS/MS analysis, each digested sample was suspended in 0.1% formic acid and maintained at 7°C until analysis within 12 hours. Samples were loaded onto Thermo Scientific Easyspray C4 or C18 nanocolumn and eluted with a gradient to 80% acetonitrile / 0.1% formic acid at 300 nL/min at room temperature using an Ultimate 3000 series liquid chromatography system. Eluted peptides were ionized in-line with a Thermo Scientific Easy Spray source for direct analysis with a Thermo Scientific Fusion Lumos Orbitrap mass spectrometer and subjected to a targeted or datadependent MS/MS acquisition scheme that collected both HCD and EThcD spectra in the high resolution orbitrap. LC-MS peaks were defined by analyzing the data with the feature mapping and precursor quantification nodes in Proteome Discoverer 2.4 that determined the retention time, charge state, and abundance of every peak in each searched file. The data were filtered and exported to Microsoft Excel to search for peaks that 1) were not present in unlabeled samples and 2) that possessed two peaks with the same charge state at a similar retention time (+/-0.5 minute) and were mass-shifted 15 N and 13 C labeled GMP) to H, S, T, Y, K or R, based on known phosphodiester or phosphoramide attachment chemistry [3] [4] [5] , although all possible sites were considered in initial, preliminary searches. For the SARS-CoV-2 GMP localization on nsp7, initial searches assigned the modification site to the N-terminal glycine at peptide position 1, but the modification site was manually determined to be located at ser-2 (ser-1 in natural nsp7) because of the presence of the y13+ ion (ser-2) at an m/z of 1798.78857 that contained the GMP moiety. For each unique peptide m/z, the top-scoring peptide spectrum matches for both EThcD and HCD are presented in Supplementary Table 4 along with GMP-attachment site assignments and additional information. The top-scoring, overall match for each peptide is presented in Table 1 along with the corresponding attachment site assignment. All matches had a precursor mass error of +/-1.5 ppm or less and were validated with a target-decoy search using a cut-off false discover rate value of 1%. Only peptides with a top spectrum match Xcorrelation score of 2.5 or greater were reported. Extracted ion chromatograms and spectra were generated using Freestyle software (Thermo Scientific). Note that for EAV reactions, not all GMP-modified candidate peaks were examined in MS2, because all EAV data was acquired prior to data analysis and MS2 was performed in a datadependent acquisition scheme. Though a single site was found in SARS-CoV-2, additional phosphoramidate linked GMP sites likely exist that may not have been observed because of the lability of the GMP attachment in the acidic conditions used for LC-MS/MS (Extended Data Figure 1 ). The raw mass spectrometry datasets and proteome discoverer results that were generated and Structure of the RNA-dependent RNA polymerase from COVID-19 virus Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors Discovery of an essential nucleotidylating activity associated with a newly delineated conserved domain in the RNA polymerase-containing protein of all nidoviruses HIV-1 antiretroviral drug therapy. Cold Spring Harb Perspect Med 2, a007161 The complex challenges of HIV vaccine development require renewed and expanded global commitment Oral Direct-Acting Agent Therapy for Hepatitis C Virus Infection: A Systematic Review The case for a universal hepatitis C vaccine to achieve hepatitis C elimination Remdesivir is a direct-acting antiviral that inhibits RNA-dependent RNA polymerase from severe acute respiratory syndrome coronavirus 2 with high potency One severe acute respiratory syndrome coronavirus protein complex integrates processive RNA polymerase and exonuclease activities Structural Analysis of Porcine Reproductive and Respiratory Syndrome Virus Nonstructural Protein 7alpha (NSP7alpha) and Identification of Its Interaction with NSP9 Structure and genetic analysis of the arterivirus nonstructural protein 7alpha Mechanism of the mRNA guanylyltransferase reaction: isolation of N epsilon-phospholysine and GMP (5' leads to N epsilon) lysine from the guanylyl-enzyme intermediate Chemical properties and separation of phosphoamino acids by thin-layer chromatography and/or electrophoresis Remdesivir and SARS-CoV-2: Structural requirements at both nsp12 RdRp and nsp14 Exonuclease active-sites The Role of Electron Transfer Dissociation in Modern Proteomics Electron Transfer/Higher Energy Collisional Dissociation of Doubly Charged Peptide Ions: Identification of Labile Protein Phosphorylations Quantitative phosphoproteomics reveals the role of protein arginine phosphorylation in the bacterial stress response SARS-CoV-2 nsp12, nsp8 and nsp7 nucleotidylation reactions 1) No nucleotide • Verify candidate GMP-modified peptides meet criteria (see "*_B" raw files and also Fig. 2a,b) 2 mM) Digest with Trypsin / LysC MS analysis: 2 replicate injections per sample: Resulting MS files: EAV_Tryp_Nonucleotide_A.raw EAV_Tryp_Nonucleotide_B.raw EAV_Tryp_GTP_A.raw EAV_Tryp_GTP_B.raw EAV_Tryp_15NGTP_A.raw EAV_Tryp_15NGTP_B.raw EAV_Tryp_13CGTP_A.raw EAV_Tryp_13CGTP_B.raw Mass spectrometric based detection of protein nucleotidylation in the RNA polymerase of SARS-CoV-2 Sussman 1,2,* Supplementary Information Guide: Supplementary Table 1: Data analysis of LC-MS peaks for a single SARS-CoV-2 chymotrypsindigested sample that contained peptides labeled with GMP, 15 N-GMP and 13 C-GMP Two additional tabs show search results and filtered results Supplementary Table 2: Data analysis of LC-MS peaks for a single SARS-CoV-2 GluC-digested sample that contained peptides labeled with GMP, 15 N-GMP and 13 C-GMP Two additional tabs show search results and filtered results Supplementary Table 3: Data analysis of LC-MS peaks for EAV trypsin-digested samples that were not labeled with GMP or were labeled with either GMP, 15 N-GMP or 13 C-GMP. Peaks unique to two labeled, replicate injections (GMP-, 15 N-GMP or 13 C-GMP-labeled) were exported and searched against all others Supplementary Table 4: Summary of all MS/MS peptide spectrum matches containing a GMP Top scoring hits for each peptide mass (m/z) and for each fragmentation method (HCD or EThcD) are shown along with peptide sequence, Xcorrelation score, precursor mass error, raw data file name, scan number, and other relevant information Funding to support this work was provided to R.N.K. from NIH/NIAID (Grant No. AI123498) and M.R.S. from NSF (Grant No. MCB-1713899). The authors declare no competing interests. Supplementary Information is available for this paper.Correspondence and requests for materials should be addressed to Michael R. Sussman, Center for Extended Data Figure 1 . Stability of GMP covalent attachment on SARS-CoV-2 nsp7 and nsp8. SARS-CoV-2 proteins nsp7 and nsp8, radiolabeled with 32 P-GTP, were incubated in the indicated conditions and were analyzed by SDS-PAGE. Autoradiography was used to visualize radioactive proteins. SDS was necessary in the heated sample to prevent aggregation. Abbreviations: FA, formic acid; TEA, triethylamine; RT, room temperature. Extended Data Figure 6 . HCD spectrum for GMP-modified SARS-CoV-2 nsp7 peptide 1-14. The top scoring HCD spectrum match for GMP-modified nsp7 peptide is shown for direct comparison to EThcD spectrum in Fig. 2c . Note that the modification site was assigned to K3 by Sequest HT for this spectrum, instead of S2. Guanine and fragments that lost the GMP modification were manually labeled after inspection.