key: cord-0720670-m4dlisqm authors: Korn, Sophie M.; Dhamotharan, Karthikeyan; Fürtig, Boris; Hengesbach, Martin; Löhr, Frank; Qureshi, Nusrat S.; Richter, Christian; Saxena, Krishna; Schwalbe, Harald; Tants, Jan-Niklas; Weigand, Julia E.; Wöhnert, Jens; Schlundt, Andreas title: (1)H, (13)C, and (15)N backbone chemical shift assignments of the nucleic acid-binding domain of SARS-CoV-2 non-structural protein 3e date: 2020-08-08 journal: Biomol NMR Assign DOI: 10.1007/s12104-020-09971-6 sha: 647f671afa4b3a3a1c690246740e85fd66009992 doc_id: 720670 cord_uid: m4dlisqm The ongoing pandemic caused by the Betacoronavirus SARS-CoV-2 (Severe Acute Respiratory Syndrome Coronavirus-2) demonstrates the urgent need of coordinated and rapid research towards inhibitors of the COVID-19 lung disease. The covid19-nmr consortium seeks to support drug development by providing publicly accessible NMR data on the viral RNA elements and proteins. The SARS-CoV-2 genome encodes for approximately 30 proteins, among them are the 16 so-called non-structural proteins (Nsps) of the replication/transcription complex. The 217-kDa large Nsp3 spans one polypeptide chain, but comprises multiple independent, yet functionally related domains including the viral papain-like protease. The Nsp3e sub-moiety contains a putative nucleic acid-binding domain (NAB) with so far unknown function and consensus target sequences, which are conceived to be both viral and host RNAs and DNAs, as well as protein-protein interactions. Its NMR-suitable size renders it an attractive object to study, both for understanding the SARS-CoV-2 architecture and drugability besides the classical virus’ proteases. We here report the near-complete NMR backbone chemical shifts of the putative Nsp3e NAB that reveal the secondary structure and compactness of the domain, and provide a basis for NMR-based investigations towards understanding and interfering with RNA- and small-molecule-binding by Nsp3e. SARS-CoV-2, the cause of the early 2020 pandemic accompanied by the respiratory disease called COVID-19, is the latest representative of the coronaviridae family, which also comprises the 2002 first generation SARS-CoV and the Middle East Respiratory Syndrome (MERS)-CoV. The severe velocity of virus spread, based on its unexpectedly high infectivity, demands for a rapid action towards both the development of a vaccine and potent viral inhibitors to weaken or eliminate symptoms that are a major life-thread, especially to older generations worldwide. The almost 30-kb enveloped positive-sense singlestranded RNA of SARS-CoV-2 represents one of the largest known viral genomes. Contained therein are possible 14 open reading frames (ORFs) that encode for up to 30 transcripts, the majority of which have been proven at protein level (Gordon et al. 2020) . Within the highly conserved proteins of Betacoronaviruses (Yoshimoto 2020) , the ORF1a/b-encoded non-structural proteins (Nsp) 1-16 assemble the replication/transcription complex which comprises an incompletely understood network of viral-viral and host-viral protein-protein and RNA-protein interactions. Besides the structural Spike protein, important for viral entry, it is a set of non-structural proteins that represent the canonical protein drug targets, among them the two proteases Nsp5 (Mpro) and Nsp3d (PLpro), the Nsp3b ADP-ribose-phosphatase macrodomain, and the Nsp7/8/12 RNA-dependent RNA polymerase complex. Nsp3, the largest Nsp (Snijder et al. 2003) , is one of the most enigmatic coronavirus proteins as it is composed of a plethora of functionally related, yet independent subunits. After cleavage of Nsp3 from the full-length ORF1-encoded polypeptide chain, it displays a 1945-residue multi-domain protein, with individual functional entities that are subclassified from Nsp3a to Nsp3e followed by the ectodomain embedded in two transmembrane regions and the very C-terminal CoV-Y domain. Nsp3e is unique to Betacoronaviruses and consists of a nucleic acid-binding domain (NAB) and the so-called group 2-specific marker (G2M) (Neuman et al. 2008) . Structural information is rare; while the G2M is predicted to be intrinsically disordered (Lei et al. 2018) , the only available experimental structure of the Nsp3e NAB was solved from SARS-CoV by the Wüthrich lab using solution NMR (Serrano et al. 2009 ). The SARS-CoV Nsp3e NAB was shown to bind G-rich ssRNA and to possess DNAunwinding capability (Neuman et al. 2008) , while its precise function and well-defined consensus target sequences have remained unknown. Seeing its specific appearance, Nsp3e thus represents a potential drug target for both the current as well as potential future Betacoronavirus epidemic waves. The 2020 founded research consortium covid19-nmr seeks to rapidly and publicly support the search for antiviral drugs using an NMR-based screening approach which requires the broad production of all drugable proteins and RNAs and their as comprehensive as possible assignment of NMR resonances, and eventually the determination of structures to be used in rational drug design. We here provide the near-complete backbone assignment of the SARS-CoV-2 Nsp3e NAB and thereby enable its exploitability in followup applications, such as residue-resolved drug screening and interaction mapping. This study uses the SARS-CoV-2 NCBI reference genome entry NC_045512.2, identical to GenBank entry MN908947.3 (Wu et al. 2020) . The definition of domain boundaries for the Nsp3e NAB was guided by the available NMR structure (PDB 2K87) of its closest homologue, i.e. Nsp3e from the 2002 first generation SARS-CoV (Serrano et al. 2009 ), sharing 82% sequence identity. Based on the sequence alignment of the entire SARS-CoV-2 Nsp3e with SARS-CoV Nsp3e and consideration of flexible overhangs observed in the structure, we defined our expression construct to span amino acids 1088-1203 counting the overall Nsp3 primary sequence. A codon-optimized expression construct of SARS-CoV-2 Nsp3e NAB was obtained from GenScript Biotech (Netherlands), inserted into the pET3bbased vector pKM263, containing an N-terminal His 6 -tag, a GST-tag and a tobacco etch virus (TEV) cleavage site. Due to the nature of the TEV cleavage site, the produced protein contained four artificial N-terminal residues (Gly-3, Ala-2, Met-1 and Gly0) after cleavage, before the original protein sequence starts with Tyr1 according to Tyr1088 in the fulllength Nsp3 sequence. Uniformly 13 C, 15 N-labelled Nsp3e NAB protein was expressed in E. coli strain BL21 (DE3) in M9 minimal medium containing 1 g/L 15 NH 4 Cl (Cambridge Isotope Laboratories), 2 g/L 13 C 6 -d-glucose (Eurisotop) and 100 µg/mL ampicillin. Protein expression was induced at O.D. 600 nm of 0.7 with 1 mM isopropyl-beta-thiogalactopyranoside for 18 h at room temperature. The cell pellet was resuspended in 50 mM sodium phosphate, 300 mM sodium chloride, 10 mM imidazole, 2 mM Tris-(2-carboxyethyl)-phosphine (TCEP) and 100 µL protease inhibitor mix (SERVA) per 1 L of culture, pH 6.5. Cells were disrupted by sonication. The supernatant was cleared by centrifugation (20 min, 7000×g, 4 °C). The cleared supernatant was passed over a Ni 2+ -NTA gravity flow column (Sigma-Aldrich) and the His 6 -GST-tag was cleaved over night at 4 °C with 0.5 mg of TEV protease per 1 L of culture, while dialyzing into size exclusion buffer (25 mM sodium phosphate, 150 mM sodium chloride, 2 mM DTT, 0.02% NaN 3 , pH 7.0). TEV protease and the cleaved tag were removed via a second Ni 2+ -NTA gravity flow column and Nsp3e was further purified via size exclusion on a HiLoad 16/600 SD 75 (GE Healthcare). Fractions containing pure Nsp3e were determined by SDS-PAGE, pooled and concentrated using Amicon centrifugal concentrators (molecular weight cutoff 3 kDa). NMR samples were prepared in 25 mM sodium phosphate pH 7.0, 150 mM sodium chloride, 2 mM TCEP, 0.02% NaN 3 , 10% (v/v) D 2 O, 300 µM 4,4-dimethyl-4-silapentane-1-sulfonic acid (DSS) as internal chemical shift standard at Nsp3e concentrations of 0.6 to 1.1 mM. (Pervushin et al. 1997; Salzmann et al. 1998 ) and used sensitivity-enhanced gradient echo/antiecho coherence selection (Czisch and Boelens 1998; Schulte-Herbrüggen and Sorensen 2000) . Acceleration of longitudinal 1 H relaxation between scans was achieved in the Band-Selective Excitation Short-Transient (BEST) (Lescop et al. 2007; Schanda et al. 2006; Solyom et al. 2013 ) manner using exclusively shaped proton pulses with bandwidths/offsets of 4.8/8.3 ppm and the inter-scan delay set to 0.3 or 0.4 s. A 3D NOESY-[ 15 N, 1 H]-HSQC (Marion et al. 1989; Zuiderweg and Fesik 1989) with water suppression using a WATERGATE sequence (Piotto et al. 1992 ) was recorded to complete backbone assignments and also provided Gln/Asn NH 2 group assignments. Sequence-specific assignments of tryptophan side chain 1 H ε1 / 15 N ε1 resonances were obtained with a [ 15 N, 1 H]-BEST-TROSY version of the HN(CDCG)CB experiment (Lohr and Ruterjans 2002) with proton pulses centered at 10 ppm and covering a bandwidth of 4 ppm. A slowly exchanging histidine imidazole 1 H Nε2 resonance was assigned using a 2D BEST-TROSY-H(NCDCG)CB version with the magnetization transfer pathway adapted to histidine side chains and proton pulses centered at 12 ppm (Andersson et al. 1998) . The 15 N heteronuclear NOE experiment was performed as an interleaved pseudo-3D TROSY version (Lakomek et al. 2012) using 256 indirect complex points. All NMR experiments were carried out at a sample temperature of 25 °C using Bruker Avance spectrometers of 600, 700 and 950 MHz proton Larmor frequency, equipped with cryogenic z-axis gradient probes. Data acquisition and processing was undertaken using Topspin versions 3 and 4. Cosine-squared window functions were applied for apodization in all dimensions. Spectra were referenced with respect to internal DSS and for 13 C and 15 N as described in (Wishart et al. 1995) . All assignments of the Nsp3e NAB were performed using the CCPNMR analysis 2.4 software suite (Vranken et al. 2005 ) and the program Sparky (Lee et al. 2015) . The Nsp3e NAB 1 H, 15 N-HSQC shown in Fig. 1 shows an excellent peak dispersion. Of note, we obtained a yet better resolved amide correlation spectrum at 950 MHz proton frequency; however, we found some resonances exchange broadened and only visible at lower field strength, e.g. Phe23. For convenience, residues were numbered starting with 1 on Tyr1088. The overall high quality of all spectra allowed the assignment of > 98% of all backbone amides within the natural sequence (Tyr1-Thr116, according to Tyr1088-Thr1203), all Trp and Gln sidechain amides, and 3 out of 10 Asn sidechain amides (17, 90, 101) . The assignments are in good agreement with the previously published assignments of the 2002 SARS-CoV Nsp3e NAB 1066 − 1181 (Serrano et al. 2008) , which reflects the high sequence similarity (Yoshimoto 2020) . Only two residues of the natural sequence (Asn22 and Ser73, both likely in flexible loop regions) could not be assigned in their backbone amides due to obvious line-broadening beyond detectability, which notably had also been observed for the SARS-CoV Nsp3e (Serrano et al. 2008) . For amino acids Glu5, Ile7, Asn8, Asp57, Leu58, and Val114 we observed a second, minor conformation based on the preceding prolines with both cis and trans isomers present. To assess the overall compactness of the NAB and internal dynamics, we recorded hetNOE data (Fig. 2a) as a function of the primary sequence. For residues 8-109, hetNOE values of 0.65 or higher were measured indicating an overall rigid structure of the protein. No regions of increased flexibility were observed except for the two termini (residues 1-7 and 110-116). We also calculated carbon secondary chemical shifts based on the chemical shifts of C α and C β (Fig. 2b bottom) relative to random coil values essentially as described by (Wishart and Sykes 1994) . Four consecutive residues with significant negative or positive shifts were used to define either β-strands or α-helices, respectively. Our data suggest a ββαββαββα-fold, which is in agreement with the structure of its homologue from SARS-CoV (Fig. 2b top) . While all secondary structure elements well align between the two homologues, helix-2 -according to our data -is shorter and directly connects to β-strand 4. The very terminal residues do not display secondary structure content, which is in line with the increased flexibility observed in the hetNOE experiment. Our data thus suggest that the NAB of SARS-CoV-2 Nsp3e resembles a similar structure as the SARS-CoV Nsp3e (Serrano et al. 2009 ). Our determined NMR resonance assignments and spectral quality clearly prove the Nsp3e NAB drugability and will now pave the way towards a solution structure, RNA-and protein interaction studies, and residue-resolved high-throughput drug screening as a crucial contribution to the initiative of screening all potential SARS-CoV-2 proteins. (Vranken et al. 2005 ) based on the respective signal-to-noise of spectra. No values are shown for Asn22 and Ser73 (missing assignments) and Phe23, His82 and Lys95 due to large relative errors based on the overall low peak intensities of these amides. Additional gaps derive from prolines. b SCS are interpreted towards their underlying secondary structure as shown above the panel (experimental) and when compared to the SARS-CoV Nsp3e homologue structure (Serrano et al. 2008 (Serrano et al. , 2009 ) from PDB entry 2K87. α-helices are shown with red bars, β-strands with blue arrows, respectively. Light colors indicate the presence of elements with imperfect geometry in the structure or merely tentative secondary chemical shifts BioMagResBank (https ://www.bmrb.wisc.edu) under accession number 50334. Spectral raw data (upon request) and assignments are also accessible through https ://covid 19-nmr.de. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creat iveco mmons .org/licen ses/by/4.0/. An alpha/beta-HSQCalpha/beta experiment for spin-state selective editing of IS cross peaks Sensitivity enhancement in the TROSY experiment A SARS-CoV-2 protein interaction map reveals targets for drug repurposing Measurement of 15N relaxation rates in perdeuterated proteins by TROSY-based methods NMRFAM-SPARKY: enhanced software for biomolecular NMR spectroscopy Nsp3 of coronaviruses: structures and functions of a large multi-domain protein A set of BEST triple-resonance experiments for time-optimized protein resonance assignment Correlation of backbone amide and sidechain (13)C resonances in perdeuterated proteins Overcoming the overlap problem in the assignment of 1H NMR spectra of larger proteins by use of threedimensional heteronuclear 1H-15N Hartmann-Hahn-multiple quantum coherence and nuclear Overhauser-multiple quantum coherence spectroscopy: application to interleukin 1 beta Proteomics analysis unravels the functional repertoire of coronavirus nonstructural protein 3 Attenuated T2 relaxation by mutual cancellation of dipole-dipole coupling and chemical shift anisotropy indicates an avenue to NMR structures of very large biological macromolecules in solution Gradient-tailored excitation for single-quantum NMR spectroscopy of aqueous solutions TROSY in triple-resonance experiments: new perspectives for sequential NMR assignment of large proteins Speeding up threedimensional protein NMR experiments to a few minutes Clean TROSY: compensation for relaxation-induced artifacts NMR assignment of the nonstructural protein nsp3(1066-1181) from SARS-CoV Nuclear magnetic resonance structure of the nucleic acid-binding domain of severe acute respiratory syndrome coronavirus nonstructural protein 3 Unique and conserved features of genome and proteome of SARS-coronavirus, an early split-off from the coronavirus group 2 lineage BEST-TROSY experiments for time-efficient sequential resonance assignment of large disordered proteins The CCPN data model for NMR spectroscopy: development of a software pipeline The 13C chemical-shift index: a simple method for the identification of protein secondary structure using 13C chemical-shift data 1H, 13C and 15N chemical shift referencing in biomolecular NMR A new coronavirus associated with human respiratory disease in China The proteins of severe acute respiratory syndrome coronavirus-2 (SARS CoV-2 or n-COV19), the cause of COVID-19 Heteronuclear three-dimensional NMR spectroscopy of the inflammatory protein C5a Acknowledgements Open Access funding provided by Projekt DEAL. We thank Katharina Targaczewski and Sabrina Töws for excellent technical support in the wet lab work. The Frankfurt BMRZ (Center for Biomolecular Resonance) is supported by the Federal state of Hesse.Funding This work was funded by the Deutsche Forschungsgemeinschaft through grant numbers SFB902/B18 (to covid19-nmr), SCHL2062/2 − 1 (to A.S.), and by the Johanna Quandt Young Academy at Goethe (Grant Number 2019/AS01 to A.S.).