key: cord-0300687-hnk3voby authors: Sobeh, Amr M.; Eichhorn, Catherine D. title: C-terminal determinants for RNA binding motif 7 protein stability and RNA recognition date: 2022-05-13 journal: bioRxiv DOI: 10.1101/2022.05.13.491737 sha: c70d0520f4143f2caf622058d6a36bc0414b0fcd doc_id: 300687 cord_uid: hnk3voby The 7SK ribonucleoprotein (RNP) is a critical regulator of eukaryotic transcription. Recently, RNA binding motif 7 (RBM7), which contains an RNA recognition motif (RRM), was reported to associate with 7SK RNA and core 7SK RNP protein components in response to DNA damage. However, little is known about the mode of RBM7-7SK RNA recognition. Here, we recombinantly expressed and purified RBM7 RRM constructs and found that constructs containing extended C-termini have increased solubility and stability compared to shorter constructs. To identify potential RBM7-7SK RNA binding sites, we analyzed deposited data from in cellulo crosslinking experiments and found that RBM7 crosslinks specifically to the distal region of 7SK stem-loop 3 (SL3). Electrophoretic mobility shift assays and NMR chemical shift perturbation experiments showed weak binding to 7SK SL3 constructs in vitro. Together, these results provide new insights into RBM7 RRM folding and recognition of 7SK RNA. Graphical abstract Highlights Extending the RBM7 RRM C-terminus improves protein expression and solubility The RBM7 RRM interacts weakly with 7SK stem-loop 3 single-stranded RNA Solution state NMR dynamics measurements support presence of additional β4’ strand 4 cells showed reduced viability after UV irradiation, with impaired activation of UV-response genes 1 requiring 7SK RNP-P-TEFb to rescue activation [43] . In HEK293 and HeLa cells, exposure to a 2 chemical mimetic of UV-induced DNA damage (4-nitroquinoline 1-oxide, 4-NQO) similarly 3 resulted in decreased viability [44, 45] is an alternate regulatory mechanism independent of function in NEXT. When phosphorylated, RBM7 has globally reduced interaction to RNA with reduced NEXT function [55] . Importantly, 7SK 20 RNA abundance is not reduced after 4-NQO exposure, indicating that RBM7 is functioning 21 independently of its role in NEXT [34] . Together, these data suggest that RBM7 plays an essential role in the release of P-TEFb 23 from the 7SK RNP complex to respond to genotoxic stress [34] . However, little is known regarding 24 how RBM7 assembles with 7SK RNA and core 7SK RNP protein components. RBM7 contains 25 an N-terminal canonical RNA recognition motif (RRM) with conserved RNP1 and RNP2 1 sequences [56, 57] ( Fig. 1A-B ) and a C-terminal serine-rich domain that is phosphorylated by 2 p38 MAPK [34, 46, 56, 57] . In vitro, the RBM7 RRM preferentially interacts with polypyrimidine repeat 3 sequences with micromolar binding affinity [56] . Four high-resolution structures have been 4 determined of the RBM7 RRM by solution NMR spectroscopy or X-ray crystallography, with 5 varying N-and C-terminal ends and observed structural features [56, 58, 59] . Here, we 6 recombinantly expressed and purified RBM7 RRM constructs with various N-terminal fusion tags 7 and C-terminal ends. Using solution NMR spectroscopy, we found that constructs containing shift assays and NMR chemical shift perturbation experiments. We found that RBM7 crosslinks 14 specifically to the distal region of 7SK stem-loop 3 (SL3) and binds weakly to 7SK SL3 constructs. Together, these data provide new insights into RBM7 RRM folding and recognition of the 7SK 16 RNA. Recombinant protein construct design, expression, and purification 20 The DNA sequence encoding human RBM7 was codon optimized for E. coli codon bias 21 and purchased as a gblock (Integrated DNA Technologies). The RBM7 gene was cloned into a 22 pET vector containing N-terminal His6, tobacco etch virus (TEV) protease recognition site, and 23 maltose binding protein (MBP) domains with a kanamycin resistance gene (Addgene plasmid 24 #29656) using in vivo assembly [60] . Briefly, Benchling was used to design overlapping primers 25 complementary to the 5' and 3' ends of both the gene insert and vector insertion sites. PCR was 26 6 performed to produce gene insert and vector with complementary 5' and 3' ends. PCR products 1 were purified using a PCR cleanup kit (Zymo Research) and directly transformed into DH5a 2 competent cells. Individual colonies were selected and cultured in LB media with kanamycin, 3 followed by plasmid purification using a plasmid miniprep kit (Qiagen). All constructs derived from 4 this plasmid were generated using site-directed mutagenesis with the exception of the GB1 5 domain, which was inserted using in vivo assembly. Sequences were verified using Eurofins 24 recombinantly expressed and purified in house) for 2-3 hours at room temperature. After dialysis, 7 cleaved RBM7 protein was purified from His6-MBP or His6-GB1 by a second Ni-NTA purification 1 step (Qiagen). As a final purification step, proteins were purified by size exclusion chromatography using stranded RNA (ssRNA) oligonucleotides were purchased from IDT, resuspended in ultrapure 13 water, and diluted to 1 mM in ultrapure water. 7SK SL3 hairpin RNA constructs were produced 14 using in vitro RNA transcription with T7 RNA polymerase (Addgene #124138 [63] , prepared in-15 house) and chemically synthesized DNA templates (IDT) following established protocols [18] . ~9.4) ( Fig. 2A-B ). The minimal HT-RBM7 86 construct had poor yield after purification (approximately 2 mg 11 per L cell culture), with significant amounts of protein observed in the insoluble fraction after cell 12 lysis (Fig. 2C) . HT-RBM7 86 exhibited poor solubility and readily precipitated throughout or shifted relative to constructs with extended C-termini indicative of global chemical exchange 1 and conformational instability (Fig. 3A) . Within 24 hours the sample began to precipitate, 2 indicating poor solubility and stability of the minimal HT-RBM7 86 construct. In contrast, the 2D amide 1 H-15 N HSQC spectrum of HT-RBM7 91 showed excellent 4 dispersion and uniform peak intensity (Fig. 3B) . The majority of RRM resonances were observed 5 and able to be assigned; however, the spectral quality was not stable over time and several 6 resonances were no longer observed 4 days after sample preparation (Fig. 3B and 7 Supplementary Figure S2A) . These resonances correspond to the same missing resonances in 8 HT-RBM7 86 suggesting that while HT-RBM7 91 is more stable, this construct experiences similar 9 chemical exchange and construct instability. Similar to HT-RBM7 91 , HT-RBM7 101 showed 10 excellent chemical shift dispersion and uniform peak intensity (Fig. 3C) . The 1 H-15 N HSQC 11 spectra of HT-RBM7 91 and HT-RBM7 101 were nearly identical (Fig. 3B) . We expected to observe Due to the observed interaction of the RBM7 RRM with solubility tags, we next evaluated 20 the potential impact of the N-terminal His6 tag on RRM folding using solution NMR. The construct 21 RBM7 101 was generated by using TEV protease to cleave the His6 tag (Fig. 3D) . The 2D amide to the RRM (Fig. 3D) . Approximately fifteen resonances are absent in the RBM7 101 construct that 24 are attributed to the sixteen N-terminal residues cleaved from HT-RBM7 101 . In addition, chemical shift perturbations are observed for N-terminal residues adjacent to the TEV cleavage site (A4, 1 A6, A8) (Fig. 3D) . To assess RBM7 RRM conformational dynamics for both HT-RBM7 101 and RBM7 101 3 constructs, we performed 1 H-15 N heteronuclear NOE experiments, which report on ps-ns motional 4 timescales (Fig. 3E) . NOE values for several residues were unable to be determined due to lack 5 of assignments particularly for a2 (K58-V63) and the C-terminus (D94-Q101), severe line to be a2-b4 loop residues in the NMR structure and b4'-b4 strands in X-ray structures, 14 respectively (Supplemental Figure S3) , have NOE values consistent with b-strand and a-helical 15 residues in the RRM (Fig. 3E) . Together, the heteronuclear NOE data shows a disordered N-and 16 C-termini, flexible b2-b3 loop, and presence of a b4' strand and extended b4 strand in both HT-17 RBM7 101 and RBM7 101 constructs. In the prior study determining the solution NMR structural ensemble, the buffer conditions 19 differed in pH and ionic strength; in addition, we found that inclusion of L-arginine and L-glutamic 20 acid improved construct solubility. Further, the construct for structure determination had 21 shortened N-and C-termini (aa 6-94) compared to our construct (aa 1-101), which together may 22 account for the observed structural differences in the a2-b4 region. To further compare these two 23 constructs, we used the chemical shift information in the BMRB deposition as input for the Talos+ 24 server [66] to compute predicted secondary structure and S 2 order parameters for the construct 25 14 used previously for structural studies (RBM7 2M8H ) (Fig. 3F) . The overall pattern of the predicted 1 S 2 values for RBM7 2M8H shows excellent agreement to our observed NOE values, particularly 2 proline-adjacent residues V36 and Q50 and C-terminal residues. The predicted S 2 values show 3 decreased values for residues G72-Y76 and the secondary structure of these residues is 4 predicted to be unstructured, supporting the determined solution NMR structure model in which 5 these residues are disordered. showed that the RBM7 RRM RNP1 and RNP2 sequences were required for 7SK RNP association 12 in cellulo [34] . To gain additional insights into specific 7SK RNA residues that crosslinked to 13 RBM7, we obtained iCLIP data deposited to EMBL and processed and analyzed data. The per-14 residue 7SK RT stops were normalized to the total number of 7SK reads for both DMSO control 15 and 4-NQO treated samples to measure crosslinking enrichment across the 7SK RNA sequence 16 (Fig. 4A) . Consistent with the previous study, RBM7 crosslinking to 7SK RNA was observed in 17 both control and treated samples (Fig. 4A) [34] suggesting an interaction between RBM7 and the observed adjacent to this site as well as G232-C235 (Fig. 4B) . To investigate RBM7 RRM binding to 7SK SL3 in vitro, we performed qualitative EMSAs 24 using HT-RBM7 101 and various constructs of 7SK SL3 comprising either hairpin or ssRNA 25 sequences (Fig. 4C-F) . To ensure protein samples did not contain contaminating nucleic acids that co-purified from E. coli, a control lane containing only protein was included. Unfortunately, 1 we observed sample precipitation shortly after addition of protein to RNA, with EMSAs showing 2 only unbound RNA and not the corresponding protein-bound band. Binding was inferred from the 3 decreasing intensity of the unbound band with increasing protein concentration. We first used a 4 hairpin construct comprising the upper stem of 7SK RNA SL3 (SL3210-264) (Fig. 4B) and observed 5 extremely weak binding requiring significant HT-RBM7 101 stoichiometric excess (Fig. 4C) loop, and SL3244-262 ( Fig. 4D-E) . Although these constructs also require excess protein to saturate As a component of NEXT, RBM7 targets RNA for processing and turnover [53, 54 ]. An 25 alternate function for RBM7 has recently been identified, independent of NEXT, in which RBM7 assembles with 7SK RNP to promote P-TEFb release and subsequent transcription activation of 1 genes in response to genotoxic stress [34] . RBM7 assembly with proteins is essential for its 2 function in NEXT [50, 58] . Similarly, RBM7 interacts with both 7SK RNA and core 7SK RNP To investigate the C-terminal boundary requirements for RBM7 RRM folding and stability, we generated RRM constructs corresponding to the X-ray structure of the individual RRM (aa 1-20 86, PDB ID 5IQQ), RRM-ZCCHC8 complex (aa 1-91, PDB IDs 5LXR and 5LXY), and an extended 21 construct ending at residue 101. We found that while all constructs could be recombinantly 22 expressed and purified, the minimal HT-RBM7 86 construct was the least stable with the lowest 23 yields, rapid precipitation, and deterioration of 1 H-15 N HSQC spectral quality within one day. Although HT-RBM7 91 showed improved solubility, this construct also showed similar deterioration addition, there is a discrepancy between the observed secondary structures in the solution NMR 10 and X-ray structures, in which the X-ray structures contain an additional b4' strand between a2 11 and b4, as well as an extended b4 strand relative to the solution NMR structure (Supplemental 12 Figure S3 ). X-ray structures report a lowest-energy conformation whereas NMR structures report 13 a structural ensemble that is often more conformationally dynamic compared to a crystal structure. 14 The a2-b4 region is part of the protein-protein interface (Supplemental Figures 3-4) , and the 15 b4'-extended b4 strands observed in the X-ray structures may be further stabilized by protein-16 protein interactions. Our heteronuclear NOE data shows that in the HT-RBM7 101 construct these 17 residues are ordered, consistent with b4' and extended b4 strand formation. However, the Talos+ 18 order parameters and secondary structure predicted from the chemical shifts of RBM7 2M8H 19 suggest an unstructured loop rather than a b strand, in support of the solution NMR structure 20 previously determined. Our study uses an extended construct, with additional residues at both 21 the N-and C-termini. In addition, we use a different buffer that includes a reduced pH and amino 22 acids L-arginine and L-glutamic acid to improve solubility. These differences in construct 23 boundaries and buffer may account for the observed differences in secondary structure. Unusually, the beginning of the extended b4 strand contains a proline (P79). Prolines are generally disfavored in b-strands due to lacking an amide proton that participates in hydrogen Promoter-proximal pausing of RNA polymerase II: emerging roles in 2 metazoans RNA polymerase II elongation control Getting up to speed with transcription elongation by RNA polymerase II Promoter-proximal pausing of RNA polymerase II: a nexus of gene 8 regulation Cracking the control of RNA polymerase II elongation by 7SK 10 snRNP and P-TEFb Transcription elongation control by the 7SK snRNP 12 complex: Releasing the pause Domains in the SPT5 protein that modulate its 14 transcriptional regulatory properties P-TEFb-the final frontier Control of RNA polymerase II elongation potential 17 by a novel carboxyl-terminal domain kinase The 7SK small nuclear RNA inhibits the CDK9/cyclin T1 19 kinase to control transcription 7SK small nuclear RNA binds to and 21 inhibits the activity of CDK9/cyclin T complexes Evolutionary conservation of the human 7 S RNA sequences Keeping the balance: The noncoding RNA 7SK as a master 13 regulator for neuron development and function Release of positive transcription elongation factor b (P-TEFb) from 7SK small nuclear 16 ribonucleoprotein (snRNP) activates hexamethylene bisacetamide-inducible protein (HEXIM1) 17 transcription RNA-based affinity purification reveals 7SK RNPs with distinct 19 composition and regulation CTIP2 23 is a negative regulator of P-TEFb 7SK snRNA-mediated, gene-specific cooperativity 25 of HMGA1 and P-TEFb HMGA1 directly interacts 1 with TAR to modulate basal and Tat-dependent HIV transcription KAP1 3 Recruitment of the 7SK snRNP Complex to Promoters Enables Transcription Elongation by 4 RNA Polymerase II Hexim1 To Block P-TEFb Assembly into the 7SK snRNP and Sustain Transcription Elongation RNA helicase DDX21 9 coordinates transcription and ribosomal RNA processing P-TEFb Activation by 12 RBM7 Shapes a Pro-survival Transcriptional Response to Genotoxic Stress The bromodomain 15 protein Brd4 is a positive regulatory component of P-TEFb and stimulates RNA polymerase II-16 dependent transcription The Yin and Yang of P-TEFb regulation: implications for human 18 immunodeficiency virus gene expression and global control of cell growth and differentiation CDK9: a signaling hub for transcriptional control Kick-sTARting HIV-1 transcription elongation by 7SK snRNP 23 deporTATion The Global Phosphorylation Landscape of 9 SARS-CoV-2 Infection SR 11 proteins collaborate with 7SK and promoter-associated nascent RNA to release paused 12 polymerase Genome-wide analysis of KAP1, the 14 7SK snRNP complex, and RNA polymerase II Global 16 unleashing of transcription elongation waves in response to genotoxic stress restricts somatic 17 mutation rate The 7SK/P-TEFb 19 snRNP controls ultraviolet radiation-induced transcriptional reprogramming A quantitative p38-MK2 signaling axis 2 regulates RNA metabolism after UV-light-induced DNA damage p53-deficient cells rely on ATM-and 4 ATR-mediated checkpoint signaling through the p38MAPK/MK2 pathway for survival after DNA 5 damage Highly interacting regions of the 7 human genome are enriched with enhancers and bound by DNA repair proteins p53 regulates enhancer accessibility and activity in response to 10 DNA damage SARS-CoV-2 hijacks p38ß/MAPK11 to 13 promote viral protein translation, bioRxiv Zcchc8 is a glycogen synthase 15 kinase-3 substrate that interacts with RNA-binding proteins Interaction profiling identifies the 19 human nuclear exosome targeting complex The human 1 nuclear exosome targeting complex is loaded onto newly synthesized RNA to direct early 2 ribonucleolysis The regulation and functions of the nuclear RNA 4 exosome complex p38MAPK/MK2-mediated phosphorylation of RBM7 regulates the human 7 nuclear exosome targeting complex RBM7 subunit of the NEXT complex binds U-rich sequences and targets 3'-end 10 extended forms of snRNAs The RNA recognition motif, a plastic RNA-binding 12 platform to regulate post-transcriptional gene expression Structure of the RBM7-ZCCHC8 core of the NEXT complex reveals connections to splicing 15 factors RRM domain of human RBM7: purification, 17 crystallization and structure determination In vivo DNA assembly using common laboratory bacteria: A 20 re-emerging tool to simplify molecular cloning Removal of Affinity Tags Swiss Bioinformatics Resource Portal, as designed by its users Cell-free 1 translation reconstituted with purified components Isolation of DNA Fragments from Polyacrylamide Gels by the 3 Crush and Soak Method BioMagResBank (BMRB) as a 6 Resource for Structural Biology TALOS+: a hybrid method for predicting protein 8 backbone torsion angles from NMR chemical shifts A 10 heteronuclear correlation experiment for simultaneous determination of 15N longitudinal decay 11 and chemical exchange rates of systems in slow equilibrium Spectroscopy Shows Four Dynamic Domains for Phospholamban NMRPipe: a 16 multidimensional spectral processing system based on UNIX pipes NMRFAM-SPARKY: enhanced software for biomolecular 19 NMR spectroscopy iCLIP data analysis: A complete 24 pipeline from sequencing reads to RBP binding sites iCLIP: protein-RNA interactions at nucleotide resolution A systems view of spliceosomal assembly and 8 branchpoints with iCLIP BEDTools: a flexible suite of utilities for comparing genomic 11 features VARNA: Interactive drawing and editing of the RNA 13 secondary structure Structural basis for MTR4-ZCCHC8 interactions that stimulate the 15 MTR4 helicase in the nuclear exosome-targeting complex Discovery of a large-scale, cell-state-responsive allosteric switch in the 19 7SK RNA using DANCE-MaP HnRNP A1/A2 Proteins 21 Assemble onto 7SK snRNA via Context Dependent Interactions Chemical reversible crosslinking enables measurement of RNA 3D distances and alternative 24 conformations in cells RRM-RNA recognition: NMR or crystallography...and 1 new findings Position-specific propensities of amino acids in the beta-3 strand The transcription-5 dependent dissociation of P-TEFb-HEXIM1-7SK RNA relies upon formation of hnRNP-7SK 6 RNA complexes