key: cord-0263988-p54rcypn authors: Cheong, Ezekiel Ze Ken; Quek, Jun Ping; Xin, Liu; Li, Chaoqiang; Chan, Jing Yi; Liew, Chong Wai; Mu, Yuguang; Zheng, Jie; Luo, Dahai title: Crystal structure of the Rubella virus protease reveals a unique papain-like protease fold date: 2022-04-16 journal: bioRxiv DOI: 10.1101/2022.04.15.488536 sha: 50a065b1a5346f2847538deae427c15c93bdacf5 doc_id: 263988 cord_uid: p54rcypn Rubella is well-controlled due to an effective vaccine, but outbreaks are still occurring without any available antiviral treatments. There is still much to learn about the rubella virus (RUBV) papain-like protease (RubPro) that could be a potential drug target. This protease is crucial to RUBV replication, cleaving the non-structural polyprotein p200 into 2 multi-functional proteins, p150 and p90. Here we report a novel crystal structure of RubPro at 1.64 Å resolution. It has a similar catalytic core structure to that of SARS-CoV-2 and foot-mouth-disease virus (FMDV) proteases. RubPro has well-conserved sequence motifs that are also found in its newly discovered Rubivirus relatives. The RubPro construct was shown to have protease activity in trans against a construct of RUBV protease-helicase and fluorogenic peptide. A protease-helicase construct was also cleaved in E. coli expression. RubPro was demonstrated to possess deubiquitylation activity, suggesting a potential role of RubPro in modulating the host’s innate immune responses. The structural and functional insights of the RubPro will advance our current understanding of its function and point to more structure-based research into the RUBV replication machinery, in hopes of developing antiviral therapeutics in the future. Rubella is an infectious disease that is well-characterised by rashes [1] . It is often confused with measles that also cause rashes. Confoundingly, rubella is also known as German measles, and measles is also known as rubeola [2] . However, rubella and measles are different diseases caused by different viruses. The rubella virus (RUBV) is the aetiological agent of the Rubella disease and belongs to the genus, Rubivirus, of the newly created family, Matonaviridae [3] . Rubella infection during the first trimester of pregnancy can result in miscarriage or congenital rubella syndrome (CRS). CRS is characterised by foetal cataracts, deafness, heart defects and global developmental delay [4] . Infection at the early stages of pregnancy typically has the worst prognosis [5] . There is currently no treatment for CRS apart from symptomatic treatment [6] . Before the rubella vaccine was developed in 1969, rubella epidemics occurred every 6-9 years [7] . Modern rubella vaccines utilise the RA27/3 strain [8] , a strain of RUBV obtained from an aborted foetus infected with the virus [9] . The vaccine is typically administered in a combination of measles, mumps and rubella (MMR) vaccines. With 97% effectiveness [10] , the vaccine has virtually eliminated rubella in more than 130 countries [11] . As such, there appears to be very little impetus to research deeply into its virology. RUBV is a Group VI virus with a single-stranded, positive-sense RNA genome enclosed by an icosahedral capsid [4, 12] . The viral genome is around 10 kb in size and has the highest GC content of RNA viruses, at 70% [13] . The genome has a 5' cap structure with a poly(A) tail at its 3'end. The 5'-proximal open reading frame (ORF) encodes the non-structural polypeptide p200, while the 3'-proximal ORF encodes the structural proteins, capsid and surface glycoproteins E1 and E2 (Fig 1A) [ 14] . The non-structural polyprotein p200 is then processed into two non-structural proteins, p150 and p90 [15] . The p150 protein consists of a methyltransferase (MTase) and protease domain [16] , while the p90 protein has both helicase and RNA-dependent RNA polymerase (RdRp) domain [17] . These non-structural proteins are crucial to RNA viruses for replication [18] and polyprotein processing [19] . The RUBV protease, RubPro, while in the p200, cleaves the polyprotein between residues G1301 and G1302 [20] . This cleavage, at SRGG/GTCA, is found at the Cterminal of RubPro in p150 and the N-terminal of the helicase domain of p90 (Fig 1B) [ 21] . Based on computer alignments, RubPro is predicted to be a papain-like cysteine protease (PCP). PCPs form a large family among cysteine proteases and can be found in plants, viruses and parasites [22] . Coronaviruses and alphaviruses utilise PCPs for polypeptide processing and immune evasion mechanisms [23, 24] . Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) papain-like protease (PLpro) [25] and foot-and-mouth disease (FMDV) PCP [26, 27] cleaves inflammatory ubiquitin and other ubiquitin-like antiviral proteins, such as interferonstimulated gene 15 (ISG15), to modulate host innate immunity responses [28] . The functional importance of PCPs for viral replication and survival makes them viable drug targets for antiviral treatment [29] . RubPro has residues C1152 and H1273 as the catalytic dyad [30] , confirmed via site-directed mutagenesis [31] . RubPro requires divalent ions like Zn 2+ , Co 2+ and Cd 2+ for protease activity [32] . Additionally, the Ca 2+ -dependent association of calmodulin with RubPro is necessary for the proteolytic activity of RubPro [33] . Using FMDV PCP as a reference, homology modelling has been used to partially model the RubPro structure [34] . RubPro contains a Ca 2+ -binding EF-hand domain which plays a role in maintaining the structure of RubPro [34] . The cysteine residues are arranged close to Zn 2+ , and this Zn 2+ -coordination also contributes to structural stability [35] . The structure of RubPro has not been solved, and its enzymatic activity or inhibition has not been characterised quantitatively. Scientific knowledge of RUBV is relatively superficial compared to other viruses, likely due to the effective vaccine reducing the need for deeper research. The only proteins of RUBV with solved structures are the structural proteins E1 [36] and capsid protein [37] . The enzymatic and essential non-structural proteins' structures and interactions are not well-understood or adequately characterised. While only RUBV is found naturally in humans [38] , two newly discovered Rubivirus, Ruhugu virus (RUHV) and Rustrela virus (RUSV) have crossed barriers across other mammalian host species. This raised concerns about the potential zoonotic transmission of RUBV-like viruses into human hosts [39] . Furthermore, despite an effective RUBV vaccine, there are still rubella outbreaks occurring due to gaps in national immunisation programmes. Japan experienced three rubella outbreaks during the last decades [40, 41] . There were outbreaks in Poland [42] and Romania [43] in 2012 as well. Once an outbreak manages to break the first line of defence, which is vaccine coverage, there are no antiviral treatments available to combat the virus and the disease. Effective antiviral therapeutics requires a fundamental understanding of viral replication processes. Many successful drugs inhibit proteins of the replication machinery, such as proteases [44, 45] and RNA polymerases [46] [47] [48] . The lack of knowledge about RUBV highlights a need for deeper research into its replication to potentially develop antiviral treatments, as well as to expand the knowledge of virology that could enlighten the replication and pathogenesis mechanisms of other viruses. In this study, the crystal structure of RubPro to a resolution of 1.64 Å is reported. This novel structure provides the molecular basis for the substrate recognition and characterisation of the active site for potential inhibitor studies. Next, sequence alignment was conducted with RUHV and RUSV proteases. RubPro was also structurally aligned with other viral PCPs deposited in the Protein Data Bank (PDB) for further structural analysis. The protease's activity was also characterised using RUBV protease-helicase as the natural substrate as well as using fluorogenic peptides. RubPro was shown to have K48-linkage-specific deubiquitylation activity. This study provides structural and functional insight into RubPro to facilitate antiviral development against RUBV, as well as to expand the field of knowledge of viral proteins. Cloning and expression tests of various proteins were carried out by the Protein Production Platform (PPP) at Nanyang Technological University (NTU). RubPro, RubPro C1152A mutant and RUBV protease-helicase (RubProHel) C1152A mutant were cloned in pSUMO-LIC vector, which encodes an N-terminal His-Sumo tag that is cleavable by Sumo protease. RubPro consisted of RUBV residues 1021-1301 and RubProHel consisted of RUBV residues 1021-1616. C1152A mutants were generated by site-directed mutagenesis of the respective wild type (WT) constructs. C1152A mutagenesis were performed, for both RubPro and RubProHel using Q5 Melbourne, Australia. Data processing was carried out using the XDS data processing package. The structure of the RubPro was determined using the single-wavelength anomalous diffraction (SAD) method. Measurements of anomalous signals from Zn 2+ were used to derive the positions of the heavy atom substructure. Crank2 [49] and BUCCANEER [50] software were used to calculate the initial phases and for automatic model building respectively. The structure was subjected to iterative rounds of refinement using the Phenix_refine program [51] [52] [53] and manual rebuilding using Coot [54] . Figures were generated using PyMOL (Schrödinger). Data collection and refinement statistics can be found in Table S1 . The amino acid residue sequence of our RubPro construct was aligned with the sequences of the non-structural p200 polypeptides of RUHV and RUSV. The sequences were obtained from UniProtKB with the identifiers A0A7L5KV54 for RUHV and A0A7L5KV68 for RUSV. Alignments were done using Clustal Omega [55] . Testing of RubProHel C1152A cleavage by RubPro 0.5 mg/mL of RubProHel C1152A was incubated with doubling concentrations of RubPro, from 0.125 mg/mL to 2 mg/mL, at room temperature for 18 h in cleavage buffer (25 mM HEPES buffer at pH 7.5, 150 mM NaCl, 5% w/v glycerol and 2 mM DTT). Samples were analysed analyzed by electrophoresis on a 15% SDS-PAGE gel and visualized by Coomassie blue staining. The assay was performed in a 96-well half-area black clear-bottom microplate The assay was repeated using another chemically synthesised peptide substrate, Acryl-LSRGG-AMC (GenScript, Hong Kong). Structural prediction of RubPro was performed using the Alphafold2 Colab web server [57] . Isolated peptides, Ace-Leu-Ser-Arg-Gly-Gly-Gly-Nme (LSRGGG) and Ace-Arg-Leu-Arg-Gly-Gly-Gly-Nme(RLRGGG), were simulated in a water box with size of 3.82 nm×3.82 nm×3.82 nm for 500 ns. From the second 250 ns trajectories, one structure of every 100 ps was chosen (in total 2500 structures for each peptide) to dock onto the protein structure near the active side (C1152 and H1273). The quick autodock vina tool [56] was used for docking by freezing all the peptide conformation. These In this way, the protein-LSRGGG complex structure was obtained. The initial protein-RLRGGG structure was got based on the protein-LSRGGG structure and mutating LSRGGG to RLRGGG. Parameters of the protein and peptides were based on the AMBER99SB-ILDN force field [57] and the system was solvated with TIP3P [58] water molecules and counterions were added to neutralize the system. The molecular dynamics (MD) simulations were performed using GROMACS [59] 5.1.2 software. The LINCS [60] algorithm was used to constrain bonds between heavy atoms and hydrogen to enable a timestep of 2 fs. A 1.2nm cutoff was used for van der waals interaction and short-range electrostatic interactions calculations, and the Particle Mesh Ewald method was implemented for long range electrostatic calculations. Simulation temperature was maintained at 300K using a V-rescale thermostat [61] and 1bar pressure using Parrinello-Rahman [62] barostat. To analyse the behaviour of interactions between the protein and the peptides, we calculated the binding energies between them using the MM-GBSA (Molecular Mechanics Generalized Born Surface Area) method [63] . The entropy was not calculated: Where, GComplex, Gpro and Gpep is the free energy of complex, the protein, and the peptide, respectively. Free energy (ΔG) of each state was calculated as follows: Where EMM is the molecular mechanical energy, GGB is the polar contribution towards solvation energy calculated by Generalized Born (GB) method respectively. GSA is the contribution from the nonpolar terms towards solvation energy, and TS is the entropic contribution of the system. EMM was obtained by summing contributions from the electrostatic energy (Eele), the van der Waal energy (Evdw), and the internal energy including bond, angle, and torsional angle energy terms (Eint) using the same force field as that of MD simulations. GGB was calculated with Onufriev's method [64] . GSA in equation 2 is proportional to the solvent accessible surface area (SASA) and was computed by molsurf module: Where, the surface tension proportionality constant (γ) was set to 0.0072 kcal/mol/Å 2 , while the free energy of nonpolar solvation for a point solute (b) was set to a default value. The protein substrates ubiquitination was prepared as previously described [65] . The novel crystal structure of RubPro was determined at a resolution of 1.64 Å collected at Zn K absorption edge of 9.7 KeV, using the RubPro domain of the P150 (Residues 1021-1301) (Fig 2 and Fig S1) . (Fig 2C) . The catalytic dyad C1152 and H1273 are located on the palm and the thumb respectively. The positions of these catalytic residues are similar to that of SARS-CoV-2 PLpro, whereby the catalytic triad is also found at the interface between the palm and the thumb regions [66] . The sulfur atom of C1152 is 4.2 Å apart from the delta nitrogen of H1273 (Fig 2C) . In contrast to other cysteine proteases, no stabilizing Asn/Asp residue was found near H1273. Sequence alignment of the RubPro with the protease domain of RUHV and RUSV was performed to analyse for the presence of conserved motifs among the Rubivirus genus. The protease domain of RUHV and RUSV shows 53 % and 40 % amino acid similarity respectively to RubPro, with conserved catalytic dyad and zinc coordinating residues (Fig 2E) . The Dali server was utilised to search for structural homologs in the PDB database (Table S2 ) [67] . We selected the crystal structures of human ubiquitin specific protease (USP) 30 (PDB: 5OHN) [68] , FMDV PCP (PDB: 4QBB) [69] and SARS-CoV-2 PLpro (PDB: 6WZU) [66] for further analysis against that of RubPro (Fig 3) . For all 3 cysteine proteases, there was a good alignment of the core domains around the catalytic sites. The catalytic residues were in similar positions with less than 5 Å deviations ( Fig S2) . While RubPro has a Cys-His catalytic dyad, the other proteases had catalytic triads; Cys-His-Asp for PLpro and FMDV PCP, and Cys-His-Ser for USP30. The N-terminal fingers domain of RubPro was distinct and had no structural homologs with any of the other 3 proteases. RubProHel C1152A, with the natural cleavage junction between RUBV protease and helicase, was used to assess the trans cleavage activity of the RubPro. A catalytic C1152A mutation was introduced to RubProHel, to prevent any self-cleavage in cis. Fig 4A) . Expression of RubProHel in E. coli cells was used to ascertain cleavage of the p150- Enzyme concentration and buffer pH optimisation was first conducted to identify the suitable conditions for conducting protease assay. Enzyme concentration greater than 3.75 M was chosen for easier quantification and calculation of the enzymatic rate ( Fig S4A) . Buffer pH 6.4 to 6.6 displayed the highest enzymatic rate (Fig S4B) . Finally, the protease activity of RubPro was measured at a protein concentration of 5 M at 50 mM potassium phosphate buffer pH 6.6. The protease assay of RubPro and RubPro C1152A mutant were performed using Z-RLRGG-AMC substrate and Acryl-LSRGG-AMC substrate. The RubPro has a higher affinity towards Z-RLRGG-AMC substrate than Acryl-LSRGG-AMC substrate with Km values of 580 M and 1460 M respectively (Fig 4C and Fig 4D) . The RubPro also exhibits a higher catalytic rate of 0.0000741 s -1 against Z-RLRGG-AMC substrate while having a 10-fold slower catalytic rate of 0.00000791 s -1 against Acryl-LSRGG-AMC substrate (Fig 4C and Fig 4D) . Two blocked hexamer peptides, Ace-Leu-Ser-Arg-Gly-Gly-Gly-Nme (LSRGGG) and Ace-Arg-Leu-Arg-Gly-Gly-Gly-Nme(RLRGGG), were simulated together with the To determine Rubpro activity on Ubiquitin (Ub) and the Ub-like (Ubl) modifier interferon stimulated gene 15 (ISG15), we purified K63-triubiquitin, K48-triubiquitin and M1-triubiquitin(linear) chains and ISG15-HIS according to our previous work [65] . Starting with ubiquitin, robust RubPro activity and high specificity was observed towards Lys48-linked polyubiquitin (Fig 5B) , where triubiquitin was cleaved to monoubiquitin products, but not Lys63-linked triubiquitin ( Fig 5A) and linear-linked triubiquitin (Fig 5D) . Similar results were obtained using gel-based analysis of Lys63 and linear-triubiquitin versus cleavage of ISG15-HIS to mature ISG15, HIS tag is not removed from the ISG15 C-terminus (Fig 5C) . A mutant of the catalytic dyad, C1152A, showed no cleavage of K63-triubiquitin, K48-triubiquitin and M1-triubiquitin (linear) chains and ISG15-HIS (Fig 5A-D) . Time course analysis also suggest that RubPro is a K48-linkage-specific deubiquitinase (Fig 5E) . The novel structure of RubPro presented here sheds much insight into the poorly understood RUBV (Fig 2) . The novelty of the structure is further exemplified by the unsuccessful prediction using the Alphafold2 server (Fig S6) . This 1.64 Å structure represents the first structure of the RUBV non-structural protein. In contrast to the previous study [34] , we did not identify the presence of Ca 2+binding sites nor a EF-hand domain in our crystal structure. We tried supplementing CaCl2 into the crystallisation buffer and soaking the crystals in a solution containing CaCl2. One possible future work would be to conduct a denaturation-refolding assay of RubPro in the presence of Ca 2+ . One surprise of the RubPro structure is that the C-terminal ends in a β-sheet β8 that is pointing away from the catalytic site (Fig 2B) . This suggests that the cleavage junction SRGG/GTCA in the p200 polypeptide could be inaccessible to the RubPro catalytic domain, for cis cleavage of the p150-p90 junction. A possibility could also be RubPro might function like the Chikungunya virus capsid protease domain, which is only active for one proteolytic reaction, after which the active site is inaccessible [70] . Structural work needs to be carried out to solve the structure of RubProHel, to shed light on this apparent structural disagreement. Co-crystallisation can also be carried out with a bound inhibitor or a substrate to further characterise the active site. A C1152A or H1273A mutant could be used to permanently bind a substrate without cleaving it. The newly discovered Rubivirus members, RUHV and RUSV, are the closest relatives of RUBV. From sequence alignment of RubPro with RUHV and RUSV proteases (Fig 2E) Despite limited sequence similarity with other cysteine proteases, RubPro is structurally similar to the other cysteine proteases, namely, human USP30, FMDV PCP and SARS-CoV-2 PLpro at the catalytic core (Fig 3 and Table S2 ). The structure of RubPro was also compared with the protease domain of the Chikungunya virus, as RubV was previously classified under the Togaviridae family. However, both structures are distinctive different (Fig S7) , and no members of the Togaviridae family were identified as hits in the Dali structural homology database search (Table S2) . Expression of the WT RubProHel showed significant cleavage into the 42.8 kDa and 34.7 kDa products (Fig 4B) , demonstrating in vivo protease activity in E. coli cells. As such, the RubProHel C1152A mutant is likely to be a good candidate for structural studies, as it can bring structural insights into the self-cleavage mechanism of p200 into p150 and p90. The orientation of the cleavage junction SRGG/GTCA relative to the catalytic site can inform on the mechanism of RubPro activity on p200, whether it works in cis or trans dominantly. RubProHel C1152A was cleaved by RubPro in trans (Fig 4A) , and this differs from Liang et al.'s conclusion that trans cleavage requires additional residues 920-974 [20] . The additional residues might contribute to improved protease activity but is not essential for protease activity. In our study, Epidemiology and prevention of vaccine-preventable diseases Molecular biology of rubella virus Changes to virus taxonomy and to the International Code of Virus Classification and Nomenclature ratified by the International Committee on Taxonomy of Viruses (2021) Congenital rubella syndrome: ophthalmic manifestations and associated systemic disorders Congenital Rubella, in StatPearls. 2020: Treasure Island (FL) Congenital rubella syndrome. Manual for the Surveillance of Vaccine-Preventable Diseases World Health Organization. Observed rate of vaccine reactions -MMR vaccines New York: Random House. 10. Centers for Disease Control and Prevention. Rubella vaccination. 2020. 11. World Health Organization, WHO position on measles vaccines Three-dimensional structure of a membrane-containing virus Analysis of base and codon usage by rubella virus Intracellular distribution of rubella virus nonstructural protein P150 Sequence of the genome RNA of rubella virus: evidence for genetic rearrangement during togavirus evolution Expression of the rubella virus nonstructural protein ORF and demonstration of proteolytic processing Rubella virus and birth defects: molecular insights into the viral teratogenesis at the cellular level Rubella virus replication and links to teratogenicity Viral proteinases Rubella virus nonstructural protein protease domains involved in trans-and cis-cleavage activities Rubella virus: structural and non-structural proteins Papain-like cysteine proteases Coronaviruses: an overview of their replication and pathogenesis Alphavirus protease inhibitors from natural sources: A homology modeling and molecular docking investigation Papain-like protease regulates SARS-CoV-2 viral spread and innate immunity Severe acute respiratory syndrome coronavirus papain-like protease: structure of a viral deubiquitinating enzyme The leader proteinase of foot-and-mouth disease virus negatively regulates the type I interferon pathway by acting as a viral deubiquitinase The ubiquitin system: a critical regulator of innate immunity and pathogen-host interactions Drug Development and Medicinal Chemistry Efforts toward SARS-Coronavirus and Covid-19 Therapeutics Characterization of the zinc binding activity of the rubella virus nonstructural protease Characterization of the rubella virus nonstructural protease domain and its cleavage site The rubella virus nonstructural protease requires divalent cations for activity and functions in trans Calcium-dependent association of calmodulin with the rubella virus nonstructural protease domain Identification of a Ca2+-binding domain in the rubella virus nonstructural protease A cysteine-rich metal-binding domain from rubella virus nonstructural protein is essential for viral protease activity and virus replication Functional and evolutionary insight from the crystal structure of rubella virus protein E1 Assembly, maturation and threedimensional helical structure of the teratogenic rubella virus Neurological aspects of rubella virus infection Relatives of rubella virus in diverse mammals Rubella outbreak in Japan. Lancet Epidemiological characteristics of rubella and congenital rubella syndrome in the 2012-2013 epidemics in Ongoing outbreak of rubella among young male adults in Poland: increased risk of congenital rubella infections Ongoing rubella outbreak among adolescents in Salaj Recent advances in hepatitis C virus treatment: review of HCV protease inhibitor clinical trials HIV protease inhibitors: a review of molecular selectivity and toxicity The journey of remdesivir: from Ebola to COVID-19. Drugs Context, 2020 Efficacy and safety of favipiravir, an oral RNA-dependent RNA polymerase inhibitor, in mild-to-moderate COVID-19: A randomized, comparative, open-label, multicenter, phase 3 clinical trial Review article: the efficacy and safety of sofosbuvir, a novel, oral nucleotide NS5B polymerase inhibitor, in the treatment of chronic hepatitis C virus infection Automatic protein structure solution from weak X-ray data The Buccaneer software for automated model building. 1. Tracing protein chains Towards automated crystallographic structure refinement with phenix.refine PHENIX: a comprehensive Python-based system for macromolecular structure solution Use of knowledge-based restraints in phenix.refine to improve macromolecular refinement at low resolution Coot: model-building tools for molecular graphics The EMBL-EBI search and sequence analysis tools APIs in 2019 Fast, accurate, and reliable molecular docking with QuickVina 2 Improved side-chain torsion potentials for the Amber ff99SB protein force field Comparison of simple potential functions for simulating liquid water GROMACS: fast, flexible, and free A Parallel Linear Constraint Solver for Molecular Simulation Canonical sampling through velocity rescaling Polymorphic transitions in single crystals: A new molecular dynamics method The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities Exploring protein native states and largescale conformational changes with a modified generalized born model Ordered assembly of the cytosolic RNA-sensing MDA5-MAVS signaling complex via binding to unanchored K63-linked poly-ubiquitin chains Structure of papain-like protease from SARS-CoV-2 and its complexes with non-covalent inhibitors DALI and the persistence of protein shape Mechanism and regulation of the Lys6-selective deubiquitinase USP30 Foot-and-mouth disease virus leader proteinase: structural insights into the mechanism of intermolecular cleavage Kinetic characterization of trans-proteolytic activity of Chikungunya virus capsid protease and development of a FRET-based HTS assay Herpes Simplex Virus 1 Ubiquitin-Specific Protease UL36 Abrogates NF-kappaB Activation in DNA Sensing Signal Pathway The cysteine protease domain of porcine reproductive and respiratory syndrome virus nonstructural protein 2 possesses deubiquitinating and interferon antagonism functions This research is supported by the Ministry of Education, Singapore, under its MOE AcRF Tier 1 Award 2021-T1-002-021. We wish to acknowledge the funding support for this project from Nanyang Technological University under the URECA Undergraduate Research Programme. J. P. Q. is supported by the Nanyang