key: cord-0881909-eovwh0c0 authors: Mukherjee, Shruti; Harikishore, Amaravadhi; Bhunia, Anirban title: Targeting C-terminal helical bundle of NCOVID19 Envelope (E) protein date: 2021-02-04 journal: Int J Biol Macromol DOI: 10.1016/j.ijbiomac.2021.02.011 sha: b2c8637b9ba7f4eb61f5e5ce6578fe1cbde4bf25 doc_id: 881909 cord_uid: eovwh0c0 One of the most crucial characteristic traits of Envelope (E) proteins in the severe acute respiratory syndrome SARS-CoV-1 and NCOVID19 viruses is their membrane-associated oligomerization led ion channel activity, virion assembly, and replication. NMR spectroscopic structural studies of envelope proteins from both the SARS CoV-1/2 reveal that this protein assembles into a homopentamer. Proof of concept studies via truncation mutants on either transmembrane (VFLLV), glycosylation motif (CACCN), hydrophobic helical bundle (PVYVY) as well as replacing C-terminal “DLLV” segments or point mutants such as S68, E69 residues with cysteine have significantly reduced viral titers of SARS-CoV-1. In this present study, we have first developed SARS-2 E protein homology model based on the pentamer coordinates of SARS-CoV-1 E protein (86.4% structural identity) with good stereochemical quality. Next, we focused on the glycosylation motif and hydrophobic helical bundle regions of E protein shown to be important for viral replication. A four feature (4F) model comprising of an acceptor targeting S60 hydroxyl group, a donor feature anchoring the C40 residue, and two hydrophobic features anchoring the V47 L28, L31, Y55, and P51 residues formed the protein based pharmacophore model targeting the glycosylation motif and helical bundle of E protein. Database screening with this 4F protein pharmacophore, ADMET property filtering on enamine small molecule discovery collection yielded a focused library of ~7000 hits. Further molecular docking and visual inspection of docked pose interactions at the above mention V47 L28, L31, Y55, P51, S60, C40 residues led to the identification of 10 best hits. Our STD NMR binding assay results demonstrate that the smaller molecule ligand 3, 2-(2-amino-2-oxo-ethoxy)-N-benzyl-benzamide, binds to NCOVID19 E protein with a binding affinity (K(D)) of 141.7 ± 13.6 μM. Furthermore, the ligand 3 also showed binding to C-terminal peptide (NR25) as evidenced with the STD spectrums of wild type E protein would serve to confirm the involvement of C-terminal helical bundle as envisaged in this study. cell membrane permeability by forming oligomeric cation-selective ion channels [17] . More importantly, the transmembrane regions of E protein have been known to interact with M protein and aids in its colocalization [18] , thus maintaining the integrity in viral morphogenesis, especially during assembly and budding [14] . Biophysical studies in using the "VYVY" motif [19] have shown that this region assumes helical orientation, a characteristic trait of the known amyloidogenic propensity [20] , in membrane environment and enable the self-aggregating motif [21] to insert into membrane areas and might contribute to the process of entry into host environment or viral assembly [19, 22] . Additionally, the role of C-terminal residues of SARS-CoV-1 E protein in association with nucleocapsid (N) [23] , PALS1 PDZ domain [24] was also shown to be implicated in regulation of viral pathogenesis and viral replication by both in vitro and in vivo studies [24] . Several mutations focusing on the transmembrane "VFLV" segment, the glycosylation motif "RLCAYCCN", and the hydrophobic helical bundle "PSFYVYVYSR" stretches on SARS-CoV-1 E protein have shown a complete reduction in the viral replication. Similarly, point mutation studies replacing the S68-E69 with a single cysteine residue or replacing the C-terminal tail (DLLV) with glycine residues in SARS-CoV-1 had drastically reduced viral counts [25] . More importantly, the genotyping analysis from recent NCOVID19 infections worldwide points to a lower mutational frequency in this E protein, highlighting its suitability for targeted drug discovery [26] . In this study, we have attempted to target C-terminal helical bundle to identify and characterized biophysical binding of small molecule ligands to NCOVID19 Envelope E protein using structure based drug design and STD NMR tools. First, we have developed a 3D structure model for NCOVID19 E protein with reliable stereochemical quality from its closest homolog using homology modeling studies. Next, four feature protein based pharmacophore model was developed by targeting the residues of glycosylation motif and the helical bundle of C-terminal region. Together with database screening with enamine small molecule library [27] , ADMET filtering [28] , J o u r n a l P r e -p r o o f molecular docking, and visual inspection of interactions at the above described epitope residues led to purchase of focused library of 10 molecules. Next, Saturation Transfer Difference (STD) NMR binding studies of these 10 molecules with E protein led to the identification of two ligands 1 and 3 that bind to E protein in high micromolar affinity. The has cysteine residues mutated to alanine residues and also lacks the C-terminal ten residues. It was utilized as a template to build the protein for NCOVID19 envelope small membrane protein. Phyre 2 [29] was employed to construct the 3D model of NCOVID19 E protein comprising of 1-65 amino acid residues. Later, the pentameric assembly of the NCOVID19 ion channel was built by superimposing the 3D protein models on the structure template (5X29). Protein Preparation: Next, the pentameric assembly of NCOVID19 E Protein was checked for stereochemical clashes within the arrangement. The sterically clashing residues (F23) with V25 of adjacent subunit at the central cavity in the pentamer arrangement were adjusted using the rotamer library. The heavy atoms of the pentameric assembly were constrained, and the added hydrogen atoms were energy minimized using steepest descent and conjugate algorithms for 2000 steps using prepare protein module in Discovery studio [30] . Upon satisfying the energy / rmsd convergence criterion, the energy minimized structures were saved for further modeling studies. Structure-based pharmacophore model: The first step in developing the structure-based pharmacophore model on this domain was to enumerate the possible hotspot features such as donor (green), acceptor (magenta), and hydrophobic (cyan) feature vector site points at CTD using the interaction site generation module in Discovery Studio 2020 [30] [31] [32] [33] [34] . These numerous three feature J o u r n a l P r e -p r o o f vector site points are hierarchically clustered based on their rmsd to their respective feature type and only cluster centers for each of the three feature(s) were included in the study. Finally, a donor feature mapping the carbonyl atoms of C40 residue, an acceptor partially mapped to the hydroxyl group of S60, and two hydrophobic features in the near vicinity of hydrophobic residues V47, L28, Y57, P54 were included in the structure-based pharmacophore model. Site points that were further away from the above-mentioned residues of CTD were discarded. 3D-Database Preparation: 3D databases of enamine datasets from Zinc library [35] were employed in this study. 3D coordinates were generated using prepare ligands tools. Next, ADMET [28] property filters were applied for enamine libraries. Ligands that are likely to induce the liver enzyme CYPD6 inhibition, hepatotoxicity, high plasma protein binding, and ligands with low solubility and absorption were excluded from further consideration. Thus, the resulting clean ligands were utilized to build 3D databases using the "CEASER" search algorithm with an option to generate 50 conformations per ligand in Discovery studio 2020 [30] . Next, 4F pharmacophores were used to search the 3D database resulting in a focused library of < 7000 hits. Docking: Gold molecular docking program 2020 was utilized to dock the obtained focused library at the previously discussed C-terminal domain using default parameters available with the GOLD docking program [36] . Together with help the GOLD PLP fitness scores and visual inspection of interaction with the intended C-terminal residue (S60) led us to identify the 10 best poses that could dock at the C-terminal helical bundle of NCOVID19 E protein that could potentially inhibit the viral pathogenesis and viral replication as suggested in SARS-CoV-1 E protein knock out studies. The NMR sample for STD NMR experiments was prepared with 5 µM NCOVID19 E protein / The experiments were performed at 25 °C in a Bruker Avance III 700 MHz, equipped with an RT probe and the Topspin v3.2 software [37] for data acquisition, processing, and analysis. The STD NMR spectra were acquired using a double pulsed-field gradient spin-echo (DPFGSE) pulse sequence, providing a better baseline and improved water suppression. The STD NMR spectra (on-resonance=0 ppm and off-resonance=40 ppm) for each ligand was performed with the total saturation time of 2s (a train of 40 selective Gaussian pulses, 49 ms each-with a 1ms intervals) at 50 dB for 4K scans while the reference spectra were acquired with 2K scans. The transfer of saturation from the E protein to the respective ligands was generated upon subtraction of the onresonance spectra from the off-resonance by phase cycling. The competitive inhibition was performed with a molar ratio of Protein: ligand 1: ligand 3 = 1:100:100. A series of one-dimensional STD spectra were acquired by stepwise Where, STD max is the maximal STD intensity, t is the saturation time and k st is the saturation rate constant. From Eq. (IV), the initial slope, which is known as the total STD value, can be obtained using Eq. (V): The initial slope of the STD total isotherm can be plotted as a function of ligand concentration to obtain the dissociation constant, K D : Comparison of E protein of NCOVID19 to SARS-CoV-1 revealed a very high sequence identity Blast protein search on protein databank revealed that the NCOVID19 E protein sequence harbors 86.4% identity with SARS-CoV-1 E protein (pdb 5X29) as their closest homolog structure. So far, only NMR structure studies were elucidated, describing the monomer (wild type protein [13] ), and pentameric structural arrangements using Cys to Ala mutated constructs of SARS-CoV-1, respectively. Despite the lack of structural information on last 10 C-terminal residues, the pentameric arrangement of E protein provided the structural basis for its ion channel activity. Very recently, the NMR structure of NCOVID19 E protein trans-membrane (TM) domain (pdb 7K3G) in homo-pentameric assembly was also determined highlighting the importance of TM residues in mediating ion channel [39] . However, lack of the C-terminal self-aggregating helical domain and that of PDZ domain structural details (Figure 1 ) eclipses our understanding on these domains at molecular or atomistic level. Therefore, in this study, we have utilized the NMR structure (PDB 5X29) [22] with 84.6 % sequence identity as a template to build the 3D structure of NCOVID19 E protein (1-65 amino acids) using the Phyre 2.0 program [40] . Next, our Ramachandran plot results As highlighted in the earlier section, the C-terminal helical bundle, glycosylation motif, and the central channel residues (NTD) are nicely juxtaposed in a pentameric assembly to form a ligandbinding site. Residues such as C40, C44, V47, I48, L28, L31 from NTD of one subunit and the Cterminal residues P54, F56, S60, R61 from another subunit form the ligand-binding site ( Figure 3B ). Mainly, three feature probes comprising of a donor, an acceptor, and a hydrophobe site points are used to enumerate the interaction hot spots (vector site points). Hierarchical clustering based on rms distance on each of the three feature types provided the pharmacophore features targeting the above-mentioned residues ( Figure 4A ). Feature or site points that are further away from these reside are discarded from further model development. Finally, a four-feature model comprising of donor feature anchored to carbonyl atoms of C49 residue, an acceptor feature near S60 hydroxyl atoms and two hydrophobic features in the vicinity of L28, V47, P54, Y57 ( Figure 4B ) were evolved to screen the enamine databases. Saturation Transfer Difference (STD) NMR spectroscopy is a versatile and widely used tool [42] that not only identifies the potential binders (high micromolar to low millimolar affinity) but also enables information on the ligand's binding epitope for the protein (NCOVID19 E Protein). STD NMR experiments with the three ligands helped us to clearly delineate their divergent binding affinities for the NCOVID19 E protein. Of the 3 ligands purchased, only ligands 1 and 3 (Table 1) showed weak to moderate affinity, respectively. With respect to the reference spectrum, the STD spectrum of ligand 1 with 4096 scans shows low-intensity peaks from 6.8-7.0 ppm and 7.8-8.0 ppm (data not shown). Increasing the STD scans to 8192 did reveal conspicuous ligand peaks from aromatic phenyl rings from 6.8-7.0 ppm and 7.8-8.0 ppm region. This data suggests that the aromatic ring protons of ligand 1 were involved in its binding to the E protein ( Figure 5A ). The ligand 2, however, did not show any STD signal, confirming that it does not bind to the E protein J o u r n a l P r e -p r o o f (Supporting information Figure 2 ). In contrast, ligand 3, with a much smaller molecular mass than ligand 1 showed moderate binding for the protons from 7.0-7.7 ppm, signifying the involvement of aromatic benzene rings in its binding to the E protein ( Figure 5B ). These observations indicated the importance of the aromatic interactions in mediating the E protein association for both ligand 1 and ligand 3. MHz and at a molar ratio of E protein:ligands=1:100. Our pose selection criterion was based on PLP fitness scores as well as interactions with the T32, S60, R62, and the hydrophobic interactions involving the L28, L31 V42, I46, L51, P54 residues from the E protein that led to identifying three hits (Table 1 and Supporting information Figure 3 ). Figure 3B ). Among the two aryl groups, the benzodioxine moiety (at one end of the molecule) was found to be engaged in hydrophobic interactions with I46, L51, P54, residues of the E protein (light yellow arrows). The other aryl group, i.e., 1,3-dioxoisoindoline moiety (at the other end of the molecule), was engaged in both Hbonding interactions with S60 as well as partakes in hydrophobic interactions with C44 and V47 (residues not shown). The 4-methylthiobutanoate fragment, on the other hand, maintained alkyl hydrophobic interactions with L31, L34, C40 ( Figure 6 ) in E protein. Figure 8A ). This data clearly suggested a common or comparable binding site in E protein for either of the two ligands ( Figure 8B ). Nevertheless, the sharper signals from ligand 3 indicated a relatively stronger binding affinity that further attenuates the 1 st ligands weak binding in this competitive interaction for the protein site. The fact that the aromatic rings partake in the binding interactions with the E protein suggests that the difference in the aromatic moieties' structural orientation could be directly correlating to the J o u r n a l P r e -p r o o f divergent functional association of the two ligands ( Figure 8B ). This alternatively indicates the vantage of the composite benzene rings in the two ligands in moderating a plausible hydrophobicity mediated stable interaction. This understanding further prompted us to focus our studies on the hydrophobic-rich segment from the E protein that might be serving as the key residues that define the binding site. We next determined the binding affinity (K D ) of ligand 3 using five different molar ratios of E protein/ligand 3 complex. Our binding affinity titrations ( Figure 9A To define any involvement of the hydrophobic stretch from the C-terminal region of the E protein, peptide fragment-based studies were performed with a 25 residue peptide stretch, involving the Cterminal segment ranging from N45-R69 (NR25). The NR25 peptide was synthesized with selectively 15 N labeled Val and Leu residues to gain specific insight into any possible role for the hydrophobicity-mediated interaction. Two-dimensional 1 H-15 N HSQC spectra as well as STD titration were recorded for the NR25 peptide in the presence of either ligands 1, 3. Though, our HSQC results of the ligands 1, 3 with NR25 peptide did not reveal any chemical shift perturbation on amide resonances for the Val and Leu residues (Supporting information Figure 4 ). On the other hand, surprisingly, when we employed STD NMR titration of ligand 3 with NR25 (N45-R69) showed the peaks of ligand aromatic moieties at 7.0-7.7 ppm in STD spectrum ( Figure 9C ) in a manner similar to E protein. This crucial data further maps the involvement of C-terminal helical bundle as binding epitope for ligand binding and corroborates our hypothesis that C-terminal helical bundle could mediate important role in viral replication. Our present study identified a micromolar affinity hit molecule by targeting NCOVID19 E protein by employing protein-based pharmacophore generation, database screening, docking fitness, and Severe acute respiratory syndrome coronavirus as an agent of emerging and reemerging infection Emerging coronaviruses: Genome structure, replication, and pathogenesis Coronaviruses: an overview of their replication and pathogenesis Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2): An Update Mechanisms of coronavirus cell entry mediated by the viral spike protein Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2): An overview of viral structure and host response Drug targets for corona virus: A systematic review Severe acute respiratory syndrome coronavirus envelope protein ion channel activity promotes virus fitness and pathogenesis A severe acute respiratory syndrome coronavirus that lacks the E gene is attenuated in vitro and in vivo The small envelope protein E is not essential for murine coronavirus replication Absence of E protein arrests transmissible gastroenteritis coronavirus maturation in the secretory pathway Expression and purification of coronavirus envelope proteins using a modified β-barrel construct The coronavirus E protein: assembly and beyond Severe acute respiratory syndrome coronavirus envelope protein regulates cell stress response and apoptosis Generation of a replication-competent, propagation-deficient virus vector based on the transmissible gastroenteritis coronavirus genome SARS coronavirus E protein forms cationselective ion channels Expression and membrane integration of SARS-CoV E protein and its interaction with M protein Structural insights of a self-assembling 9-residue peptide from the C-terminal tail of the SARS corona virus E-protein in DPC and SDS micelles: A combined high and low resolution spectroscopic study Self-assembly of a nine-residue amyloid-forming peptide fragment of SARS corona virus E-protein: mechanism of self aggregation and amyloid-inhibition of hIAPP Host-membrane interacting interface of the SARS coronavirus envelope protein: Immense functional potential of C-terminal domain Structural model of the SARS coronavirus E channel in LMPG micelles SARS-CoV envelope protein palmitoylation or nucleocapid association is not required for promoting virus-like particle production The SARS coronavirus E protein interacts with PALS1 and alters tight junction formation and epithelial morphogenesis Role of Severe Acute Respiratory Syndrome Coronavirus Viroporins E, 3a, and 8a in Replication and Pathogenesis Decoding SARS-CoV-2 Transmission and Evolution and Ramifications for COVID-19 Diagnosis, Vaccine, and Medicine Enamine, 1 Distribution Way Computer-Aided Prediction of Pharmacokinetic (ADMET) Properties, Dosage Form Design Parameters2018 The Phyre2 web portal for protein modeling, prediction and analysis Discovery of a Novel Mycobacterial F-ATP Synthase Inhibitor and its Potency in Combination with Diarylquinolines Small molecule Plasmodium FKBP35 inhibitor as a potential antimalaria agent Revisiting de novo drug design: receptor based pharmacophore screening Application of structurebased focusing to the estrogen receptor ZINC--a free database of commercially available compounds for virtual screening Development and validation of a genetic algorithm for flexible docking Topspin v3.2, Bruker Biospin GmbH Deciphering key features in protein structures with the new ENDscript server Structure and drug binding of the SARS-CoV-2 envelope protein transmembrane domain in lipid bilayers The Phyre2 web portal for protein modeling, prediction and analysis ChEMBL_27 SARS-CoV-2 release Applications of saturation transfer difference NMR in biological systems