key: cord-0982386-x5isgtsu authors: Khan, Muhammad Tahir; Zeb, Muhammad Tariq; Ahsan, Hina; Ahmed, Abrar; Ali, Arif; Akhtar, Khalid; Malik, Shaukat Iqbal; Cui, Zhilei; Ali, Sajid; Khan, Anwar Sheed; Ahmad, Manzoor; Wei, Dong-Qing; Irfan, Muhammad title: SARS-CoV-2 nucleocapsid and Nsp3 binding: an in silico study date: 2020-08-04 journal: Arch Microbiol DOI: 10.1007/s00203-020-01998-6 sha: 7d07c39627a2cfd51ed8ef398f1663d78cc2750d doc_id: 982386 cord_uid: x5isgtsu [Image: see text] SARS-CoV-2 belongs to the single-stranded positivesense RNA family (Anand et al. 2005; Chang et al. 2014) . This virus family has a large genome that encodes four structural proteins, small envelope (E), matrix (M), nucleocapsid phosphoprotein (N), spike (S), and 16 nonstructural proteins (nsp1-16) that together, ensure replication of the virus in the host cell. The nonstructural proteins, mostly associated with RNA replication, carry out the enzymatic function required for viral replication (Anand et al. 2005; Zhang et al. 2005; Ahmed et al. 2020) . The genome of SARS-CoV-2 also encodes for nsp7, nsp8, and nsp12 that together form a complex called RNA-dependent RNA-polymerase, nsp10, nsp13, nsp14 and 16 complexes called RNA capping machinery, and nsp3, PLpro, and nsp5 (3CLpro) known as proteases that impede innate immunity and also essential for cleaving viral polyproteins (Snijder et al. 2016) . Among these, phosphoprotein (N) is an essential component that links the viral genome with a membrane called nucleocapsid. The N-terminal RNA-binding domain (N-NTD) captures the RNA genome while the C-terminal domain anchors the ribonucleoprotein complex to the viral membrane via its interaction with the M protein. The four structural proteins together with the viral + RNA genome and the envelope constitute the complete virion. The nucleocapsid phosphoprotein consists of an N-terminal (NTD) and a C-terminal (CTD) domain. Both of these domains have the RNA binding affinity, while the CTD binds the M protein, establishing the physical linkage between the envelope and + RNA. The SARS N proteins also play regulatory roles in the viral life cycle through the host intracellular machinery (Chang et al. 2014) . A more recent study shows the structure of N protein, right hand-like fold, composed of a β-sheet core with an extended central loop. The core region adopts a five-stranded U-shaped right-handed antiparallel β-sheet platform with the topology β4-β2-β3-β1-β5, flanked by two short α-helices. A prominent feature of the structure is a large extending loop between β2-β3 that forms a long basic β-hairpin (β2′ and β3′) (Dinesh et al. 2020) . We docked the crystal structure of RNA (Sheng et al. 2014 ) and nucleocapsid protein using HADDOCK (Dominguez et al. 2003) . The RNA duplex was found to be bound between the basic finger and the palm of the N-NTD with highly positive arginine residues (R92, R107, R149) that directly contact the RNA. The model predicts several hydrophobic interactions with side-chains of residues I94 and L104, contributing the RNA binding. Residues A50, T57, H59, R92, I94, S105, R107, R149, Y172 in the surrounding may form interactions with RNA. However, no experimental or long-term molecular dynamic simulation (MD) was performed to confirm what kind of interaction existed. The nucleocapsid N-terminal residues (N-NTD) interact with genomic RNA of SARS-CoV-2 and incorporate it into progeny particles (Tang et al. 2005; Cong et al. 2020; Kang et al. 2020) . The replication-transcription complexes (RTCs) sites are the place where CoV RNA is synthesized. The N proteins are also present at RTCs sites and help viral replication, interacting with nonstructural protein 3 (Nsp3), a component of the RTCs. These two aspects of the CoV life cycle, however, have not been linked. The N protein was found to bind exclusively to nsp3 and to no other RTCs components. The questions arise that what kind of interactions and how these interactions will be affected in the case of mutations. To address these questions, we docked N proteins with Nsp3 using HADDOCK servers to unveil the residues involved in interactions between Nsp3 and N proteins complex formation for future inhibitor designing. This study will provide a better understanding of rapid drug designing to control the global epidemic of SARS-CoV-2. PDB is a database of biomolecule structure, providing crystal structure for proteomics and in silico studies. Only structure relevant to the current study was chosen. The crystal structure of nucleocapsid N-terminal domain (N-NTD) was retrieved from Protein Data Bank (PDB ID: 6yi3). However, for a better understanding of interactions, the full structure of N protein (QHD43423) was download from I-TASSER (Zheng et al. 2019 ) and that of Nsp3 (6w6y) were downloaded from Protein Data Bank (Berman 2008 ). The structures were subjected to Protein Preparation Wizard in the molecular operating environment (MOE) (Vilar et al. 2008) and also in the Chimera (Pettersen et al. 2004 ). The missing hydrogens were added, and partial charges were assigned. The protein structure was energy minimized using default value in Amber 10 forcefield in MOE. Selenomethionines were changed into methionine. The prepared was saved in PDB format for docking. To gain insight into the interactions, N proteins and Nsp3 were docked using HADDOCK server (de Vries et al. 2010; van Zundert et al. 2016) . The server supports nucleic acids, small molecules, and can perform tasks with experimental data. HADDOCK is widely used in protein-protein, protein-DNA/RNA complexes. The server allows the users for conformational modification of the side chain and also of the backbone of the biomolecules during complex formation. Besides, the server directly supports the docking of PDB and NMR data that contain multiple structural models. Default parameters were used for docking except correlation was set to shape and electrostatics. Intermolecular bonds were analyzed using UCSF Chimera (Pettersen et al. 2004) , MOE, and PyMOL (Lill and Danielson 2011) . Crystal structure of N-NTD (PDB ID 6M3M) and dsRNA (PDB structure 2LK2), a 21-base pair-long dsRNA (Liu et al. 2008 ) was extracted from PDB and docked with the fully optimized 6yi3 using the HEX docking server (Macindoe et al. 2010) . The HEX docking method considers the electrostatic potential, and is based on the rigid body docking algorithm (Macindoe et al. 2010) . The default parameters were used for docking except correlation was set to shape and electrostatics. Intermolecular bonds were analyzed using UCSF Chimera (Pettersen et al. 2004 ) and PyMOL (Lill and Danielson 2011) . The top clusters based on HADDOCK score are shown in Table 1 . Among the seven clusters, cluster 1 was selected, containing 39 structures. However, the tope one based on HADDOCK score, van der Waals, and electrostatic energies has been selected for the current investigation. In the majority of protein-protein dockings, the van der Waals energy, and electrostatic energies combine finds accurate results with promising levels as the scoring function (Mandell et al. 2001) . Van der Waals (dispersion forces) contribute to protein interactions with surfaces or other biomolecules. A total 116 structures have been clustered in HADDOCK server which was further clustered into 8 groups, representing 57% of the water-refined models. The statistics of the topmost reliable clusters are given in Table 1 . Its Z-score (the more negative the better) specifies the number of standard deviations from the average this cluster is positioned in the relation of the score. However, van der Waals and electrostatic interactions are important to observe the protein-protein docking where a more negative van der Waals (vdW) and electrostatic interactions show well-docked molecules. The strength of binding of the two proteins relies on the number of residues present at the interface between the two proteins and its area corresponding to interface. Large interfaces depict high binding energy (sum of vdW, H-bonds, electrostatics) (Roth et al. 1996; Nilofer et al. 2017) . Docking results were examined with the PyMOL, MOE, and Chimera (DeLano 2002; Pettersen et al. 2004; Vilar et al. 2008) , for interactions between the N protein and Nsp3 the closest interacting residues were labeled for better understanding. Here in the current investigations, we detected that residues at N-CTD might be involved in interacting with Nsp3 (Fig. 1a) , a member of RTCs for SARS-CoVs RNA processing. The interacting residues of Nsp3, V207, N208, S209, F210, S211, G212, Y213, L214, K215, L216, T217, and D218 might be an important target of inhibitors to prevent viral RNA synthesis, mediated by N protein and Nsp3 interactions (Fig. 1a-c) . The surface representation of Nsp3-N complex revealed that the contacts are stable to serve in tethering the genome at a very early stage of infection. Protein-protein interactions are important in every life activity, leading to the execution of numerous elementary roles inside the cell (Koegl and Uetz 2007; Wang et al. 2019) . The CoVs N protein plays a vital role in virions through interactions with the positive-strand RNA, the M protein, and other nsp. Previous studies report that the N-terminal end of Nsp3 (1 to amino acid 233) may be central for interactions with the N protein (Hurst et al. 2010 (Hurst et al. , 2013 which induces RTCs-facilitated CoV RNA replication. However, specific residues have not been highlighted, interacting with RTCs for viral RNA synthesis. Residues, S188, S190, R191, N192, R195, S197, T198, P199, G200, S201, K237, G238, Q239, Q241, G243, Q244, T245, V246, T247, K248, F314, P309, S310, A311, S312, and A313 of Nsp3 have been detected, interacting with N protein (S183, S184, R185, S186, S187, S188, R189, S190, R191, S193, S194, R195, and N196). Previous studies also confirmed that the N-terminal part of nsp3, from residues 1 to 233 is essential for interaction with N protein (Hurst et al. 2010 (Hurst et al. , 2013 . In a more recent study (Cong et al. 2020) , it has been shown that interactions with Nsp3 are mediated by N1a-N1b (amino acids 1 to 194) and to a minor extent N2a (amino acids 195 to 257) (Fig. 2) . Although there are some other important targets (Khan et al. 2020b) , designing potential inhibitor against these interacting sites of both, N and Nsp3 might be useful to block the SARS-CoV-2 replication in host cells. The N plays a key role to link the viral + RNA to the membrane. There are two domains, N-terminal RNA-binding domain (N-NTD) that binds the RNA while the C-terminal domain (CTD) after interaction with the M protein, is involved in anchoring the ribonucleoprotein to the viral membrane. A more recent study also reported that amino acid residues A50, T57, H59, R92, I94, S105, R107, R149, Y172 are important in the establishment of interactions with SARS-CoV-2 RNA (Dinesh et al. 2020) (Fig. 3) . For a better result of inhibitors, conserved residues may be identified for better management of CoVs infections. Residues S105 and R107 are conserved among all SARS-CoV N-NTD (SARS-CoV-2, SARS-CoV, MERS-CoV, and HCoV-OC43) (Kang et al. 2020) . Inhibitors may be designed to block the ssRNA binding N-NTD site. Binding of drugs at protein interfaces is mostly controlled by some specific residues contributing disproportionately to the Gibbs free energy of binding and dynamics of proteins (Massova and Kollman 1999; Weiss et al. 2000; Arkin and Wells 2004; Zhao and Chmielewski 2005; Moreira et al. 2007; Wells and McClendon 2007; Boukharta et al. 2014; Ibarra et al. 2019; Khan et al. 2020a) . A previous study reports the importance of N-nsp3 interaction with N proteins for SARS-CoV-2 replication. A wide range of supporting evidence confirmed the interaction between N proteins and Nsp3 replicase-transcriptase complex (Hurst et al. 2013) . These findings suggest that some peptide inhibitor may be designed to prevent the interaction of Nsp3 and N proteins, required of viral replication and propagation. In conclusion, the N protein seems the most potent drug target in SARS-CoV-2. Its N-NTD interacts with the viral RNA while N-CTD interacts with the Nsp3 of RTCs during RNA domains mediate the binding to Nsp3. The N1b is required for RNA binding during replication in which N2b interacts with M proteins Fig. 3 Crystal structure of nucleocapsid (PDB ID: 6YI3) and suggested interaction site residues with RNA. The interaction site was predicted in a more recent study. (a) Residues predicted, interact-ing with RNA. Positively charged cleft between the basic finger and the palm creating a putative RNA-binding site in the hinge/junction region between the palm and basic finger. (b) RNA-binding site synthesis. Domains, N1b, N2a, and N2b, all are a potential target. Residues involved in interaction may be the potential target of drug and peptides inhibitors. Drug development and screening against interacting residues may be useful for better management of SARS-CoV-2 infections. Preliminary identification of potential vaccine targets for the COVID-19 coronavirus (SARS-CoV-2) based on SARS-CoV immunological studies Coronavirus main proteinase: target for antiviral drug therapy Small-molecule inhibitors of protein-protein interactions: progressing towards the dream The protein data bank: a historical perspective Computational prediction of alanine scanning and ligand binding energetics in G-protein coupled receptors The SARS coronavirus nucleocapsid protein-forms and functions Nucleocapsid protein recruitment to replication-transcription complexes plays a crucial role in coronaviral life cycle The HADDOCK web server for data-driven biomolecular docking Structural basis of RNA recognition by the SARS-CoV-2 nucleocapsid phosphoprotein HADDOCK: a protein-protein docking approach based on biochemical or biophysical information Characterization of a critical interaction between the coronavirus nucleocapsid protein and nonstructural protein 3 of the viral replicase-transcriptase complex An interaction between the nucleocapsid protein and a component of the replicase-transcriptase complex is crucial for the infectivity of coronavirus genomic RNA Predicting and experimentally validating hot-spot residues at protein-protein interfaces Crystal structure of SARS-CoV-2 nucleocapsid protein RNA binding domain reveals potential unique drug targeting sites Phylogenetic analysis and structural perspectives of RNA-dependent RNA-polymerase inhibition from SARs-CoV-2 with natural products Marine natural compounds as potents inhibitors against the main protease of SARS-CoV-2. A molecular dynamic study Improving yeast two-hybrid screening systems Computer-aided drug design platform using PyMOL Pymol: An open-source molecular graphics tool Structural basis of toll-like receptor 3 signaling with double-stranded RNA HexServer: an FFT-based protein docking server powered by graphics processors Protein docking using continuum electrostatics and geometric fit Computational alanine scanning to probe protein−protein interactions: a novel approach to evaluate binding free energies Computational alanine scanning mutagenesis-an improved methodological approach Proteinprotein interfaces are vdW dominant with selective H-bonds and (or) electrostatics towards broad functional specificity UCSF Chimeraa visualization system for exploratory research and analysis Van der Waals interactions involving proteins Crystal structure studies of RNA duplexes containing s(2)U: A and s(2)U: U base pairs The nonstructural proteins directing coronavirus rna synthesis and processing Biochemical and immunological studies of nucleocapsid proteins of severe acute respiratory syndrome and 229E human coronaviruses The HADDOCK2.2 web server: user-friendly integrative modeling of biomolecular complexes Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery A high efficient biological language model for predicting protein-protein interactions Rapid mapping of protein functional epitopes by combinatorial alanine scanning Reaching for high-hanging fruit in drug discovery at protein-protein interfaces A molecular docking model of SARS-CoV S1 protein in complex with its receptor Inhibiting protein-protein interactions using designed molecules Deep-learning contact-map guided protein structure prediction in CASP13 Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations