key: cord-0883877-6kvxh7td authors: Tanaka, Shigenori; Watanabe, Chiduru; Honma, Teruki; Fukuzawa, Kaori; Ohishi, Kazue; Maruyama, Tadashi title: Identification of correlated inter-residue interactions in protein complex based on the fragment molecular orbital method date: 2020-07-09 journal: J Mol Graph Model DOI: 10.1016/j.jmgm.2020.107650 sha: 3213795421a67caa1cdf2bf398b98b5acafc326c doc_id: 883877 cord_uid: 6kvxh7td A theoretical scheme to systematically describe correlated (network-like) interactions between molecular fragments is proposed within the framework of the fragment molecular orbital (FMO) method. The method is mathematically based on the singular value decomposition (SVD) of the inter-fragment interaction energy (IFIE) matrix obtained by the FMO calculation, and can be applied to a comprehensive description of protein-protein interactions in the context of molecular recognition. In the present study we apply the proposed method to a complex of measles virus hemagglutinin and human SLAM receptor, thus finding a usefulness for efficiently eliciting the correlated interactions among the amino-acid residues involved in the two proteins. Additionally, collective interaction networks by amino-acid residues important for mutation experiments can be clearly visualized. Since its proposal in 1999 [1, 2] , the fragment molecular orbital (FMO) method [3] [4] [5] has provided a powerful and useful tool to perform ab initio electronic-state calculations for biomolecular and other related systems. One of very advantageous features in the FMO method for biomolecular analyses is its ability to evaluate "effective interactions" between "fragments" that are usually chosen to be an amino-acid residue or a (small) ligand molecule as a unit component in protein-ligand complex system, for instance. This inter-fragment interaction is referred to as IFIE (Inter-Fragment Interaction Energy) or PIE (Pair Interaction Energy) in the literature [3] [4] [5] [6] , and plays a vital role in, e.g., docking analysis by specifying important interactions involved in the object system. In fact, the FMO-IFIE analysis has been extensively applied to the investigations of mechanisms of molecular recognition associated with protein-protein [7] [8] [9] [10] [11] , protein-nucleic acid [12] [13] [14] , protein-drug [15] [16] [17] [18] [19] , and other [20] [21] [22] [23] [24] [25] In earlier studies, we applied the FMO-IFIE method to the analyses of intermolecular interactions of influenza virus hemagglutinin (HA) protein in complex with sialosaccharide receptors [23, 25] and Fab fragment of antibody [7, 8] for the investigation of mutation effects. Particularly, in the former analysis [25] , we found an interesting phenomenon referred to as indirect effect in which the mutations of HA residues that do not strongly interact with the receptor significantly affect the change in binding affinity of complex, while the interactions between some unmutated residues in HA and the receptor often vary substantially due to the mutations at other residues. This unexpected effect has thus suggested a presence of correlated (network-like) inter-fragment interactions, whose detailed mechanism has remained to be elucidated. In biomolecular complex systems, the inter-fragment interactions are multiple in essence. Although the electron-correlated FMO-IFIE itself refers to an effective, renormalized interaction between single fragments in which some many-body effects are included, the total complex interactions should be described as a whole in terms of the set of all the IFIEs. In earlier investigations on protein-protein interaction (PPI) [10, 11] , the network structure of IFIE (or PIE) was revealed in terms of the concept of Protein Residue Network (PRN). Concerning this issue of describing the correlated interactions due to multiple fragments, we have recently found a usefulness of the technique of singular value decomposition (SVD). In our previous study for protein-ligand systems [26] , we applied the SVD for the calculated results of the IFIE matrix (amino-acid residues × various ligand compounds) to elicit the essential interactions and consequently improve the correlation between FMO results and experimental ligand (small compound) binding affinities. Through this method, we obtained the improved correlation with experimental results by extracting important singular eigenvectors that play essential roles for ligand binding. In the present study we extend this SVD methodology to the description of protein-protein interaction (PPI) in order to comprehensively identify the correlated interactions among residues. By means of the SVD that enables a data compression similar to the principal component analysis (see Sec. 2.3), the network structure of IFIEs is systematically extracted. We here employ a complex system of measles virus hemagglutinin (MVH) and human SLAM (signaling lymphocyte activation molecule) receptor as an example for the PPI analysis. Measles virus (MV) causes an acute and highly devastating contagious disease in humans. In a previous study [27] , employing the crystal structures of three human receptors, SLAM, CD46, and Nectin-4, in complex with the measles virus hemagglutinin (MVH or HA), we computationally elucidated the details of binding energies between the amino-acid residues of HA and those of the receptors in terms of ab initio FMO method. The calculated IFIEs revealed a number of significantly interacting amino-acid residues of HA that played essential roles in binding to the receptors. As predicted from previously reported experiments, some important amino-acid residues of HA were shown to be common but others were specific to interactions with the three receptors. Further, we carried out FMO calculations for in silico experiments of amino-acid mutations, finding reasonable agreements with virological experiments concerning the substitution effect of residues. Thus, our study demonstrated that the electron-correlated FMO method is a powerful tool to exhaustively search for amino-acid residues that contribute to interactions with receptor molecules. It is known that SLAM is the most important receptor for wild-type MV, because it is responsible for invasion and propagation, and also for pathogenesis in the infected animals [28] . Here, employing the IFIE matrix composed of the HA residues and the SLAM residues as the column and row elements, respectively, we assess the usefulness of the SVD analysis to comprehensively describe the PPIs. It is noted that we employ the result of FMO calculation in vacuo bacause the primary purpose of the present work is to propose a novel method and assess its validity, while the incorporation of solvent effect is actually feasible in explicit or implicit way [29] [30] [31] . In the following section, we first illustrate the theoretical framework to obtain the correlated inter-fragment interactions in the FMO scheme. Test calculations employing a protein complex MVH-SLAM are carried out in Sec. 3, and their implications are discussed. Concluding remarks are given in Sec. 4. The FMO method [1] [2] [3] [4] [5] is a computational method that divides large molecules such as proteins into relatively small units called fragments, and then calculates the energy of the whole molecule and the electron density quantum-chemically by molecular orbital (MO) calculations of fragment monomers and fragment dimers (sometimes, trimers and tetramers are also considered [5] ). By using this method, we can apply an ab initio MO method that has been shown to succeed for small compounds to macromolecules such as proteins without a significant loss in accuracy. When dividing a large molecule into N f fragments and letting the total electron energies of a fragment I and its pair IJ be E I and E IJ , respectively, the total electron energy of a molecule can be approximated (FMO2 approximation) as [3] [4] [5] : If ∆P is the difference matrix of the electron density of monomer (P I , P J ) and dimer (P IJ ), Eq. (1) can be transformed into the following equation: where V I and V IJ are the electrostatic potentials that fragment I and fragment pair IJ receive from surrounding fragments, respectively. We thus find where ∆E IJ can be interpreted as the interaction energy between the fragment pair of I and J. This ∆E IJ is referred to as inter-fragment interaction energy (IFIE) [3] [4] [5] . In the FMO method, the interaction between amino-acid residues in protein complex can be identified as an IFIE. Then, the total or partial summation of IFIEs (namely, IFIE-sum) is an index representing the strength of the binding between a specific residue and a set of other residues. This notion of IFIE-sum has played a significant role in the FMO analysis on biomolecular recognition [5] . However, in the present study, we propose an alternative approach to systematically extracting the collective interactions among clustered residues, as detailed below. To analyze the molecular interactions between MVH and human SLAM, we retrieved the crystal structures of the complex ( Fig. 1 ) from the Protein Data Bank (PDB entry: 3ALZ) for the FMO calculation. In this structure, the complex of MVH-SLAM was composed of HA monomer (Chain A) and one SLAM molecule (Chain B). In the present study, molecular interactions were analyzed [27] by the FMO method with electron-correlated MP2/6-31G* scheme using the software ABINIT-MP [5] . The preparation of the complex structure used for The summation of all the IFIEs between residue pairs of MVH and SLAM receptor gives the binding energy [5] . An m × n matrix S with m and n-dimensional vectors as columns and rows can be related to an m×n diagonal matrix Σ that satisfies the following equation: Here, U is an m × m orthogonal matrix, and V is an n × n orthogonal matrix. If U and V are chosen appropriately, a pair of matrices can be made with Σ satisfying the condition described below. When it is rewritten, the following equation is obtained: This type of decomposition is called Singular Value Decomposition (SVD) [33] [34] [35] [36] . For simplicity of description, we assume m ≥ n. Otherwise, we can think of the transposed matrix S T of S. If σ ij is an element of Σ, in the case of i = j, σ ij = 0; in the case of i = j, called a singular value of S, a column vector of U is a left singular vector, and a row of V T is a right singular vector [33] [34] [35] [36] . When the original matrix S is m × n, the size of each matrix is the In this way, through the application of SVD to the IFIE matrix S for protein complex, we can extract the correlated inter-residue interactions specified as singular components. Through the FMO calculation for MVH-SLAM complex [27] , As seen in Figs. 2-4, we can detect D505, D507, D530 and R533 on the HA side as important residues, which corresponds well to experimental findings [37] [38] [39] . On the other hand, we observe important residues of E75, K77 and E123 on the SLAM side, as was remarked in the previous study [27] . A novel feature of the SVD methodology proposed in the present work is to explicitly elicit the correlated or cooperative IFIEs shared among multiple residue fragments. For example, as seen in Fig. 4 as an alternative to the earlier attempts based on other viewpoints such as the 3D Scattered Pair Interaction Energies (3D-SPIEs) [9] and the PIE-PRN [10, 11] . In some virological mutagenesis experiments, the importance of H536 residue in MVH on protein binding to SLAM was remarked [37] [38] [39] . It is well known that the FMO calculations performed in vacuo give the IFIE values that overestimate the electrostatic interactions at the far distances [5] . Although we may take explicit or implicit account of solvation effects [29] [30] [31] to overcome this difficulty, we here resort to a post-FMO processing scheme called Statistically Corrected IFIE (SCIFIE) [40] Table 1 In this study, we have proposed a novel methodology to efficiently obtain and visualize the correlated inter-fragment interactions in the FMO calculations for protein complex. With the aid of the SVD technique for the efficient data compression, the IFIE matrix composed of the residue components from each protein can be decomposed into the contributions from individual singular-value components that comprehensively describe the cooperative interaction network among amino-acid residues. We have applied this method to the analysis of molecular recognition between the measles virus HA and its SLAM receptor to assess its feasibility and usefulness. Collective interaction patterns formed by some important residues, which could not be described in earlier FMO-IFIE studies based on individual fragment interactions, were thus identified in agreement with experimental observations. In addition, the significances of the protonation state of His536 and of the electrostatic screening effect were addressed in the present study, while the explicit or implicit inclusion of the solvation effects would be an important issue [29] [30] [31] in actual pharmaceutical applications. For the future development, the incorporation of the pair interaction energy decomposition analysis (PIEDA) [41] would be interesting to discriminate the contributions from various kinds of molecular interactions. The proposed method would also be applicable to the analysis of molecular recognition associated with the new coronavirus SARS-CoV-2, e.g., for a complex between the spike protein and the angiotensin-converting enzyme 2 (ACE2) [42, 43] . Their contribution fractions and the accumulated occupancies are also shown. There is no conflict of interest among authors. Modern Methods for Theoretical Physical Chemistry of Biopolymers The Fragment Molecular Orbital Method: Practical Applications to Large Molecular Systems A new method to describe correlated molecular-fragment interactions is proposed Singular value decomposition is applied to the IFIE matrix for protein complex. Correlated inter-residue interactions in the complex are extracted systematically. Complex of measles virus hemagglutinin and SLAM receptor is analyzed Figures S1-S4 are given with their legends in the Supplementary data.