key: cord-0891505-c1bmzd6o authors: Cao, Yipeng; Yang, Rui; Wang, Wei; Jiang, Shengpeng; Yang, Chengwen; Liu, Ningbo; Dai, Hongji; Lee, Imshik; Meng, Xiangfei; Yuan, Zhiyong title: Probing the Formation, Structure and Free Energy Relationships of M Protein Dimers of SARS-CoV-2 date: 2022-01-11 journal: Comput Struct Biotechnol J DOI: 10.1016/j.csbj.2022.01.007 sha: 35cae5694da6c2a6dd4c8a71e4d493dcd9929011 doc_id: 891505 cord_uid: c1bmzd6o The M protein of the novel coronavirus 2019 (SARS-CoV-2) is the major structural component of the viral envelope and is also the minimum requirement for virus particle budding. M proteins generally exist as dimers. In virus assembly, they are the main driving force for envelope formation through lateral interactions and interactions with other viral structural proteins that play a central role. We built 100 candidate models and finally analyzed the six most convincing structural features of the SARS-CoV-2 M protein dimer based on long-timescale molecular dynamics (MD) simulations, multiple free energy analyses (potential mean force (PMF) and molecular mechanics Poisson-Boltzmann surface area (MMPBSA)) and principal component analysis (PCA) to obtain the most reasonable structure. The dimer stability was found to depend on the Leu-Ile zipper motif and aromatic amino acids in the transmembrane domain (TMD). Furthermore, the C-terminal domain (CTD) effects were relatively small. These results highlight a model in which there is sufficient binding affinity between the TMDs of M proteins to form dimers through the residues at the interface of the three transmembrane helices (TMHs). This study aims to help find more effective inhibitors of SARS-CoV-2 M dimers and to develop vaccines based on structural information. Severe acute respiratory syndrome is caused by novel coronavirus 2019 (SARS-CoV-2). SARS-CoV-2 and its mutants have caused a large number of infections and deaths worldwide. At present, it is the most serious crisis in the global community [1] [2] , and there is a risk of reduced effectiveness or even ineffectiveness of vaccines in the face of novel mutations, such as the delta mutation, suggesting that we need to better understand the structural properties of SARS-CoV-2 to improve treatment [3] [4] [5] . In the last two years, great progress has been made in research on the SARS-CoV-2 virus particle structure: for example, Li et al. determined the high-resolution structure of whole virion particles with cryo-EM [6] . However, due to resolution limitations, the details of the properties of the structural proteins are still vague. Improvement in treatment and prevention strategies is clearly limited by the lack of structural information about viral proteins, and a deeper understanding of these structures can lead to the development of more effective anti-coronavirus drugs and vaccines [7] [8] . SARS-CoV-2 virus particles include four structural proteins-the spike (S), nucleocapsid (N), envelope (E), and membrane (M) proteins-that are important for the infectivity and pathogenicity of coronaviruses [9] [10] [11] . The S protein, which is prominent on the virus surface, contains immune recognition sites and is the most important protein for inducing an immune response [12] [13] . The N protein immobilizes the viral RNA in the capsid [14] . The E protein appears in smaller numbers in the virus, and its role is less understood. It may form a pentamer channel to regulate the ion concentration in the host cell. [15] The M protein is a viral membrane protein that can interact with the S, N and E proteins in the viral envelope to stabilize other structural proteins [16] [17] [18] . The shape and structural protein homeostasis of the viral envelope are mainly determined by the M protein, which is the most abundant structural protein in the coronavirus family and is thought to be important for the formation of viral membranes [19] [20] . The M protein consists mainly of the transmembrane domain (TMD), which is embedded in the viral envelope, and the C-terminal domain (CTD). The CTD of the M protein is localized inside the virion and plays an important role in the structural stability of the N protein [21] . The structure of the M protein is still unknown, mostly because the virus TMD complicates the identification of the M protein by cryo-EM [6] . M proteins interact with other structural proteins to form new virus particles in the endoplasmic reticulum-Golgi apparatus (ERGIC) [22] [23] . In previous studies, the M protein was found to form functional dimers in virus particles, and its abundance was found to affect the curvature of the viral envelope. The binding between the M protein dimer and the S protein is closely related to the virulence of the virus [24] . This suggests that the M protein dimer forms a matrix in the viral envelope and could be defined as a network of molecular interactions able to maintain the balance of the whole virion [17] . It has been shown that the M proteins of SARS-CoV-2 are more immunogenic in terms of T cell response than nonstructural viral proteins [25] , and directly targeting viral replication can target the M protein [26] . The M protein also antagonizes type I and III interferon production by targeting retinoic acid-inducible gene I/melanoma differentiationassociated protein 5 (RIG-I/MDA-5) signaling, enhancing viral replication and reducing the body's immunity to the virus [27] . No crystal structure has been obtained for the SARS-CoV-2 M protein that can demonstrate the dimer conformation. According to the previous cryo-EM structural protein models of SARS-COV produced by Neuman et al., M-M interactions may take various forms [19] . Yu et al used coarsegrained molecular dynamics simulations (CGMD) to simulate M-M interactions and thereby explain the role of M-M dimers in the membrane remodeling of the virus [28] , and Ouzounis et al. predicted low-resolution three-dimensional models of M proteins based on Orf-3a as a structural template [29] . Although there are few reports of M protein dimers, these studies still elucidate the dimer formation process through M-M interactions to some extent. In this study, we established a series of M protein dimer models using long timescale molecular dynamics simulations, free energy calculations and other methods to further study both qualitative and quantitative aspects of the dimerization of M proteins in a complex membrane environment. This study provides new insight into the molecular mechanisms of the structure of the M protein, its influence on other structural proteins and the morphology of the viral envelope. Because the initial model has a large effect on the accuracy of the simulation results of the M protein, we chose an M protein monomer model based on AlphaFold 2 and finely optimized it with the Feig laboratory. The model was highly rated in CASP14 in 2020 [30] [31] [32] . Subsequently, the HDOCK server was used to construct initial models of the M-M dimer [33] . HDOCK is a fast protein-protein docking server that integrates a variety of bioinformatics and structural biology methods to achieve robust and accurate models [34] . It has been successfully applied to the construction of many proteins complex models [35] [36] . The HDOCK server has established a total of 100 M protein dimer models. Previous research has shown that the formation of M protein dimers depends mainly on the surface interaction of the TMD; therefore, the initial models were selected based on the following criteria: 1. the M protein TMD has different interacting surfaces, requiring the dimer models to cover all possible TMD binding surfaces; 2. The TMDs are on the same plane, ensuring that the transmembrane region is always embedded in the membrane; 3. the CTD planes are positively parallel, oriented in the same direction; and 4. the favored region > 95% when MOLPROBITY software is used to evaluate the models [37] . We identified six models by filtering on the criteria above as well as using the HDOCK server's docking score provided by the server as the initial model. Subsequently, we used additional protein-protein docking server tools (CLUSPRO, ZDOCK and HADDOCK) [38] [39] [40] to build M protein dimer models for comparison with the HDOCK results. The construction of the simulation systems was based on our previous study [15] . The membrane composition was obtained from the reference Mandala et al. [41] and some related studies on lipid bilayer models in ERGIC-like environments and defined as a mixture of neutral phosphatidylcholine/phosphatidylethanolamine lipids (PC/PE), charged phosphatidylinositol and phosphatidylserine lipids (PI/PS), cholesterol (CHOL) and a small portion of sphingomyelin (PSM). The proportion of each part is shown in Table 1 . The software TMHMM [42] was used to determine the location of the M protein TMD part in the membrane. The Chemistry at HARvard Macromolecular Mechanics (CHARMM)-GUI server was used to build six simulation systems containing M protein dimers, pre-equilibrium membranes, water and ions, and the system size was ~ 17×17×10 nm 3 . The total atom counts were between 370,000 and 410,000 due to the different dimer topologies [43] . All the simulation systems had 0.15 M NACL added to simulate the physiological environment, and based on CHARMM36, all atomic force fields were applied [44] . First, the fastest descent algorithm was used to perform the energy minimization for 20-50 K steps. Then, six 5 ns equilibrium cycles were performed by gradually closing the position limits of lipid molecules, and the pre-equilibrium systems were optimized for a 20-50 ns NPT simulation. Finally, molecular dynamics simulations were performed for each of the six M protein dimer-membrane systems at 3000 ns. The molecular dynamics (MD) time step was set to 2 fs. Electrostatic interactions were described using the particle mesh Ewald (PME) algorithm with a cutoff of 1.0-1.2 nm. The LINear Constraint Solver (LINCS) algorithm was used to constrain the bonds. The pressure was maintained semi-isotopically at 1 bar in both the x-and y-directions using the Parrinello-Rahman barostat algorithm [45] , and the system temperature was maintained at 310 K using the Nose-Hoover thermostat. The initial systems used for umbrella sampling simulations were derived from the 3000 ns simulation of the M protein dimer mentioned above. The gmx trjconv protocol was used to keep the M protein dimer in the center of the system. One of the dimer chains was restricted, and the other chain was slowly pulled with 1000 kJ/mol energy using the method described by Lu et al [46] . After 5-10 ns, one chain was pulled to the edge of the system, and the distance between the two chains' TMD center of mass was approximately 5 nm. For the umbrella sampling simulation, from the center of mass to 3 nm, each successive window was 0.1 nm, and from 3 nm to 5 nm, the windows were 0.2 nm. Each system had approximately 30-40 windows in total. To maximize the convergence of the membrane protein systems, Domański et al.'s method [47] was applied. Each window was subjected to a long-term simulation for equilibration. A 50-70 ns simulation with a timestep of 3 fs and a force constant of 1,000 kJ/mol -1 nm -2 harmonic potential was used. The initial 20-30 ns were used for system equilibration, and the subsequent 30-50 ns were applied for analysis. For the CTD of the M protein, we chose the closest CTD conformations (dimer 1, see Figure 1 ) for the umbrella sampling simulation. The CTD did not have a TMD, and we removed the TMD and N-terminal domain (NTD) and the GROMACS 'gmx solvate' to insert the CTD part into a water box. We performed 60 ns enhanced sampling for each window. The initial 20 ns was used to equilibrate the system. All simulation parameters were consistent with those mentioned above. The reaction coordinate of all systems was the center of mass of the protein (or CTD). Finally, the potential mean forces (PMFs) were computed using the weighted histogram analysis method (WHAM), and the profile was generated using the 'gmx wham' protocol [48] . A bootstrap analysis (N = 50) was performed to evaluate the statistical error. The PMF balance value was normalized to the vicinity of the y-axis 0, and the energy results were all set to kcal/mol. The molecular mechanics Poisson-Boltzmann surface area (MMPBSA) method was applied to calculate the binding energy between M proteins in all dimers. For all systems, 200 snapshots of the equilibrated last 400 ns were used to calculate the binding free energies with an interval of 2 ns [49] . In all calculations, water, ions and lipids were removed, and the implicit calculation model was used. The binding free energy of the ligand-receptor (M-M proteins) complex could be calculated as: ΔG bind,solv = ΔG polar + ΔG non−polar (4) ΔG protein and ΔG ligand are defined as two chains of the M protein dimer. The binding energy was decomposed per residue to calculate each residue's contribution to the M protein dimer association. Principal component analysis (PCA) [50] was used as a part of the quasiharmonic analysis method to explore the structural and binding energy of six SARS-CoV-2 M protein dimers. The rotational and translational motions of the systems were eliminated by fitting the initial reference structure. The 5000 snapshots from the equilibrated 400 ns of all six M protein dimers were taken to generate the covariance matrix between the Cα atoms of the M protein. The GROMACS tools gmx Sham, Covar and Anaeig were used to perform PCA and obtain the approximate free energy landscape (aFEL). Two feature vectors (PC1 and PC2) generated from PCA were used for the 2d aFEL. The software Chimera [51] was used to visualize the protein structure. All of the simulations were performed using the GROMACS 2018 package [52] on the new-generation supercomputer in the National Supercomputer Center in Tianjin. The total simulation times were ∼70 µs. To build all possible M protein dimer models, we used the HDOCK online docking server of the Huang laboratory [33] [53] . The HDOCK server based on a hybrid algorithm of template-based modeling and the ab initio method can accurately predict the structure of protein-protein or protein-DNA docking according to extensive experiments [54] [55] . With the input of the M protein monomer predicted by the AlphaFold 2 AI algorithm and the optimization of M with the Feig laboratory (the highest prediction score on CASP14), the HDOCK server established a total of 100 possible M protein dimers. Through a filtering process (see the 'Methods' section for the filter criteria), six SARS-CoV-2 M protein dimer models were selected for subsequent simulation, as shown in Figure 1A . To further confirm the accuracy of these models, we used different servers for M protein docking. The results of ClusPro, ZDOCK and HADDOCK included 10, 30 and 10 different dimer models respectively. Using the filtering process mentioned in the method section, 3, 3, and 2 models were obtained for each server, respectively. By comparing the dimer models, we found that they were consistent with the six HDOCK models. There were only slight differences in the CTD (the results are presented in Supplementary Figure 1-3) , and the results indicate that all six dimers are representative. The M protein contains a TMD, and TMHMM was used to define the TMDs of the M protein as three helices, transmembrane helix (TMH)1 (W20-A40), TMH2 (K50-V70) and TMH3 (I80-F100). The CTD (L120-G202) was defined as well and is located within the envelope of the virus particle. Some evidence suggests that the CTD binds viral RNA, and its stability is important for the interaction of RNA with the M protein and M with the N protein [56] . Of the six M protein dimers, the associated formation of TMH1-TMH3 ( Figure 1B) is different. Dimer 1 and dimer 3-dimer 5 show a symmetrical dimerization structure, but dimer 2 shows a parallel dimerization structure. The three transmembrane helices of the M protein monomer interact with another monomer, and the binding positions of THM1-THM3 are different in each dimer. For the CTD, the two states -open (dimer 2, dimer 4-dimer 6) and closed (dimer 1 and dimer 3) -can be clearly observed in Figure 1B . To evaluate the stability of the M protein dimer model in a complex membrane environment, six 3000 ns (3 μs) atom molecular dynamics simulations for each system were performed. RMSD of which is shown in Figure 2 . We evaluated the stability of the TMD and CTD models by analyzing the root mean square deviation (RMSD), as shown in Figure The association energy of the SARS-CoV-2 M protein dimer with different structural characteristics can be characterized by the potential of mean force (PMF), which describes the change in free energy between two M protein monomers along a particular reaction coordinate and is derived from the probability distribution along this coordinate. We analyzed the free energy distribution of M protein dimerization in a complex membrane system by using the umbrella sampling method and calculated the PMF for the lateral interaction of M protein monomers in six conformations. A steered MD simulation was performed in which a force was applied to pull the M protein monomer away from its binding site on the dimer. This generated a lateral 1D reaction coordinate perpendicular to the M protein surface, ranging from the bound to unbound state. The reaction coordinate Z was defined as the distance between the center of mass of the M protein TMD binding site. The PMF difference of the six dimers determines the dimerization capacity. According to the PMF profile, dimer 1 has the largest binding energy, at ~20 kcal/mol, when it is dimerized, while the PMFs of dimers 4, 3, 6, 5 and 2 gradually decrease, and the PMFs range from 18 to 10 kcal/mol. The maximum energy difference ΔG is ~10 kcal/mol. To evaluate whether the TMD or CTD of the M protein has different influences on the formation of the dimer, the PMF of dimer 1 CTD was also calculated. When the system has only CTD, the maximum PMF is only ~3 kcal/mol. These observations indicate that the TMDs of the M protein are more affected by dimerization than CTDs. figure 5B ). For dimer 1, dimer 2, dimer 4 and dimer 5, the higher binding affinity is concentrated in TMH2 and TMH3. For dimer 6, the major contributors to M protein association are TMH1 and TMH2. For dimer 4, the binding energy is equally distributed between TMH1-TMH3. The MMPBSA results clarify that the TMD maintains the dimers through the TMH interaction in the simulation process, but the binding energy of decomposition to each dimer is different, suggesting that the TMHs have an important influence on the stability of M protein dimers with different conformations. Neuman et al. indicated that the tightness of the M protein dimer could affect the curvature of the viral membrane [19] . Our results show that the TMD of the M protein dimer being in close contact would be better for maintaining the stability of the viral membrane. However, when the TMD is loosely bound, the dimer is more likely to disturb the membrane. The transition of the dimer between these conformations changes the shape of the virus. In summary, the TMD of the M protein is critical for maintaining the stability of the virus structure and morphology, as well as its virulence and infectious ability. In summary, TMDs are critical to viral morphology, virulence, and infectious ability. To identify the dominant motions in the TMD of the M protein dimer, PCA was performed on the 400 ns MD simulation trajectory of each system after equilibrium to obtain the decomposition diagram defined by two feature vectors PC1 and PC2. Then, the feature vector is converted to an aFEL, which displays the variance in the TMD conformational motion, as shown in Figure 6 . The blue to red bars represent the free energy from the lowest to the highest. The results show that all the aFEL dimers separate into four free energy gradients, 0, 0.8, 1.6 and 3.5 kcal/mol. Obviously, dimer 1 and dimer 3's aFELs show a lower metastable state, and the lowest free energies are -4.963 and -4.940 kcal/mol. Dimer 2, dimer 4-dimer 6 have smaller regions of blue and purple than dimer 1 and dimer 3, demonstrating that there are large motions in the TMD. Overall, compared with the PMF and MMPBSA, these plots show similar characteristics demonstrating the involvement of these TMDs in the dynamic stability of the M protein dimers. COVID-19 is an unprecedented threat to humans, and currently, the vaccines available are effective against nonmutant SARS-CoV-2 [58] . However, the vaccines are less effective against mutant variants, especially the delta mutation. Researchers studying the pathogenesis and infection routes of COVID-19 have mainly focused on the S protein, but information on other structural proteins is still lacking. A more detailed understanding of the role of structural proteins in viruses is very important for revealing how the virus enters a cell, replicates and is released. As the most abundant structural protein in a coronavirus, the M protein is the main coordinator of virus assembly. The M protein is a transmembrane protein that also has a major function as a dimer in interactions with S, E and N proteins. The M protein includes two main domains, the TMD and CTD. The TMD binds stably to the virus membrane as well as the transmembrane region of the S and E proteins. There is evidence that the CTD of the M protein, which is located in the interior of virus particles, immobilizes viral RNA by N and M protein interactions [59] [60] . Therefore, the SARS-CoV-2 M protein is important for maintaining virus morphology, stabilizing the position of other structural proteins and the size of the coronavirus. In this study, we thoroughly investigated the SARS-CoV-2 M protein dimer structure binding free energy and structural characteristics. By using the optimized AlphaFold-2-built M protein model, the HDOCK server modeled six candidate dimers, and the CLUSPRO, ZDOCK and HADDOCK servers were used to build a total of 50 models to verify the binding coverage of the dimers. The conformations of the M protein dimer can be divided into CTD open states (dimer 2 and dimers 4-6) and CTD closed states (dimer 1 and dimer 3) . The difference in the TMD in the dimer is mainly due to the difference in the transmembrane interaction surfaces (Figure 2 and Supplementary Figure 6 ). After long timescale MD simulations, we find that all six systems are stable and do not dissociate in water or the membrane region, but the dimer conformation shows different steady-state dynamic behaviors. The RMSD values of the TMD and CTD suggest the critical influence of the initial structure on the overall structural stability, especially for the open state of the CTD of the M protein. The RMSD shows higher fluctuation, but the closed state does not appear. This is due to the binding affinity of the residues in CTD being very low. Notably, dimer 4 shows a "semi-open" state, and the CTD shows structural stability between dimer 1 and dimer 2. Compared with dimer 1, the TMDs of dimer 4 interaction surfaces are different. We analyzed the MD trajectory of dimer 4, which does not show a CTD closed state similar to that of dimer 1. The RMSF values (Figure 3) show that the three transmembrane helices, TMH1-TMH3, of the six dimers have high stability, with a fluctuation range of 0.1-0.2 nm, and the average RMSF is 0.12 nm. The CTD shows a large RMSF, especially for four spikes (red arrows in Figure 3 ), and the value exceeds 0.4 nm. The average RMSF of the CTD is 0.18 nm, 0.06 nm higher than that of the TMD. Supplementary Figure 4 shows the positions of the amino acids for the four spikes in the M protein CTD, which are all located at the hairpin of the B-sheet, indicating that the B-sheet affects the overall stability of the CTD. Combined with the RMSD and RMSF, although the overall stability of the M protein depends on both the CTD and TMH, the TMD plays a key role in the stability of the M protein dimer. Regarding the TMD interactions of different M protein dimers, the PMF was calculated. The PMF is a monotonically decreasing function of the degree of separation; it describes the change in free energy between two M protein monomers along a particular reaction coordinate and is derived from the probability distribution along this coordinate. Starting from the critical separation distance of ~2.5 nm A, the M protein binding energy landscape forms a steep downslope funnel depending on the M protein center-of-mass distance. In particular, each M protein dimer has a different PMF profile. Dimer 1 needs more free energy for disassociation, approximately 20 kcal/mol, while dimer 2 needs less, approximately 10 kcal/mol, as shown in Figure 4 . The PMF result of dimer 4 is the closest to that of dimer 1, which corresponds to the structural characteristics between the two dimers. We also built a system to calculate the association energy of the CTD of dimer 1 (the center of mass of dimer 1's CTD is the closest). Under the same MD simulation conditions, the CTD disassociation free energy is only ~3 kcal/mol. Although the separate CTD simulation system cannot fully reflect the role of the M protein CTD in dimerization, some simulations suggest that the CTD contributes very little (<10%) to M protein binding [61] [62] [63] . The PMF results further demonstrate that the difference in the binding affinity of the six dimers is greater for the three THM binding surfaces than the CTD of the M protein. Notably, the structural characteristics of dimer 3 are significantly different from those of other dimers. The M protein monomers contact in parallel, and the dynamic behavior is very stable. The disassociation free energy of the dimer is 16 kcal/mol, indicating that it is also a high probability M protein dimer conformation. We suspect that in some cases, the M protein monomer can form a head-tail contact oligomer structure through a centered dimer 3-like model; if the virus particles need to be recombinant or changed in structure, the M protein oligomer can dissociate into dimers and interact with N, E, S, proteins or the membrane. Our hypothesis can also explain the results of some previous experiments [23] [20] . Since the current umbrella sampling method cannot achieve complete convergence for membrane proteins, it has some influence on the accuracy of the PMF results. The MMPBSA method was used to quantitatively calculate the binding energy of each residue of the M protein dimer during the stability period of MD simulation. especially the contact residues in the TMH1-TMH3 surfaces. Figure 5B shows that the TMHs of the dimer do not contribute the same binding energy. The energy contributions of dimer 1, dimer 2, dimer 4 and dimer 5 are mainly from TMH2 and TMH3. Those of Dimer 6 are from TMH1 and TMH2, and those of dimer 3 are from all three TMHs (each residue binding energy is shown in Supplementary Figure 5 and Table 2 ). The differences in the binding free energies of the six dimer TMDs suggest the stability of the M protein dimers in the complex membrane environment, and the MMPBSA results are in agreement with the PMF results, indicating that the PMF value is convergent. According to the MMPBSA results for each residue, the large numbers of leucine, isoleucine, phenylalanine and tryptophan residues in areas including TMH1-TMH3 suggest that aromatic amino acids and Leu-Ile zippers may play an important role in maintaining dimer stability, which further confirms previous studies based on structural characteristics [64] [65]. The morphology and mechanism of SARS-CoV-2 M protein dimers may be varied. Although the topological stability and binding free energies of the six dimers differ, they do not directly lead to dissociation of the dimers, suggesting that the dimers may coexist in various forms (see Figure 7) . We propose the following reasons: (1) By changing the binding site, the different dimer conformations can change the curvature of the viral envelope and control its shape to adapt to different environments. (2) The TMD of the dimer can bind with the S and E proteins, which requires changing the M protein dimer conformation to expose its binding sites to interact with other structural proteins. (3) When the CTD of the M protein interacts with the N protein and viral RNA, the stability of N and RNA in the virus is not compromised because of the specific morphology of the M protein dimer. (4) When the M dimer is not necessary, M exists as an oligomer with lower binding energy that can easily and quickly dissociate when the environment changes to form the dimer. So far, the virus is still mutating. The amino acid sequences of the M protein of the two common mutations showed that only 1-2 amino acids were mutated compared with the original strain, indicating that the M protein is highly conserved (the amino acid sequence multiple alignments results are shown in supplementary figure 7 ). This conservation may be closely related to the function of the M protein in the virus, such as maintaining the viral envelope morphology, other structural proteins and genetic material stability. These factors provide SARS-CoV-2 with high adaptability to the human environment, which will provide valuable clues for future research on SARS-COV-2 M protein and its mutations. Although the crystal structure of the novel coronavirus 2019 M protein has not yet been determined, the M protein monomer could still be built based on AlphaFold 2, and the dimers were obtained by the HDOCK server. The dimers were characterized, and their structural properties were estimated by all-atom MD simulations in the complex membrane composition. Long timescale MD simulation and free energy results reveal the most stable state of the M protein dimer. The structural differences in M protein dimers are mainly from their binding surface, and the free energy results suggest that the TMD plays an important role in the stability of dimers. The important amino acids for dimers were concentrated in the Leu-Ile zipper and benzene ring amino acids in THM1 to THM3. In addition, this study suggests that the flexible part of the CTD of the M protein is located within the virus envelope but that it has little effect on the overall structure of the dimer. We suggest that the TMD of the M protein dimer may maintain the stability of other structural proteins (S, N, and E) on the viral envelope through similar mechanisms, but the interaction mechanisms need more study. In conclusion, our study revealed the mechanisms of M protein dimerization at the molecular level. In the future, relevant strategies should be used to destroy the assembly of the M protein to reduce the replication ability of the novel coronavirus and to even kill the virus. This study provides novel insights into important structural features of interface residues for the advancement of effective therapeutic strategies to target the role of M proteins in the viral life cycle. Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China CoV-2: a storm is raging SARS-CoV-2 variants, spike mutations and immune escape Neutralization of SARS-CoV-2 variants B. 1.429 and B. 1.351 Reduced sensitivity of SARS-CoV-2 variant Delta to antibody neutralization Molecular architecture of the SARS-CoV-2 virus Two-component spike nanoparticle vaccine protects macaques from SARS-CoV-2 infection Single-component, self-assembling, protein nanoparticles presenting the receptor binding domain and stabilized spike as SARS-CoV-2 vaccine candidates Structural insights into SARS coronavirus proteins Structural basis of receptor recognition by SARS-CoV-2 Structural characterization of SARS-CoV-2: Where we are, and where we need to be The spike protein of SARS-CoV-a target for vaccine and therapeutic development Structural and functional basis of SARS-CoV-2 entry by using human ACE2 Phosphoregulation of phase separation by the SARS-CoV-2 N protein suggests a biophysical basis for its dual functions Computational Study of the Ion and Water Permeation and Transport Mechanisms of the SARS-CoV-2 Pentameric E Protein Channel The M protein of SARS-CoV: basic structural and immunological properties Characterization of protein-protein interactions between the nucleocapsid protein and membrane protein of the SARS coronavirus In silico identification of Tretinoin as a SARS-CoV-2 envelope (E) protein ion channel inhibitor A structural analysis of M protein in coronavirus assembly and morphology Assembly of the coronavirus envelope: homotypic interactions between the M proteins Structural insights into the mechanism of RNA recognition by the N-terminal RNA-binding domain of the SARS-CoV-2 nucleocapsid phosphoprotein Understanding SARS-CoV-2 budding through molecular dynamics simulations of M and E protein complexes The C-terminal domain of the MERS coronavirus M protein contains a trans-Golgi network localization signal Mapping of the coronavirus membrane protein domains involved in interaction with the spike protein SARS-CoV-2-specific T cells are rapidly expanded for therapeutic use and target conserved regions of the membrane protein The SARS-Coronavirus Membrane protein induces apoptosis through modulating the Akt survival pathway Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) membrane (M) protein inhibits type I and III interferon production by targeting RIG-I/MDA-5 signaling A multiscale coarse-grained model of the SARS-CoV-2 virion A recent origin of Orf3a from M protein across the coronavirus lineage arising by sharp divergence It will change everything": DeepMind's AI makes gigantic leap in solving protein structures Physics-Based Protein Structure Refinement in the Era of Artificial Intelligence High-accuracy protein structure prediction in CASP14 The HDOCK server for integrated protein-protein docking HDOCK: a web server for protein-protein and protein-DNA/RNA docking based on a hybrid strategy Challenges and opportunities of automated protein-protein docking: HDOCK server vs human predictions in CAPRI Rounds Computational structure modeling for diverse categories of macromolecular interactions MolProbity: More and better reference data for improved all-atom structure validation The HADDOCK web server for data-driven biomolecular docking ZDOCK server: interactive docking prediction of protein-protein complexes and symmetric multimers The ClusPro web server for protein-protein docking Structure and drug binding of the SARS-CoV-2 envelope protein transmembrane domain in lipid bilayers Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes CHARMM-GUI: a web-based graphical user interface for CHARMM CHARMM General Force Field (CGenFF): A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields Lee Holian B. Hoover NPT dynamics for systems varying in shape and size Steered molecular dynamics simulations of force-induced protein domain unfolding Convergence and sampling in determining free energy landscapes for membrane protein association The weighted histogram analysis method for free-energy calculations on biomolecules. I. The method In silico screening identifies a novel potential PARP1 inhibitor targeting synthetic lethality in cancer treatment Principal component analysis and long time protein dynamics Structure visualization for researchers, educators, and developers High performance molecular simulations through multi-level parallelism from laptops to supercomputers Comparative hostcoronavirus protein interaction networks reveal pan-viral disease mechanisms Identification of small molecule inhibitors of the deubiquitinating activity of the SARS-CoV-2 papain-like protease: in silico molecular docking studies and in vitro enzymatic activity assay The Card1 nuclease provides defence during type III CRISPR immunity Structure of the SARS coronavirus nucleocapsid protein RNA-binding dimerization domain suggests a mechanism for helical packaging of viral RNA Self-assembling study of sarcolipin and its mutants in multiple molecular dynamic simulations Transmission event of SARS-CoV-2 Delta variant reveals multiple vaccine breakthrough infections Efficient assembly and release of SARS coronavirus-like particles by a heterologous expression system Evidence that TMPRSS2 activates the severe acute respiratory syndrome coronavirus spike protein for membrane fusion and reduces viral control by the humoral immune response HERG channel and cancer: a mechanistic review of carcinogenic processes and therapeutic potential Association of peripheral membrane proteins with membranes: Free energy of binding of GRP1 PH domain with phosphatidylinositol phosphatecontaining model bilayers Conformational dynamics and free energy of BHRF1 binding to Bim BH3 Identifying SARS-CoV membrane protein amino acid residues linked to virus-like particle assembly The SARS-coronavirus infection cycle: a survey of viral membrane proteins, their functional interactions and pathogenesis The SARS-CoV-2 nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein Evolutionary dynamics of SARS-CoV-2 nucleocapsid protein and its consequences Authorship contribution statement Rui Yang: Conceptualization, Methodology, Data curation This study was supported by the National Natural Science Foundation of China (NSFC-31900894) and the China Postdoctoral Science Foundation (2021M700975). We thank the National Supercomputer Center in Tianjin (NSCC-TJ) for the long-timescale MD simulations performed on the New Generation Exascale Tianhe Supercomputer.