key: cord-0771129-e8hiz4r7 authors: Santo, Kolattukudy P.; Neimark, Alexander V. title: Adsorption of Pulmonary and Exogeneous Surfactants on SARS-CoV-2 Spike Protein date: 2022-05-04 journal: bioRxiv DOI: 10.1101/2022.05.04.490631 sha: b9ec5e418731a24eb5339d73dfd430237cd5569c doc_id: 771129 cord_uid: e8hiz4r7 COVID-19 is transmitted by inhaling SARS-CoV-2 virions, which are enveloped by a lipid bilayer decorated by a “crown” of Spike protein protrusions. In the respiratory tract, virions interact with surfactant films composed of phospholipids and cholesterol that coat lung airways. Here, we explore by using coarse-grained molecular dynamics simulations the physico-chemical mechanisms of surfactant adsorption on Spike proteins. With examples of zwitterionic dipalmitoyl phosphatidyl choline, cholesterol, and anionic sodium dodecyl sulphate, we show that surfactants form micellar aggregates that selectively adhere to the specific regions of S1 domain of the Spike protein that are responsible for binding with ACE2 receptors and virus transmission into the cells. We find high cholesterol adsorption and preferential affinity of anionic surfactants to Arginine and Lysine residues within S1 receptor binding motif. These findings have important implications for informing the search for extraneous therapeutic surfactants for curing and preventing COVID-19 by SARS-CoV-2 and its variants. The COVID-19 pandemic, that affected over 2 years more than half-a-billion people and claimed more than six million deaths around the world, has triggered broad research activities aiming at preventing and curing coronavirus disease. COVID-19 is transmitted by inhaling airborne SARS-CoV-2 virus particles, which infect the respiratory system and cause severe acute respiratory syndrome (SARS) that can lead to lung failure. Whereas our knowledge of the biochemical structure and functions of SARS-CoV-2 is quickly growing, the physico-chemical aspects of virus interactions with the respiratory system environment have been sparsely addressed and are poorly understood. Bridging this knowledge gap is important for informing clinical studies on surfactant therapies by administering respiratory and exogenous surfactants that can hinder virus proliferation and penetrations into the cells. The search for suitable therapeutic surfactants is actively expanding. [1] [2] [3] [4] Here, we explore by using coarse-grained molecular dynamics (CGMD) simulations the mechanisms of preferential adsorption of typical pulmonary and exogenous surfactants on SARS-CoV-2 virions that may lead to virus inactivation. SARS-CoV-2 virions, which contain and protect the virus RNA, are spheroidal nanoparticles, with a lipid bilayer envelope of about 85 nm diameter decorated by a "crown" of 20 nm long protrusions of Spike protein. Being inhaled, SARS-CoV-2 virions travel through lung airways and interact with respiratory surfactant monolayer films coating alveoli that composed of phospholipids, cholesterol, and surfactant proteins, known as the lung surfactant (LS). The virion attachment to cells occurs by binding of S1 domain of Spike protein to angiotensin converting enzyme 2 (ACE2) receptor on the type II pneumocytes of the alveoli. [5] [6] The LS film regulates surface tension at the alveolar interfaces to help breathing and prevent lung collapse during expansion-contraction cycles. 7 LS film is also the first line of defense against inhaled airborne particles. 3 Surfactant-induced inactivation of viruses has been a traditional topic of clinical studies before the COVID-19 pandemic. 4, [8] [9] [10] [11] [12] [13] Surfactants, that have been found effective in inactivating various types viruses and in restoring the lung damage, include soap 10 , sodium dodecyl sulfate (SDS), 11 phospholipids, 12 sodium laureth sulphate, 13 and potassium oleate 9 (see review by Simon et al 4 ). The virus inactivation may occur (1) by surfactant penetration into viral envelope lipid membrane leading to its lysis and (2) surfactant adsorption on viral proteins that plays crucial role in the infection process and the stability of the virus. 4 While viral lysis was observed at high surfactant concentrations that can be cytotoxic, functional inactivation of viral proteins by surfactant adsorption was found at low surfactant concentrations suitable for therapeutic administration. 4 Despite recent great progress in molecular simulations of coronaviruses, including simulations of systems containing several to hundreds of millions of atoms, [14] [15] [16] no attempts have been made to study the interfacial interactions of SARS-CoV-2 virions with pulmonary and exogenous surfactants that is the focus of this work. Our main goal is to mimic in-silico surfactant adsorption on the S1 domain that is responsible for ACE2 binding. Spike protein, a trimer of the 1273-residue S-protein, 17 has two functional domains, S1 (1-685) and S2 (686-1273). S1 binds to ACE2 with its receptor binding domain (RBD 319-546), while S2 assists in fusion of the virion and host cell membranes. The N-terminal domain (NTD 14-305) of S1 binds to sugar molecules. 18 The main functional region in RBD is the receptor binding motif (RBM 438-506). 19 We perform CGMD MARTINI 20-21 simulations of surfactant adsorption on the S1 domain. Three typical surfactants are considered: zwitterionic phospholipid dipalmitoyl phosphatidyl choline (DPPC) and cholesterol (CHOL), that are main components of LS films, and anionic sodium dodecyl sulphate, that is considered as a potential exogenous surfactant with a therapeutic effect. 22 The MARTINI model of S1 domain is constructed utilizing the Spike protein structure predicted by I-TASSER 23-24 ( Figure S1a ). Simulations are performed at the physiological conditions (310 K, 1 bar). The S1 domain is initially placed in the center of the 18 nm cubic simulation box filled with water and surfactant molecules dispersed around the protein at random positions. To demonstrate the difference in surfactant -protein interactions, we explore 7 surfactant compositions: 3 single component solutions and 4 binary mixtures containing DPPC and CHOL at 4:1 and 1:1 ratios, DPPC and SDS at 1:1 ratio, and CHOL and SDS at 1:1 ratio. The total number of surfactant molecules varies between 10-50. 4 replicas of each system are simulated for good statistics (in total -136 simulations). The simulations are run for 1 and the statistics is averaged over the last 200 ns and over the replicas. Details of the CG models schematically shown in Figure 1 and CGMD simulations are provided in the Supporting Information (SI) Section 1. We find that the differences in sorption of various surfactants on S1 domain are determined by an interplay of surfactant-surfactant and surfactant-protein interactions. The former interactions cause surfactant aggregation (micellization). In the single component surfactant systems ( Figure 1a ), DPPC shows a high tendency to aggregate into large clusters that adhere to the protein, which results in fewer direct lipid-protein contacts. In comparison, cholesterol exhibits a much lower aggregation tendency owing to its wedge-like shape that hinders micellization and, consequently, enhances direct lipid-protein interactions. SDS forms pronounced micellar aggregates, similarly to DPPC. Being a smaller molecule with a simpler structure, SDS adsorbs to the protein more effectively compared to DPPC (see below). All three surfactants show strong affinity to specific S1 functional subdomains, NTD, RBD and RBM. Preferential adsorption of surfactants on the amino acid residues along the S1 residue sequence is quantified by the probability, ( ), of the n-th amino acid residue along the sequence ( = 1, . . , 685) to be in direct contact with adsorbed surfactant. ( ) is calculated Figure 1 . a) Representative snapshots of the surfactant-adsorbed S1 sub-domains of Spike protein: DPPC (left), CHOL (middle) and SDS (right). S1 colors: NTD -purple, RBM -red, RBD -ice blue, other S1 regions -gray. The CG models of surfactants are shown in left upper corners. b) The surfactant adhesion probability of residues along the S1 sequence. The highlighted (light-green) regions correspond to high adsorption probabilities. c) The S1 atomistic structure with residues colored according to the surfactant adsorption probability. as the fraction of time that given S1 residue is within 0.62 nm distance from a surfactant molecule (SI, section 2). Note that here we deal with reversible physical adsorption of surfactants. ( ) is averaged over the data from 4 replicate simulations with 5 different surfactant concentrations. Averaging over a total of 20 simulations for each surfactant provides reliable statistics for determining the regions of the protein that are preferential for surfactant adsorption. As depicted in Figure 1b -c, ( ) exhibits similar patterns for all the three surfactants, which identify the blocks of residues,1-14, 100-150, 170-250, 320-506 and 530-600, as having high affinity to surfactants, in contrast to blocks 14-100, and 250-320, where surfactant adsorption is less probable (see detailed analysis in Figure S3 -S4). Figure 1c demonstrates the distribution of S1 residues with respect of their affinity to surfactant adsorption. Note that the RBD and RBM sub-domains represent preferential regions for adhesion of surfactant aggregates. The difference in adsorption of the considered surfactants is further demonstrated with the overall surfactant adsorption probability averaged across the whole S1 domain (SI section 2) at a given molar surfactant concentration, that predicts distinctively a higher adsorption probability of cholesterol compared to DPPC and SDS, (DPPC), indicating distinctively higher cholesterol adsorption. Note that SDS has a higher affinity to S1 compared to DPPC, despite lower adsorption probabilities (Figure 1b-c and 2a) . SDS is a much smaller molecule and thus make less direct contacts with residues compared to DPPC. The fractions of different surfactant fragments (heads and tails) or a bead type that are in contact with the protein, provides a more detailed characterization of the adsorbed surfactants ( Figure 3 ). For instance, 4 represents the ratio of number of PO4 beads in direct contact with the protein residues to the total number of PO4 beads, and so on. In one-component surfactant systems (Figure 3b) , reveals: (CHOL) > (SDS) > (DPPC), which is a much-pronounced trend compared to . Adsorption of DPPC occurs preferentially by hydrophobic tail and glycerol backbone. There is a preference for adsorption of the anionic PO4 over the cationic choline that has the least affinity to the protein. In SDS, the anionic SO3 group has a like affinity to the protein as other uncharged fragments. In general, a clear preference for adsorption of anionic fragments is observed in all systems considered. For cholesterol, the highest affinity to the protein is found for the hydrophilic ROH fragment followed by the aromatic R1-R5 and the aliphatic tail C1/C2 fragments. Note that the observed difference in surfactant adsorption correlates with the affinity of the surfactant head groups. This is explained in part by the fact that in surfactant aggregates adhering to the protein, direct contacts of the tail groups are hindered, while the head groups are located closer to the protein. In mixtures, surfactant adsorption is affected by the interactions between different components. In DPPC-CHOL mixtures (Figure 3c-d) , cholesterol adsorption is reduced, while DPPC adhesion is slightly enhanced compared to the respective single-component solutions. This is explained by CHOL association with DPPC that hinders the CHOL-protein contacts and reduces the aggregate size, enhancing DPPC-protein adsorption. In DPPC-SDS systems, SDS adsorption is suppressed (Figure 3e ) due to the electrostatic attraction between 3 − and choline groups that favors DPPC-SDS aggregation. Adsorption in SDS-CHOL systems seem to be unaffected by the inter-surfactant interactions (Figure 3f ), although the 3 − adsorption appears to be somewhat enhanced. The surfactant affinity to different protein residues is analyzed by calculating the fraction of residues of given type occupied by adsorbed surfactant molecules, ( ), where =A,G, …., E, the amino acid residues (SI, section SIII); i.e., ( ) = 0.2 means that 20% of the total Alanine residues are occupied. The distribution of different amino acids in S1 domain is given in Figure 4a , and the respective distributions of ( ) in Figure 4b . Interestingly, the distributions ( ) exhibit the same patterns as shown in Figure 4b independent of the surfactant concentration ( Figure S5 ). Among the hydrophobic residues, which are present in large quantities, ILE, LEU, PHE, PRO and VAL have a higher affinity to surfactants compared to ALA and GLY. Among the polar residues, Cystine and Tyrosine are distinctively preferred by surfactants while abundant ASN, SER and THR are not liked by surfactants. Cationic residues, Arginine and Lysine, are preferred over anionic residues ASP and GLU by all surfactants considered, not just by zwitterionic DPPC and anionic SDS, but also by nonionic cholesterol. These ionic residues, a total of 113, present in similar amounts evenly distributed along the chain, gives the S1 domain an overall cationic character. The observed ability of surfactants to be adsorbed on cationic residues presents a special interest, as the overall cationic character of S1 domain with a net charge of +5 is the driving force for the Spike protein binding to highly negatively charged ACE2 protein. 25 Understanding the surfactant affinities of various amino acid residues shown in Figure 4 helps to predict the behavior of the CoV-2 variants. S1 domain is strongly cationic with ARG and LYS being preferential sites for surfactant adsorption. However, the roles of ARG and LYS might be not just electrostatic but also residue-specific with additional surfactant interactions with the side chains, which are attractive to non-ionic and zwitterionic surfactants. Noteworthy, the key mutations in the S1 domain of the highly infectious Delta 26 and, especially, Omicron 27 variants (Table S2 ) increase the total number of Arginine and Lysine. The S1 mutations in Delta increase the numbers of ARG by 2 and LYS by 1 and reduce number of GLU by 1, while in the S1 domain of Omicron, ARG is increased by 2 and LYS by 3. The increase in cationic residues may be one of the factors of the high infectivity of CoV-2 variants. In this respect, it is instructive that our simulations suggest that the cationic residues can be effectively targeted by surfactants. To conclude, our simulations demonstrate that typical zwitterionic and anionic surfactants and cholesterol are strongly adsorbed to the functional fragments of the SARS-CoV-2 Spike S1 domain. The adsorption process proceeds by forming surfactant aggregates adhered on the protein. Adsorption of cholesterol is more efficient than other surfactants, that explains recent experimental observations that cholesterol enhances the Spike interactions with model membranes. 28 The anionic surfactants have a higher affinity to S1 domain, tending to be adsorbed on cationic Arginine and Lysine residues, which are present in higher quantities in Delta and Omicron variants. The latter effect may have important implications for the search of extraneous therapeutic surfactant that can potentially hinder binding of Spike proteins with ACE2. The authors thank Andrew Gow and Jared Radbel for useful discussions. The SARS-CoV-2 Spike protein trimer model ( Figure S1a ) in the closed state is obtained from the I-TASSER website (https://zhanggroup.org//COVID-19/, lineage A) from which the S1 domain (residues 1-685) is cut. This S1 fragment is converted to MARTINI CG representation ( Figure S1a ) and the corresponding force field parameters are obtained, using the martinize.py script. The CG MARTINI 1-3 employs a general 4-1 mapping; approximately, 4 heavy atoms are grouped into a CG bead. In the all-atom representation, the S1 domain consist of 10681atoms (including hydrogens) while in the CG model it consists of just 1556 CG atoms. The CG models of the surfactants ( Figure S1b) , DPPC, Cholesterol and SDS, are taken from the structure files available from the MARTINI website (http://cgmartini.nl/). DPPC is modelled into a 12-bead representation with head group containing the choline, and phosphate beads and two glycerol beads, and two tails containing 4 beads each. Cholesterol has a head group with OH containing ROH group connected to the 5-bead aromatic ring region (R1-R5) and a 2-bead aliphatic tail. SDS is dissociated in the solutions into DSand sodium ions. The DSis modelled as consisting of the anionic SO3and a 3bead aliphatic hydrophobic tail. The standard MARTINI water model is used as solvent, in which the CG water bead represents 4 water molecules. 3 GROMACS 2019 4 versions are used for performing simulations. The systems are constructed with the S1 domain at the center of the Figure S3 . a) The atomistic models of Spike trimer and the monomeric S protein, and the MARTINI model of the S1 domain. NTD, RBD and RBM are colored respectively pink, iceblue and red. All other residues are shown in while. b) The MARTINI CG representation of the surfactants studied in the simulations. Table S1 . The details of the simulation systems. *The number of water CG beads given represents that in one of the replicas. N_w in the other three replicas may change by a few, typically less than 5. ( The S1 domain has a net charge of +5 and to neutralize this charge, 5 Cl-ions are added. In SDS systems, since the SO3 Na head group is ionized and dissociated, and therefore, Na+ ions equal to are added as well (Table S1 ). Then four replicas of each system are constructed with a different surfactant placement and the subsequent solvation with water, to improve statistics of the analyzed data, making it a total of 136 systems. Note that since each replica at the same are constructed independently, the number of water molecules in each replica may differ by a few (~5). The initial system size is 18 × 18 × 18 3 and the typical number of water CG beads is ~49400. Details of the systems are provided in Table S1 . MARTINI version 2 1-2 is used for the simulations. The initial S1-surfactant systems are simulated a high temperature of 800 K with position restraints on the S1 fragment under NVT conditions, in order to randomize the surfactant distribution. The Coulomb interactions are incorporated with reaction-field scheme with a general non-bond interaction cutoff 1.1 nm and a dielectric constant =15 that is appropriate for MARTINI standard water. Berendsen thermostat is used for temperature control and a time step of 20 fs is used. The final snapshot of this simulation is used as the initial configuration for the production runs. The initial configurations for the four replicas of each system are obtained with running the NVT randomization for different simulation times and with different random seeds for initial velocity generation; generally, NVT simulations are run in the range 0.2 -1.0 ns. Typical snapshots of initial configurations are provided in Figure S2 . The production runs are performed for 1 for each system, at NPT conditions at physiological temperature 310 K and pressure 1 bar, controlled by Berendsen thermostat and barostat with a compressibility value 3 × 10 −4 bar -1 and a coupling constant 12.0 ps. The surfactant adsorption probability of the nth residue ( ) along the protein sequence is calculated as the fraction of time the residue is in contact with a lipid CG bead. The lipid contacts are counted when the distance between the protein and the lipid beads is less than or equal to 0.62 nm which is 1.3-1.4 of the LJ diameter of the MARITNI beads (0.47 nm). ( ) is calculated over 101 frames in the last 200 ns of each simulations. This is averaged over the 4 replicas at each surfactant concentrations, and further averaged over the 5 surfactant concentrations to finally obtain the values plotted in Figure 1b , Figure S3 and Figure S4 . In Figures S3 & S4 , the average probability in regions of relatively high (green) and low (red) adhesion are given. The highly adhesive regions are identified with relatively high probability of adhesion. The regions of residues,1-14, 100-150, 170-250, 320-506 and 530-600, are found to have high affinity to surfactants, in contrast to blocks 14-100, and 250-320, where surfactant adsorption is less probable. Note that similar patterns are observed in single component ( Figure S3 ) and binary mixture ( Figure S4 ) systems. The average probability of an S1 residue to have adhered with a lipid bead in each simulation is calculated as, Figure S4 . The snapshots of S1-DPPC system with = 50 after the initial surfactant placement and solvation with water (left), and after randomization with NVT simulation at 800 K for 400 ps (right). Colors: S1 backbone is colored orange and the side chains in yellow, the choline group of DPPC in blue, the DPPC glycerol backbone in lime, DPPC tails in cyan, and the Clions in magenta. The water beads are shown as tiny green dots. where = 685, the total number of residues in the S1 domain, and is the average number of residues adsorbed with surfactants. The fraction of surfactants obtained from the average number of surfactants in contact with the protein, calculated in the simulation window 800-1000 ns over 101 frames as = ( 2) . Note that, a surfactant may adhere to the protein, by its one or several CG beads. Thus a more quantitative picture will be obtained when the fraction of adhered surfactant CG beads are calculated, which is given by, Figure S5 . The surfactant adsorption probability of residues in the single componet surfactant systems. The residue regions with relatively high adsorption probability are indicated in green, while lower surfactant adsorption regions indicated in red. The average adsorption probabilities in those regions are also given. Other colors: purple-NTD, red-RBM, iceblue -RBD other than RBM and the remaining S1 regions -gray. where is the average surfactant CG beads in contact with the protein. The fraction of a particular bead type, such as choline or ROH etc is calculated in the same way. in different cases are provided in Figure 3 of the paper. The fraction of the surfactant adsorbed residue type, where = , , … . . , , the amino acid types, is the number of lipid-adsorbed amino acid residue of type and is the total number of -type amino acids in the S1 chain. ( ) for a given is averaged over the four replicas. ( ) is plotted in the case of single type surfactant systems at various concentrations in Figure S5 and Figure 4b of the paper, which shows similar patterns for all surfactants at all concentrations. The hydrophobic residues ILE, LEU, PHE, PRO and VAL, and the polar CYS and TYR have pronounced affinity for surfactants. The cationic residues ARG and LYS are preferred by all surfactants, zwitterionic, anionic and non-ionic. Table S2 lists the ionic mutations in Delta and Omicron variants. [5] [6] Each variant is associated with an increase in ARG and LYS in the S1 domain. = 30 and = 50. The hydrophobic, polar, cationic and anionic residues are respectively colored in cyan, green, blue and red. The Role of Pulmonary Surfactants in the Treatment of Acute Respiratory Distress Syndrome in COVID-19 Assessment of pulmonary surfactant in COVID-19 patients Methods and models to investigate the physicochemical functionality of pulmonary surfactant Surfactants -Compounds for inactivation of SARS-CoV-2 and other enveloped viruses Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study Mechanisms of SARS-CoV-2 entry into cells Lung surfactant Elution of viruses by ionic and nonionic surfactants Inactivation of human and avian influenza viruses by potassium oleate of natural soap component through exothermic interaction The inactivation of the virus of epidemic influenza by soaps Inactivation of human immunodeficiency virus type 1 by nonoxynol-9, C31G, or an alkyl sulfate, sodium dodecyl sulfate Laninamivir octanoate and artificial surfactant combination therapy significantly increases survival of mice infected with lethal influenza H1N1 virus AI-driven multiscale simulations illuminate mechanisms of SARS-CoV-2 spike dynamics Molecular architecture of SARS-CoV-2 envelope by integrative modeling Beyond Shielding: The Roles of Glycans in the SARS-CoV-2 Spike Protein Structural biology of SARS-CoV-2 and implications for therapeutic development Structural and functional properties of SARS-CoV-2 spike protein: potential antivirus drug development for COVID-19 Key residues of the receptor binding motif in the spike protein of SARS-CoV-2 that interact with ACE2 and neutralizing antibodies The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations The MARTINI Coarse-Grained Force Field: Extension to Proteins Surfactant-based prophylaxis and therapy against COVID-19: A possibility The I-TASSER Suite: protein structure and function prediction I-TASSER: a unified platform for automated protein structure and function prediction Interaction Analysis on the SARS-CoV-2 Spike Protein Receptor Binding Domain Using Visualization of the Interfacial Electrostatic Complementarity Evolutionary analysis of the Delta and Delta Plus variants of the SARS-CoV-2 viruses Cryo-EM structure of a SARS-CoV-2 omicron spike protein ectodomain SARS-CoV-2 spike protein removes lipids from model membranes and interferes with the capacity of high density lipoprotein to exchange lipids The MARTINI Force Field: Coarse Grained Model for Biomolecular Simulations The MARTINI Coarse-Grained Force Field: Extension to Proteins Coarse Grained Model for Semiquantitative Lipid Simulations GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers Evolutionary analysis of the Delta and Delta Plus variants of the SARS-CoV-2 viruses Table S2. Analysis of ionic mutations in Delta and Omicron Variant Mutations (S1-green, S2-red) Net change