key: cord-1008293-3z5aooi1
authors: Sun, Qinfang; Biswas, Avik; Vijayan, R. S. K.; Craveur, Pierrick; Forli, Stefano; Olson, Arthur J.; Castaner, Andres Emanuelli; Kirby, Karen A.; Sarafianos, Stefan G.; Deng, Nanjie; Levy, Ronald
title: Structure-based virtual screening workflow to identify antivirals targeting HIV-1 capsid
date: 2022-03-09
journal: J Comput Aided Mol Des
DOI: 10.1007/s10822-022-00446-5
sha: cf7237d83c431231117a96b34613d304fde19a10
doc_id: 1008293
cord_uid: 3z5aooi1

We have identified novel HIV-1 capsid inhibitors targeting the PF74 binding site. Acting as the building block of the HIV-1 capsid core, the HIV-1 capsid protein plays an important role in the viral life cycle and is an attractive target for antiviral development. A structure-based virtual screening workflow for hit identification was employed, which includes docking 1.6 million commercially-available drug-like compounds from the ZINC database to the capsid dimer, followed by applying two absolute binding free energy (ABFE) filters on the 500 top-ranked molecules from docking. The first employs the Binding Energy Distribution Analysis Method (BEDAM) in implicit solvent. The top-ranked compounds are then refined using the Double Decoupling method in explicit solvent. Both docking and BEDAM refinement were carried out on the IBM World Community Grid as part of the FightAIDS@Home project. Using this virtual screening workflow, we identified 24 molecules with calculated binding free energies between − 6 and − 12 kcal/mol. We performed thermal shift assays on these molecules to examine their potential effects on the stability of HIV-1 capsid hexamer and found that two compounds, ZINC520357473 and ZINC4119064 increased the melting point of the latter by 14.8 °C and 33 °C, respectively. These results support the conclusion that the two ZINC compounds are primary hits targeting the capsid dimer interface. Our simulations also suggest that the two hit molecules may bind at the capsid dimer interface by occupying a new sub-pocket that has not been exploited by existing CA inhibitors. The possible causes for why other top-scored compounds suggested by ABFE filters failed to show measurable activity are discussed. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s10822-022-00446-5.

HIV-1 capsid protein (CA) plays an essential role in the HIV-1 life cycle [1] . In addition to acting as the building block of the HIV-1 capsid core, CA also interacts with several host factors, including CPSF6 and NUP153 to regulate important molecular events such as uncoating and nuclear transport [2, 3] . As a result, CA has been recognized as an attractive drug target for antiviral development. Among the several small molecule binding sites on CA, medicinal chemistry efforts have largely focused on targeting the PF74 binding site. X-ray structures reveal that PF74 binds to a pocket located at the inter-helix space between the H3 and H4 helices of one CA subunit (CA-NTD) and the H8 helix of an adjacent CA subunit (CA-CTD) (Fig. 1) [1, 2] . PF74 is a phenylalanine derived peptidomimetic identified by Pfizer from a high throughput screen and displays antiviral activity at sub-micromolar potency (EC 50 = 8-640 nM) [4] .

Further progression of this chemical series was halted given its poor metabolic stability profile. The most advanced HIV capsid targeting small molecule inhibitor is GS-6207 (Lenacapavir), an investigational compound in Phase III clinical trials. Despite the recent progress in the discovery of potent and novel capsid inhibitors, the emergence of Lenacapavir resistant mutations (e.g., M66I and N74D/ Q67H towards GS-6207) emphasizes the need for developing newer antivirals that overcome resistance to existing antivirals.

In this work, we apply a virtual screening workflow (see Fig. 2 , left) that consists of docking, which is run as part of the FightAIDS@Home project on the IBM World Community Grid, and absolute binding free energy calculations (ABFE) to identify small molecule HIV-1 inhibitors targeting the PF74 site of CA. Computational methods are increasingly being used in various stages of the drug discovery process. For example, in virtual screening of large compound libraries, docking, and pharmacophore mapping are widely used in hit identification [5, 6] ; in the meantime, the more rigorous free energy perturbation (FEP) method is playing an increasingly important role in hit-to-lead and lead optimization because of its ability to accurately estimate the relative binding free energy between closely related ligands [7] [8] [9] .

While docking is a powerful tool for predicting binding modes and for rapidly filtering out molecules that are unlikely to bind, its accuracy is hampered by the approximate treatment of desolvation, receptor reorganization, and entropic effects. Furthermore, the hit rates from docking are highly dependent on the nature of the drug target and the chemical space of the virtual library it is screened against, as demonstrated by the spread of the hit rates (0.2 to 100%) reported in Shoichet's review [10] . As some drug targets are notoriously undruggable (e.g., shallow solvated sites), the hit rates would be poor. Conversely, established drug targets like kinases and proteases tend to yield better hit rates. A target-focused library like a "kinase-targeted library" would yield more hits as the compounds contained in the library have been designed to interact with the target protein. In addition, structural knowledge of the target protein complexed with small molecules/endogenous substrates Fig. 1 A Crystal structure of PF74 (purple) bound at the dimeric interface of a CA hexamer (PDB ID: 4XFZ [1] ). B The PF74 binding site is located between the inter-helices space formed between helices H3 and H4 of CA1 subunit (blue) and H8 of CA2 subunit (orange) Fig. 2 A Screening cascade used for the discovery of small molecules targeting the PF74 site of HIV-1 capsid protein. B Illustration of the thermodynamic cycle scheme used by DDM for ABFE calculation could help improve hit rates, as this information could be leveraged for guided docking approaches (docking based on pharmacophoric/interaction constraints).

To improve the treatment of desolvation, receptor reorganization, and entropic effects in docking, the popular MM-GBSA method is widely used to improve the ranking of the docked complexes [11, 12] . . To further capture all these physical effects including the receptor reorganization and entropic effects, absolute binding free energy methods (ABFE) such as the double decoupling method (DDM, see Fig. 2 right panel) [13, 14] and BEDAM [15] , which are based on molecular dynamics simulations, can be used following docking to improve the accuracy of virtual screening. We note that the scoring functions used in MM-GBSA and BEDAM are closely related. BEDAM implements more accurate estimators for the GB and non-polar solvation terms [16] [17] [18] .

In recent years advances in the methodology and energy functions and the increased computing power have enabled the use of ABFE calculations for the study of biomolecular recognition [19] [20] [21] [22] [23] [24] [25] [26] [27] . While ABFE has yet to be widely applied to screen libraries containing millions of compounds due to the high computational cost relative to docking, we and others have demonstrated a few years ago that such methods can be successfully employed as an additional filter on hundreds of top docked compounds to obtain significantly improved enrichment in virtual screening applications [28] . In our previous virtual screening studies for the SAMPL4 [26] and D3R Grand Challenge [22] , we have used an implicit solvent-based ABFE method BEDAM [15] . To our knowledge, those two studies are among the first large-scale applications of ABFE in a virtual screening campaign. In recent years automated high-throughput explicit solvent ABFE protocols are being rapidly developed for docking refinement in virtual screening on GPU and cloud computing platforms [29] [30] [31] . We also note that similar virtual screening workflows have been successfully applied in recent drug discovery studies against SARS-CoV-2 [32, 33] , and Hepatitis B Virus (HBV) capsid [34] . Acharya et al. [32] have combined enhanced sampling molecular simulation and ensemble docking to predict a set of compounds binding to SARS-CoV-2 targets for experimental validation. Li et al. [9] recently reported the discovery of 16 inhibitors targeting the SARS-CoV-2 main protease (Mpro), using a computational virtual screening approach that includes molecular docking studies followed by accelerated free energy perturbationbased absolute binding free energy (FEP-ABFE) simulations. Senaweera et al. [34] identified novel HBV capsid assembly modulators hits by employing a structure-based virtual screening against a small molecule protein-protein interaction (PPI) library and pharmacophore-guided compound design, which are subsequently validated by synthesis and biological evaluation.

In the present study, we apply two ABFE filters (implicit solvent BEDAM, explicit solvent DDM) to improve the discriminative power of the virtual screening pipeline. As described below, out of the 24 compounds suggested by this virtual screening pipeline to be the most promising compounds, we identified two primary hit molecules using thermal shift assays. Our simulation results suggest that these two hit molecules may interact with CA by occupying a new sub-pocket. We also analyze the reason why many top compounds suggested by ABFE filters failed to show measurable activity in assays. Among these, the poor solubility of several compounds and overestimation of the nitro group's hydrogen bonding propensity by the current force field are identified as the most likely causes that have reduced the enrichment in virtual screening.

To identify novel hits targeting HIV-1 Capsid (CA), about 1.6 million drug-like commercially-available compounds from the ZINC library were screened using AutoDock Vina [35, 36] . The top 500 compounds from docking, including four top poses for each compound, were refined using the implicit solvent binding free energy method BEDAM (Binding Energy Distribution Analysis Method) [15, 37] . This corresponds to a total of 2000 absolute binding free energy simulations. Used with docking, the main strength of BEDAM, as demonstrated in previous studies [21, 22, 24, 26, 38] is its ability in distinguishing binders from nonbinders to improve enrichment.

Before running BEDAM on the docked complexes, the method was validated with both negative and positive controls for screening against CA targets. As a negative control, we have run BEDAM to compute the absolute binding free energy for a set of compounds that are known to be nonbinders according to previous experimental assays (unpublished). As shown in Fig. 3 the BEDAM-calculated binding free energies, which are all unfavorable, are consistent with the fact that these compounds do not bind CA in the thermal shift assays. We have also validated BEDAM using known binders PF74 and GS-6207 and obtained favorable absolute binding free energies i.e., − 1.9, − 7.8 kcal/mol respectively. While the predicted ABFE values are too weak, the free energy difference between these two binders binding to CA correlates well with the ratio of experimentally determined EC 50 values [39, 40] . These negative and positive control results help validate BEDAM for use as a screening tool targeting CA and establish the cutoff absolute binding free energy ∆ G 0 bind from BEDAM as 0 kcal/mol.

While in a typical docking simulation, the receptor is kept fixed, AutoDock provides an option to allow limited receptor flexibility. Using BEDAM to refine two sets of docked structures from top-ranked flexible and rigid receptor docking, we find that the docked structures from the flexible docking yield consistently more favorable BEDAM ∆ G 0 bind : see Fig. 4 . Allowing receptor flexibility in docking leads to the sampling of more low energy structures as the ligands and the receptor atoms can adapt to each other to achieve more optimal interactions.

The results from the BEDAM rescoring show that only a small number of top docked ligands, i.e., ~ 50 out of 500, showed negative (favorable) binding free energies for binding to the PF74 site of CA (Fig. 5) . Thus, including the physics-based BEDAM ABFE filter substantially reduces the number of ligands whose binding is further examined with more accurate explicit solvent ABFE simulations.

Before carrying out absolute binding free energy simulations with explicit solvent (DDM) to refine the 50 top-ranked ligands from BEDAM, we first used DDM (Fig. 2 , right) to estimate the binding affinities for three known binders to the PF74 site of CA: PF74 [2] , GS-6207 [40, 41] , and ZW-1261 [42] . Figure 6 shows that the DDM calculated absolute binding free energies agree reasonably well with the experimental measurements (K D and/or EC 50 ), with error bars less than 1.5 kcal/mol. These results suggest that DDM can be used to provide reasonable estimates of absolute binding free energies of ligands targeting the PF74 binding site of CA.

The absolute binding free energies for the 50 top compounds from BEDAM are calculated using DDM ( Fig. 7) . Using ∆ G 0 bind of − 6.0 kcal/mol (which translates into K D of 40 μM) as the threshold for binders, 11 compounds with ∆ G 0 bind more favorable than -6.0 kcal/mol are identified as the most promising compounds to be tested experimentally for activity. In addition, these 11 ligands are used as the seed compounds to identify additional congeneric molecules in the ZINC library based on chemical similarity. Using the FEP program (FEP+ [7] ) from Schrodinger Inc., ~ 40 of the relative binding free energies of the congeneric compounds with respect to the seed molecules are calculated. The congeneric compounds with more favorable binding free energies than the original seed compounds are retained. This procedure led to a total of 24 compounds that were purchased and experimentally tested (Table S1 ). 

Biophysical screening consisting of fluorescence-based thermal shift assays (TSA) was carried out on the 24 topranked compounds suggested by the virtual screening pipeline. The TSA provides an estimate for the effects of compound binding on the thermal stability of covalently crosslinked CA hexamers, by measuring the melting temperature shift, ΔT m , of the crosslinked CA hexamer in the presence and absence of a small molecule. Table S1 (Supporting Information) shows the measured ΔT m at the small molecule concentration of 40 μM for the 24 suggested ZINC compounds and PF74. Among the ZINC compounds, two show ΔT m greater than 1 °C, i.e., ZINC520357473 and ZINC4119064 exhibiting ΔT m of 14.8 °C and 33 °C, respectively. These values are comparable to the ΔT m of ~ 9 °C and ~ 11 °C caused by the addition of the known CA binders PF74 (shown in Table S1 ) and GS-6207 [41] . Interestingly, these two molecules shared similar binding modes which are largely preserved throughout the docking and MD free energy simulation stages (Fig. S1 and Table S2 ). These results show that the two molecules ZINC520357473 and ZINC4119064 are primary hits targeting HIV-1 CA.

We examine the protein-ligand interactions based on the representative molecular dynamics (MD)-simulated structures of the two hits ZINC520357473 and ZINC4119064 in complexes with CA by superimposing the MD structures containing these molecules onto that containing the known inhibitor PF74 (Fig. 8B, C) . As seen from Fig. 8A ZINC520357473 and ZINC4119064 have different chemical functionality and possibly improved metabolic stability compared with PF74, which suffers from low metabolic stability [43] . Based on the MD simulated structures shown in Fig. 8 , while these two hit molecules and PF74 all appear to interact with the sub-pocket 1 (residues Asn53, Lys70, and Tyr130), the more extended ZINC520357473 and ZINC4119064 also potentially occupy a new nonpolar sub-pocket 2 (residues Pro34, Ile37, Pro38, Arg173, and Ala174). We note that although the two ZINC molecules potentially occupy both sub-pocket 1 and sub-pocket 2, their absolute binding free energies are weaker compared with PF74 ( Table 1 ). The weaker binding affinities for the two ZINC molecules could be attributable to two factors: (1) As seen from Fig. 8 ., the interactions between the ZINC molecules and the sub-pocket 1 are less optimal compared with those in the PF74-CA complex; (2) for the two hit molecules to occupy the new sub-pocket 2, the protein intramolecular hydrogen bonds Arg173-Asn57 and Arg173-Val59 that are present in the CA-PF74 structure need to be broken to accommodate the more extended ZINC compounds; this can be seen from the RMSD of sidechains of the pocket 2 residues in the ZINC4119064-CA complex (using apo CA as the reference), which is larger than that in the PF74-CA complex (Table S3) .

However, despite exhibiting weaker interactions compared with PF74, the new potential protein-ligand interaction motifs featured by the two ZINC molecules suggest possible ways to design new and more potent inhibitors (Fig. 8) . For example, ZINC520357473 and ZINC4119064 could be modified to optimize their interactions with the sub-pocket 1 residues. It is also possible to "hybridize" the PF74 with ZINC4119064 to occupy the sub-pocket 2 without sacrificing the intermolecular hydrogen bonds involving the Arg173 side chain. Thus, the new chemotypes and possible novel interaction motifs found in these two hit molecules could present opportunities for new lead design towards compounds with better potency and metabolic stability.

We now examine why many compounds predicted by DDM to bind CA with low nanomolar affinity fail to show measurable activities in assays. One possible reason for the lack of measurable thermal shift signals could be due to the poor compound solubility in solution, which leads to low concentrations of soluble compounds in solutions to bind CA. In fact, from visual inspection we found that eight compounds show poor solubility in either DMSO or aqueous buffer solutions (highlighted in Table S1 ). As seen from Table S1 , half of the ZINC molecules screened showed logP values greater than 6. Furthermore, Table S4 shows that while the average logP in the ZINC database is ~ 3.1 the average logP values of the top ~ 500 docking and top ~ 50 BEDAM ranked compounds are shifted up to 5.3 and 5.7, respectively. These values are also higher than the average logP of ~ 4.5 for the known inhibitor PF74, and the two primary hits ZINC520357473 and ZINC4119064 (logP 3.5 and 4.5 respectively). This may suggest that the scoring functions overly reward the burial of nonpolar surface area. More balanced treatments of desolvation contribution in the scoring function could lead to improvement in the accuracy of virtual screening [44] .

Another possible reason for the false positives from the DDM could be the overestimation of the hydrogen bonding propensity of the nitro group in several top-ranked compounds (Table S1 ). For example, in the MD structures, ZINC58660738 which carries a nitro group predicted to form pi-cation interactions with Lys70 of CA, is predicted to bind CA with nanomolar affinity by DDM (Fig. 9) . However, experimental studies suggest that the nitro group is a weak hydrogen bond acceptor [45] . In such molecules, the force field may likely overestimate the intermolecular hydrogen bond involving the nitro group, which leads to false positive predictions by the DDM calculations. One CA monomer is shown as blue cartoon, while the neighboring CA monomer is shown as orange cartoon. Pocket-1 is shown in blue circle and pocket-2 is shown in red circle 

In this work, a computational workflow was applied to identify hits targeting the PF74 binding site of HIV-1 capsid. The virtual screening workflow includes docking ~ 1.6 million drug-like molecules from the ZINC database, followed by absolute binding free energy simulations in both implicit (BEDAM) and explicit solvent (DDM). The computational screening identified 11 compounds with binding affinities between − 6 and − 11.7 kcal/mol. Additional small modifications were made based on chemical similarity to identify another 13 compounds with binding affinities estimated by FEP to be between − 6 and − 12.3 kcal/ mol. Among the 24 compounds predicted to be most promising, two have shown strong signals in thermal shift assays that demonstrate their ability of stabilizing the HIV-1 capsid hexamer. The workflow used in this study is similar in spirit to an industrial drug discovery setting, where a virtual screening campaign is run against a vendor or "make-on-demand" library to shortlist around 250-350 compounds for sourcing/purchase. Lastly, our modeling suggests that the two hit molecules may interact with HIV-1 CA by occupying a new sub-pocket that has not been exploited by existing CA inhibitors. While the comparison of this possible novel interaction motif obtained from modeling with those exhibited by the known inhibitor PF74 provides insights for the design of improved CA-targeting HIV-1 inhibitors, the new interaction motif revealed by the MD simulations needs to be structurally validated by X-ray crystallography.

The binding free energy calculations have been performed on a set of ligands that dock favorably to the PF74 binding site of HIV CA. The structures of the protein-ligand complexes from AutoDock Vina [35, 36] were used as the starting point for the free energy calculation. Two kinds of free energy methods, the binding energy distribution analysis method (BEDAM) [15] , the double decoupling method (DDM) [19, 46] , and free energy perturbation (FEP) [7, 47] were employed. While DDM [13, 14, 48] is the standard method for computing absolute binding free energy in explicit solvent, FEP [7] is the standard method for computing relative binding free energy in explicit solvent, the recently developed BEDAM [15] method employs Hamiltonian replica exchange in an implicit solvent model to accelerate the sampling of the phase space.

AutoDock Vina and BEDAM were run on the IBM World Community Grid (WCG), a volunteer-supported computing platform devoted to running projects that will benefit Fig. 9 The predicted interaction diagram between ZINC58660738 and CA based on the molecular model of the CA-ZINC58660738 complex, which features an intermolecular H-bond involving a nitro group. According to DDM calculations, this molecule which show no measureable activity in experimental assays, is predicted to bind with nanomolar affinity human health. As one of the 31 research projects supported by WCG, the FightAIDS@Home project (FAAH; http:// fight aidsa thome. scrip ps. edu/, https:// fight aidsa thome2. cst. temple. edu/) utilizes the WCG distributed computing network to conduct virtual screens for discovering new inhibitors against HIV capsid using AutoDock Vina (FAAH phase I) and BEDAM (FAAH phase II). DDM and FEP were carried out on high performance computing (HPC) resources from XSEDE and local computing resources CB2RR at Temple University.

The docking computational experiments were performed using the docking software AutoDock Vina [35, 36] . During docking, a total of 10 different CA structures were selected from the crystal structure of PF74 in complex with the native CA hexamer (PDB 4XFZ) [1] and structures from two models of the whole CA core (PDB 3J3Q and 3J3Y) [49] . Among these 10 structures, 6 represent the conformations found in CA hexamers and 4 in CA pentamers. Each of these target conformations was used in the docking computation either as a full rigid structure or with a specific combination of flexible side chains that are involved in the PF74's binding pocket. The flexible parts of the target were included in the docked poses of the ligands in the structure files. The ZINC sub-database libraries that have been screened in this work include Maybridge, Chembridge, FDA approved drugs, and human metabolite database. A total of 1,677,767 commercially available compounds from these libraries were docked against the target structures.

A set of 500 top-ranked ligand-CA complexes were obtained from the docking screening, of which 250 binding to rigid CA receptors, and 250 to flexible CA receptors. For each ligand, the top 4 predicted poses from AutoDock Vina [35, 36] were retained for further processing by ABFE screening.

In the BEDAM (Binding Energy Distribution Analysis Method) approach [15] , the protein-ligand system is described by the OPLS2005 force field [50, 51] and an implicit solvation model AGBNP2 [17, 52] . The standard binding free energy ΔG 0 b is computed using a hybrid effective potential connecting the unbound state (λ = 0) and the bound state (λ = 1), without going through the gas phase ligand state as in the case of explicit solvent double decoupling method. The methodology of BEDAM has been described in previous papers [15, 17] . The setup of the BEDAM simulations has been described previously [53] . The software implemented on the WCG to run BEDAM is the academic version of IMPACT [54] .

For the explicit solvent double decoupling calculations (DDM) [13, 14, 19, 27, [55] [56] [57] [58] of absolute binding free energy performed in this study, the protein receptor is modeled with the Amber ff14sb-ILDN force field [59] , and the ligands are described by the Amber GAFF2 parameters set [60] and the AM1-BCC charge model [61] . The starting structure of GS-6207 in complex with a CA dimer is extracted from the crystal structure coordinates of the complex of GS-6207 with a cross-linked CA hexamer (PDB 6V2F) [40] ; the starting structure of PF74 in complex with a CA dimer is extracted from the crystal structure coordinates of the complex of PF74 with native CA hexamer (PDB 4XFZ) [1] , and the starting structure of ZW-1261 in complex with a CA dimer is extracted from the crystal structure coordinates of the complex of ZW-1261 with native CA hexamer (PDB 7M9F) [42] .

A DDM calculation involves two legs of simulation, in which a restrained ligand is gradually decoupled from the receptor binding pocket or from the aqueous solution. In each leg of the decoupling simulations, the Coulomb interaction is turned off first using 11 λ-windows, and the Lennard-Jones interactions are then turned off in 17 λ-windows. 

FEP + program [7] of Schrödinger Suite 2020-1 was used to calculate the relative binding free energies of ligands binding to Capsid dimer. The complexes were prepared with Protein Preparation Wizard and the simulations were performed by FEP + Panel. For every pair of ligands, the FEP + GUI in Maestro Suite is applied to build the perturbation map. OPLS3e force field was used for modeling proteins and ligands. Torsion parameters were checked for all ligand fragments using Force Field Builder. A 10 Å cubic box filled with ~ 31,300 SPC water was used for complex and solvent perturbation leg. The number of alchemical λ windows perturbations are set to 12 by default and intermediate windows spanned wild-type and mutant states. For each λ window, the production run was 15 ns in the NPT ensemble. The Bennett acceptance ratio method (BAR) was used to calculate the free energies between two λ windows.

TSAs used purified covalently crosslinked hexameric CA A14C/E45C/W184A/M185A (CA121). CA121 cloned in a pET11a expression plasmid was kindly provided by Dr. Owen Pornillos (University of Virginia, Charlottesville, VA, USA). CA121 was expressed in E. coli BL21(DE3) RIL cells and purified according to reported protocols [62] . The TSAs were conducted as previously described [63] [64] [65] , with each reaction containing 7.5 µM crosslinked CA hexamer in 50 mM sodium phosphate buffer (pH 8.0), 1 × Sypro Orange Protein Gel Stain (Life Technologies, Carlsbad, CA, USA), and either 1% DMSO (control) or 40 µM compound (1% DMSO final). The plate was heated from 25 to 95 °C with a heating rate of 0.2 °C every 10 s in the QuantStudio 3 realtime PCR system (Thermo Fisher Scientific). The fluorescence intensity was measured with an Ex range of 475-500 nm and Em range of 520-590 nm. The difference in the melting temperature (ΔT m ) of crosslinked CA hexamer in the presence of compound (T m ) versus DMSO control (T 0 ) was calculated for each compound tested using the following Eq. (1):

The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s10822-022-00446-5.

X-ray crystal structures of native HIV-1 capsid protein reveal conformational variability

Structural basis of HIV-1 capsid recognition by PF74 and CPSF6

HIV-1 resistance to the capsid-targeting inhibitor PF74 results in altered dependence on host factors required for virus nuclear entry

HIV capsid is a tractable target for small molecule therapeutic intervention

Discovery of small molecule inhibitors of MyD88-dependent signaling pathways using a computational screen

Design, synthesis, and antiviral activity of entry inhibitors that target the CD4-binding site of HIV-1

Accurate and reliable prediction of relative ligand binding potency in prospective drug discovery by way of a modern free-energy calculation protocol and force field

Molecular dynamics and Monte Carlo simulations for protein-ligand binding and inhibitor design

Fast, accurate, and reliable protocols for routine calculations of protein-ligand binding affinities in drug design projects using AMBER GPU-TI with ff14SB/GAFF

Docking screens for novel ligands conferring new biology

The MM/PBSA and MM/GBSA methods to estimate ligand-binding affinities

Assessing the performance of the MM/PBSA and MM/GBSA methods. 1. The accuracy of binding free energy calculations based on molecular dynamics simulations

The statistical-thermodynamic basis for computation of binding affinities: a critical review

Absolute binding free energies: a quantitative approach for their calculation

) the binding energy distribution analysis method (BEDAM) for the estimation of protein-ligand binding affinities

On the nonpolar hydration free energy of proteins: surface area and continuum solvent models for the solute-solvent interaction energy

The AGBNP2 implicit solvation model

Free energy surfaces of beta-hairpin and alpha-helical peptides generated by replica exchange molecular dynamics with the AGBNP implicit solvent model

Comparing alchemical and physical pathway methods for computing the absolute binding free energy of charged ligands

Resolving the ligand-binding specificity in c-MYC G-quadruplex DNA: absolute binding free energy calculations and SPR experiment

Binding energy distribution analysis method: hamiltonian replica exchange with torsional flattening for binding mode prediction and binding free energy estimation

Large scale free energy calculations for blind predictions of protein-ligand binding: the D3R Grand Challenge

Parameterization of an effective potential for protein-ligand binding from host-guest affinity data: force Field Optimization With Host-Guest Systems

Distinguishing binders from false positives by free energy calculations: fragment screening against the flap site of HIV protease

The mechanism of H171T resistance reveals the importance of Ndelta-protonated His171 for the binding of allosteric inhibitor BI-D to HIV-1 integrase

Virtual screening of integrase inhibitors by large scale binding free energy calculations: the SAMPL4 challenge

Elucidating the energetics of entropically driven protein-ligand association: calculations of absolute binding free energy and entropy

Rigorous free energy simulations in virtual screening

Automation of absolute protein-ligand binding free energy calculations for docking refinement and compound evaluation

GPU-accelerated molecular dynamics and free energy methods in amber18: performance enhancements and new features

A cloud computing platform for scalable relative and absolute binding free energy predictions: new opportunities and challenges for drug discovery

Supercomputer-based ensemble docking drug discovery pipeline with application to Covid-19

Identify potent SARS-CoV-2 main protease inhibitors via accelerated free energy perturbation-based virtual screening of existing drugs

Discovery of new small molecule hits as Hepatitis B virus capsid assembly modulators: structure and pharmacophore-based approaches

AutoDock4 and AutoDockTools4: automated docking with selective receptor flexibility

AutoDock vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading

Large scale affinity calculations of cyclodextrin host-guest complexes: understanding the role of reorganization in the molecular recognition process

Improving prediction accuracy of binding free energies and poses of HIV integrase complexes using the binding energy distribution analysis method with flattening potentials

Novel HIV-1 capsid-targeting small molecules of the PF74 binding site

Structural and mechanistic bases for a potent HIV-1 capsid inhibitor

Molecular dynamics free energy simulations reveal the mechanism for the antiviral resistance of the M66I HIV-1 capsid mutation

Toward structurally novel and metabolically stable HIV-1 capsid-targeting small molecules

Testing inhomogeneous solvation theory in structure-based ligand discovery

Detection of weak hydrogen bonding to fluoro and nitro groups in solution using H/D exchange

Insights into the dynamics of HIV-1 protease: a kinetic network model constructed from atomistic simulations

Computational design of small molecular modulators of protein-protein interactions with a novel thermodynamic cycle: allosteric inhibitors of HIV-1 integrase

Faculty opinions recommendation of absolute binding free energy calculations using molecular dynamics simulations with restraining potentials

Mature HIV-1 capsid structure by cryo-electron microscopy and all-atom molecular dynamics

Evaluation and reparametrization of the OPLS-AA force field for proteins via comparison with accurate quantum chemical calculations on peptides †

Development and testing of the OPLS all-atom force field on conformational energetics and properties of organic liquids

AGBNP: an analytic implicit solvent model suitable for molecular dynamics simulations and highresolution modeling

Blind prediction of HIV integrase binding from the SAMPL4 challenge

Integrated modeling program, applied chemical theory (IMPACT)

Calculation of standard binding free energies: aromatic molecules in the T4 lysozyme L99A mutant

Computations of standard binding free energies with molecular dynamics simulations

Ligand binding thermodynamic cycles: hysteresis, the locally weighted histogram analysis method, and the overlapping states matrix

Perturbation potentials to overcome order/disorder transitions in alchemical binding free energy calculations

Simmerling C (2015) ff14SB: improving the accuracy of protein side chain and backbone parameters from ff99SB

Development and testing of a general amber force field

Fast, efficient generation of high-quality atomic. Charges AM1-BCC model: I. Method

X-ray structures of the hexameric building block of the HIV capsid

Evaluation of fluorescence-based thermal shift assays for hit identification in drug discovery

Novel in vitro screening system based on differential scanning fluorimetry to search for small molecules against the disassembly or assembly of HIV-1 capsid protein

High-density miniaturized thermal shift assays as a general strategy for drug discovery

We thank the IBM WCG team for providing IT support for running the FAAH project, and the WCG volunteers for their generous contribution of computing resources. Explicit solvent ABFE and FEP simulations were run on two local computing resources, the CB2RR HPC cluster and the OWLSNEST high performance cluster at Temple University, and the Comet clusters of XSEDE (MCB100145).

Data availability AutoDock Vina was used for docking, which is free of charge (http:// vina. scrip ps. edu/); Academic version of IMPACT was used to run BEDAM, which can be obtained free of charge (https:// github. com/ Compu tatio nalBi ophys icsCo llabo rative/ bedam_ workf low); GROMACS-2018.3 was used for DDM calculations in explicit solvent (https:// manual. groma cs. org/ docum entat ion/ 2018.3/ downl oad. html); and commercial software Schrödinger Suite 2020-1 was used for relative binding free energy calculations (FEP+) and visualization (Maestro). The vendor's distribution can be found at (https:// www. schro dinger. com/ downl oads/ relea ses). The calculated absolute binding free energies in explicit solvent, logP, experimental TSA results, and solubility in DMSO for the screened molecules are provided in the Supporting Information. Molecular structures of input files are available from: https:// zinc. docki ng. org/ and https:// www. rcsb. org/. Output files are available from the corresponding author upon request.

The authors declare no competing financial interest.