key: cord-0903302-9miwxxzy authors: Sprenger, Kayla G.; Conti, Simone; Ovchinnikov, Victor; Chakraborty, Arup K.; Karplus, Martin title: Multiscale affinity maturation simulations to elicit broadly neutralizing antibodies against HIV date: 2021-09-02 journal: bioRxiv DOI: 10.1101/2021.09.01.458482 sha: 1500a3c2ced58d253ad57e495701aeded4a2a39a doc_id: 903302 cord_uid: 9miwxxzy The design of vaccines against highly mutable pathogens, such as HIV and influenza, requires a detailed understanding of how the adaptive immune system responds to encountering multiple variant antigens (Ags). Here, we describe a multiscale model of B cell receptor (BCR) affinity maturation that employs actual BCR nucleotide sequences and treats BCR/Ag interactions in atomistic detail. We apply the model to simulate the maturation of a broadly neutralizing Ab (bnAb) against HIV. Starting from a germline precursor sequence of the VRC01 anti-HIV Ab, we simulate BCR evolution in response to different vaccination protocols and different Ags, which were previously designed by us. The simulation results provide qualitative guidelines for future vaccine design and reveal unique insights into bnAb evolution against the CD4 binding site of HIV. Our model makes possible direct comparisons of simulated BCR populations with results of deep sequencing data, which will be explored in future applications. Author Summary Vaccination has saved more lives than any other medical procedure, and the impending end of the COVID-19 pandemic is also due to the rapid development of highly efficacious vaccines. But, we do not have robust ways to develop vaccines against highly mutable pathogens. For example, there is no effective vaccine against HIV, and a universal vaccine against diverse strains of influenza is also not available. The development of immunization strategies to elicit antibodies that can neutralize diverse strains of highly mutable pathogens (so-called ‘broadly neutralizing antibodies’, or bnAbs) would enable the design of universal vaccines against such pathogens, as well as other viruses that may emerge in the future. In this paper, we present an agent-based model of affinity maturation – the Darwinian process by which antibodies evolve against a pathogen – that, for the first time, enables the in silico investigation of real germline nucleotide sequences of antibodies known to evolve into potent bnAbs, evolving against real amino acid sequences of HIV-based vaccine-candidate proteins. Our results provide new insights into bnAb evolution against HIV, and can be used to qualitatively guide the future design of vaccines against highly mutable pathogens. 147 and HQ (see Methods). The concentration profile was kept the same across all vaccination 148 protocols (see Methods); therefore, differences in simulation outcomes are due solely to 149 differences in Ag sequences or temporal patterns of administration. Upon a single immunization with each of the four Ags, we find that the mean BCR 151 breadth increases as follows: WT (breadth = 0.00) < KR (breadth = 0.005) < EU (breadth = 152 0.11) < HQ (breadth = 0.15). These results show that the choice of administered Ag has a 153 significant influence on breadth. We next investigated whether the mean BCR breadth 154 could be optimized by administering these Ags in different temporal patterns ( Fig. 2A) . 155 Averaging the results across all 1-, 2-, or 3-Ag protocols, we find that the mean BCR 156 breadth increases with the number of immunizations (breadth = 0.07, 0.39, and 0.73 for 1-, 157 2-, and 3-Ag protocols, respectively). However, the mean breadth is also observed to vary 158 widely across the 2-or 3-Ag protocols, which shows that the temporal Ag administration 159 pattern can have a significant impact on BCR breadth. We also find that the mean number 160 of GC cycles is correlated with breadth ( Fig. 2B) , as more GC cycles afford BCRs more 161 time to accumulate mutations that confer breadth. To understand the effects on AM of changing the temporal Ag administration pattern, 163 we calculated the frustration imposed on the GC reactions by the different vaccination 164 protocols, which was defined as the opposite sign of the average binding free energy of the 165 GC-seeding BCR(s) and the Ag administered in the subsequent immunization. As such, the 166 "frustration" is really a metric of the extent to which the existing B cell population is 167 thrown off of a previous steady state. The change in imposed frustration versus the change 168 in mean BCR breadth between the current and former immunization is shown in Fig. 2C . 169 We find a positive correlation between the change in frustration and in the mean BCR 170 breadth, with the increase being higher in each subsequent immunization (slopes reported in 171 Fig. 2C ). In addition, with the current set of four Ags, we find that progressively fewer 172 protocols result in actual increases in frustration (i.e., rather than smaller decreases; those to 173 the right of zero on the x-axis) with each additional immunization. This could indicate that 174 three sequential immunizations with these Ags are not necessary to elicit sufficiently high-175 breadth Abs. Mechanistically, this is because BCRs become increasingly capable of 176 compensating for increasing frustration using strengthened interactions with conserved Ag 8 177 residues. We demonstrated this by calculating the mean number of BCR mutations towards 178 Ag residues that are considered to be conserved in the CD4bs [20] (see Methods and Table 179 S3 in the SI). Fig. 2D shows that the mean number of BCR mutations towards conserved 180 Ag residues increases from 0.95 to 2.2 to 3.0 after one, two, and three immunizations, 181 respectively. Consequently, we observe a strong positive correlation (R 2 =0.79) between the 182 number of conserved site mutations and the mean BCR breadth. In summary, we find that the elicitation of bnAbs is strongly dependent on the temporal Fig. 3B shows that these additional 220 Abs also employed ICM during their maturation. This is particularly interesting for CH103, 221 which binds to the CD4bs in a significantly different pose than the other bnAbs we 222 considered (Fig. S1 ). Overall, these results imply that ICM is likely to be a general 223 mechanism employed during AM and is an observable that can be monitored in the 224 simulations. In the SI, we explore the biological driving forces behind these results, which 225 include differences in the mutability of the codons encoding different amino acid types at 226 the interface, and the number of nucleotide mutations required to transition from one amino 227 acid type to another. The above comparison is only qualitative because many more simulations could be 293 performed with the coarse-grained model due to the much lower computational cost, 294 enabling conclusions about B cell survival rates. However, if trends from our past work 295 hold true here as well, they would imply that rather than attempting to maximize the level 296 of frustration in each immunization, there is an optimally increased level of frustration to 297 impose (i.e., Ag sequence to use, in this case) in each immunization. Such an approach is 298 predicted to simultaneously optimize the production of high-breadth Abs and B cell 299 survival rates [8] . An important finding from this work, which was not possible to obtain with previous 301 coarse-grained models, is that the VRC01 bnAb evolved mutations at the interface that 302 enabled interfacial composition/electrostatic pattern matching against the CD4bs itself. We 303 showed that three other anti-CD4bs bnAbs also employed this mechanism -which we term We discuss various driving forces behind ICM in the SI. In addition, better ICM may 313 enable BCRs to more closely approach an Ag, thus allowing contacts with the conserved 314 residues of the Ag. This, in turn, may increase the selection force to evolve mutations that 315 bind better to the conserved residues, thus conferring breadth. Such a mechanism may 316 promote bnAb evolution even though ICM per se does not correlate highly with breadth. 317 Also, it remains unknown whether potent (but not necessarily broad) antibodies increase 318 their degree of ICM during maturation. This would indicate that ICM is a more general 319 feature of antibody evolution, a pattern matching mechanism which allows for general 320 compatibility in the interactions between BCRs and Ags, before precise mutations are made 321 to enable BCR binding to conserved Ag residues. Future longitudinal studies of bnAb 322 evolution may help to shed light on this factor. In view of our results, we suggest that a promising route to a universal vaccine for HIV 324 is to first administer an Ag that promotes BCR mutations which increase the degree of ICM 325 against the CD4bs (although we acknowledge that designing such immunogens may be 326 nontrivial). This step should be followed by sequential immunizations with Ags that have 327 an optimal number of increasingly different amino acids in the variable residues 328 surrounding the conserved residues to achieve an optimally increasing temporal frustration 329 profile. This procedure is expected to optimize the breadth of the mature BCR/Ab 330 population, and according to our past work [8], it will also maximize the produced bnAb 331 titers. It is of high importance to understand how differences in Ag sequences lead to 333 differences in the types of evolved BCR mutations. An advantage of our model is the ability The breadth of a BCR is defined as the fraction of Ags in a panel that can be neutralized 486 by the Ab. We compute the binding breadth B, which is based on the binding free energy 487 ΔG of the BCR for all Ags in the panel: In this expression, N is the number of Ags in the panel, ΔG i is the binding free energy 489 for the i-th Ag, and ΔG cutoff is an upper limit in the binding free energy above which the 497 for all 106 Ags in the Seaman panel and compared the two distributions (Fig. 5) . For the 498 calculations, PDB 5FYJ [28] was used as the template. From Fig. 5 , we can see that the 499 average binding free energy of the germline is shifted to higher values, indicating weaker 500 binding, with a clear separation between the histograms for the mature and germline Abs. 501 Using a cutoff binding free energy value of -9.9 kcal/mol, the computed breadth of the 502 germline and mature Abs are 0.0 and 0.31, respectively. We note that lower thresholds lead 503 to even greater separation between these values (e.g., a threshold of -9.7 kcal/mol leads to 504 germline and mature Ab breadths of 0.0 and 0.64, respectively). However, -9.9 kcal/mol 505 was found to be the best threshold to use to discriminate among the BCRs produced in the 506 AM simulations, for which the binding affinity distributions were observed to vary widely 507 depending on the temporal Ag administration protocol. In the crystal structure of VRC01 in complex with HIV gp120, nine residues were 515 found to be in direct contact (within 4Å) with glycans on the surface of the Ag (Fig. 6A, 516 right). Our scoring function for the BCR/Ag binding free energy was not parameterized to 517 include the effect of glycans. Thus, we removed the glycan moieties from the structure files 518 prior to the calculations and excluded these residues from our CD4bs interfacial Ab residue 519 definition, which was used to determine the degree of interfacial composition matching of 520 the BCRs from the AM simulations. We also excluded an additional two interfacial residues 521 in VRC01GL that positionally aligned with these nine residues (Fig. 6A, left) . All other 522 interfacial residues identified for VRC01GL and VRC01 were considered for a total of 22 523 residues (see Table S2 in the SI). We note that inclusion of the 11 glycan-contacting 524 residues in our calculations would lead to incorrect assessments of the degree of ICM BCR interfacial residues in contact with conserved Ag residues were similarly 533 identified using VMD. A total of 15 Ab residues were identified to be within 4Å of one of 534 eight conserved Ag residues in PDB code 5FYJ (see Table S3 in the SI). These 15 residues 535 represent 68% of the 22 total BCR interfacial residues listed in Table S2 . The conserved Ag 577 where , , , and stand for polar, apolar, acidic, and basic amino acid types, 578 respectively, , is the interfacial fraction of amino acid type k for clone i, and , is the 579 average interfacial fraction of amino acid type k for the Ag panel. and 3 single-Ag sequential immunizations, (B) mean BCR breadth vs. mean number 192 of GC cycles, (C) changes in mean BCR breadth vs. changes in frustration between current 193 (t) and former (t-1) immunizations, and (D) mean BCR breadth vs. mean number of 194 mutations towards conserved Ag residues Tackling 597 influenza with broadly neutralizing antibodies Germinal centers Optimal immunization cocktails 601 can promote induction of broadly neutralizing Abs against highly mutable pathogens Manipulating the selection forces during affinity maturation to generate cross-605 reactive HIV antibodies Optimizing 607 immunization protocols to elicit broadly neutralizing antibodies A Stochastic model of the germinal center integrating 610 local antigen competition, individualistic T-B interactions, and B cell receptor 611 signaling Simulation of B cell affinity maturation 613 explains enhanced antibody cross-reactivity induced by the polyvalent malaria 614 vaccine AMA1 Optimal sequential immunization can focus antibody responses against 616 diversity loss and distraction Trade-offs in antibody repertoires to complex 619 antigens Competitive exclusion by autologous antibodies can prevent 621 broad HIV-1 antibodies from arising Preferential presentation of high-affinity immune 624 complexes in germinal centers can explain how passive immunization improves the 625 humoral response Induction of broadly neutralizing 627 antibodies in Germinal Centre simulations Homology modeling-629 based in silico affnity maturation improves the affnity of a nanobody A sound strategy for homology 632 modeling-based affinity maturation of a HIF-1α single-domain intrabody A 637 consensus protocol for the in silico optimisation of antibody fragments 640 Models of somatic hypermutation targeting and substitution based on synonymous 641 mutations from high-throughput immunoglobulin sequencing data Design of 644 immunogens to elicit broadly neutralizing antibodies against HIV targeting the CD4 645 binding site Rational HIV 647 immunogen design to target specific germline B cell receptors Somatic mutation leads to efficient affinity maturation when 652 centrocytes recycle back to centroblasts Role of 654 framework mutations and antibody flexibility in the evolution of broadly neutralizing 655 antibodies. Elife Comparative protein structure modeling using MODELLER New statistical potential for quality assessment of protein 659 models and a survey of energy functions Mechanisms underlying vaccination protocols that may 662 optimally elicit broadly neutralizing antibodies against highly mutable pathogens Trimeric HIV-1-Env structures define glycan shields from clades Tiered 668 categorization of a diverse panel of HIV-1 Env pseudoviruses for assessment of 669 neutralizing antibodies Structural basis for 671 germline antibody recognition of HIV-1 immunogens Increasing 673 the potency and breadth of an HIV antibody by using structure-based rational design Co-evolution of a 676 broadly neutralizing HIV-1 antibody and founder virus 679 Sequential immunization elicits broadly neutralizing anti-HIV-1 antibodies in Ig 680 knockin nice Tailored 682 immunogens direct affinity maturation toward HIV neutralizing antibodies Sustained 685 antigen availability during germinal center initiation enhances antibody responses to 686 vaccination Sequential immunization with a subtype B HIV-1 envelope quasispecies partially 689 mimics the in vivo development of neutralizing antibodies 692 Sequential and simultaneous immunization of rabbits with HIV-1 envelope 693 glycoprotein SOSIP.664 trimers from clades Sequential immunizations with a panel 696 of HIV-1 Env virus-like particles coach immune system to make broadly neutralizing 697 antibodies 699 Immunization for HIV-1 broadly neutralizing antibodies in human Ig knockin mice Visualizing 702 antibody affinity maturation in germinal centers Mutation drift and repertoire shift in the maturation of the 705 immune response Optimality of mutation and selection in germinal centers T follicular helper cell differentiation, function, and roles in disease. 709 Immunity Clonal selection in the germinal centre by 711 regulated proliferation and hypermutation Germinal center dynamics revealed by multiphoton microscopy 714 with a photoactivatable fluorescent reporter Estimation of the breadth of CD4bs targeting HIV antibodies by molecular modeling and machine learning Fitness 719 landscape of the human immunodeficiency virus envelope protein that is targeted by 720 antibodies VMD: Visual molecular dynamics. 722 Structural basis for broad 724 and potent neutralization of HIV-1 by antibody VRC01. Science (80-) Structural 727 repertoire of HIV-1-neutralizing antibodies targeting the CD4 supersite in 14 donors Simulated vaccination protocols with 1, 2, or 3 sequentially administered Ags Results are shown for the mean BCR breadth, mean degree of interfacial 732 composition/electrostatic pattern matching (ICM), and the mean number of GC cycles Standard deviations are on average 0.05 for both mean BCR breadth and ICM, and 2.6 for 734 the number of GC cycles List of 22 amino acid residues used in the in silico CD4bs interfacial BCR residue 736 definition and Fig. 7 (main text). Residue numbers and identities correspond to those in PDB List of conserved residues in the CD4bs of HIV, as determined by Conti et al. 3 , and 739 the corresponding residues of VRC01 in contact with the Ag residues, used to characterize 740 BCR conserved site binding (Fig. 2D, main text) Snapshots from Visual Molecular Dynamics (VMD 4 ) of VRC01 (left) and CH103 743 (right) in complex with gp120-based Ags. Ags are shown in pink and Abs in transparent gray, 744 with the six complementarity-determining regions (CDRs) colored in red (CDRH2), orange 745 (CDRL2), yellow (CDRH3), green (CDRL3), blue (CDRH1), and purple (CDRL1) Basic biological principles govern the mutations acquired by VRC01 during AM AA type, is computed as the sum of the relevant column values subtracted by the 750 sum of the relevant row values. (B) Heat map of the average number of nucleotides needed 751 to transition from one AA type to another. (C) Mean mutability of the different AA types, 752 based on the model of Yaari Mean BCR breadth of vaccination protocols with t=1, 2, and 3 single-Ag sequential 756 immunizations versus the mean weighted degree of interfacial composition and electrostatic 757 pattern matching (ICM), computed using different thresholds for determining the BCR 758 breadth. Error bars are only shown for the degree of ICM for clarity and represent the 759 standard deviation of two Fraction of different aa types at the BCR/Ag interface in BCR sequences produced 761 after administration of protocols beginning with (A) the EU Ag, (B) the KR Ag Black dashed lines indicate the interfacial amino acid fractions in 763 the CD4bs of HIV Linear correlation between experimental and computed binding affinities using the 766 RF_HA_SRS statistical pair potential. Each dot represents one binding affinity value. The 767 data in orange are used for the training of the linear regression Snapshots from Visual Molecular Dynamics (VMD 4 ) of VRC01 (left) and CH103 813 (right) in complex with gp120-based Ags. Ags are shown in pink and Abs in transparent gray, 814 with the six complementarity-determining regions (CDRs) colored in red (CDRH2), orange 815 (CDRL2), yellow (CDRH3), green (CDRL3), blue (CDRH1), and purple (CDRL1) To gain 820 insight into other factors that might influence the evolutionary trajectories of these Abs, we into its bnAb form. We considered 33 interfacial residue indices in this analysis, 823 including the 21 overlapping residue indices between VRC01GL and VRC01, as well as the 824 4 additional residue indices identified solely in VRC01GL and 8 additional residue indices 825 identified solely in VRC01. Of these 33 residues, 14 were mutated during the evolution of 826 VRC01GL into VRC01. We characterized each of these 14 mutations according to which 827 amino acid (AA) type it was mutated from/to in VRC01GL and VRC01, respectively 3A (main text), the results of Fig. S2A show that a 831 significant fraction of the interfacial mutations was made away from polar AAs 832 even considering the large fraction of mutations made away from polar AAs simply to other 833 polar AAs. We sum up the fraction of mutations made away from polar AAs (Fig. S2A, top 834 row) and subtract this from the summed fraction of mutations made to polar AAs to see that the overall selection tendency for polar AAs at the 836 interface, or , is strongly negative (-0.50). Similarly, we find a strong positive 837 selection tendency towards apolar ( = +0.15) and basic 3A (main text). However, the same analysis 839 shows a mild positive selection tendency towards acidic residues 3A. In VRC01GL, 3/25 interfacial residues are 841 acidic, compared to 3/29 residues in VRC01, resulting in the decreased interfacial acidic 842 fraction in Fig. 3A. For one of the acidic interfacial residues in VRC01 These results raise a series of questions, such as why the need for increasing the basic 846 interfacial fraction was met primarily through making mutations away from polar residues 847 versus away from apolar residues, or why so many seemingly unessential polar-to-polar and 848 apolar-to-apolar transitions occurred Another 857 contributing factor for these mutations is that many polar-to-polar and apolar-to-apolar 858 mutations are "neutral" mutations, which are neither positively nor negatively selected 859 during AM. It is known that VRC01-class Abs typically emerge only after many years of 860 infection 5,6 , and so the Abs have long maturation periods over which to accumulate these 861 largely irrelevant mutations. The reasons for the lack of basic-to-basic and acidic-to-acidic 862 mutations -or any mutations away from charged AAs, for that matter -are two-fold. We 863 can observe from Fig. 3A (main text) that charged AA types make up the lowest interfacial 864 fractions by far in VRC01GL more mutability score of all 5-mers containing a given codon for all codons encoding for 871 a particular AA type. Computing the mutability scores for the individual nucleotides in the 872 CDRs of VRC01GL in a similar manner, Fig. S2D shows that there are many more 873 nucleotides in codons that encode for polar AAs -and even apolar AAs -with high 874 mutability scores, than there are basic or acidic AAs with high mutability scores To summarize, four key biological principles dictate the types of observed mutations in simulations. From a targeting standpoint, both the initial, relative proportions of 880 different AA types at the interface (principle #1) and the mutability of the codons encoding 881 for the interfacial AAs Lastly, from a selection standpoint, a large initial difference these principles and the effects that changes in the parameters have on the ensuing 889 number and type of acquired Ab mutations will be explored in detail in future studies (e.g., 890 the impact on AM of using germline sequences with different initial mutability patterns or 891 degrees of interfacial composition matching). Here, we explore the impact of administering 892 multiple variant Ags in different Basic biological principles govern the mutations acquired by VRC01 during AM AA type, is computed as the sum of the relevant column values subtracted by the 900 sum of the relevant row values. (B) Heat map of the average number of nucleotides needed 901 to transition from one AA type to another. (C) Mean mutability of the different AA types, 902 based on the model of Yaari Mean BCR breadth of vaccination protocols with t=1, 2, and 3 single-Ag sequential 907 immunizations versus the mean weighted degree of interfacial composition and electrostatic 908 pattern matching (ICM), computed using different thresholds for determining the BCR 909 breadth. Error bars are only shown for the degree of ICM for clarity and represent the 910 standard deviation of two Analysis of mutational trajectories of individual clonal BCR sequences 912 The results for the EU-based protocols (Fig. S4A) and HQ-based protocols (Fig. S4D) 913 were discussed in the main text. For the KR-based protocols (Fig. S4B), we find that many 914 B cell clones evolved basic-to-polar mutations after just the first immunization with KR, 915 and this trend continued after a second immunization with KR (Fig. S4B, blue line basic interfacial fractions -and the lower ICM score -obtained after the first immunization 919 with KR. For the rest of the KR-based protocols, no clear trends exist in the types of 920 mutations that are evolved after the second and third immunizations, leading to the wide 921 array of final ICM scores After the second 924 immunization, regardless of which Ag was administered, we observe relatively large 925 decreases and increases in the polar and basic interfacial fractions, respectively. However, 926 these ICM gains are offset by mutations that decreased the apolar interfacial fraction for all 927 four WT-based protocols. The third and final immunization, again regardless of which Ag 928 was administered, generally led to little change in the interfacial fraction of any amino acid 929 type Fraction of different aa types at the BCR/Ag interface in BCR sequences produced 933 after administration of protocols beginning with (A) the EU Ag, (B) the KR Ag At each cycle of the affinity maturation simulation this protocol will be 941 called hundreds of times, so it is imperative to reduce the computational time as much as 942 possible. The methods we use are based on all-atoms descriptions of the antibody-antigen 943 protein-protein complex As templates we use crystal 946 structures of HIV antibodies bound to the gp120 HIV surface protein. The first step of our 947 protocol is thus to create one model of the full complex with Modeller, by grafting the to reduce the computation time. Hydrogen atoms are not included in this step, since it 950 was found that in this quick protocol, they often generate the wrong chirality, so only the 951 positions of heavy atoms are generated. The second step is to optimize the structure. We 952 need to add hydrogen atoms, check for the presence of disulfide bonds, and check for the 953 presence of errors in chirality To compute the binding affinity we 959 use the Rykunov-Fiser statistical pair potentials (RFSPP) 10 . These potentials compute the 960 stability of each protein (G x ), and the binding affinity (dG) can be computed as the 961 difference dG = G cpx -G lig -G rec . This whole procedure was repeated 12 times starting 962 with the creation of the model with Modeller using different initial random seeds. The final 963 RFSPP scoring was obtained as the average over the 12 values. Two RFSPP are available: 964 an all-heavy atoms model (RF_HA_SRS) and a beta-carbon directional potential 965 (RF_CB_SRS_OD) The described protocol requires about 12 minutes of 968 computer time to compute one binding affinity (about 1 minute per model) Data were collected for four complexes: the resurfaced stabilized core 3 (RSC3) 973 antigen bound to antibodies VRC01, VRC03, and VRC-PG04, and antigen 93TH057 bound point mutations and, in the last case, also for few insertion and deletions. For each 976 available experimental dG, the RFSPP score was evaluated as described, and the obtained 977 scores were compared with the experimental dG. As templates we used PDBs 3NGB 12 , 978 3SE8 13 , 3SE9 13 , and 5FYJ 1 , respectively, for the four base complexes. The scale and unit of 979 measure of the RFSPP score is uncertain Due to the use of a 985 random split in the training and validation sets, the training can be repeated a number of 986 times and different models can be obtained. To reduce this randomness and avoid picking 987 an arbitrary initial random seed, the full training was repeated 10000 times and all Rp, Rs, 988 MAE and RMSE were stored. To remove all solutions with obviously bad properties (like 989 very low Rp, or very high MAE) a Pareto optimization strategy 14 was adopted Seeking a model that is balanced between the two limits, the third model of 994 the front was chosen. The obtained linear fit with the RF_HA_SRS statistical pair potential 995 shows a Rp=0 The main negative feature is 997 the low slope of the linear regression This problem is common to all tested linear regressions, even using very different 1000 descriptors (other statistical potentials, implicit solvents, buried surface area based The final linear regression result is dG score = m * RFSPP + q, with m=0.00619 and q= Linear correlation between experimental and computed binding affinities using the 1005 RF_HA_SRS statistical pair potential. Each dot represents one binding affinity value. The 1006 data in orange are used for the training of the linear regression, while in blue for the 1007 validation. 1008 1009 1010 Scoring function validation: scan for spurious mutations The affinity maturation protocol involves the stepwise accumulation of mutations from 1012 the germline to move towards some mature-like antibody. Intuitively, we expect that a 1013 single point mutation cannot produce a too large gain in binding affinity, and also that there 1014 are no particularly good mutations that always increase the binding affinity. Due to the use 1015 of a very crude and fast scoring function This is the expected behavior for single point random mutations. With the same data, 1025 it is also possible to check if some mutations (e.g., mutation to cysteine) are always 1026 beneficial mutations. This would imply that a bias towards that particular amino acid is 1027 present in the scoring function. The bar plot in Fig. S6b shows the probability that mutating 1028 to each amino acid results in an improved binding affinity (i.e., an amino acid with a 1029 probability of one means that while scanning the 25 residues in the binding site, mutating 1030 any of those resides to that amino acid results in a better binding). Intuitively, we expect 1031 that no "preferred" amino acid should exist, which means that the probability should never 1032 be close to one, but "terrible" amino acids S6. a) Relative binding affinity changes while mutating the 25 residues of VRC01GL 1043 at the binding site into each possible amino acid, sorted by magnitude. The range of the 1044 magnitudes is as expected limited with most mutations neutral (close to zero). b) Bar plot of 1045 the probability that mutating to a given amino acid improves the binding affinity Trimeric HIV-1-Env structures define glycan shields 1051 from clades A, B, and G Rational HIV immunogen design to target specific germline B cell 1053 receptors Design of immunogens to elicit broadly neutralizing antibodies 1055 against HIV targeting the CD4 binding site VMD: Visual molecular 1058 dynamics Broadly neutralizing antibodies and the 1060 search for an HIV-1 vaccine: The end of the beginning Manipulating the selection forces during affinity maturation to 1063 generate cross-reactive HIV antibodies Models of somatic hypermutation targeting and substitution based on 1065 synonymous mutations from high-throughput immunoglobulin sequencing data Comparative protein structure modeling using MODELLER CHARMM: A program for macromolecular energy, 1070 minimization, and dynamics calculations New statistical potential for quality assessment of protein 1072 models and a survey of energy functions Tiered categorization of a diverse panel of HIV-1 Env 1074 pseudoviruses for assessment of neutralizing antibodies Structural basis for broad and potent neutralization of HIV-1 by 1077 antibody VRC01 Focused evolution of HIV-1 neutralizing antibodies revealed by 1079 structures and deep sequencing Multi-Criteria Decision Analysis for Supporting the 1081 Selection of Engineering Materials in Product Design. Multi-criteria Decision