key: cord-0991290-ywhu66o7 authors: Watabe, Teruaki; Kishino, Hirohisa title: Structural Considerations in the Fitness Landscape of a Virus date: 2010-03-01 journal: Mol Biol Evol DOI: 10.1093/molbev/msq056 sha: a69d7201e40dfdad69b0319d749365f627c375e7 doc_id: 991290 cord_uid: ywhu66o7 Viral fitness is determined by replication within hosts and transmission between them. We examine how pleiotropic mutations that have antagonistic effects (i.e., antibody evasion vs. receptor binding) on viral replication within hosts can impact viral immune escape in the host population. When the host population is vaccinated, the virus escapes from passive immunity by mutations in the antibody-binding region on the surface of the target protein. However, the reduced ability of the antibody to bind the virus is often accompanied by a reduced ability of the virus to bind the cell receptor because the antibody-binding region overlaps with the receptor-binding domain (RBD). The types of permitted mutations are limited. To investigate the causal relation between a mutation in a viral genome and adaptive evolution of a viral population, we developed a mathematical model that describes the population dynamics of viruses, antibodies, and normal/infected cells within a host. The coefficients describe the binding affinity between the virus and the induced antibody and that between the virus and its receptor. Our knowledge-based index enables us to estimate the effect of a mutation in a binding region on the binding affinity. Using population genetic theory, we evaluated the probability that a mutant is fixed in a host population. The mutations that can be fixed with high probabilities may determine how long a vaccine remains effective. We simulate the adaptive evolution of coronavirus, the etiological agent of severe acute respiratory syndrome, and show that some of mutations in the RBD may have high fixation probabilities in the vaccinated host population. Viruses provide an ideal opportunity to quantify the causal effect of a mutation in a genome on the adaptive evolution of the population. Phenotypic evolution is driven by selection pressure acting on viability and/or reproductive ability. Physiological and developmental processes of higher organisms are robust against genomic mutations (Wagner 2000; Li et al. 2004 ). Due to this buffering mechanism, genetic variations are preserved in phenotypically normal populations, and it is therefore difficult to establish a causal relationship between genomic mutations and phenotypic alterations (Rutherford 2000) . On the other hand, viruses have a simple life cycle, so their adaptive evolution can be explained by a few factors (Beerenwinkel et al. 2002) . In viral evolution, mutations in genetic coding sequences may correlate with phenotypic evolution through structural changes in proteins that function at various stages of the life cycle. This enables the simultaneous investigation of the phenotypic and genotypic evolution of viruses by monitoring the structural changes in proteins that are caused by genotypic changes and that are reflected in phenotypic differences. We focus this study on the adaptive evolution of virus under the selection pressure of passive immunity induced by a vaccine. A mutation in a viral genome may change the structure of the antibody-binding region that is often overlapped with or close to the receptor-binding domain (RBD) . The structural change of the antibody-binding re-gion may affect the binding affinity between the virus and the host cell receptor as well as the binding affinity between the virus and the antibody. Therefore, the virus must decrease its affinity to antibodies while maintaining its ability to bind the receptor. To quantify the causal effect of a mutation on the improved infectivity, we first develop a mathematical model to describe viral dynamics in a vaccinated host and to construct the fitness landscape of the virus. The model describes the dynamics of three interacting populations: host cells, viruses, and induced antibodies. Two types of binding affinities are incorporated into the model as coefficients in the differential equations. Our knowledgebased index of binding affinities, or association constants (Watabe et al. 2007 ), enables us to estimate changes in the values of the coefficients due to mutation. Assuming that epidemic factors are unchanged, fitness is measured as the size of the mutant population within a host relative to the size of the wild-type population. Adopting a population genetic approach, we calculate the probability that a mutant is fixed in a viral population within and between hosts. As an empirical example, we analyze the adaptive evolution of a coronavirus, which was the etiological agent of severe acute respiratory syndrome (SARS; Ksiazek et al. 2003; Rota et al. 2003) . SARS coronavirus (SARS-CoV) utilizes angiotensin-converting enzyme 2 (ACE2) to attach its spike (S) protein to the host cell (Li et al. 2003) . The cross-species transmission of the virus was permitted by a few amino acid substitutions in the RBD of the S protein Song et al. 2005) . In order to prevent reemergence of SARS, potent therapeutics and vaccines are needed. The S protein is a major antigenic determinant and is therefore an attractive target for neutralizing antibodies. It has been suggested that passive immunization with human monoclonal antibodies (mAbs) might be used to treat SARS (Holmes 2003) , and human mAbs with a potent neutralizing effect against the RBD of the S protein have been identified (Sui et al. 2004; Prabakaran et al. 2006 ). Using our knowledge-based model of viral population dynamics and population genetics, we predict amino acid mutations that are most likely to fix in the viral population. Summary of the Whole Process Figure 1 summarizes the whole process described below. In this study, we determine a mutant virus by different values of the two parameters of the mathematical model and assume that the parameters are proportional to the association constants of the protein complexes: the S protein-ACE2 receptor complex and the S proteinantibody complex. We simulate the emergence of mutants by changing the amino acid sequence of the RBD of the S protein. The amino acid substitutions in the RBD affect the association constants of the protein complexes, and hence, the parameters of the mathematical model that characterize the virus change. We determine the amino acid mutation rate by structural constraints on the muta-tions. In the simulation, the rate is incorporated by generating mutations according to the amino acid preferences in the local structural environment, and the association constants are evaluated by a knowledge-based sequencestructure fitness measure (Watabe et al. 2007 ). To gain insight into the population dynamics of infectious viruses in a vaccinated host, we here develop a mathematical model that describes the dynamics of three interacting populations: host cells, virus particles, and neutralizing antibodies (Nowak and Bangham 1996) . For vaccinemediated protection against viral diseases, the crucial agents are antibodies, which mostly neutralize free viral particles (Burton 2002) . Hence, the model contains four variables: uninfected cells (X), infected cells (Y), free viral particles (V), and neutralizing antibodies (W). These variables obey the following differential equations: Uninfected cells are constantly produced at rate k and infected by free viruses at rate bXV. Infected cells produce free viral particles at rate kY. The half-lives of uninfected cells, infected cells, and free viral particles are 1/d, 1/a, and 1/u, respectively. The basic reproductive ratio of the virus is given by R 0 5 bkk/(adu). In the work of Nowak and Bangham (1996) , the variable W represents B cells, and its production rate is set to rVW. In the present study, the variable W represents neutralizing antibodies; therefore, its production rate is set to be proportional to the amount of free viral particles, rV. The half-life of neutralizing antibodies is 1/h. Free viral particles are neutralized by antibodies at rate qVW. In order to avoid ambiguity caused by the large number of parameters, we normalized the variables by the parameters in equation (1) and solved the equations numerically (see Supplementary Material online). The half-life of uninfected cells, 1/d, is set to 100 days (Nowak and Bangham 1996) and that of neutralizing antibodies (immunoglobulin G), 1/h, is set to 20 days (Insel and Looney 2004) . The virion half-life has been estimated for several viruses. In an analysis of viral dynamics during antiviral therapy, the hepatitis C virus (HCV) has been estimated to have an average half-life of 2.7 h (Neumann et al. 1998) . For the hepatitis B virus (HBV), the estimated virion half-life is about 1 day . The halflife of the human immunodeficiency virus has been estimated to be 5.8 h (Perelson et al. 1996) . The estimated half-life of infected cells not only depends on the virus but also shows considerable interpatient variation. Halflives of 1.7-70 days for HCV (Neumann et al. 1998 ) and 10-100 days for HBV have been obtained. In the present study, we employ two values We introduce viral mutations that result in different binding affinities to the receptor and/or to the antibody. Assuming that a mutant virus emerges immediately after infection, we extend the mathematical model (eq. 1) as follows. Until a mutant emerges ð0 t,tÞ, the form of the extended equation is the same as the original, except that subscripts are used: Following emergence of the mutant ðt !tÞ, mutant viruses V 2 infect cells, and cells infected by mutant viruses, denoted as Y 2 , produce free viral particles of the mutant: Here the subscript i is 1 or 2. We consider a mutant emerges on a time scale of a half-life of the virus. We set the timet equal to the half-life of free viral particles 1/u. As with equation (1), we normalize the variables by the parameters in equations (2) We define the fitness of a viral population (characterized by a parameter set (b, q)5(x, y) in eq. 1) within vaccinated hosts by integrating the variable V over time: We assume that transmission between hosts is directly related to viral load within infected hosts (i.e., eq. 4). The incubation period of SARS is usually 3-5 days (World Health Organization 2003) , and the virus can infect new hosts until medical treatment occurs. The medical treatment may start within a few days of onset. Hence we should consider 5-8 days as a time period that the virus can infect to other hosts. Therefore, we set the range of integration as 0 to s. On the (b,q)-plane, a viral fitness landscape is given based on changes in w. Viruses with higher values of w can infect new hosts more efficiently. On the other hand, the final results of our analysis do not depend much on the value of s (data not shown). The aspect of the fitness landscape is characterized by the truncation of the integration itself rather than the time point of the truncation. Hence, we set the value of s equal to 10 days. In order to evaluate the fixation probability of a mutant virus within a vaccinated host population, we must determine the selective advantage and initial frequency of the mutant. The selective advantage can be defined by a ratio between the fitness of the mutant virus and that of the ancestral virus: s 5 wðx#; y#Þ wðx; yÞ À 1: Here, w in the numerator is the fitness of the mutant virus characterized by a parameter set (b,q) 5 (x # ,y # ), whereas w in the denominator is the fitness of the ancestor characterized by a parameter set (b,q) 5 (x,y). When the fitness of the mutant virus is the same as that of the ancestral virus (w(x # ,y # ) 5 w(x,y)), the value of its selective advantage is zero. On the other hand, a positive (or negative) value of the selective advantage corresponds to the mutant virus with higher (or lower) fitness than that of the ancestor. The fixation probability of an allele with frequency p is obtained as the solution of the ordinary differential equation vðpÞ 2 d 2 uðpÞ dp 2 þ mðpÞ duðpÞ dp 5 0; with boundary conditions u(0) 5 0 and u(1) 5 1 (Kimura 1962) . Here, the second term represents the change in the frequency that is driven by the selective advantage and the first term corresponds to the drift of the frequency. m(p) and v(p) are expressed as Note that the selective advantage is not necessarily small in our case (Wright 1931) . N is the host population size, in which all hosts are vaccinated and infected by the ancestral virus before the mutant emerges. We assume that the size of the host population is constant until the mutant virus becomes fixed in the population although the individuals in the population change. In the present study, we consider a million scale of vaccinee and assume that infected vaccinee population stands at 1% (N 5 10,000). As a result, we obtain the following formula for the fixation probability of the mutant virus: Population Dynamics of an Emerging Mutant Virus within Its First Host and the Initial Frequency in the Host Population An emerging mutant virus within a host, which is characterized by the parameter set (b 2 ,q 2 ) 5 (x # ,y # ) in equation (3), inevitably competes with the existing viral strain characterized by the parameter set (b 1 ,q 1 ) 5 (x,y). The result of this competition determines whether the emerging mutant can infect new hosts. To infect new hosts, the mutant must achieve a sufficient viral load within its first host. We define the integrals of the variables in equation (3) We consider the ratio between w 2 and the sum of w's of all hosts within the population, (N À 1)w(x,y) þ w 1 (x,y) þ w 2 (x # ,y # ), as a frequency that determines the number of new hosts infected by the mutant virus. We employ the Poisson distribution: Here, k represents the number of new hosts infected by the mutant in the next generation, and P(g;k) is the probability that the number of new hosts is k. g represents the expected value of the number of the new hosts. Combining the fixation probability with a given initial frequency (eq. 6) and the probability of the number of new hosts infected by the mutant (eq. 7), the fixation probability of the mutant virus in the host population is given by: The rate at which one amino acid residue changes to another is the product of two factors: the rate at which mutations occur at the corresponding residue (a i at residue i) and the frequency with which the new amino acid is selected at the residue (f i aa #for a change from amino acid a to a # at residue i). The rate a i is determined by several factors, including the error rate of the viral polymerase. We assume that the rate a i is constant among residues in the RBD. Here, we do not explicitly consider the time span of occurrence of a resistant virus to the vaccine. Hence, we do not fix the explicit value of the rate. The frequency f aa # i should be determined mostly by structural constraints on mutations. In a simulation of amino acid substitutions, we incorporated the frequency by generating mutations according to the amino acid preferences in the local structural environment (P(a # i jX); Watabe et al. 2006 Watabe et al. , 2007 . In the present study, we examine the adaptive evolution of the virus by using the viral fitness landscape within a vaccinated host population that was obtained from the mathematical model of viral population dynamics in vaccinated hosts (eq. 1). The virus starts from a point on the fitness landscape and moves through the landscape based on changes of binding affinities resulting from particular mutations in the amino acid sequence of the RBD. The parameters of the mathematical model, (b,q), represent the strengths of couplings between virus particles and uninfected cells and between virus particles and antibodies. Those couplings are mediated by proteins: the S proteins, the ACE2 receptors, and the fragments of antigen-binding (Fabs) of antibody. Hence, we assume that the parameters of the mathematical model, (b,q), have the following relationships to the association constants of the protein complexes (K a . The trajectory of mutants on the fitness landscape was characterized by the changes in association constants calculated for the mutated amino acid sequence of the RBD. In silico modeling based on molecular dynamics is a tool to observe structural changes caused by mutations occurring in genome. In order to carry out in silico calculations, information about the structure of the corresponding molecular complex is needed as a template. In the case of SARS-CoV, the structure of the complex formed between the RBD and the peptidase domain of human ACE2 has been determined (Li F et al. 2005) , and X-ray crystallography data are available in the Protein Data Bank (PDB). The structure of the S protein when bound by a neutralizing antibody has also been described (Hwang et al. 2006; Prabakaran et al. 2006) . In order to estimate the binding free energy of the protein complex, the protein-protein interactions must be investigated. Several computational methods have been proposed (Elcock et al. 2001) . Molecular dynamics simulations have increasingly been used to study protein-protein interactions. However, because of long convergence time, molecular dynamics simulations are prohibitive for large systems such as protein complexes (Cui et al. 2008) . In this study, we employ a Structural Considerations in the Fitness Landscape · doi:10.1093/molbev/msq056 knowledge-based sequence-structure fitness measure (Watabe et al. 2007 ) to evaluate the binding affinities between the S protein of SARS-CoV and the Fabs of mAbs and also those between the S protein and its receptor. The sequence-structure fitness measure is a likelihood-based index, which quantifies a degree to which a sequence fits in a given structure. The measure is calculated empirically based on amino acid preferences and pairwise interactions in structural environments present in template structures. By bypassing structural prediction, it is possible to analyze thousands of sequences and correlate high-fitness sequences with structural changes along evolutionary pathways. The binding affinity of a complex formed between proteins A and B is measured as the association constant of the complex, K a 5 ½AÁB ½A½B , where [A], [B] , and [AÁB] are the concentrations of proteins A, B, and their complex in equilibrium. We can relate the ratio to the likelihood ratio, K a } PðAÁBÞ PðAÞPðBÞ (see Supplementary Material online), and predict the ratio using information from the protein database. Using the fact that the amino acid sequences in the binding region and the structure of the complex characterize the binding affinity of the complex, we represent the likelihood as the likelihood of the amino acid sequences given the structure: Because we do not consider structural changes in the main chains of the proteins, the likelihoods P(str AþB ), P(str A ), and P(str B ) are constant. Thus, the index of the affinity of the complex is defined as the likelihood ratio of the amino acid sequences in the binding region of the complex to those in the binding region of the uncomplexed proteins: The sequence distribution of a structure P(seqjstr) in equation (8), which we call the sequence-structure fitness measure, expresses the fitness of the amino acids given the structural environment and can be represented by the amino acid preference and the pairwise interaction (Watabe et al. 2006 (Watabe et al. , 2007 : Here, an amino acid sequence and the positions of the C a atoms of amino acid residues are denoted by A 5 (a 1 ,. . .,a n ) and X 5 (x 1 ,. . .,x n ), respectively. Expecting that the fitness of an amino acid depends primarily on the local structure surrounding each amino acid residue, Simons et al. (1999a Simons et al. ( , 1999b have categorized local environments and calculated amino acid frequencies in the proteins registered in the PDB for each category. Applying equation (9) to equation (8), the index of binding affinity is expressed by the product of the three parts: The first and third parts on the right-hand side of the above equation can be interpreted as contributions from hydrophobic interactions. The second part corresponds to contributions from the pairwise interactions between residues of the proteins A and B. Its explicit formula is The amino acid preferences and interactions of the amino acid pairs are measured by empirical frequencies observed in the nonredundant set of protein structures in the PDB (Hobohm and Sander 1994 , the latest library is available from ftp://ftp.embl-heidelberg.de/pub/databases/protein_ extras/pdb_select/). The degree of residue burial characterizes the local environment of a residue, which is defined as the number of C a atoms surrounding the C a atom of the corresponding residue within a distance of 1 nm (10 Å ) (Watabe et al. 2007) . The spatial distances between the C a atoms of each amino acid pair are categorized into intervals of 0.1 nm from 0.7 to 1 nm. Pairs with spatial distances of 0.7 nm are compiled into one category because such small distances are rare (Watabe et al. 2007 ). The interaction between a residue and its neighborhood in the primary sequence plays a major role in making local structural conformations, such as an a-helix. Hence, we treat interactions between residues with small site intervals j À i 4 separately from those with larger site intervals (Watabe et al. 2006 ). To achieve a balance between robust estimation of amino acid preferences and pairwise interactions and sensitive detection of changes in microstructure, we adopt a sliding window approach (Watabe et al. 2007 ). We estimate the binding affinities of the S proteinreceptor and S protein-antibody complexes using the index developed above (eq. 8) along with X-ray crystallography data from the PDB. have determined the structure of the complex formed between the RBD of the S protein of SARS-CoV and the peptidase domain of human ACE2 (PDB code: 2AJF). Hwang et al. (2006) have described the structure of the complex formed between the S protein and a neutralizing antibody, 80R (PDB code: 2GHW). The 80R-binding epitope on the RBD overlaps very closely with the ACE2-binding site (Hwang et al. 2006 ). Prabakaran et al. (2006) have reported the crystal structure of the RBD of the S protein in complex with a neutralizing antibody, m396 (PDB code: 2DD8). This antibody recognizes a region that overlaps half of the receptor-binding region. We used these three structural data sets on protein complexes as templates to estimate the binding affinity of mutated viruses in our simulation. Here, we assume that the main chain structure of the S protein is conserved for the mutated viruses. In figure 2A , we show how the viral dynamics depends on the basic reproductive ratio R 0 5 100 and the three values of the parameter q that yield the three minimum values of the variable for the uninfected cell X(Xd/kj min 5 0.99, 0.90, and 0.50). Because the variables were normalized, the value of Xd/k corresponds to the proportion of uninfected cells to its maximum amount when the virus does not invade. During the early stage of infection, viral load increases linearly on a logarithmic scale and does not depend on the ability of the neutralizing antibody (q). In figure 2B , the viral time courses of an ancestor and a mutant are shown. The ancestor is characterized by R 0 5 100 and the minimum value of the variable for the uninfected cell X(Xd/ kj min 5 0.99). The basic reproductive ratio of the mutant is R 0 5 200. It is shown that the mutant drives out the ancestor. We obtained the fitness landscapes by solving the mathematical model ( fig. 3 and supplementary fig. S1 , Supplementary Material online). The vertical axis and horizontal axis represent the model parameters q and b, respectively. The parameter b is proportional to the basic reproductive ratio: b 5 R 0 adu/(kk). We show vertical lines characterized by the basic reproductive ratios of 1, 10, and 100. Moreover, we show curves expressing minimum values of the variable for the uninfected cell X (Xd/kj min 5 0.99, 0.90, and 0.50). The fitness landscapes have common features. Fitness is defined by the integral of V (in eq. 1) over time, which is truncated at time s 5 10 days to reflect the infectious period. Furthermore, as predicted from equation (1), the dynamics of a virus with slow replication (R 0 ; 1) does not depend on the binding affinity of neutralizing antibody during the early stages of infection (see supplementary material, Supplementary Material online). Hence, when R 0 is low, the fitness landscape does not depend much on binding affinity of the neutralizing antibody, whereas when R 0 is high, fitness varies with q and b. Changes in Binding Affinity due to Single-Amino Acid Mutations In the structural data available for the three protein complexes in the PDB (2AJF, 2GHW, and 2DD8), there are 180 common amino acid residues in the RBD. Among these, 78 amino acid residues (45%) are related to receptor binding (supplementary fig. S2A , Supplementary Material online). In the case of the complex with the antibody 80R, 79 amino acid residues (46%) of the RBD are utilized by the mAb to bind, completely covering the amino acid residues for receptor binding (supplementary fig. S2B , Supplementary Material online). In the case of the complex with the antibody m396, 84 amino acid residues (48%) of the RBD contribute to mAb binding. Half of these (46 amino acid residues or 26%) overlap with the region for receptor binding (supplementary fig. S2C , Supplementary Material online). To see the distribution of the selection intensities of mutants, we performed 1,000 single amino acid (SAA) mutations at random at each of amino acid positions related to receptor binding and evaluated the association constants for each mutant. We plot the positions of the mutated sequences on the landscape for the antibodies 80R ( fig. 3) and m396 (supplementary fig. S3 , Supplementary Material online). Here we employed a point, (b*,q*) 5 (1.20Â10 3 ,2.25Â10 7 ), such that the minimum value of the variable for the uninfected cell is Xd/kj min 5 0.99 and the basic reproductive ratio is R 0 5 100, as the point for the original sequence. Within the range of variation of the SAA mutations on the fitness landscape, some mutants gain large increases of fitness that achieve large selective advantages. We simulated SAA mutations at all residues of the RBD and evaluated the fixation probability of each SAA mutation. In figure 4 , we indicate the amino acid residues at which mutations can be fixed with high probabilities within the host populations. In table 1, we list the fixation probabilities of mutations at the amino acid residues indicated in figure 4. At the starting point on the fitness landscape, mutants realizing greater affinity to the receptor and/or lower affinity to the antibody gain a higher fixation probability. In the case of vaccination using the antibody 80R ( fig. 4A ), the average fixation probability at residue 489I of the S protein is 0.71. However, at the residues on either side of 489I, the average fixation probabilities are less than 0.10. On the other hand, in the case of vaccination using the antibody m396 ( fig. 4B ), mutations to charged amino acid residues at residues 488G and 489I have high fixation probabilities, and the average fixation probabilities are greater than 0.40. These differences are caused by structural differences in the S protein-antibody binding. Mutations at residues 488-490 do not substantially change the binding affinity of the complex between the S protein and the antibody 80R. In contrast, mutations at residue 489I increase the binding affinity of the S protein-receptor complex; hence, their fixation probabilities within hosts vaccinated with the antibody 80R are high (supplementary fig. S4A, Supplementary Material online) . Mutations at the residues on either side of 489I decrease the binding affinity of the S protein-receptor complex; hence, their fixation probabilities are very low (supplementary fig. S5A, Supplementary Material online) . On the other hand, the binding affinity of the complex between the S protein and the antibody m396 is affected by mutations at residues 488-490. Mutations that increase (or decrease) the binding affinity of the S protein-receptor complex also FIG. 4 . Amino acid residues at which mutations can be fixed with high probabilities within the host populations. The colored spheres are at the positions of the C a atoms of the corresponding amino acid residues. The red sphere indicates the residue at which the average fixation probability over the 20 possible mutations is within the category of 0:6 Ũ ,0:8, whereas the orange sphere indicates the residue at which the average fixation probability is within 0:4 Ũ ,0:6 (similarly, the yellow sphere corresponds to 0:2 Ũ ,0:4, the green sphere to 0:1 Ũ ,0:2, and the blue sphere toŨ,0:1). We performed simulations of adaptive evolution on the fitness landscape with parameters of 2 h for virion half-life and 2 days for infected cell half-life. We employed a starting point located along the line expressing the minimum value of the variable for the uninfected cell X (Xd/kj min 5 0.99). The point was at (b*,q*) 5 (1.20Â10 3 ,2.25Â10 7 ), where the basic reproductive ratio is R 0 5 100. The receptor ACE2 (light blue) and the antibodies (flesh color) are superimposed. increase (or decrease) the binding affinity of the S proteinantibody complex. Among these mutations, mutations to charged residues decrease the binding affinity of the S protein-antibody complex due to hydrophobic interaction. In the S protein-antibody m396 complex, the degrees of residue burial for residues 488-490 are greater than those in the S protein-receptor complex. This makes the influence of hydrophobic interaction stronger for residues 488-490 in the S protein-antibody m396 complex than in the S protein-receptor complex. Hence, their fixation probabilities within hosts vaccinated with the antibody m396 are high (supplementary figs. S4B and S5B, Supplementary Material online). The mutations at residue 489I that increase the binding affinity of the S protein-receptor complex primarily affect the pairwise interactions with the two glycines whose C a atoms are 0.7-0.8 nm away from the C a atom of residue 489I ((r, aa, m#, Dlog 10 K a ) 5 (0.77, G, 17, 0.25) and (0.73, G, 21, 0.26) in table 2 and (0.77, G, 17, 0.19) and (0.73, G, 21, 0.19) in supplementary table S1, Supplementary Material online). The pairwise interaction of isoleucine at residue 489 (whose degree of residue burial is 20) with the glycine (whose degree of residue burial is about 20) is the weakest interaction among the potential amino acid residues at position 489. Hence, most mutations at residue 489 increase the binding affinity of the S protein-receptor complex. On the other hand, the pairwise interaction of glycine at residue 488 (whose degree of residue burial is 20) with the glycine on the receptor (whose degree of residue burial is about 20) is the strongest interaction among the potential amino acid residues at position 488. Hence, most mutations at residue 488 decrease the binding affinity of the S protein-receptor complex (supplementary table S2, Supplementary Material online). Here, the pairwise interaction is denoted by the ratio of the pairwise frequency to the product of individual frequencies in the same local structural environment (eq. 10). The strength of pairwise interaction is measured by amino acid preferences in the local structural environment, and the amino acid preferences are evaluated by using the nonredundant set of protein structures in the PDB. At residue 480D, the number of interacting residues on the surface of the receptor is three and the distances to those residues are larger than 0.9 nm (supplementary tables S3 and S4, Supplementary Material online). Hence, at residue 480D, the interaction with residues on the surface of the receptor seems to be weak. Except for a mutation to cysteine, no mutation substantially changes the ability to bind to the receptor (supplementary figs. S6A and B, Supplementary Material online). Residue 480D does not seem to contribute substantially to binding to the receptor. However, mutations to hydrophobic or positively charged amino acids at this residue decrease the ability to bind to the antibody 80R (supplementary fig. S6A , Supplementary Material online), so residue 480D may play a key role in NOTE.-n, the degree of residue burial of residue 489 in the individual S protein; m, the degree of residue burial of residue 489 in the S protein within a proteinprotein complex; r, the distance (in nm) between the C a atom of residue 489 and that of the target residue; aa, the target amino acid residue of the pairwise interaction; and m#, the degree of residue burial of the target residue in the target protein within a protein-protein complex. binding to the antibody 80R ( fig. 4A ). In the S proteinantibody m396 complex, there is no residue on the surface of the antibody m396 whose C a atom is within a 1-nm sphere around the C a atom of reside 480 of the S protein (figs. 4B and supplementary fig. S6B , Supplementary Material online). At residue 480D, a mutation to valine, whose residue is hydrophobic, affects the hydrophobic interaction with residues of the antibody 80R, increasing the binding affinity. However, the pairwise interactions with the four residues (R, S, N, D) whose C a atoms are 0.7-0.8 nm away from the C a atom of residue 480 become weaker. As a consequence, the mutation D480V decreases the binding affinity of the complex between the S protein and the antibody 80R (supplementary table S3, Supplementary Material online). At the same time, the mutation D480V does not substantially change the binding affinity of the S protein-receptor complex. Hence, its fixation probability within hosts vaccinated with the antibody 80R is high. The mutation to lysine, whose residue is positively charged, affects the hydrophobic interaction with residues of the antibody 80R, decreasing the binding affinity (supplementary table S4, Supplementary Material online). Therefore, the fixation probability of this mutation within hosts vaccinated with the antibody 80R is also high. As we have seen previously, residue 480D located at the center of the binding region of the S protein does not contribute substantially to binding to the receptor. On the other hand, the residues located at the edge of the binding region (e.g., 489I, 488G, and 473N) contribute to binding to the receptor. Residue 473N has five contacting sites on the surface of the receptor and also on the surface of the antibody 80R. Most mutations at residue 473N increase the ability to bind to the receptor and decrease the ability to bind to the antibody 80R (supplementary fig. S7 , Supplementary Material online). The mutation N473L increases the strength of the hydrophobic interaction both with the receptor and with the antibody 80R. However, the binding affinity of the S protein-antibody 80R complex is decreased by weakened interactions with the two serines whose C a atoms are 0.70-0.75 nm away from the C a atom of residue 473 (supplementary table S5, Supplementary Material online). The fixation probability of this mutation within hosts vaccinated with the antibody 80R is high. Within a host population vaccinated with the antibody 80R, the mutation D480V shows a high fixation probability. This is a result of decreased binding affinity of the antibody 80R to the S protein. Residue 480D has been previously suggested as a key residue determining binding to the antibody 80R (Sui et al. 2008 ). It has also been suggested that mutations at this residue do not affect binding to the receptor (Wong et al. 2004) . By a mutation to a hydrophobic amino acid from another type of amino acid, the hydrophobic interaction becomes stronger, increasing the binding affinity. However, the binding affinity of the antibody 80R to the S protein is dominated by pairwise interactions. The mutation D480V decreases the binding affinity. In proteinprotein associations, hydrophobic interactions seem to be less dominant (Gohlke et al. 2003) . The mutation I489K reduces the intensity of hydrophobic interactions. However, the binding affinity of the S protein-receptor complex increases, whereas that of the S protein-antibody 80R complex decreases. This difference is due to the different effects of the mutation on the pairwise interactions. We used a mathematical model that describes the dynamics of interacting populations: host cells, virus particles of an ancestor and a mutant, and mAbs. We examined how immune escape occurs when mutations have antagonistic effects on binding antibody and receptors. We simplified this complex, multiscale problem by modeling response to mAbs in order to gain a fundamental understanding of viral evolutionary response to adaptive immunity. Our simplified model takes advantage of the rich supply of protein structural data and sequence data that are readily available for several viruses, providing a framework that can be adapted for other viruses. In future studies, we plan to extend our model to deal with polyclonal antibodies, which is a more realistic immune response. We found that some amino acid mutations in the RBD may have a high fixation probability (greater than 0.9). Within the range of variation of the SAA mutations on the fitness landscape, some mutants produce sufficient changes in binding strengths to achieve high fixation probabilities within the host population. The variation in binding strengths corresponds to the variation in association constants calculated by using the structural preference of the amino acid sequence, which was estimated using structural data stored in the PDB. Hence, the appearance of the mutants that may be fixed with high probability within the host population is not a grandiose figment but in reality it can happen. The time span of the occurrence of the mutants fixed within the host population can be measured by setting an explicit value of the rate a i at which mutations occur at the corresponding residue. By measuring the time span, we can determine the effective period of the vaccine. In development of vaccine, it may be crucial to address the emergencies of such mutations in the RBD. Supplementary figures S1-S7, tables S1-S5, and other materials are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/). Diversity and complexity of HIV-1 drug resistance: a bioinformatics approach to predicting phenotype from genotype Antibodies, viruses and vaccines Molecular dynamics-solvated interaction energy studies of protein-protein interactions: the MP1-p14 scaffolding complex Computer simulation of protein-protein interactions Insights into protein-protein binding by binding free energy calculation and free energy decomposition for the Ras-Raf and Ras-RalGDS complexes Enlarged representative set of protein structures SARS coronavirus: a new challenge for prevention and therapy Structural basis of neutralization by a human anti-severe acute respiratory syndrome spike protein antibody, 80R The B-lymphocyte system: fundamental immunology On the probability of fixation of mutant genes in a population A novel coronavirus associated with severe acute respiratory syndrome Structure of SARS coronavirus spike receptor-binding domain complexed with receptor The yeast cell-cycle network is robustly designed Angiotensin-converting enzyme 2 is a functional receptor for the SARS coronavirus Receptor and viral determinants of SARS-coronavirus adaptation to human ACE2 Hepatitis C viral dynamics in vivo and the antiviral efficacy of interferon-a therapy Population dynamics of immune responses to persistent viruses Viral dynamics in hepatitis B virus infection HIV-1 dynamics in vivo: virion clearance rate, infected cell lifespan, and viral generation time Structure of severe acute respiratory syndrome coronavirus receptor-binding domain complexed with neutralizing antibody Characterization of a novel coronavirus associated with severe acute respiratory syndrome From genotype to phenotype: buffering mechanisms and the storage of genetic information Ab initio protein structure prediction of CASP III targets using ROSETTA Improved recognition of native-like protein structures using a combination of sequence-dependent and sequenceindependent features of proteins Crosshost evolution of severe acute respiratory syndrome coronavirus in palm civet and human Broadening of neutralization activity to directly block a dominant antibodydriven SARS-coronavirus evolution pathway Potent neutralization of severe acute respiratory syndrome (SARS) coronavirus by a human mAb to S1 protein that blocks receptor association Robustness against mutations in genetic networks of yeast A likelihood-based index of protein-protein binding affinities with application to influenza HA escape from antibodies Fold recognition of the human immunodeficiency virus type 1 V3 loop and flexibility of its crown structure during the course of adaptation to a host A 193-amino acid fragment of the SARS coronavirus S protein efficiently binds angiotensin-converting enzyme 2 Outbreak news-severe acute respiratory syndrome (SARS) Evolution in Mendelian populations We thank the three anonymous reviewers for their constructive comments. We thank Keiko Udaka for useful comments on the half-life periods of viruses and antibodies. This study was supported by JSPS KAKENHI Grant-in-Aid for Scientific Research (B) 19300094 and Scientific Research (C) 20570223. Watabe and Kishino · doi:10.1093/molbev/msq056 MBE