key: cord-0890702-ege6sfcx authors: Shaw, Leah B.; Billings, Lora; Schwartz, Ira B. title: Using dimension reduction to improve outbreak predictability of multistrain diseases date: 2007-02-22 journal: J Math Biol DOI: 10.1007/s00285-007-0074-x sha: 1729909e30f65c7371cc9c3fed9e91ca51f68ecd doc_id: 890702 cord_uid: ege6sfcx Multistrain diseases have multiple distinct coexisting serotypes (strains). For some diseases, such as dengue fever, the serotypes interact by antibody-dependent enhancement (ADE), in which infection with a single serotype is asymptomatic, but contact with a second serotype leads to higher viral load and greater infectivity. We present and analyze a dynamic compartmental model for multiple serotypes exhibiting ADE. Using center manifold techniques, we show how the dynamics rapidly collapses to a lower dimensional system. Using the constructed reduced model, we can explain previously observed synchrony between certain classes of primary and secondary infectives (Schwartz et al. in Phys Rev E 72:066201, 2005). Additionally, we show numerically that the center manifold equations apply even to noisy systems. Both deterministic and stochastic versions of the model enable prediction of asymptomatic individuals that are difficult to track during an epidemic. We also show how this technique may be applicable to other multistrain disease models, such as those with cross-immunity. In the study of infectious disease, a problem of interest is the coexistence of multiple strains. Examples of persistent co-circulating multistrain diseases include influenza [1] , malaria [12] , and dengue fever [11] . More recently, the avian flu viruses, including H5N1, were reported to coexist in several genotypes until 2004 [17] . Other examples of viruses possessing multiple strains may be seen in corona viruses, such as severe acute respiratory syndrome (SARS) [18] . The presence of multiple strains adds greater complication to models of disease dynamics due to an increasing number of stages through different infectionrecovery combinatorics. Reducing the dimensionality of the model is desirable, both to obtain a simpler model and to understand how the strains interact. In this paper, we focus on dengue fever, which has four co-circulating serotypes. It is believed that following infection with and recovery from one serotype, cross-reactive antibodies act to enhance the infectiousness of a subsequent infection by another serotype [16] . This phenomenon is termed antibody-dependent enhancement (ADE). Primary infections are less severe, often asymptomatic, but secondary infections are associated with more serious illness and greater risk for dengue hemorrhagic fever (DHF). Multistrain diseases with ADE-induced dynamics have been modeled previously [5, 10, 15] . Models of multistrain steady state endemic behavior and its stability were considered for two strains in [9] , where both mosquito population and humans were included. Although ADE was not explicitly modeled, the authors did consider different parameter values of contact susceptibility. Values of contact susceptibility less than unity model cross-immunity, whereas values greater than unity model increased susceptibility due to secondary infection. Ferguson et al. [10] modeled two serotypes of dengue and showed that ADE can lead to oscillations and chaotic behavior. Schwartz et al. [15] developed a similar model with non-overlapping compartments for greater ease of interpretation. The bifurcation structure was obtained for both the autonomous model and a model with seasonal forcing. Chaotic solutions were found for a wide range of parameter values. It was observed from numerical simulations that outbreaks of the four serotypes could occur asynchronously but that certain primary and secondary infective compartments remained synchronized. In particular, all compartments currently infected with a particular strain exhibited in-phase outbreaks. Cummings et al., [5] using the same model, explored the competitive advantage that ADE confers to viruses. Other models for dengue fever (e.g., [8] ) have included the mosquito vector as in [9] but have not included the ADE effect. Several models of multistrain diseases with cross-immunity rather than ADE have appeared in the literature [1, 6, 8] . In these models, individuals who have recovered from infection with one serotype have reduced susceptibility to infection with other serotypes. Oscillatory dynamics have been observed. The analysis of Dawes and Gog [6] employed the center manifold theorem that we will use in this paper but for a different purpose. Dawes and Gog showed a relationship between different serotypes, while we will focus on the relationship between primary and secondary infections with the same serotype, which are quantities that were not independently tracked in the Dawes and Gog model. Summarizing the results to date, in models of secondary infections where either ADE or cross-immunity is considered, the steady state equilibrium is not always stable, and persistent oscillations may occur. For models with ADE, although ADE is the cause of oscillation onset, more complicated chaotic or stochastic behavior is observed over a much wider range of parameter values than those of periodic oscillations [15] . Currently, there is no theory for predicting the interrelationship between primary and secondary infective compartments. In this paper, we consider the problem of revealing the dynamical structure between the secondary and primary infectives. Specifically, we analyze the model presented in [15] and demonstrate that the dynamics can be reduced to a lower dimensional model. Our analysis motivates the in-phase behavior previously reported for primary and secondary infectives. The lower dimensional system allows the prediction of primary infective levels from knowledge of other compartment values. This result may be useful in disease monitoring because little epidemiological data exists for the asymptomatic primary infectives. We show numerically that our predictions hold even in noisy systems. We also discuss the relevance of this technique to other multistrain diseases that do not display ADE, such as influenza, in which infection with one strain yields partial immunity to other strains. We study the model of [15] for n co-circulating serotypes. An individual can contract each serotype only once and is assumed to gain immunity to all serotypes after infection from two distinct serotypes. This compartmental model divides the population into disjoint sets and follows the size of these compartments as a percentage of the whole population over time. The variable definitions are as follows, where each variable is a dimensionless quantity representing a fraction of the total population: s Susceptible to all serotypes; x i Primary infectious with serotype i r i Primary recovered from serotype i x ij Secondary infectious, currently infected with serotype j, but previously had The model is a system of n 2 + n + 1 ordinary differential equations describing the rates of change of the population within each compartment: The parameters µ, µ d , β, and σ denote birth, death, contact, and recovery rates, respectively. Antibody-dependent enhancement is governed by the parameters φ i . Rates of infection due to primary infectives are of the form βsx i , as in a standard SIR model, while infection rates due to secondary infectives are weighted by the appropriate φ, taking the form βφ j sx ij . When φ = 1, there is no ADE, and both primary and secondary infectives are equally infectious. When φ > 1, the ADE appears as an enhancement factor in the nonlinear terms involving secondary infectives. Although the contact rate β could be time dependent (e.g., due to seasonal fluctuations in the mosquito vector population), we assume it to be constant here for simplicity. The fixed parameters throughout the paper are given by: µ = 0.02, µ d = 0 or µ d = 0.02, β = 400, and σ = 100, all with units of years −1 . These values are consistent with estimates used previously in modeling dengue [2, 5] . Time t is measured in years. We assume all serotypes have equal ADE factors: φ i = φ for all i. The ADE factor has not been measured from epidemiological data, so we test our results for various ADE values. The dynamics of Eq. 1-4 have been studied previously [2, 15] . A bifurcation diagram in the ADE factor φ is given in Fig. 1 for the four serotype model. The dynamics are qualitatively similar for other values of n, the number of serotypes. For φ ∈ [1, φ c ), the system has a stable endemic steady state. At a critical φ value, φ c , the system undergoes a Hopf bifurcation and begins to oscillate periodically. For slightly larger φ values, the periodic solutions become unstable and the system oscillates chaotically. Chaotic oscillations persist for most larger values for φ, with the exception of narrow windows of stable periodic solutions. It should be noted that previous studies of this model [2, 15] used a death rate equal to the birth rate (µ d = µ = 0.02) so that the population remained constant in time. However, the endemic steady state cannot easily be obtained analytically when the mortality terms of Eqs. 1-4 are included. In our analysis, we omit the small mortality terms by setting µ d = 0. The dynamics with µ d = 0 are qualitatively similar to that with µ d = 0.02 and have the same bifurcation structure. We demonstrate numerically that our results hold well even for the system with mortality. Setting µ d = 0 in Eqs. 1-4 still permits death to occur in the classes that have recovered from a secondary infection. For a disease such as dengue, for which the mortality rate is low and the average age at infection is believed to be fairly young [13] , this assumption is physically reasonable. The larger parameters β, σ may be scaled by the small parameter µ, defining where β 0 , σ 0 are O(1). Eqs. 1-4 with µ d = 0 have an endemic steady state of for all i, j. Although the full n-serotype model has n 2 + n + 1 dimensions, the attractor lies in a much lower dimensional space. We claim that the attractor actually lies in 2n + 1 dimensions to a good approximation, due to a relationship between primary and secondary infectives. This result can be shown using center manifold analysis. We now outline a derivation for the case of two serotypes and state the center manifold equations for several other general cases of interest. We begin with Eqs. 1-4 for two serotypes with no mortality (µ d = 0). We shift the variables so that the endemic fixed point (Eq. 6) is at the origin: The Jacobian in the shifted variables, evaluated at the origin, is given by to zeroth order in µ. The Jacobian is not diagonalizable, as it has only five linearly independent eigenvectors. A new set of variables w is defined as follows: The transformation matrix for this change of variables contains the five linearly independent eigenvectors of the Jacobian and two additional vectors selected to be linearly independent and put the system in the desired form. In the new variables, rescaling time to τ = t/µ, the dynamics are described by The Jacobian of Eqs. 9-15, evaluated at the origin and to zeroth order in µ, is Equations 9-16 may be written in the compact forṁ , where A and B are constant matrices such that the eigenvalues of A have negative real parts and the eigenvalues of B have zero real parts, and f and g are second order in u, v, µ. Therefore, center manifold theory [3] can be applied. The system will rapidly collapse onto a lower dimensional manifold defined by We expand h and l to second order To solve for the coefficients of the expansion, we write equalities for dw 1 dτ and dτ . This is done by equating the right hand sides of Eqs. 9-10 with the time derivatives of Eqs. 20-21, using Eqs. 11-15 to substitute for dw 3 dτ , . . . , dw 7 dτ and Eqs. 20-21 to substitute for w 1 , w 2 as needed. The coefficients of the expansion can be obtained by equating coefficients of like powers. The coefficients h 0 , h i , h µ , h µ,i and l 0 , l i , l µ , l µ,i are all 0, while some of the h i,j and l i,j are nonzero. After carrying out the above program, Eqs. 20,21 simplify to when the coefficients are substituted. Finally, we rewrite Eqs. 22-23 in the original variables. Thus, we obtain the following approximation for the invariant manifold onto which the system collapses: The above equations for the center manifold are our main result of the paper. To bring out more of the structure, we make the following observations. We generally observe in numerical simulations that the infective compartments (and their deviations from the fixed point) are small compared to the susceptibles and recovereds. Thus the infective correction terms tos −r i may be dropped to obtain a simpler expression for the center manifold: Given values for the susceptibles, primary recovereds, and secondary infectives, the primary infective compartments may be computed from the approximate center manifold equations (either Eqs. 24 or 25). Figure 2 compares the center manifold prediction for primary infectives with that from simulations. (Results from Eqs. 24 and 25 are nearly indistinguishable.) The second order approximation to the center manifold leads to reasonable predictions for the primary infectives. In particular, we accurately predict the time and approximate magnitude of the bursts, which correspond to epidemic outbreaks. The prediction for the infectives is occasionally negative because the center manifold equations are for thex i , the deviations of infectives from their fixed point, and sometimes the predicted deviations are large enough that adding the fixed point Antibody-dependent enhancement factors have not been measured experimentally, and it is possible that different serotypes could have different levels of enhancement. We address the possibility of unequal ADE factors for the two serotype case. A procedure similar to that used with symmetric ADE factors may be followed. In this case, an expansion for the location of the fixed point must also be used. For two strains with φ 1 = φ, φ 2 = φ(1 + ) and | | 1, the center manifold is approximated (to second order in the variables and µ, ) by For ease of analysis, we have assumed here that the small parameters µ, are of the same order of magnitude. The center manifold equations again allow fairly accurate predictions of the primary infective compartments for small (data not shown). Even when the assumption that is the same order as µ does not hold, the timing and amplitude of bursts in the primary infectives are generally predicted accurately (Fig. 3 ). It is important to extend the analysis to a larger number of serotypes. In particular, there are four observed serotypes for dengue fever. The center manifold technique may again be applied for n > 2 strains. In the expressions for the center manifold, it is convenient to define the sum of secondary infectives currently infected with strain k:z For dengue fever, primary infectives are generally asymptomatic, and most hospital cases are secondary infectives. The current infecting serotype can be determined from serology measurements [13] .z k , the sum of secondary infectives with serotype k, may be proportional to the number of hospital cases with serotype k. This quantity may be relevant to disease monitoring in addition to providing a shorthand notation for expressing the center manifold equations. We have completed center manifold analysis for n = 2, 3, 4 serotypes. The following equations for the center manifold summarize our results for n = 2, 3, 4 and extrapolate them for general n: where We observe that the terms f k and g jk may be neglected. They are linear functions of primary and secondary infective compartments. At the fixed point, the infective compartments are order µ, while the susceptibles and recovereds are order 1 (Eq. 6). It therefore is reasonable that deviations of the infectives from the fixed point, represented by the variablesx p ,x pq , would generally be small compared tos,r i . This assumption generally holds true in our simulations. For each of the n strains, Eqs. 29 and 30 provide n − 1 linearly independent equations, since some of the Eqs. 30 are linearly dependent, allowing n(n − 1) dimensions to be eliminated. Thus the dynamics of the system have dimension 2n + 1. If the susceptible and primary recovered compartments are known, a single quantityz k (Eq. 28) for each strain is sufficient to describe the infective dynamics for that strain. Primary and secondary infectives can then easily be computed using Eqs. 29 and 30. We find that similar accuracy is attained when the f k , g jk are set to 0 for convenience as with the f k , g jk terms retained. Figure 4 compares the predictions from the center manifold equations (Eqs. 29 and 30) with actual values for sample primary and secondary infectives. The full four-strain model with mortality (µ d = 0.02) was used in the simulations. Predictions were made by assuming that susceptibles, primary recovereds, and sums of secondary infectives (z k ) were known. Although the derivation of the center manifold equations omitted mortality terms, they generate accurate predictions for the full model. It has been observed in numerical simulations that primary and secondary infectives currently infected with the same serotype oscillate in phase synchrony [15] . Based on the center manifold reduction in dimension, we provide some analysis to motivate this effect. We convert the shifted variables (s,x i ,r i ,x ij ) to the original variables (s, x i , r i , x ij ) and use the center manifold equations 29, 30 (with f , g terms omitted as explained above) for substitution in the model ordinary differential equations, Eqs. 2 and 4. We obtain where we have defined for convenience. Taking the difference between Eqs. 33 and 34 and converting back to rescaled variables yields The difference between Eqs. 29 and 30 provides a useful substitution relation: Substitution of Eq. 37 into 36 gives an equation for how the difference between primary and secondary infectives evolves in time: where K = σ (x k,0 + φz k,0 ) = µ(1 + φ)/n is a positive constant of order µ. We restrict our consideration of Eq. 38 to those intervals on whichx That is, we consider intervals in which the sum of primary and secondary infectives with serotype k (weighted by the ADE factor) is below the steady state value. In the chaotic regime, these intervals occur during dropouts when serotype k is present in low levels. Such intervals cover the majority of the time domain and are interrupted by outbreaks of serotype k. (cf. Fig. 4 .) Integrating Eq. 38, we find that From our previous assumption, the argument of the integral is negative and is of order 1/µ 2 , since the infectives range between 0 and the negative of the fixed point. Therefore, x k (t) − (n − 1)x jk (t) 1, andx k (t) and (n−1)x jk (t) approach each other rapidly, with their difference decreasing with exp −O(1/µ) , where |µ| 1. We have demonstrated that primary and secondary infectives with serotype k are phase synchronized, differing only by an exponentially small term, during the intervals in which levels of serotype k are low. Outbreaks in the infectives occur in bursts, so although the relation x k (t) − (n − 1)x jk (t) 1 is not expected to hold during outbreaks, the outbreaks have fast time scale dynamics. Therefore the duration of the outbreak is short enough that phase synchrony is not lost. It may be possible to examine the phase locking during the outbreaks, but that would require a detailed multiscale analysis, which is beyond the scope of the present paper. Even so, our results help to explain the phase synchrony previously observed between primary and secondary infectives currently infected with the same strain [15] . We next consider Eqs. 1-4 with multiplicative noise and discuss whether Eqs. 29 and 30 can be applied to a noisy system. We include the noise in our numerical integration as follows. Let We add noise to the natural log coordinates: Here, η is a vector of random variables with mean zero and standard deviation σ n . Transforming back to the original phase space, we obtaiṅ Therefore, the noise is multiplicative and scaled by each component. Equation 41 was integrated numerically using a stochastic Euler method. In this section, we demonstrate the robustness of our predictions in the presence of a type of noise. Although there are other ways in which stochastic perturbations could be included in the model, multiplicative noise is the simplest choice because we already use natural log coordinates when numerically integrating Eqs. 1-4. In the original coordinates, multiplicative noise scales the fractions so that when they approach zero, the stochastic perturbation remains small, and the resulting vector field does not force the solution to leave the positive octant. The result is that the trajectories remain asymptotically meaningful fractions so that sufficient statistics may be generated. The predictions from Eqs. 29 and 30 hold approximately true for a wide range of noise standard deviations σ n . Sample results are shown in Fig. 5 . For the parameter values in Fig. 5 , the endemic steady state is stable for σ n = 0. Adding noise perturbs the system away from the steady state, and as the noise standard deviation σ n increases, the amplitude of the disease outbreaks increases. To quantify the accuracy of the center manifold prediction for various σ n , we define a metric for the "distance" of a trajectory from the center manifold: where x i,pred is the predicted primary infective value using Eqs. 29 and 30 (neglecting f k , g jk terms) and · t denotes a time average. We compute d by sampling every 0.01 year for 10 4 years (after first running for 10 4 years to remove transients) in order to obtain good statistics. Sample results are given in Fig. 6a . The "distance" d from the center manifold, i.e., the error in the center manifold prediction, increases as the noise increases. However, the amplitude of the outbreaks also increases with larger noise, so it is not unexpected that d would increase. We additionally compute a fractional error (percent error) in the primary infective compartment x 1 so that the effect of noise may be more carefully assessed. We exclude from the calculation the time intervals in which x 1 is less than its steady state value of µ/(nσ ), an arbitrarily chosen cutoff. In this way, we exclude the times when x 1 is small and the absolute error is also small, as those intervals would artificially inflate the fractional error measurement. The times of interest, which are the epidemic outbreaks, are included in the calculation. Results are given in Fig. 6b . We have analyzed a model for multistrain diseases with antibody-dependent enhancement. For a disease with n strains and two sequential infections, the model has n 2 +n+1 equations. Using center manifold analysis, we have reduced the model to 2n + 1 dimensions. Although we have derived an approximation to the manifold on which the solutions for both deterministic and stochastic systems lie, if we use this approximation and evolve the dynamics of a model constrained to the manifold, we do not necessarily see the same bifurcation structure. This is due to the fact that the analysis itself is local and done near the steady state endemic point. Despite that limitation, the lower dimensional system has other uses which we have described here. We have partially explained the synchrony between primary and secondary infective compartments currently infected with the same serotype. During the intervals when a serotype is present in low levels, primary and secondary infectives of that serotype are approximately related by a constant of proportionality. Although our derivation does not hold during outbreaks of that serotype, the outbreaks occur in short bursts during which phase synchrony is not lost. As mentioned above, a full multiscale analytic treatment may lend more insight into the phase relationship between primary and secondary infections of a given serotype during the outbreak periods. Our center manifold approximation enables prediction of each primary and secondary infective value if the disease parameters, susceptible and recovered compartments, and sums of secondary infectives (Eq. 28) are known. The prediction matches well with numerical simulations even when significant amounts of noise are added to the system. It has been observed [13] that most hospital cases of dengue fever are secondary infections. From serotype measurements of these hospital cases, estimation of the number of secondary infectives that have a particular serotype i should be possible. On the other hand, there is little epidemiological data on the frequency of primary infections because patients with primary infections tend not to be as seriously ill and are not admitted to hospitals. They are typically considered asymptomatic. Obtaining primary infective data for dengue monitoring purposes might require random sampling of large numbers of apparently healthy individuals, which would not be feasible economically nor practically. Our center manifold equations might be used to improve monitoring of dengue outbreaks by predicting compartments that cannot readily be measured, such as the primary infectives. The fact that susceptible and recovered compartment data is necessary to make predictions for primary infectives is a potential drawback. However, we observe in numerical simulations that the susceptibles and recovereds vary more slowly than the infective compartments, which quickly spike during an outbreak. Therefore, it is possible that susceptible and recovered data could be obtained by sampling a population less frequently than would be required for infective data. From blood samples, a person's immunologic background can be obtained, and it can be determined whether he has been previously infected with a particular serotype (see e.g., [14] ). If such sampling is feasible, the center manifold equations will yield information about primary infectives from indirect measurements from population data. The applicability of our method to other models is also of interest. Multistrain models for diseases with cross-immunity, in which recovery from one strain leads to reduced susceptibility to further infection with other strains, have been developed [1, 4, 6, 8] . We have analyzed a two-strain model with cross-immunity (LB Shaw, unpublished): where ψ < 1 expresses the degree of cross-immunity and n = 2. This two-strain cross-immunity model is introduced in [4] . This model differs from the ADE model in that rather than giving the secondary infectives a different infectivity (weighted by φ), we give the primary recovereds a different susceptibility to further infection (weighted by ψ) than the susceptible compartment. Using the coordinate transformation given in Eq. 8 (setting φ = 1), we can reduce Eqs. 45-48 from 7 to 5 equations and show a relationship between primary and secondary infectives. The eigenvalues of the Jacobian corresponding to rapidly collapsing directions are n(n − 1) eigenvalues equal to −σ , the recovery rate. The center manifold analysis holds because the recovery rate is rapid compared to the birth rate. It seems reasonable to expect that other models with crossimmunity might also be reduced and a relationship between primary, secondary, and perhaps higher order infectives might be derived. Because cross-immunity models are often written in a different form with overlapping compartments (e.g., [1, 6, 8] ), some reformulation of the equations may be necessary before a center manifold approach can be used. The results presented here were primarily for a symmetric multistrain model with equal disease parameters for each strain. However, the case of unequal ADE factors was briefly considered, and a center manifold was found when the difference between ADE factors was a small parameter (Eqs. 26-27). To model a multistrain disease with realistic parameters, it may be necessary to relax the symmetry constraint and derive results for an asymmetric system. Such a derivation is unwieldy because the endemic steady state can no longer be determined analytically (except as an approximate expansion). New techniques may be necessary to extend our current approach to more asymmetric models. Another potentially interesting extension is to include spatial spread of the disease. It has been observed from epidemiological data in Thailand that while dengue outbreaks within a single town may be of primarily one serotype, concurrent outbreaks in another town may have a different serotype predominate [7] . A similar center manifold analysis might be attempted on a patch model, in which the population is divided into several spatially distinct patches that are coupled. Research was supported by the Office of Naval Research and the Center for Army Analysis. L.B. was supported by the National Science Foundation under Grants Nos. DMS-0414087 and CTS-0319555. L.B.S. is currently a National Research Council postdoctoral fellow. Applications of centre manifold theory Proc. Natl. Acad. Sci. USA Proc. Natl. Acad. Sci. USA 96