key: cord-0278702-s601i9h3
authors: Peterson, Joseph D.; Adhikari, Ronojoy
title: Efficient and flexible methods for time since infection models
date: 2020-10-21
journal: nan
DOI: 10.1103/physreve.104.024410
sha: a93ed3ab9cf34bfd59d5c61f16a744c905a3e96b
doc_id: 278702
cord_uid: s601i9h3

Epidemic models are useful tools in the fight against infectious diseases, as they allow policy makers to test and compare various strategies to limit disease transmission while mitigating collateral damage on the economy. Epidemic models that are more faithful to the microscopic details of disease transmission can offer more reliable projections, which in turn can lead to more reliable control strategies. For example, many epidemic models describe disease progression via a series of artificial 'stages' or 'compartments' (e.g. exposed, activated, infectious, etc.) but an epidemic model that explicitly tracks time since infection (TSI) can provide a more precise description. At present, epidemic models with 'compartments' are more common than TSI models , largely due to higher computational cost and complexity typically associated with TSI models. Here, however, we show that with the right discretization scheme a TSI model is not much more difficult to solve than a comparment model with three or four 'stages' for the infected class. We also provide a new perspective for adding 'stages' to a TSI model in a way that decouples the disease transmission dynamics from the residence time distributions at each stage. These results are also generalized for age-structured TSI models in an appendix. Finally, as proof-of-principle for the efficiency of the proposed numerical methods, we provide calculations for optimal epidemic control by non-pharmaceutical intervention. Many of the tools described in this report are available through the software package 'pyross'

Epidemic modelling is a valuable tool for both (1) predicting how a disease will spread through a population and (2) testing and optimizing control strategies to mitigate the damage. Better control strategies are possible with better epidemic modelling, and ultimately an epidemic model is only as reliable as the assumptions it makes. Good epidemic models must be able to provide fine-grained descriptions for how a disease is spreading. It is not enough to simply predict a number of infected persons -one must also be able to predict the spread of disease from city-to-city and age group to age group, for example [1, 2, 3, 4] . Given the full burden of complexity an epidemic model is expected to bear, one cannot usually afford much more than an elementary model of disease transmission at each level.

To that end, some of the most popular classes of epidemic models are generalizations of the standard SIR framework, in which a population is partitioned into discrete compartments of susceptibles (S), infecteds (I), and recovered (R) [5] . In the simplest instance of this framework, a susceptible individual becomes infected by encountering an infected person in some way or another. An infected person is immediately infectious, and the subsequent transition to recovered has no memory of the time spent being infected (i.e. it is a random process with an exponential distribution of residence times). In the limit of a large population, the mean result over a large number of random transitions is predictable, and so deterministic versions of the model (like those considered in this report) are often preferred.

Compartment-based models like SIR model do capture important features of disease transmission, but the reliance on an artificial construction of discrete stages with Markov transitions is at odds with the complex medical realities of disease progression. This can cause problems, for example, when predicting the reproduction number of a disease based on its initial growth dynamics [7] . A more realistic model of disease transmission describes the infectiousness of an infected individual as a function of the infection age, or the 'time since infection' (TSI) [5, 2, 6, 11, 12] Does a person become infectious right away, or is there a delay of several days before they can transmit? When is a person most infectious? What is the probability distribution for recovery times? These details (which define the actual transmission dynamics) are naturally incorporated into a TSI model, but they can be difficult to match precisely in a compartment model [7, 8] . On the surface, it might seem that any epidemic model based on time since infection will be substantially more complex than the basic SIR model and therefore of limited value to fine-grained epidemic modelling applications. In this report, however, we will show that the SIR model can be extended to consider time-since-infection without making the model much more difficult to solve.

The advantages of a TSI model for disease dynamics do not eliminate the need to resolve discrete compartments. For example, in modeling the transmission of SARS-CoV-2, in addition to knowing the total number of infected at any point in time, one must also make predictions for the total number of hospitalizations if there is a risk that the healthcare system might become overwhelmed [13, 15] . In this report, we present a very simple and flexible strategy for decoupling the residence time distributions at each stage from the overall dynamics for transmission. Instead of explicitly modelling the occupancy and infectiousness in each compartment, we simply keep track of the total infected population at every TSI and then apply a filter to determine the total occupancy of each compartment. This makes the whole modelling framework more flexible and it should also be more straightforward to parameterize from case data.

By way of overview, this report is arranged as follows: First, in section 2, we present the governing equations for a TSI model, including the use of a filter over the infected population. We nondimensionalize the governing equations (section 2.4) and then offer three different numerical methods by which the TSI model can be integrated in time. In particular, section 2.7 introduces a novel Galerkin approximation with a spectral discretization of the infected population. The generalization of section 2 to age-structured epidemic models is provided in the appendix. In section 3, we provide sample results for epidemic curves obtained from the three discretization strategies, and in section 4, we discuss potential limitations of the Galerkin discretization. Finally, in section 5, we go beyond simple epidemic curves to consider optimal control by non-pharmaceutical interventions. To our knowledge, this is the first time that such calculations have been performed for a TSI modelling framework.

The methods described in the present report are available for use in the pyRoss epidemic modelling package (https://github.com/rajeshrinet/pyross).

For pedagogical purposes, we first review the deterministic SIR model. The SIR model is simpler than the basic TSI model, but it shares a similar underlying structure. The number of susceptibles, S, infecteds, I, and recovereds, R evolve in time by:

where β and γ are rate constants for infection and recovery, respectively (units of 1/time) and N = S + I + R is the total population.

Models that include time since infection are not new, and in fact the equations presented here (apart from superficial differences pertaining to the filtering convention, c.f. Figure 1 ) are covered by existing models for epidemics, many of which are more general [2, 6, 9] . Historically, however, models with TSI have been under-utilized due to the higher computational cost and complexity associated therewith.

It should also be noted that compartment models like SIR can also be represented as TSI models [14] and TSI models can also be discretized into compartment models (c.f. section 2.5). The distinction between the two is one of frameworks, not physics. When it is important to accurately describe disease dynamics in terms of time since infection, it is our view that the TSI modelling framework provides a more efficient and flexible set of tools to work with.

In a model with time since infection, we will speak of the infected class in terms of the number density of persons (per unit time) whose infections began at a time s prior to the current time t, I(t, s).

Thus, the total number of persons infected in the narrow interval between s and s + δs prior to time t is given by I(t, s)δs. As time passes, the time since infection for all infected persons also continues to pass in the same way. Given I(t, s), one can calculate the rate at which susceptible persons become infected: 

As with the standard SIR model, the equations for susceptibles and infecteds are closed. However, for practical applications of an epidemic model one may also need to describe the various sub-classes of the infected population (e.g. infectious, recovered, hospitalized, deceased, quarantined, asymptomatic, etc), with each sub-class α representing a fraction Φ α (s) of the total with number density I α (t, s) = Φ α (s)I(t, s) . The total population in each sub-class at any time is then given by:

This partitioning, or 'filtering', is also represented graphically in Figure 1 . The partitioning strategy is somewhat different than what one often sees in a TSI model containing sub-classes of the infected population, and it is only viable because we are interested in the deterministic (as opposed to stochastic) version of the model. To our understanding, the more common treatment (for both deterministic and stochastic models) is to explicitly move the infected population between different compartments while keeping track of the 'time since arrival' in each compartment [13, 14] . We prefer the partitioning scheme presented here for two reasons: First, it eliminates the need for and the influence of artificially constructed stages (e.g. 'exposed', 'asymptomatic', etc.). Second, it decouples assignments to compartments from the disease dynamics defined by β(s). Third, and perhaps most importantly, it is a severe approximation to say that the distribution of exit times from a present stage (e.g. hospitalization) depends only on the time since arrival and not on the time since infection.

In our approach, we note that since time since arrival depends on time since infection, the overall entry/exit rates for hospitaization can be exactly described as a function of time since infection with no approximations necessary.

For many compartment models (and indeed many TSI models as well) it is not unusual to suppose that infectiousness of an individual depends on the subclass or subclasses to which the individual has been assigned [15] . In that case, if each sub-class has a rate constant for infection β α (s) equation 4 would look like:

In our construction of a TSI model, however, explicitly evaluating the summation is unnecessary -we have already stated that β(s) is the mean infectiousness of the infected class at time since infection s.

In other words, equation 8 is equivalent to equation 4 when the mean value β(s) is correctly defined as:

Thus, we can independently control the overall disease dynamics given by β(s) and the overall partitioning given by Φ α without having to settle the details for each β α (s).

For the analysis in this report, we will only present calculations for two subclasses of infectedsthose who are recovered, with population R, and those that are not. If all infected persons are delcared recovered only once their infection has aged beyond the longest possible transmission time, then Φ R (s)

is just a step function at s = T . Taking a time derivative of equation 7, this can also be written as:

Or, more generally, given φ α (s) = dΦ α /ds the population in any subclass of infecteds evolves by:

In general, one can choose any number of subclasses with arbitrarily complex shapes for each Φ α (s)

with minimal added cost for obtaining solutions numerically (provided one chooses an appropriate numerical method). We refer to equations 4 -7 as the TSI model. Equations for an age-structured generalization of this TSI model are provided in the appendix.

In equations 4 -7, it was assumed that the functions Φ α partitioning the infecteds depend on time since infection but not on time itself. This approximation works well for modelling subclasses whose populations reflect the nature of the disease itself (like hospitalizations and fatalities), but it is less suited to subclasses whose populations reflect a behavioral or societal response to the disease. Consider a system in which quarantines are assigned by a track-and-trace progam -the probability of being under quarantine at time since infection s depends on the history of testing to that point and not necessarily on the current state of testing. Therefore, in the most general case one can write a dynamical equation:

where φ α in (t, s) and φ α out (t, s) are the probability density for moving in/out of subclass α at time t and time since infection s. Note, however, that if the criterion for moving people in and out of subclass α are changing on timescales that are slow compared to the typical residence time for subclass α, then to leading order one can apply a quasi-static approximation for which equation 12 reverts to:

and hence equation 7 becomes:

Finally, while the underlying probability density functions φ α (t, s) can be obtained from track-andtrace data, we have found that the available data is not sufficiently detailed to precisely specify all necessary inputs to a time since infection model. To that end, we suggest that the probability density functions can be approximated as beta distributions:

where p α in is the probability that an infected person is ever assigned to subclass α and the 'shape' of the distribution is defined by parameters a and b. The same parameterization can be done for each φ α out as well. Finally, the shape parameters α and β can be explicitly related to the meanx and variance v of the distribution:

This same parameterization strategy also applies to the rate constant for infection, β(s), which is itself a probability density function for transmitting the disease at time since infection s.

Before proceeding further, it is helpful to recast the governing equations into dimensionless form. We T 0 β(s)ds we rescaleβ = βT C /R 0 so that 1 −1 dsβ(s) = 1, and we rescale allφ α = φ α T C in the same way so that 1 −1 dsφ α (s) = p α , where p α is the probability that α is the final state (e.g. recovered, deceased, etc) for an infected person.

For compactness of notation, we will suppress tildes in our dimensionless variables -from this point forward, all variables are dimensionless unless otherwise stated. The dimensionless governing equations of the TSI model with time since infection are given by:

These equations can be solved by any number of means. In the section that follows, we will present three discretization stategies for numerical solution. First, we show that existing multi-stage models can be derived using a particularly inefficient method-of-lines discretization. Second, we show that with a few small changes the accuracy of the standard multi-stage model can be improved dramatically at a small cost -time-stepping is inflexible. Finally, for both spectral accuracy in s and flexible time-stepping in t, we present a novel discretization based on a Galerkin approximation.

The PDE can be converted to an approximate ODE representation via the method of lines, perhaps the most common technique used to solve TSI models [14] . Here, we discretize s in uniform intervals of width h and represent advection/integration in s via upwinding and right Riemann sums, respectively.

These methods are robust but generally poor performing, with accuracy O(h) -about as bad as any stable numerical scheme can be. We will show that this choice of discretization maps the TSI model to the standard multi-stage generalization of the SIR model [10] . Details about the truncation errors introduced by this particular choice of discretization will be discussed shortly.

We evaluate the solution at discrete points s 1 , s 2 , . . . , s N where s n = −1+(n−1)h and h = 2/(N −1).

In this section, we denote I n as the number density of infected persons at each s n and β n ∼ hβ(s n ).

For best results, the value of β n should be re-scaled so that the right Riemann sum approximation still evaluates to unity: N n=2 β n = 1. Applying this discretization scheme to the PDE equations, we obtain evolution equations for S, I n , and R:

Recall, however, that equation 27 applies to the special case where a person recovers only after aging beyond the maximum infectious period, φ R (s) = δ(s − 1). For more realistic versions of φ R (s), equation 27 becomes:

For best results, the value of φ R,n should be re-scaled so that the right Riemann integral still evaluates to one, N n=2 φ R,n = 1. Equations 24 -27 map exactly to the SIR model when N = 2 or to a multi-stage SIkR model with k = N −1 stages more generally. In practice, however, the convergence is so slow that truncation errors will have a large influence on the predictions. Therefore, we discuss the truncation errors introduced by this scheme and how they systematically influence the resulting predictions.

First, we consider truncation error when integrating over s using a left Riemann sum. Assuming β n = β 0 for all n, the right Riemann sum will under/over estimate the infectivity of the infected class during the rising/falling of the epidemic by O(h). Next, upwinding to approximate derivatives in s introduces numerical diffusion, meaning that the numerical solution behaves like a PDE model to which an additional diffusion term has been added: [12] . The principle disadvantage of this method is that time-stepping is inflexible.

We evaluate the solution at discrete points s 1 , s 2 , s 3 . . . , s N where s n = −1 + (n − 1)h and h = 2/(N − 1) and also at discrete times t (k) = kh for integer k. We denote I n as the number density of infected persons at each s n and β n ∼ hβ(s n ) for n = 0, N and β n ∼ hβ(s n )/2 for n = 0, N . As in section 2.5, for best results the value of β n should be re-scaled so that N n=1 β n = 1. The time-step k = 0, 1, 2, ... will be indicated as a superscript in parentheticals, e.g. S (k) is the susceptible population at time t (k) . In the language of a predictor/corrector method, intermediate predictions will be further denoted by the letter p inside the parenthetical, e.g. S (k+1,p) .

Beginning from time t (k) , we can calculate the state of the system at time t (k+1) = t (k) + h as follows. We begin with an explicit prediction step:

Then, we proceed to the correction: 

This method has second order accuracy O(h 2 ), so compared to the SIkR model one needs only a few stages to obtain good results. The principle weakness of this method is that time-stepping is inflexible -each time step must advance the solution forward by a time of exactly h. Note also that for numerical stability, the time-step h must be chosen to be sufficiently small (e.g. h(R 0 − 1) < 1).

For spectral accuracy in s and flexible time-stepping, we provide a discretization based on a Galerkin approximation. On the interval s ∈ [−1, 1], we can write I(t, s) as weighted sum of Legendre polynomials:

Any set of orthogonal basis functions could be used for this purpose, but we have found that Legendre polynomials work particularly well in this application. If we truncate the sum after m + 1 terms and insert into the governing equations, we obtain a closed system of equations for S, I 0 , I 1 , I 2 , . . . , I m , and R. This technique is known as a Galerkin approximation. The population of susceptibles evolves by:

where the coefficients a n are:

For n < m, we obtain evolution equations for the coefficients I n : 

In practice, this constraint can sometimes be used to eliminate the highest order Legendre polynomial coefficient, I m , as an independent variable.

However, this trick of eliminating I m works best when the reproduction number R 0 (or more generally, the contact matrix in an age-structured model) is not varying in time (c.f. section 4). Implicit algebraic constraints can be costly during numerical integration, so the predictor/corrector method may be preferred in these cases.

Finally, we consider the growth dynamics for a recovered class:

The Galerkin discretization does offer spectral resolution in s and flexible time-stepping -both of which are improvements over the second-order accuracy and inflexible time-stepping of the predictorcorrector method. However, as is often the case with spectral methods, the improvements in accuracy come with costs regarding the robustness of the method. These costs will be discussed in more detail in section 4.

In this section, we will provide sample calculations for each of the discretization schemes given in the preceding sections. In each case, we will solve for the growth dynamics of a system with R 0 = 2 and β(s) ∼ (1 + s)(1 − s) 4 . We also consider non-trivial recovery dynamics, φ R (s) = (1 + s) 4 (1 − s). This choice of input parameters is intended for proof-of-concept calculations only and does not purport to represent the dynamics of any particular disease. For an initial condition, we seed with a small infected population distributed in s to follow the fastest growing linear mode.

To find the fastest growing linear mode, we return to the PDE formulation of the model. From equation 20 we find:

with growth rate λ and density of recent infecteds I 0 1. From the boundary condition at s = −1 we also have:

This gives an implicit relationship between the growth rate λ and the reproduction number R 0 and β(s) that can be solved numerically:

Thus, we see that the growth rate λ (in units of 2/T ) is not uniquely determined by the reproduction number R 0 but also depends on how the infectiousness of a disease rises and falls over time within the infectious period. Equivalent results have been previously reported for TSI models [10] .

In this section, all simulations begin from an initial condition that approximates the fastest growing linear mode, I(0, s) = 10 −3 exp(−λs). Analysis of linear growth dynamics for an age-structured version of the TSI model are provided in the appendix to the present report.

First, we consider the SIkR discretization, per section 2.5. In Figure 2 , we see that there is noticeable error for small values of N ≤ 6 and trends for convergence are not evident between N = 4 and N = 6.

The convergence of the SIkR discretization is just O(h), so it is not surprising that the model performs so poorly here. Next, we consider the predictor/corrector (PC) discretization, per section 2.6. In Figure 3 , we see that the absolute error at small N happens to be larger in this case, but the trends for convergence are clear, and the PC method will vastly out-perform the SIkR method for slightly larger values of N . Note also that the PC method takes large time steps, so the cost to implement a larger value of N is still very good by comparison. In this specific application, it is clear that the Galerkin method vastly out-performs the predictor/corrector and SIkR discretization strategies. This should not be too surprising, since the Galerkin discretization provides spectral accuracy in s. However, as mentioned in section 2.7, models with spectral accuracy can be less robust in certain applications. In the following section, we discuss common problems that one might encounter with the Galerkin method.

Considering the Galerkin method, the dramatic improvements in accuracy come at a cost in the robustness of the method. In this section, we will discuss problems that the Galerkin method has with

(1) abrupt changes in R 0 , (2) smooth changes in R 0 , and (3) arbitrarily chosen initial conditions.

For the first two, we must explain what it means to think about abrupt/smooth changes in R 0 .

To this point, we have described the reproduction number as a dimensionless constant, but in this section we will allow it to be a function of time, R 0 = R 0 (t), as though the underlying dynamics of transmission were also subject to some time-dependent modulation β(t, s) = u(t)β(s). Physically, this construction might be useful for describing changes in non-pharmaceutical interventions (e.g. social distancing, lockdowns, etc.) or the inherent seasonality of an epidemic.

If there are step-wise changes or 'kinks' in R 0 (t) at some time t 0 , then the distribution of infecteds I(t, s) within the infectious range s ∈ [−1, 1] will also be non-smooth for all times t ∈ [t 0 , t 0 + 2]. Spectral methods (including those based on Legendre Polynomials) are less accurate when representing non-smooth functions, so the Galerkin discretization will sometimes (but not always!) perform poorly following non-smooth dynamics in R 0 (t). As an example, we replicate the calculations of the preceding section but impose R 0 (t > 3) = 1. In Figure 5 , we compare the converged solution from the PC method with the Galerkin result for m + 1 = 4. In this case, the Galerkin method is not as good as it was in Figure 4 , but overall it still appears to match the converged solution very well. The quality of agreement found in Figure 5 is perhaps partially attributed to the smooth choice of functions β(s) and φ R (s). If we instead choose φ R (s) = δ(s − 1), we find that shortly after the change in R 0 , the recovered population temporarily decreases -a result that is clearly non-physical. As a matter of best practice, one may want to consider using other methods whenever R 0 (t) is non-smooth -for example, it might be better to use the predictor-corrector method for the short

A similar problem also emerges whenever one attempts to apply an arbitrary choice of initial condition. The initial condition contains a history of infections going back to time t = −2. If the choice of initial condition is not consistent with the same constant value of R 0 that is applied for t > 0, the distribution of infecteds I(t, s) within the infectious range s ∈ [−1, 1] will be non-smooth until time t = 2. It is for this reason that the starting condition for all the simulations in section 3.2 was chosen to be aligned with the fastest growing linear mode. As a matter of best practice, when applying an arbitrary initial condition we suggest using the PC method to integrate to time t = 2.

However, even with a good choice of initial conditions and a smoothly varying R 0 (t), there are potential limitations to the Galerkin method. In any TSI method (including PC and SIkR), one must always be sure that the numerical method has sufficient resolution to capture the unsteady behavior in R 0 (t): when R 0 (t) is varying on timescales that are fast compared to the infection period, a higher number of modes will be needed to capture that variation. For example, if the reproduction number is changing hour-by-hour and the infectious period is several weeks, one will need hundreds or thousands of modes to capture all the effects of a time-varying reproduction number. Fortunately, whenever the reproduction number varies on such fast time-scales, one can often use a smoothed out approximation (e.g. moving average) to obtain very similar results in the trajectories for S and R with a much smaller number of modes. In Figure 7 , we compare converged predictions for R 0 = 2 and R 0 = 2 + sin(2πt)

with the same values of β(s) and φ R (s) = δ(s − 1). The unsteady reproduction number requires substantially more modes (m + 1 ≈ 16) for convergence, and unrealistic results (c.f. Figure 6 ) are seen when too few modes are present. Finally, whenever the reproduction number is varying in time -on any timescale -we have empirically found that there are minor problems with finding an approximation to I(t, s) that approximately preserves the property I(t, s > t − 1) = I(t − (s + 1), −1) whenever the boundary condition 48 is handled explicitly per equation 49. Fortunately, we have not yet found an example in which this particular difficulty in describing I(t, s) leads to notable errors in the trajectories for S and R. Whatever the underlying issue here may be, we have found that it is absolved when the boundary condition 48 is handled implicitly, but this will often make implementation of the Galerkin method much more computationally expensive.

A principal motivation of the present report is to develop numerical methods for TSI models that are computationally tractable for solving complex problems in the fight against infectious diseases. The example calculations given in section 3.2 do affirm the computational efficacy of our methods, but they do not represent a complex calculation that could not be reasonably attained with pre-existing (and less efficient) methods. Therefore, in this section we will provide proof-of-concept results for optimal control caclulations -calculations that (to our knowledge) have not yet been reported for a TSI model.

Here, we use the same model parameters as in section 3.2, but we consider a time-dependent R 0 (t) = u(t)R 0 , where u(t) is a control function that modulates the effective reproduction number. This control function could describe non-pharmaceutical interventions such as social distancing and lockdowns to prevent the transmission of a disease. For simplicity, we choose to describe the control function u(t) as a piece-wise linear interpolation between a set of N u interpolation points. Since Optimal control calculations require a cost function, which we represent as the sum of a social distancing cost and the cost of infections evaluated on an interval t ∈ [0, T f ], with T f = 35. The cost of infections is given by the total number of infections 1 − S at the end time, and we assume the cost of modulating the reproduction number at any time scales as (1 − u) 2 S. We use the parameter Ω as a constant of proportionality between these two costs so that the total cost C is given by:

Large values of Ω will represent diseases that are (a) not very deadly and/or (b) extremely costly to contain (e.g. the common cold), whereas small values of Ω describe epidemics that are more deadly and/or less costly to contain (e.g. Ebola or MERS). Note that our analysis here has not considered the possibility of re-infection -for diseases with short-lived immunity, the conversation on optimal control can be completely different.

Naievely, one might think that given a TSI model with a maximum infectious period (and no possibility of re-infection), the best control solution will always be to eradicate the disease with a full lockdown lasting one infection period, giving a cost C = 2Ω. However, this is not practical for mild diseases, Ω 1, which are not deadly enough to warrant such drastic measures. For sufficiently large values of Ω one will always find that that the best option is to let the disease run its course with targeted interventions that guide the system towards a less costly 'herd immunity' state. As an example, we present results for optimal control with Ω = 0.3, for which a solution with C = 0.54 < 2Ω has been found. The time-dependent R 0 (t) is shown in Figure 8 , and the resulting evolution of S and R are shown in Figure 9 . Note that the interventions mostly occur after the epidemic has already peaked (the inflection point in S(t)). This is because for large values of Ω, containing the disease is futile -the focus is not on preventing the epidemic but on finding a less costly path to herd immunity. 

Epidemic models are useful tools in the fight against infectious diseases, but they are often forced to strike a balance between accuracy and complexity. Infection-age or time since infection (TSI) models allow for a more realistic description of disease dynamics, but a percieved burden of computational complexity has here-to limited their use in many applications. In the present work, we have shown that one can retain a full TSI description of disease transmission with only a small increase in the computational costs, especially if one employs our Galerkin discretization strategy. We have discussed the advantages (accuracy with low modes) and disadvantages (robustness to non-smooth dynamics) of the Galerkin approximation, and also provided proof-of-concept calculations for 'optimal control' using a 'predictor/corrector' scheme. Besides providing more efficient numerical methods, we have also devised a 'filtering' technique that eliminates the need to assign different disease dynamics to different sub-populations of the infected population. Finally, the governing equations for an age-structured generalization of the model are given in the appendix. At this point, it is our view that TSI models can now be employed for a wide range of applications in which they were previously not considered feasible. The tools and methods described in the present report (including the age-structured version) are all freely available through the software package pyross.

This work was undertaken as a contribution to the Rapid Assistance in Modelling the Pandemic (RAMP) initiative, coordinated by the Royal Society. This work was funded by the European Research

Council, under the Horizon 2020 Programme, ERC grant 740269. The authors would especially like to thank Simon Frost and Robert Jack for their valuable feedback on this project, and also to Rajesh Singh for his help incorporating the methods outlined in this report into the epidemic modelling framework of pyross.

A Class-structured TSI model

In many cases, it is important to resolve predictions for how an epidemic will spread through different sub-populations (e.g. age, occupation, socioeconomic status, etc.). If there are C classes in the population, we can denote population that is susceptible and assigned to class i = 1, 2, . . . , M as S i .

Likewise, in this appendix the population that is assigned to class i and was infected during the narrow interval ofs ∈ [s, s + δs] in the past is I i (t, s)δs, and the total population assigned to class i is N i . The mean frequency of contact between members of class i and j is given by the quantity C ij /N j , where C ij is referred to as the 'contact matrix'. In the most general case, we allow each class i to have a different value of β i (s). The full PDE version 1 of the governing equations are given by:

Sub-classes of the infected population (e.g. infectious, recovered, hosptialized, symptomatic, asymptomatic, etc.) can be computed in the same manner as before. For example, if we assume that all infecteds are classified as recovered once their infection ages beyond time T , then the fraction of the population that is recovered and assigned to sub-class i evolves as:

As before, we prefer to work with the governing equations in a dimensionless form. We rescale S i and R i by the total population N = i N i , such thatS i = S i /N andR i = R i /N . Using a characteristic time T C = T /2, we rescale time ast = t/T C and number density of infecteds asĨ i = I i C/N . The time since infection is rescaled to the variables = s/T C − 1 so that the range s ∈ [0, T ] is mapped tõ s ∈ [−1, 1]. Finally, we define a class-specific reproduction number

and a dimensionless matrix of rate constants that subsumes information about the contact structure:

Note that the matrixβ ij has the property: dsβ ij (s) = 1 (62)

For compactness of notation, we will suppress tildes in our dimensionless variables -from this point forward, all variables are dimensionless unless otherwise stated. The dimensionless governing equations for the age-structured TSI model are given by: 

As before, there are many valid discretizations of the PDE model but as before we will focus on the Galerkin approximation and the method of lines.

Once again, we discretize s in uniform intervals of width h and represent advection/integration in s via upwinding and left Riemann sums, respectively. We evaluate the solution at discrete points in s, evaluating at s 1 , s 2 , s 3 . . . , s N where s n = −1 + (n − 1)h and h = 2/(N − 1). In this section, we denote I i,n as the number density of infected persons at each s n and β ij,n = hβ ij (s n ). Applying this discretization scheme to the PDE equations, we obtain evolution equations for S i , I i,n , and R i :

This maps exactly to the standard age-structured multi-stage SIkR model. Therefore, we confirm that in the limit of h → 0, the age-structured SIkR model is equivalent to an age-structured time since infection model. In practice, however, the convergence is so slow that truncation errors will have a large influence on the predictions. As before, the truncation error can be represented as numerical diffusion in the s−direction.

As before, one can also employ the more efficient predictor/corrector discretization using the method of lines.

We can represent each I i (t, s) on s ∈ [−1, 1] via a truncated Legendre polynomial expansion with m+1 

As a generalization of the linear stability analysis for the single compartment case, we suppose an initial condition in which nearly the whole population is susceptible and there is a small population of infected growing as I i (t, s) = I 0,i exp(λ(t − s)). The growth rate λ and relative number of infected in each age compartment I 0,i can be obtained by considering the following matrix:

The growth rate λ is the value of lambda for which the largest eigenvalue of A ij is equal to one, and the relative number of infected in each age compartment I 0,i is given by the corresponding eigenvector.

A structured epidemic model incorporating geographic mobility among regions

An age dependent epidemic model

Modeling the spatial spread of infectious diseases: The GLobal Epidemic and Mobility computational model

Global behavior of an age-structured epidemic model

Containing papers of a mathematical and physical character

Prelude to Hopf bifurcation in an epidemic model: analysis of a characteristic equation associated with a nonlinear Volterra integral equation

Dynamics of COVID-19 epidemics: SEIR models underestimate peak infection rates and overestimate epidemic duration

Effects of the infectious period distribution on predicted transitions in childhood disease dynamics

An epidemic model structured by the time since last infection

Realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics

Simple approximations for epidemics with exponential and fixed infectious periods

Why case fatality ratios can be misleading: individual-and population-based mortality estimates and factors influencing them

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand

A theoretical framework for transitioning from patient-level to population-scale epidemiological dynamics: influenza A as a case study

Inference, prediction and optimization of non-pharmaceutical interventions using compartment models: the PyRoss library