key: cord-0675638-czog41qt
authors: Gratton, Steven
title: Path Integral Approach to Uncertainties in SIR-type Systems
date: 2020-06-02
journal: nan
DOI: nan
sha: 19548ef5a5cbe8b8553f77c62138560af0ed2a01
doc_id: 675638
cord_uid: czog41qt

In this paper I show how path integral techniques can be used to put measures on histories in"susceptible-infectious-recovered"(SIR)-type systems. The standard SIR solution emerges as the classical saddle point of the action describing the measure. One can then expand perturbatively around the background solution, and this paper goes on to work out the covariance of fluctuations around the background solution. Using a Green's function type approach, one simply needs to solve additional ordinary differential equations; an explicit matrix inversion is not required. The computed covariance matrix should be useful in the construction of fast likelihoods for fitting the parameters of SIR-type models to data. A comparison of the predictions of the approach to an ensemble of simulations is presented.

In understanding the spread of infectious diseases throughout a population, broadly speaking one can either take a continuous approach and solve deterministic differential equations or else take a stochastic simulation-based approach and computes many Monte Carlo realizations of the epidemic. A disadvantage of the first option is that it gives no indication of the properties of the uncertainties one should expect to see about the solution, while a disadvantage of the second option is that it requires the potentially expensive numerical computation of many realizations in order to fully understand the behaviour. In this note I present a scheme that walks a middle ground between these two options, a deterministic procedure that allows for the efficient computation of the properties of uncertainties around the mean history. This paper will treat the simple "susceptible-infectious-recovered" (SIR) model [1] but the technique is applicable to more complicated models. The basic idea is to start by formulating a suitable discrete-time stochastic generalization of the SIR model, allowing for appropriate fluctuations in the numbers being infected and recovering. One then writes down an approximation to how unlikely some putative "history" would be in the stochastic framework. Taking the continuum limit in time, one then obtains a "measure" of the likelihood of that history occurring if the model is true. This measure is written as the exponential of minus the "action" of the history (following physics naming conventions in mechanics and quantum theory).

Once one has an action, one has a probability distribution on entire histories of the evolution of the system under consideration and so has access to all possible information about the statistics of such histories. This is an "imaginary time" version of Feynman's path integral for quantum systems [2] . As in mechanics, the saddle point of the action determines the most likely history to occur. Then, as in quantum mechanics, by considering deviations away from the saddle point solution, one can work out the statistical properties of the deviations of the variables such as their covariances at both equal and unequal times.

Actually, it turns out that one need not explicitly evaluate the path integral over the fluctuations in order to obtain the unequal time correlators (or UETCs) in the case where the fluctuations are small; rather one can obtain them from the form of the expansion of the action to second order about the saddle point history. This effectively gives one an inverse covariance matrix, whose inverse can be obtained without a matrix inverse operation via a multi-variable version of the "Green's function" technique. It is shown that this only requires solutions of the classical Hamiltonian equations for the fluctuation variables.

So, simply by solving ordinary differential equations, one can deterministically obtain both the typical behaviour of disease spread in SIR-type models and the properties of the uncertainties around that spread.

Analogous path integral techniques have been found useful by the author and colleagues in understanding the properties of quantum fluctuations in a variety of situations in models of the early Universe; see e.g. [3, 4, 5, 6] . The work of [7] also discusses fluctuations around SIR-type models, from an alternative viewpoint, and provides references to the literature on infectious disease modelling and inference.

Section (2) describes the model situation, and Sec. (3) constructs the action weighting each history in the path integral approach. Sections (4) and (5) then study saddle points and fluctuations around them, showing how to compute covariance matrices for quantities of interest. Section (6) presents a comparison of these results with numerical simulations, and a discussion and conclusions are given in Sec. (7).

Consider a population of total number N , with S(t) susceptible people and I(t) infectious people at time t. Let us imagine that every day an infectious person 2 meets with d other people, and infects each of them with probability µ if they have not already been infected. Overnight each infectious person (not including those just infected that day) recovers with a probability γ.

So from day to day we would expect

The S/N factor in the first equation accounts for the fact that only susceptible people can be infected (assuming one cannot catch the disease twice for now) and assumes that a typical infectious person is surrounded by typical non-infectious people.

What fluctuations around these means might we expect? In terms of infections, we have a total of d IS infection events a day, which we may take to be independent. So we expect a binomial distribution of successes, with the number of trials being d I and the probability of success per trial being µS/N . This indeed has mean µd IS/N , and has variance µd IS/N (1 − µ).

The number of recoveries might also be binomially distributed, and we shall consider this case here. There will be I trials per day with a probability of success per trial being γ, giving mean γI and variance γI(1 − γ).

To match convention we can denote µd by β. In the context of COVID-19, if people are typically infectious for about five days say, then γ will be around 1/5. For the disease to be able to spread initially (when S ≈ N ), from Eq. (2) we see need β − γ > 0, or the reproduction number R 0 ≡ β/γ to be greater than unity. COVID-19 seems to typically spread with R ∼ 2.5, so β ∼ 0.5. If d is of order ten say, then µ must be around 0.05. Hence we are close to being in the Poisson limit for the number of daily infections, with mean and variance then both βIS/N . For the number of recoveries, it is not so clear that we are in the Poisson limit but for simplicity we shall assume this for now. If things are not changing too quickly per day, we can treat a few, ∆t, days together at a time. We are then just effectively multiplying the number of potential events by ∆t. Keeping ∆ts around will also presently help us in taking the continuum limit.

From Eqs. (1) and (2) and our discussion above, we see that the behaviour of S and I are coupled. To disentangle this, we introduce a new variable T ≡ S + I. Now, consider a possible "history" of the disease spread, corresponding to values for S and T for all times t (in days). Then, over a few ∆t days, focussing for the moment on S, the difference between the change in S in this history and the expected change ∆t · ∆S(t) , coming from Eq. (1), must be put down to fluctuations. Approximating the Poisson by a gaussian with the corresponding mean −βI(S/N )∆t (negative since an infection reduces S) and variance βI(S/N )∆t, the probability density of such a fluctuation is then given

Note how the ∆T factors in the exponent can be rearranged to write the exponent as

Multiplying the probability distributions from the groups of time steps making up the entire history, we obtain an exponent

Now we may take the continuum limit ∆t → 0 and thus obtain a probability density or "measure" e −S S with "action" S S given by

where we have defined f ≡ βI(S/N ) for notational simplicity (and we have neglected any history dependence of the prefactor), assumed initial and final times of 0 and t f respectively and denoted time derivatives by an overdot. Following a similar argument through for T we obtain an action for T of

with g ≡ γ(T − S) (since I = T − S). Hence our full measure on a history is given by an action

(Here we have followed a physics convention and used the subscript E to denote a "Euclidean", as opposed to "Lorentzian", action.) Had we not introduced T we could have introduced a 2-dimensional correlated gaussian for changes in S and I at each time step and ended up with an equivalent result. With a measure, we can obtain the expectation value of some quantity · · · with:

Here Dq denotes the functional path integral over the variables collectively written together as q. One considers integrating over all paths subject to appropriate boundary conditions. If one knew the system starts in some state for example, then one would constrain all histories to start in that manner. One can also impose final state conditions if one is interested in knowing what might have happened in between two known states, or leave the final conditions free. Else evaluating the path integral as a function of final state (or equivalently choosing · · · = δ(q(t f ) − q f )) will give the probability for the final state q f . Some common choices for · · · are one of the variables at some time t 1 , e.g. S(t 1 ) or say the squares of a variable at equal times (e.g. S(t 1 ) 2 ). From these the variance of S at t 1 can be derived. More generally we can work out the covariance between two variables at different times, q i (t 1 )q j (t 2 ) , also known as an unequal time correlator (UETC).

Equation (9) can in principle be used to compute all quantities of interest within a model. However the functional integral cannot usually be evaluated. Instead a saddle point approximation can be performed. Here one finds the history that minimizes the action, and considers histories around this one.

The Euler-Lagrange equations for varying S E with respect to S and T yield d dt

Here subscripts S and T denote partial derivatives with respect to S and T . We see that ifṠ + f = 0 andṪ + g = 0 holds at some time, then it must actually hold at all times. ButṠ + f = 0 andṪ + g = 0 are just the standard SIR equations, so we have seen how their solutions can emerge as saddle points of the action in the path integral approach. We now go on to argue that the SIR-type solutions are actually the most likely saddle point if the imposed boundary conditions allow. As discussed above, the probability for obtaining the final state q f given some initial state q i is given by evaluating the path integral over all paths that interpolate between q i and q f . In the saddle point approximation, to leading order this is just e −SE(q f ,qi) , where S E (q f , q i ) is the action calculated for the interpolating path. The interpolating path's initial velocities will be chosen to make the q(t f ) under evolution of Eqs. (10) and (11) indeed be the desired q f . By inspection of Eq. (8), we see that the history that has the S(t f ) and T (t f ) given by integratinġ S + f = 0 andṪ + g = 0 from the initial S(0) and T (0) must have an action of 0. Then, as the action is positive semi-definite, all other paths, which have different field vales at the final time, must have higher action and so be less likely.

Having found the most likely path or background solution for S(t) and T (t), we now consider fluctuations around it, expanding S E to second order in δS(t) and δT (t). The form of the action and the most likely path havingṠ + f = 0 andṪ + g = 0 make this particularly straightforward, yielding

Evaluating e −S2 , with S 2 from Eq. (12), for any given δS(t) and δT (t) will tell us how much more unlikely such a history is than the standard SIR solution.

Being quadratic in the fluctuations, S 2 can productively thought of as minus the exponent of a multi-dimensional gaussian probability distribution in the limit as the number of dimensions tends to infinity as the duration of the time step tends to zero. In fact, let us pass back to discrete time for clarity in the following.

(Note now though ∆t is no longer being thought of as a multiple number of days, just some non-zero fraction of a day.) In matrix notation, −2S 2 corresponds to ∆t δS δT

where δS and δT are now viewed as column vectors with each row corresponding to a timestep, f , g and their partial derivatives are now viewed as appropriate (diagonal) matrices, superscript T denotes matrix transpose, C −1 (so-named by analogy to the inverse covariance matrix for a gaussian distribution) is the matrix

and D is the appropriate discrete-time time-derivative operator:

It is important at this stage to consider the sorts of boundary conditions we shall wish to impose. For the problem as formulated here, we suppose were are given the initial values for S and T . So δS(0) and δT (0) should initially be zero and any variation of them should not be considered. Hence our vectors for δS and δT start from the first timestep with δS(∆t) and δT (∆t). By the construction of the path integral, we see it involves "forward" differences and so we have correspondingly constructed our D matrix not to include anything from a step "beyond" the end of t f . We need not impose any conditions on the δS and δT at the final timestep; indeed we often want to determine what these should be at the end of the evolution. We shall see presently the role of the potentiallydangerous unbalanced 1/∆t in the first row of D (and of the unbalanced 1/∆t in the last row of D T ) in ensuring we find the correct solution. (Unlike for D, we can allow f , g and their partial derivatives to "overrun" as required to give invertible operators with vanishing error in the continuum limit.) If the number of timesteps is not too large, one could feasibly directly calculate the inverse of C −1 , which, when divided by ∆t, would directly give the covariance matrix for the elements of δS and δT . In any case, we can proceed without having to explicitly perform the matrix inverse as follows. We want to find G, the matrix that satisfies:

where I is the appropriately-sized identity matrix. Now, consider a matrix with twice as many rows and columns as C −1 , written in block form as:

The lower-right component of its inverse is just (c + b T a −1 b) −1 (see e.g. [8] ).

Reversing the argument we see that the inverse of a matrix that can be written in the form c + b T a −1 b is just the lower-right block of the inverse of the larger matrix (17). Now, (14) can be written as:

and so G must just be the lower-right block of the matrix K, where

and B is the matrix

By construction, D is a discrete-time time-derivative operator acting on column vectors to its right. So D T must act as the time-derivative operator on row vectors to its left. But, away from the ends, by inspection of (15), D T also acts as minus a time-derivative operator on column vectors to its right, which we can emphasize by defining D ≡ −D T . B is very sparse, only mixing variables amongst themselves at equal or neighbouring times. Consider for a moment a column on the right hand side of K. Writing it as a stack of vectors P δS , P δT , δS and δT , from Eq. (19) it must satisfy:

−gP δT + g S δS + (D + g T )δT = 0,

(−D + f S )P δS + g S P δT = δ t,t2 /∆t or 0,

where the delta function appears in one equation or the other and at the particular timestep t 2 depending on the precise column considered. Apart from the few rows with "unbalanced" 1/∆t's in them from the derivative operator, we are now in a position to take the continuum limit ∆t → 0 again to find that the column must satisfy:

Neglecting the delta functions on the right hand side, Hamilton's equations for the action S 2 (Eq. (12)) have emerged, the momenta serving as auxiliary variables enabling us to straightforwardly solve for the correlation functions for the degrees of freedom δS and δT . Explicitly, varying S 2 with respect toδ S andδ T and introducing the Hamiltonian H as usual in classical mechanics, one finds P δS = (δ S + f S δS + f T δT )/f , P δT = (δ T + g S δS + g T δT )/g, with

Hamilton's equations then just read:

−gP δT + g S δS +δ T + g T δT = 0,

−Ṗ δS + f S P δS + g S P δT = 0,

to compare to (25). 1 Equations (25) constitute a two-dimensional generalization of a Green's function. For the elements δS(t 1 )δS(t 2 ) and δT (t 1 )δS(t 2 ) , considered as a function of t 1 , we need to patch together a solution to Hamilton's equations for t 1 < t 2 and for t 1 > t 2 such that there is a jump of −1 in P δS going from t 1 = t 2 − to t 1 = t 2 + (as seen by integrating the third equation of (25) over t 1 from t 2 − to t 2 + ). Similarly, the elements δS(t 1 )δT (t 2 ) and

δT (t 1 )δT (t 2 ) need solutions with a jump of −1 in P δT at t 1 = t 2 .

Let us now think about the "unbalanced" rows; we shall see that they naturally provide the appropriate boundary conditions we need. The first rows of Eqs. (21) and (22), containing the terms δS(1)/∆t and δT (1)/∆t respectively ("1" denoting the first element), tell us that the first elements of δS and δT must tend to zero as ∆t → 0, whereas the last rows of Eqs. (23) and (24), containing the terms −P δS (N )/∆t and −P δT (N )/∆t respectively ("N " here denoting the final element), tell us that the last elements of P δS and P δT must tend to zero as ∆t → 0. So, in the continuum limit, we need δS = δT = 0 initially, and P δS = P δT = 0 finally.

As Eqs. (27)-(30) are linear, we need not re-solve Hamilton's equations for each value of t 2 that we are interested in knowing correlation functions for; rather we may just take different linear combinations of an appropriate set of pre-computed solutions to satisfy either the initial or final boundary conditions. Here, we need four pre-computed solutions, conveniently chosen to each have a different single canonical variable being initially non-zero. As it happens, in constructing our correlation functions for this problem we are aided by the special circumstance that Eqs. (29) and (30) preserve P δS = P δT = 0, so the solutions satisfying the final boundary conditions have P δS = P δT = 0 initially also.

The author has written code that computes the Green's functions for fluctuations around an SIR model. The code also computes a number of stochastic realizations of the situation as discussed in Sec. (2) , allowing for comparison between the analytic and stochastic handling of fluctuations.

Parameters assumed for the model are presented in Table 1 . N has been deliberately chosen to be quite small in order to yield quite large variations. Indeed, from the form of the action (8), we can, up to the discreteness of the initial conditions (i.e. keeping I(0) some finite integer greater than zero independent of how large N is), expect the fractional fluctuations to scale as 1/ √ N , and this is verified numerically. In addition, whilst keeping γ at a COVID-19-type level, we have lowered β somewhat from the unmitigated COVID-19-type level in the description of Sec. (2) to obtain an R 0 of 1.5, in order to better illustrate subtleties associated with epidemics that fail to take off. Such a value might also be appropriate for modelling COVID-19 dynamics with some intervention schemes in place. To achieve sub-percent level agreement of the simulations with the model it was necessary to allow S and I to evolve about four or more times over the course of a single day rather than the once a day described in Sec. (2) . In particular, the daily updating case saw a final infected fraction that was some Parameter fected fraction from the stochastic simulations to the analytic saddle point solution and ±1σ and ±2σ uncertainty envelopes computed using the path integral technique. Figure ( 3) compares the scatter in the final asymptotic infected fraction from 10000 simulation realizations to a gaussian with the computed variance. The path integral technique evidently performs very well. However, as foreshadowed by the lower 2σ contour in Fig. (1) going negative for a time early on, in a minority of cases, roughly 150 in the test shown here, the epidemic never takes off and so there are negative outliers for which the gaussian path integral approximation fails critically for. The histogram with the outliers removed has standard deviation 0.027 to be compared with the path integral value of 0.026 (whereas including the outliers yields a standard deviation of 0.080). Actually, Fig. (1) suggests that it might be possible to estimate the fraction of paths that peter out by evaluating the path integral at a relatively early time whilst the gaussian approximation is still reasonable and seeing what fraction of the distribution is below zero. Asymptotic Infected Fraction Gaussian Computation ). The contours are taken from a correlated 2D gaussian distribution with covariances given by the path integral method. Note the non-gaussian "tail" in the simulation results coming from "failed" epidemics resulting in a "pile-up" of cases near to the limit of zero infectious ∆I(30) = −I(30), shown by the vertical dashed line.

In this paper we have seen how it is possible to compute fluctuations around SIR-type models by deterministically solving additional ordinary differential equations instead of by randomly simulating many histories. While we have only looked at the simplest case here, it would be straightforward to extend this to SEIR models (which add an "exposed" stage before infectiousness starts), SEIR models stratified by age or degree of connectivity and so on. Suitable modifications of the action could handle vaccination schemes and also change the stochastic model of the recovery process if deemed necessary. Effects of interventions, such as lockdowns, social distancing and isolating, can be incorporated by changing β and/or γ with time as appropriate. Some of the equations would then pick up additional time derivative terms 3 . With an "entire history" view, one can also investigate questions such as how unlikely one would have to be to overwhelm critical care facilities at some period during an epidemic. In addition, one could go on to consider if a path integral approach might also be usefully taken to the analysis of network models and to "real-world" simulations.

With our example case using only a relatively small population number of 5000, we have seen a number of situations for which a naïve "pure gaussian" interpretation of the path integral results could lead one into error, typically involving "early-failing" epidemics. One can try and creatively reinterpret some of the gaussian outputs as discussed earlier, but it should also be possible to develop other more principled ways of treating this within the path integral framework. One could add weighting terms to the path integral to implement "absorbing" boundary conditions at I = 0 for example, or change the form of the action in certain regimes to better capture the underlying situation 4 .

Although in this work we have focussed on the most likely path going forward from a known initial state and considering the fluctuations around it, there are situations in which one might wish to expand around a path that doesn't necessarily satisfy Eqs. (1) and (2) whilst still satisfying Eqs. (10) and (11). For example, at the end of an epidemic (where I(t f ) = 0) one might be able to perform accurate serological surveys giving final boundary conditions on S(t f ) also. Then, if the epidemic was known to start at a given time with a given individual, so I(0) = 1 and S(0) = N − 1 there are both initial and final boundary conditions to satisfy. The extra freedom of Eqs. (10) and (11) compared to Eqs. (1) and (2) allow a suitable solution to be found for given β and γ. (By comparing the actions for different β's and γ's, one then effectively has a likelihood for those parameters solely in terms of the initial and final conditions.) Expanding around such a saddle point, requiring a more complicated version of (12) with additional terms now present but with no difference of principle other than a change of boundary conditions for the fluctuations, would then inform one about the spread of variables to be expected in the intermediate stages of that "constrained" epidemic.

With a description of fluctuations around a model in hand, one only needs a description of the way chosen data relates to the realization of the model, to be able to go on to construct a likelihood function for the estimation of model parameters and for model comparisons (see also [7] ). Such data might consist of test results, hospitalization and death numbers at various times, with parameters (that could be sampled over and solved for) determining their relation to the underlying epidemic accounting for time delays, incompleteness and so on. Given the present (as of writing) state of COVID-19 in the UK and elsewhere, to inform lockdown-easing choices it could be very instructive to look at situations in which the effective reproduction number R 0 ≈ 1 and there is much variability in what might happen next.

A Contribution to the Mathematical Theory of Epidemics

Quantum mechanics and path integrals. International series in pure and applied physics

Cosmological perturbations from the no boundary Euclidean path integral

Homogeneous modes of cosmological instantons

Closed universes from cosmological instantons

Path Integral for Stochastic Inflation: Non-Perturbative Volume Weighting, Complex Histories, Initial Conditions and the End of Inflation

Rajesh Singh, and Günther Turk. Inference, prediction and optimization of non-pharmaceutical interventions using compartment models: the PyRoss library

Numerical Recipes 3rd Edition: The Art of Scientific Computing

I thank Christine Gratton for encouragement and for comments on a manuscript of this work.