key: cord-0690712-onvhcb89
authors: Osborn, Jenna; Berman, Shayna; Bender-Bier, Sara; D’Souza, Gavin; Myers, Matthew
title: Retrospective analysis of interventions to epidemics using dynamic simulation of population behavior
date: 2021-09-20
journal: Math Biosci
DOI: 10.1016/j.mbs.2021.108712
sha: fbd5ceb50c3ec8b1018613daaecc80a427298397
doc_id: 690712
cord_uid: onvhcb89

Retrospective analyses of interventions to epidemics, in which the effectiveness of strategies implemented are compared to hypothetical alternatives, are valuable for performing the cost-benefit calculations necessary to optimize infection countermeasures. SIR (susceptible-infected-removed) models are useful in this regard but are limited by the challenge of deciding how and when to update the numerous parameters as the epidemic changes in response to population behaviors. Behaviors of particular interest include facemasks adoption (at various levels) and social distancing. We present a method that uses a “dynamic spread function” to systematically capture the continuous variation in the population behavior and the gradual change in infection evolution, resulting from interventions. No parameter updates are made by the user. We use the tool to quantify the reduction in infection rate realizable from the population of New York City adopting different facemask strategies during COVID-19. Assuming a baseline facemask of 67% filtration efficiency, calculations show that increasing the efficiency to 80% could have reduced the roughly 5000 new infections per day occurring at the peak of the epidemic to around 4000. Population behavior that may not be varied as part of the retrospective analysis, such as social distancing in a facemask analysis, are automatically captured as part of the calibration of the dynamic spread function.

Retrospective analyses of strategies used to contain epidemics such as COVID-19 are valuable for countering successive waves of the infection, selecting countermeasures for future epidemics, and educating the population regarding the efficacy of implementing behavioral modifications. In particular, public-health agencies responsible for recommending types of personal protective equipment (PPE) to stockpile in anticipation of a future epidemic can benefit from the cost-benefit information yielded by retrospective analyses. Mathematical models, including those of the SIR type, can be helpful in providing a quantitative framework for the analyses. SIR models have been applied during the COVID-19 pandemic (Stutt et al. 2020 , Giordano et al. 2020 , Cooper et al. 2020 , Bertozzi et al. 2020 , primarily in a predictive capacity.

Some of the studies (Ngonghala et al. 2020 ) have predicted the infection dynamics for different intervention strategies, using a specific infection scenario (e.g. New York State).

A formidable challenge in applying SIR models is prescribing the values of the numerous parameters, and updating them to simulate evolving infection dynamics, as population behaviors (such as facemask adoption and social distancing) change in response to interventions. Typically, behaviors will change in a continuous manner rather than abruptly, for example, the gradual adoption of face masks by an affected population. Such gradual changes are difficult to capture in SIR models by periodically adjusting the parameters manually. The challenge is further accentuated by the high sensitivity of the predictions to some of the parameter values (Giordano et al. 2020) . Often, parameter choices are based upon best guesses, or closeness of fit (sometimes visual) of computed profiles with published curves (Cooper et al. 2020 ) .

In this paper, we introduce a modification of traditional SIR models that incorporates a "dynamic spread" function that captures changes in population behavior in a continuous manner.

There is no need to adjust parameters manually as interventions are implemented during the course of the infection. The dynamic spread function satisfies a differential equation with variable coefficients. These coefficient functions are obtained from a calibration procedure employing the published infection profile for the region of interest. The computed dynamic spread function reproduces the infection profile resulting from the baseline intervention strategy implemented over the course of the epidemic. Subsequently, the spread function can be systematically modified to analyze the effect of alternate intervention strategies. We illustrate the process using the COVID-19 crisis in New York City (CNYC) and New York State. The reduction in infection rate realizable in CNYC from alternative intervention strategies, including increased levels of mask usage and deployment of masks with higher levels of filtration, is estimated.

We illustrate the technique using a 4-equation SIR model (Stilianakis and Drossinos 2010 Myers et al. 2016) . The evolution of the susceptible, infected, and removed populations is simulated, as is the droplet transmission. The model assumes that the infection dynamics are dominated by one transmission mode (e.g. airborne particulates), and the parameter values are appropriate for all particle sizes contributing to that mode (though interpretation of the resulting equations as an average for a broad particle distribution is possible (Myers et al. 2016 ). More complicated SIR models can be useful, particularly if it is desired to model the details of the infection dynamics, e.g. quantifying the roles played by symptomatic and asymptomatic individuals (Stutt et al. 2020) . Our intention is to use the simplest model that can capture the baseline population behaviors and vary the critical ones retrospectively, with the hope that the model can be understood and used by non-experts such as policy makers. Additionally, as noted by Siegenfeld et al. (2020) , simpler models can prove more useful than complex ones, in part because accurate data is often not available to inform complicated formulations. Finally, we expect that some of the technique we present can be extended to more complex models.

The dynamic-spread function is the critical element of a systematic procedure for repurposing SIR models to perform retrospective studies. The 5 steps in the procedure are listed below and implemented subsequently. 1)Use the rate of change (measured by the number of new infections per day), dS/dt, of the susceptible population S, as the primary dependent variable. The derivative profile, which we call T(t), does not require the number of recovered patients to be tracked.

2)Normalize variables and identify critical dimensionless parameters. Formulating the model in terms of dimensionless clusters of parameters reduces the number of independent quantities that must be prescribed to run the model, and aids in identifying the most critical parameters.

3. Allow the dimensionless parameter δ, which contains the product of the infection transmission rate and the virus production rate, to vary with time, and account for its time dependence in the governing differential equation for T(t) . We denote δ(t) the "dynamic-spread" function, as it contains the elements that both vary with time and govern the rate of spread of the infection. The dynamic spread function is the critical element of the proposed strategy.

J o u r n a l P r e -p r o o f Journal Pre-proof 4)Derive the governing equation for δ(t). Provide the required coefficient functions using published T(t) profiles for a baseline infection scenario. Systematically alter the baseline spread function, and solve the governing equations, to simulate alternative strategies for countering the infection. 5)Designate the time origin for the dynamic analysis as the point of the first intervention into the epidemic. For CNYC, we identify this as day 17 (from the first reported infection), when shelterin-place was instituted. Prior to that point, it is assumed that δ is constant in time, and a traditional SIR model applies. The parameters for the traditional SIR model can be estimated from the published growth rate and reproduction number. The resulting values serve as initial conditions for the dynamic-spread-function analysis.

The model is based upon a 4-equation set consisting of 3 standard SIR equations for the susceptible, infected, and removed populations, plus an additional relation to describe droplet dynamics. The set was introduced by Stilianakis and Drossinos (2010) and extended by Myers et al. (2016) to explicitly account for the influence of protective equipment. We introduce the equations using a notation in which the primes denote dimensional quantities, with units such as numbers of persons or 1/day. The primes will be dropped following nondimensionalization. The basic set is as follows

Here ′ is the number of susceptible individuals in the total population N, ′ is the number of infected individuals, ′ is the number of removed (recovered or died) individuals, ′ is the total number of droplets contributing to the spread of the disease, ̃ is the transmission rate, μI is the infection recovery rate, κ is the droplet production rate, and −1 is the droplet removal rate. We allow the transmission rate ̃ and the production rate κ to vary with time. The removed population is not of interest in the model, hence, the equation for ′ will not be considered further.

In the model applications, it is convenient to work with a nondimensional set of equations.

Nondimensionalization of the dependent and independent variables, followed by arrangement of the resulting parameters in each equation in clusters, effectively reduces the total number of parameters. The resulting parameter combinations represent ratios than can lend insight into the infection dynamics. For example, the significance of a droplet production rate  given in number of droplets per day can be difficult to appreciate, but the ratio  / −1 of the droplet production rate to the droplet removal rate can help explain a rapid increase in the number of infection. In the nondimensionalization procedure we attempt to use scalings that represent order-of-magnitude estimates for the relevant variable. In that way, the values of the dimensionless parameters represent realistic estimates for the ratio of two competing effects in the infection scenario.

Retrospective analyses possess the advantage that some of the representative scales can be variables to the nondimensionalize (unprimed) variables are:

J o u r n a l P r e -p r o o f ′ = Δ 2 (2e)

Using these relations in Equations (1a -1c) and combining terms yields:

where = is the dimensionless infection recovery rate, = is the dimensionless droplet removal rate, and

Since ̃ and vary with time,  is also a function of time. As noted above, we designate (t) the "spread function".

We apply the model only to large-population scenarios, where ( ) << . For CNYC, this inequality is well satisfied throughout the course of the COVID-19 epidemic. In that case, the last term in Eq. (3a) ( 1 / ) can be ignored. Additionally, the last term in Eq. (3c), which can be written as ln( ) , is considerably smaller than  . In dimensional terms (removing the 

As noted above, we take the origin to be the time of first intervention.

To determine the dynamic spread function for the baseline scenario, we reformulate Eq.

(5a) as an equation for δ(t), assuming T(t) and I(t) to be known. The T(t) profile is obtained from the published number of new infections per day in the locale of interest (e.g. New York City). We label this published profile Tb(t), where the "b" subscript denotes baseline, and the resulting (from Eq. (5b)) infection profile Ib(t), and we insert them into the equation for δ(t). The resulting equation for the baseline dynamic spread function is:

J o u r n a l P r e -p r o o f

Because the governing equation for δb(t) is informed by the published Tb(t) profile, solving Eq's (5a,b) using this dynamic spread function will reproduce (within numerical tolerances) the published Tb(t) curve. The utility of δ(t) derives from modifying it to model alternative intervention strategies and solving Eq's (5) to determine the new infection curves (i.e., T(t) and I(t) profiles). Modifications to account for protective strategies were performed in the following manner.

We build upon a previously developed SIR model (Myers et al. 2016 , Yan et al. 2018 ) that systematically accounts for the presence of protective equipment. Assuming both ̃ and κ vary with time, we can write (Eq. (3d)) = ̃̃+ .

We apportion a fraction ϵκ (e.g. 1/5) of the change in δ to changes in droplet production, and accordingly set =

Since from Eq. 

which can be integrated to

In Myers et al. (2016) , it was shown that the droplet production rate in the presence of protective equipment can be written as

Here FEout is the filtration efficiency (e.g. the FE for an N95 respirator is 95%) of the mask against outward-going particles produced by the infected individual wearing the mask (also referred to as source control). If FEout varies with particle size, we assume that a dominant particle size exists in the distribution generated by the infected population, and the FE for that size applies.

The quantity fi is the fraction of the infected population wearing the mask at any given time, also known as compliance rate. We assume that the infected population wearing masks includes both symptomatic and asymptomatic persons, and that the masks have an equal effect on reducing the droplet production rate of symptomatic and asymptomatic persons. Eqs. (9) and (10) can be combined to give

and J o u r n a l P r e -p r o o f

For a given scenario with spread function δb(t) (derived from Eq. (5c)) and a given baseline mask material (i.e. FEout,b), Eq. (12) allows the population compliance rate fi to be determined as a function of time, once the relative importance of change in production (ϵκ) compared to change in transmission (1-ϵκ) is estimated. Thus, to perform a retrospective analysis in which a barrier material of different outgoing-particle-capturing efficiency is investigated, the baseline compliance rate as a function of time would first be determined from Eq. (12), then that compliance profile and the new FEout value would be used in Eq. (10) which, with Eq. (9), would be used to create a new dynamic-spread function. The modified spread function would then be used (in Eqs (5a,b)) to estimate the change in infection rate. A detailed example is provided in Section 3.1.

The initial conditions for Equations (5) are obtained by first simulating the dynamics of the infection prior to any intervention, e.g. days 1 -17 for CNYC. In that case, the derivative of the dynamic spread function is zero and Equations (5a,b) revert to a traditional SIR model. Seeking solutions that have an exponential time dependence of the form exp(Mt) for T(t) results in the algebraic equation

The subscript "0" on δ implies that the value applies to the initial period of the infection, before 

The symbol R0 represents the reproduction number (Myers et al. 2016) for the standard SIR model. Estimates of R0 for the early stages of epidemics are also published. In the simulations we perform, a range of recovery times μI (with corresponding dimensionless recovery times γ ) ranging from 2 days to 10 days was considered. For any given value of γ, λ and δ0 were obtained using published values of the reproduction number R0 (Eq. (14) and growth rate M (Eq. (13)) for the scenario of interest. The Tb(t) profile for CNYC was obtained from the Johns Hopkins

Coronavirus Resource Center, wherein the data beginning at day 17 was used. Ib(t) was derived from Tb(t) using Eq. (3b), rather than using a published infection profile, so that it was not necessary to ascertain how well recoveries were tabulated in the published infection curves.

Uncertainties were determined by performing simulations for an ensemble of (μI ,R0 ) combinations, with each parameter selected from the range of published values for a given scenario. Six parameter sets were typically used to determine the uncertainty. The mean (over the six-parameter ensemble) T(t) profile, as well as a standard deviation above and below the mean at each instance of time, are reported.

In the simulations performed to examine different alternate intervention strategies, baseline δ(t) profiles using the actual intervention strategy were first obtained, and then modified to reflect J o u r n a l P r e -p r o o f alternatives. The governing equations (5 a,b) containing the modified spread function were solved using a Runge-Kutta method (Matlab ode45, Mathworks Inc.).

We performed a variety of retrospective analyses involving protective equipment. For CNYC, days 17 -37 were analyzed. This interval was chosen because day 17 is the day of the first intervention (shelter in place), and day 37 is the time of maximum new infections per day, based upon a 7-day average (Johns Hopkins Coronavirus Resource Center 2020). Initial reproduction numbers between 2 and 6 were considered, along with recovery times between 2 days and 10 days. The fraction ϵκ required to determine the compliance (Eq. (11)) was assumed to be 1/5. The infection scenario involving the entire state of New York was also considered in another set of simulations.

Using the procedure described in Section 2.3, we analyzed scenarios where the infected population in New York City deployed different types of masks. It was assumed that only the infected population deployed the masks, i.e. we considered a source-control measure. To illustrate the procedure of Section 2.2 for this scenario, we identify the outward filtration efficiency of the baseline mask as FEout,b , where the "b" subscript denotes baseline, and the outward filtration 

We assume the compliance profile for the modified mask strategy is identical to baseline, i.e. the population uses a higher-efficiency mask but with the same adoption rate. Using the modified mask filtration rate FEout,mod , along with the baseline compliance profile (Eq (15)), in Eq. (10) for the droplet production gives the modified droplet production rate in the dynamic spread function is observed. A slight decrease in dynamic spread function value is associated with a much larger decrease in new infections (Fig. 1b) For the same increase of FE from 67% to 75% , the number of infected individuals (Fig. 1c) at day 37 is reduced by about 30%. The uncertainty is considerably larger for the infected population infected population is much more strongly influenced by the recovery time than the number of new infections. The recovery time spanned a factor of 5 over all the simulations performed. The uncertainty for the dynamic spread function (Fig 1a) is comparable to that for the infected population, though for clarity it is not shown.

To evaluate the effect of mask compliance, we assume that the adoption rate for the mask follows a temporal profile identical to baseline, but larger or smaller by a factor of F. As in the previous set of calculations, it is assumed that only the infected population deploys the masks.

Baseline compliance as a function of time is again given by Eq. (15), and the compliance profile for the modified scenarios is F times this expression. Using this modified compliance in Eq (10) gives the modified production rate

analogous to Eq. (16) for the variable filtration-efficiency simulations. The baseline spread function ( ) is identical to that for the filtration-efficiency study; it is given by the top curve in 

For the mask compliance analysis, a mask of FE of 67% was assumed. Upon solving for fi(t), (Eq.

(15)) it was found that the baseline fraction fi(t) of the CNYC population wearing the mask increased from 0% at day 17 to about 42% on day 37. The 42% maximum compliance was increased to 50%, 60%, and 70% in a series of computations by adjusting the F value in Eq (20), and solving Eqs (5a,b) with this modified spread function. F was iteratively adjusted to yield the target compliance at day 37.

Increasing the mask compliance to 50% reduced the maximum number of new infections per day from about 5100 to 4000 (Fig. 2) . The turn-around time is reduced from approximately 37 days to 34 days. Similarly, increasing the mask compliance from baseline to 60% and 70% reduced the number of new infections per day by 42% and 58%, respectively. The corresponding turn-around times reduced to 31 days and 28 days (Fig. 2) . To incorporate variable social distancing into the dynamic spread model, modifications to the spread function were made in the following manner. The level of social distancing is related to the transmission rate ̃( ) . As described in Stilianakis and Drossinos (2010) and Myers et al. (2016) , the transmission rate is proportional to the number of contacts between a susceptible individual and an infected person. Thus, the level of social distancing is proportional to ̃( ) .

This function is similar to the effective contact rate in Ngonghala et al. (2020) . Since the dynamic spread function δ(t) is proportional to the transmission rate ̃( ) (Eq. (3d)), different levels of social distancing could be implemented in a straightforward manner by scaling the baseline spread function (dashed curve in Fig. 3a ) by the desired amount. That is, is predicted by the dynamic-spread model for the lowest reduction in level of social distancing, but otherwise comparable reductions are predicted by the two models. J o u r n a l P r e -p r o o f

To the extent that reduction in the spread of infection is due primarily to reductions in social distancing compared to lower rates of droplet production, i.e. assuming ϵκ < < 1, then the spread function curve (top line, Fig. 1a) represents the level of social distancing in New York City during the COVID-19 crisis. Within the first 5 days after the stay-at-home order, the level of social distancing drops by half. A slower adoption of the stay-at-home order is observed after that, but by the turn-around point (day 37), another factor-of-two reduction in social-distancing is achieved. These trends illustrate the continuous nature by which interventions take place during epidemics. The state of New York adopted the required behavioral changes faster than the city of New York (Fig. 3a) . This faster rate of behavior modification resulted in a considerably smaller This was felt to be a relatively small difference, given the large number of assumptions in SIR models. Some caveats are in order, though. First, it is unclear to what extent relative (to baseline) changes in new infections can be compared with relative changes in hospitalizations. Second, the time course for the hypothetical scenarios was quite different between the dynamic-spread model and that of Ngonghala et al. (2020) . For example, in Ngonghala et al (2020) , application of masks with higher FE than baseline resulted in curves that are flatter than baseline, while in the dynamic -spread model the curves largely retained the same shape (Fig. 2) . This is due, at least in part, to the fact that behavioral changes occurred gradually in the dynamic-spread model, and they did not commence until the time of first intervention. In Ngonghala et al. (2020) , and many other constant-parameter models, parameter values characterizing hypothesized scenarios with higher degrees of protection are implemented at the onset of the epidemic. This results in a flat infection

profile. An additional difference between Ngonghala et al. (2020) The parameter ϵκ , which quantifies the level of change in the dynamic spread function due to droplet production compared to that due to transmission, was chosen to be small (1/5) based upon the intuition that social distancing is more important than mask usage in altering infection dynamics. Other relatively small values of ϵκ , e.g. 1/3, yielded similar results to those reported above. The ϵκ parameter is useful for estimating the mask compliance (Eq. (12)), as well as the level of social distancing, as a function of time. Alternatively, if information is available on the compliance profile for the scenario of interest, it can be used in Eq. (12) to determine ϵκ more rigorously. provided. Also required are the initial reproduction number and an estimate of the recovery rate.

Though the model is not a forecasting tool, it can be useful for designing future countermeasures (e.g. for successive waves of an epidemic), particularly if elements of the anticipated scenario are similar to those of the baseline scenario used to compute the dynamic spread function δ(t). These elements include, most importantly, population behaviors such as face mask adoption (affecting both κ and ̃ in Eq. 5) and social distancing (affecting ̃ ) , but also environmental factors such as the pathogen inactivation rate. We refer the reader to Stilianakis and Drossinos (2010) for the dependence of infection dynamics on the numerous properties of the pathogen, the population, and the environment. Here we emphasize that the dynamic spread function implicitly captures the influence of all these factors, even though no functional dependence of the parameters is J o u r n a l P r e -p r o o f introduced. Only when considering an alternative scenario that varies one of the factors (e.g. the production rate κ in this study) does the explicit parametric dependence enter. An important consequence of this property is that social distancing, likely the dominant factor affecting infection dynamics, was captured in the CNYC study without having to be explicitly modeled.

The method by which the dynamic-source model simulates changes in population behavior differs from the manner in which it is typically done with SIR models in two important ways.

With most SIR models, changes in population behavior are addressed by updating parameters at discrete times. However, the original set of parameters and parameter updates is not unique.

Depending upon the strategy used by an investigator to minimize differences with published infection rates, and the "best guesses" made for the parameters that aren't well informed by data, the investigator can derive significantly different parameter values from another investigator using the same equations, even when both investigators show good agreement with published infection curves. When the two investigators perform retrospective analyses in which a single parameter is varied, the results of the analyses can be sensitive to the baseline parameter values, which can vary for the two investigators. The dynamic-source method bypasses this potential uncertainty. The second important difference between the dynamic-source model and traditional SIR approaches regarding modeling behavior dynamics is the dynamic-source method treats the variation in population behavior over time as a mechanism affecting the infection dynamics. This mechanism is described by the last term in Eq. (5a). Even if a traditional SIR model updates the parameters periodically, this term is not included. The mechanism is analogous to transport in physical systems (e.g. fluids), where there is temporal or spatial variation in a property (e.g.viscosity). The gradient of the property multiplied by the transported entity (e.g. momentum) is a transport J o u r n a l P r e -p r o o f mechanism that should be included to fully capture the effects of property variation, in addition to simply updating the variable property at different times or spatial locations.

As noted above, the lack of forecasting ability is a limitation of the model. Similarly, because of the model's simplicity, it cannot quantify the effects of the different factors affecting the infection dynamics, e.g. the roles played by symptomatic vs. asymptomatic infected persons.

While, as mentioned above, any complicated mechanism that affects either the droplet generation (parameter  in Eq. (5c)) or the infection transmission (parameter ̃ in Eq. (5c)) is automatically (i.e. as part of the calibration process, with no user input involved) included in the spread function during the calibration process, these effects are integrated with all other effects. If it is desired to explore how an intervention involving a particular mechanism would retrospectively alter the infection rate, a direct connection between that mechanism and either the droplet generation or the infection transmission would need to be specified. While that may not be possible given the model's level of simplicity, we note that in future generations of the model an enhanced level of specificity should be possible. For example, in our calculations, all of the types of facemasks used in New York City were averaged together into a single representative barrier. If information is available to prescribe the levels at which different types of masks were deployed, it is likely that a modified spread function containing a sum over different mask designs weighted by their popularity could be constructed.

Like previous studies (Stutt et al. 2020 , our simulations predict that considerable benefit can be obtained from higher FE masks without requiring N95 levels of efficiency (Fig 1) . It is important to emphasize that for the benefits to be realized, the FE for the barrier material must be attainable for the particle-size range of the dominant transmission mode J o u r n a l P r e -p r o o f for the given scenario. One way of assuring this is for the barrier material to provide the given FE across the spectrum of particle sizes. Otherwise, knowledge of the material FE for the intended application (e.g. reducing airborne particulates generated by coughing or sneezing by infected persons indoors) is required in order to generate useful estimates. The complex issues of dominant transmission mode for COVID-19, and the FE of different masks designs for the different modes, will be addressed in future applications of the model.

The challenges of modeling and forecasting the spread of COVID-19

A SIR model assumption for the spread of COVID-19 in different communities

To mask or not to mask: Modeling the potential for face mask use by the general public to curtail the COVID-19 pandemic

Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy

Proc. Nat. Acad. Sci

A Mathematical Model for Assessing the Effectiveness of Protective Devices in Reducing Risk of Infection by Inhalable Droplets

Mathematical assessment of the impact of non-pharmaceutical interventions on curtailing the 2019 novel Coronavirus

New York Times (2021) New York Coronavirus Map and Case Count -The New York Times (nytimes.com)

What models can and cannot tell us about COVID-19

Dynamics of infectious disease transmission by inhalable respiratory droplets

A modelling framework to