key: cord-221669-uokr4mjq authors: Barnes, T. title: The SHIR Model: Realistic Fits to COVID-19 Case Numbers date: 2020-07-28 journal: nan DOI: nan sha: doc_id: 221669 cord_uid: uokr4mjq We consider a global (location independent) model of pandemic growth which generalizes the SIR model to accommodate important features of the COVID-19 pandemic, notably the implementation of pandemic reduction measures. This"SHIR"model is applied to COVID-19 data, and shows promise as a simple, tractable formalism with few parameters that can be used to model pandemic case numbers. As an example we show that the average time dependence of new COVID-19 cases per day from 15 Central and Western European countries is in good agreement with the analytic, parameter-free prediction of the model In the models typically employed in epidemiology to follow the evolution of a pandemic [1] , one begins by partitioning a total population of N individuals into a complete set of disjoint subsets, according to their medical histories. The simplest models of this type have only a few categories, such as 1) the susceptible but as yet uninfected population S, 2) the infected but not recovered population I, who can infect S, and 3) the recovered population R, who can no longer infect S, and can themselves no longer be infected. First-order ordinary differential equations (ODEs) in time are assumed to describe the coupling between these groups. Once the appropriate initial conditions are specified, these ODEs can be integrated, which gives predictions for the subsequent evolution of the group populations. In this example, given constant rates {r ij } for the two assumed transitions, S → I and I → R, the resulting ODEs for this simple "SIR" model are dS/dt = −r S→I S dI/dt = +r S→I S − r I→R I dR/dt = +r I→R I We will refer to these two rate coefficients as r S→I = r I and r I→R = r R . On adding these equations we find that so the total number of individuals is a constant, N 0 . Since these equations are linear and homogeneous, it is convenient to divide by N 0 and solve for the fraction of individuals in each category, such as f S = S/N 0 . R or * Email: tbarnes@utk.edu f R can be inferred from the other two populations, using R = N 0 − S − I, or equivalently f R = 1 − f S − f I , so we need only solve the set (3), We will assume that the entire population is initially in category S (susceptible but not infected or recovered), so that f S (t = 0) = 1 and f I (t = 0) = 0. Given these initial conditions we can solve the equations (3) , which also imply f R through f R = 1 − f S − f I , with the results The behavior of this system is just what one would expect from the dynamics described in the SIR-type defining equations (1) : The initial population of susceptibles (S) decays exponentially at the specified rate r I as they transition into infected (I); the infected fraction increases to a maximum [2] , and then declines as it populates the final, recovered population (R). As t → ∞ one asymptotically approaches a state in which all of the population has transitioned to recovered, f R (t = +∞) = 1. As anticipated, at any intermediate time 0 < t < +∞, f S (t) + f I (t) + f I (t) = 1. As a specific example, the time dependence predicted by this SIR model is shown in Fig.1 , for r I = 0.5 [days −1 ] and r R = 0.1 [days −1 ] (an infection rate of five times the recovery rate). Two important modifications to this simple pedagogical SIR model appear appropriate for a more realistic description of the COVID-19 pandemic. One is that the rate of transitions from susceptible (S) to infected (I) should be proportional to the product of these two populations, since their interaction determines this rate. This modifies the second evolution equation for the population fraction f I (t) to This "second-order reaction kinetics" in the infection process is a standard assumption in epidemic models [1, 3] . A second, more novel modification of the SIR model is to incorporate the effects of a social response to a highprofile pandemic such as COVID-19. In this case there has been strong encouragement from governments and medical authorities to reduce the growth of the pandemic by implementing social distancing and related measures. In this SHIR model these actions reduce the susceptible population by transferring them to a new category of "hidden" individuals (H) who are not susceptible to infection, which is populated from the original group of susceptibles (S) through another rate process, The individuals in H are not only the physically sequestered; simply changing behavior, such as avoiding public assemblies, crowded schools and public transportation, implementing rigorous hygienic practices, and many other measures contribute to the continued reduction of the effective susceptible population fraction f S [4] . We consider τ = 1/r H , the "social response time," to be a measure of how quickly a society responds to a developing pandemic threat. We shall subsequently find that there are interesting differences between countries in the fitted values of this parameter. The partner DE for f S , similarly modified to include a second-order infection rate process as well as the new first-order social response transition term for S → H, is The final "recovery" DE for dR/dt is unchanged from the SIR model formula in (1) . However since the R cases in practice include mortality as well as recovery, we will instead refer to the R population as "resolved" [6] . Collecting these results gives our complete set of SHIR model DEs, Note that one recovers the three original pandemic DEs of Kermack et al. [3] (their Eqs.(29)) through the substitutions f (S,I,R) → (x, y, z)/N and (r I , r R ) → (κ, ℓ), and removing the "hidden population" of the SHIR model by setting r H = 0 and f H = 0. We note in passing that the infected (and still infective) individuals (all those in box I) are also referred to in the literature as the "active" cases, numbering A(t). Here, I(t) and A(t) are identical; any individuals in box I who are no longer infective are immediately reclassified as "resolved," and moved to box R. Since (8) is a nonlinear set of ODEs for the four population fractions f S,H,I,R , an exact analytical solution will not be possible, except in certain limits and approximations. However some important results can still be proven for this nonlinear system. One such result, obtained by adding the 4 equations in (8) , is that the sum of the four populations fractions f is a constant of motion. Thus f H + f S + f I + f R = 1, or H + S + I + R remains a constant, N 0 . Another useful result is that the population fractions of the two groups H "hidden" and R "resolved" can be expressed simply in terms of the susceptible (S) and infected (I) fractions f S and f I , and Our principle task will therefore be to solve the two coupled equations for f S (t) and f I (t), the second and third equations in the set (8) . Since the rate of increase of the infected fraction df I /dt is proportional to the product f I f S , a trivial solution of the model follows from an initial condition of no infections, f I (t = 0) = 0; this implies f I (t) = 0, i.e. that there are no infections at any later (or earlier) time. In this limit all that happens is that the initial fully susceptible population f S (t = 0) = 1; f H (t = 0) = 0 evolves into an increasingly hidden population, as a decaying exponential; In practice the fraction of infected individuals f I (t) will usually be much smaller than the fraction of susceptibles f S (t). This suggests that we use the no-infection limit (11) for f S (t) as a first approximation in determining f I (t) in the small-f I (t) limit. With this substitution, we find that the increase in df I /dt, the fractional rate of new cases per day, will be For completeness we note that the rate of decrease of the "active" (still infective) population fraction f I (t) per day due to transitions from "infected" to "resolved" status is given by so the full DE for f I (t) in this small-f I (t) limit is The formula for the fractional "new cases per day" (cpd) growth rate in the small-f I (t) approximation, Eq.(12), is attractive for several reasons. First, it is easily compared to data, since there is copious information available from many countries regarding the number of new cases per day. (This quantity is simply N 0 times the df (in) I /dt above.) Second, since (14) is a linear, homogeneous, first-order differential equation involving only f I (t), and the associated integral over time is tractable, we may solve for f I (t) analytically in this small-f I (t) approximation. In the early and intermediate stages of a pandemic in which the infected fraction still dominates the resolved fraction, we can ignore the assumed slower transition of infected to resolved by setting r R = 0, which gives lim S>>I>>R df I /dt = +r I e −rHt f I . This relation is especially amenable to comparison with data, as it relates two observed quantities, the number of new cases per day (N 0 df I (t)/dt) and the cumulative number of cases (N 0 f I (t)), assuming that N 0 f R (t) can be neglected. The SHIR model in this small-infectedfraction limit predicts that this ratio should be a decaying exponential in time, In the following section we will compare the expected time dependence of several pandemic models to COVID-19 data, including the SHIR model prediction (16) . We will follow this with more detailed SHIR-model calculations of the cumulative case numbers N 0 f I (t) and especially the new cases per day (cpd), N 0 df I (t)/dt, and will describe detailed fits of these functions to COVID-19 data from a range of countries. Here we will first discuss the time dependence predicted by two familiar dynamic models of pandemics, and will show how these models can be compared to COVID-19 data. We will find that there are serious difficulties in applying these simple models to COVID-19. In this discussion we will consider only the cumulative number of cases as a function of time (here called P(t)), and the rate of appearance of new cases per day, dP/dt. These numbers are widely reported for COVID-19, and require little interpretation. One important point is that the number of cumulative cases to time t, P(t), is the sum of the infected I(t) and the resolved R(t) populations distinguished in the previous section; As a first, simplest model of the growth of a pandemic, one can assume that the rate of change of the number of cases dP/dt is proportional to the number of cases P, The solution of this differential equation is an exponential times the number of cases at some arbitrary initial time, This simple model clearly makes unrealistic assumptions, notably that there are infinitely many potential cases. It also neglects spatial dependence or other "local" aspects of the problem, as well as the effect of transitions from infected to resolved individuals who are no longer infective. This exponential model predicts that the ratio of new cases per day to cumulative cases should be a constant, In Fig.3 we compare this expectation to the ratio of new cases reported per day (∆P/∆t) to the cumulative number of cases (P) for the UK, using data from Ref. [7] over the period 18 Mar -04 May 2020. Evidently this ratio is far from constant; it falls by about a factor of 10 over the 47-day period shown. The simple exponential model can be improved through the introduction of a limit in the number of the population that can be infected, P max . One can assume that the infection rate is initially equivalent to the exponential model, but later in the pandemic is suppressed in proportion to the number of uninfected individuals remaining, P max − P. A simple DE with these properties is the "logistic model," defined by The solution of this model, which is symmetric about P = P max /2, is the inverse of an exponential plus a constant, translated by a t 0 that is determined by the initial conditions; This logistic model can also easily be tested by comparison with data on the ratio of new cases per day to cumulative cases. In this case, the model predicts that this ratio should be a simple, linear function of P, So, when data on this ratio (y-axis) is plotted against P (x-axis), this model anticipates linear P-dependence, with an x-axis intercept (where dP/dt = 0) at P = P max . In Fig.4 we show an example of this plot for COVID-19 data, using the USA cumulative cases and new cases per day data from Ref. [7] through 15 July 2020. Although this ratio initially has some qualitative resemblance to the logistic model form (23), (see Fig.4 upper left), the more recent data (lower right) is clearly inconsistent with (23), and cannot be extrapolated linearly to zero dP/dt. A study of the pandemic time dependence predicted by the logistic model shows another specific aspect of the model that makes it unrealistic for COVID-19; the rate of new cases dP/dt predicted by this model is symmetric about the maximum, at the P(t) inflection point t = t 0 . We shall see that in contrast the rate of new cases dP/dt observed in the COVID-19 data is rather skewed, and has a faster rise to maximum than the subsequent decline. Here we will test a third prediction for time dependence, which is the form predicted by the SHIR model of the previous section. There we noted that the infected fraction f I of the population is predicted to satisfy the differential equation in the limits of a small fraction of the population being infected (f I << 1) and the resolved fraction being small relative to the infected fraction (f R << f I ). The existing COVID-19 data suggests that these are both reasonable approximations in the early and intermediate stages of the COVID-19 pandemic. The expectation (24) can immediately be compared to data, since (df I /dt)/f I is just the ratio shown on the y-axis of Fig.3 . The dotted line in the figure shows a decaying exponential fit of exactly the form (24), which evidently gives a very good description of the data. The parameter values found in this fit are r I = 0.233 [days −1 ] and τ ≡ 1/r H = 20.9 [days] . Motivated by the good agreement between the SHIR model prediction of a decaying exponential in time for (df I /dt)/f I and the COVID-19 data evident in Fig.3 , we will subsequently discuss SHIR model predictions in more detail, and will show fits of this model to COVID-19 data from many European countries. This will follow a short discussion of some general aspects of mathematically similar models, which will be used in the detailed discussion of SHIR results. The results discussed above motivate the consideration of generalizations of the simplest "pure exponential" pandemic model, in which the constant λ of that model (20) is replaced by an explicit function of time, For a given function λ(t), the defining differential equation (25) gives the predicted growth of case numbers as Formally this may be regarded as exponential growth in a new (nonlinear) time variable u, defined by du = λ(t)dt. Since this is a model of the growth of a pandemic, the cumulative number of cases (26) can only increase, or become flat when the pandemic has been completely halted, which occurs if and when we reach a time t f for which λ(t f ) = 0. Accordingly we constrain λ(t) to be positive (pandemic ongoing, t < t f ) or zero (t = t f , pandemic halted). The inflection point of P(t) is of considerable importance, since this is the time of peak pandemic growth. The time t inf l of the inflection point is specified by the rate of change of case numbers dP/dt reaching a maximum; this occurs when the second derivative d 2 P/dt 2 is zero, On substituting the general solution P(t) from (26) into this constraint, we find an equivalent definition of the in-flection point in terms of the proportional growth function λ(t), A. Cumulative cases P(t) and cases per day (cpd) dP/dt(t) In Sec.I.B. we introduced the "SHIR" model as a generalization of the SIR model, with the effects of pandemic reduction measures incorporated through a "hidden" population (H) that migrates from the initial susceptible population (S) with a rate constant r H . In the limit of a small infected population fraction (f I << 1) relative to the remaining fraction of susceptibles (f S ), and assuming that the resolved fraction f R can be neglected, we showed that the infected fraction satisfies a differential equation of the form where the SHIR growth rate function is a decaying exponential in time, The rate coefficients r H and r I can be interpreted respectively as a characteristic "social response time" τ = 1/r H for the implementation of pandemic reduction measures, and an initial condition (at some t = 0) on the growth rate function λ(t); If we multiply the infected fraction f I by the total population N 0 , and similarly scale the fractional new cases per day, we can compare the results for the cumulative cases P(t) and new cases per day dP/dt directly to the data as normally reported. Given the form (31), on solving (29) we find that the predicted number of cases as a function of time in the SHIR model is a nested exponential, This may be simplified by replacing the t = 0 initial condition λ(0) by an equivalent time shift which gives a simple three-parameter form for the general solution of the cumulative number of cases in the SHIR model, This function increases monotonically from P = 0 at t = −∞ to P max at t = +∞, and has a single inflection point at t = t 0 . This is an example of a Gompertz [8] curve, y = ab q x [9] ; these are often encountered in problems in which a proportional rate of growthḟ /f is driven by an exponential. The rate of appearance of new cases per day (cpd) in the model, dP/dt, can be evaluated directly from P(t) above, and is This rate of growth is zero at t = −∞, increases to a maximum value at the P(t) inflection point t 0 , and then decreases with time, approaching zero exponentially as t → +∞. dP/dt integrates to a finite number of cases P max as t → +∞. The maximum number of cases per day, at the time t 0 of the P(t) inflection point, is given by The three parameters that specify a solution of the SHIR model, P max , τ and t 0 , all have simple interpretations; they are respectively the height, width scale, and time shift of the pandemic profile P(t) (or of the rate of new cases, dP/dt). P max is by definition the maximum value of P(t), and gives the total number of cases, which is P(t) at t = +∞. The social response time τ sets the scale for all time-interval features of P(t) and dP/dt, since changing t 0 only translates these functions. The time shift t 0 is the time of the inflection point of P(t), This inflection point t 0 may be considered the peak of the local pandemic, since the number of new cases per day estimated by the fit is then a maximum. For discussion purposes a convenient choice for t 0 in (36) and (37) is t 0 = 0, which places the inflection point of P(t) at t = 0. Plotting P(t) or dP/dt against t in units of τ , normalized to their maximum values, gives a parameter-independent "universal profile" for each of these functions. The universal form for P(t), scaled by its maximum value P max , is shown in Fig.5 . The inflection point of P(t) at t = 0 is indicated, where both P(0)/P max and its slope (in the dimensionless time variable t/τ ) equal e −1 . This last property implies that the total number of cases P max and the cumulative number of cases at the time of the P(t) inflection point (pandemic maximum), P(t 0 ), are related by P max = e · P(t 0 ) . This provides a simple estimate of the total expected cases P max from the number observed up to the pandemic peak, P(t 0 ), which may already be known. Some additional times of interest are those at which P(t) has reached 10%, 50%, 90% and 99% of the total cumulative cases; these are respectively (t − t 0 )/τ = −0.8340, +0.3665, +2.250, and +4.600. The universal curve for dP/dt, the new cases per day, is shown in Fig.6 , scaled by its maximum value of P max /eτ . The mean and variance of t for this distribution are < t >= γτ (γ is Euler's constant) and σ = (π/ √ 6)τ . The skewness [10] of dP/dt is 12 √ 6 ζ(3)/π 3 ≈ +1.140, a positive value indicating a more heavily weighted RHS. Several additional interesting features of dP/dt are the FWHM, 2.446τ (indicated in Fig.6 ), the rise time of dP/dt from half-max to peak, which is 0.985τ ; and the much slower decline of dP/dt from peak to half-max, which requires 1.461τ . The asymptotic approach of P(t) to P max cases at large times is exponential, as is the (decreasing) rate of growth of new cases at large times, We note in passing that these results suggest an improved estimate of the asymptotic total number of cases P max which has accelerated convergence, although this improved behavior would likely be masked in practice by the background of sporadic cases. In this section we will show results from fitting the predicted early pandemic time dependence from the SHIR model as derived in Sec.IIIA to data for the confirmed COVID-19 case numbers from many different countries. We will primarily consider fits of the new case numbers per day (cpd) to the SHIR model prediction, Eq.(37). As an initial example we will show results for Austria, including fits to both the cpd and the cumulative case numbers. Following this we will give the results of SHIR model fits to the cpds reported by a representative set of Western countries, and will compare and contrast these results. We will also discuss procedures for fitting more complicated datasets that show evidence for multiple infection sites, and will show examples of such fits. Finally we will show how datasets from multiple countries can be combined, and will use this procedure to test the predicted SHIR model profile for new cases per day, displayed as a universal (parameter-free) curve. As a first application we will fit the three-parameter SHIR model to the data on the number of new daily COVID-19 cases per day in Austria, as reported on the Worldometer website, Ref. [7] . Austria was chosen because it represents one of the earliest COVID-19 outbreaks in Europe, so their data allows us to follow both the earlier and later stages of the pandemic. For this fit we will use the entire Austrian cpd dataset from the initial website date of 15 Feb 2020 through 06 May 2020, comprising a nominal 82 data points. (For most countries we consider, including Austria, there are actually many "null" or zero entries in the early dates.) As a reminder, in the small fraction of infections limit assumed here one may solve the model for the cumulative case numbers of infected people P(t) and hence the rate of new cases per day dP/dt in closed form, given by Eqs. (36,37) . The three free parameters in these functions are P max (total number of cases), τ (social response time), and t 0 (the P(t) inflection point; time of the maximum cases per day). Our common t = 0 reference time in all these numerical fits and discussions is taken to be 18 Mar 2020. The fits are all simple, unweighted leastsquares without data errors. The result of this fit for Austria is shown in Fig.7 . This is a surprisingly good fit to the full range of the cpd data, including the rapid initial rise after the early cases, the peak region, and the subsequent slower decline; all are reasonably well described by the model [11] . The SHIR model parameters resulting from this fit to the cpd data from Austria are given below. These and other fitted parameters are typically quoted to three-place accuracy in this paper, to allow numerical tests such as debugging of numerical routines and checking analytic results. There is evidently reasonable agreement between the SHIR model and the data with these parameters for the number of new cases per day, although there is considerable scatter in the data. The numerical value of the peak predicted cpd, P max /eτ , and the time of the peak, t inf l = t 0 are indicated by a single large point in Fig.7 . These values are respectively 728 [cases/day] and 7.32 [days] (measured from t = 0 on 18 Mar 2020), giving 25 Mar 2020 as the peak date for new cases per day in Austria implied by this fit. The fitted value of the Austrian "social response time," τ ≈ 7.5 [days], is of special note. This is the smallest value (hence the fastest social response) we have found in our fits to the COVID-19 pandemic numbers from a large set of 15 representative Central and Western European countries (see Table I ). This suggests that Austria deployed an especially rapid and effective response to this pandemic, which presumably had a strongly limiting effect on the total number of cases. A minor discrepancy between the model and data is evident at the early times in Fig.7 that are characterized by small case numbers. (See times before about t = −4 [days] in the figure.) The number of early cases is clearly underpredicted by the fit. This may represent their growth from a small number of initial cases, closely monitored and hence developing rather slowly, which were evident before the primary pandemic contribution became dominant. We will note a similar, more prominent effect in the early data from Denmark and Norway at about the same date. An alternative approach would be to fit the cumulative number of reported cases as a function of time, which is typically reported together with the new cases per day. Since one is the derivative of the other, these should give similar parameter values. Of course the results will not be identical, because the model is not exact, and the cumulative cases data will give higher fitting weight to the later stages of the pandemic. The choice of which dataset to fit, cumulative cases or new cases per day, may be suggested by the subject of greatest interest; the peak of the pandemic is best determined by fitting the new cases per day, but the final number of cases will be best determined if constrained by the large total case numbers reported in the later stages of the pandemic. Here as an example of the second fit we give the SHIR model results that follow from fitting the cumulative case numbers from Austria. We again use a set of 82 points of Austrian data [7] from the initial reporting date of 15 Feb 2020 through the fitting date, 06 May 2020. The SHIR model fit to the Austrian cumulative cases data is shown in Fig.8 Evidently the SHIR model also provides a reasonably accurate description of the cumulative cases for Austria. The parameter values resulting from the two fits, to cpd data (44) and cumulative cases data (45), are quite similar [12] . The total case numbers P max differ from their mean by ±2%, the fitted social response times τ Next we will follow the same procedure described in the first Austrian fit above, and will fit the SHIR model to the cpd data for each of a representative set of Central and Western European countries. For each fit, as discussed for Austria, we have again used 82 pts of daily new case numbers per day [7] for the period 15 Feb -06 May 2020 as the only input data. This data fitted to the threeparameter SHIR model form for dP(t)/dt, (37). This fixes the three model parameters P max (with units of [cases]), τ [days], and the inflection point t inf l = t 0 [days] (measured from t = 0 on 18 Mar 2020) separately for each country. The fitted parameter values for the entire set of 15 Central and Western European countries considered here are given in Table. I, together with the calendar date of the inflection point t 0 , and the fitted peak number of new cases per day, which occurs at t 0 . Perhaps the most remarkable aspect of the SHIR model fits to 15 Central and Western European countries in Table. I is the wide range of values found for the "social response time" τ , and how it appears to correlate with national coronavirus policy. As τ sets the width of the pandemic peak, it is a measure of how long a country will have to endure the pandemic. (Recall for example that the FWHM of the new cases per day, dP/dt, is about 2.4·τ .) The extreme cases are Austria (τ ≈ 7.5 days) and Sweden (τ ≈ 26 days), over 3 times longer than Austria. The Swedish cpd data and fit are shown in Fig.9 . The very slow Swedish response time is apparently the result of a rather controversial COVID-19 policy pro- moted by the Swedish State Epidemiologist, A. Tegnell [13] , which did not impose the strict school closings, border controls, and restrictions on large gatherings that characterized the policies of other Scandinavian countries. Tegnell instead advocated a more open society during the pandemic, to promote herd immunity and a quick economic recovery. Although the Swedish policy was evidently successful in broadening the curve, their per capita number of cases is over twice that of Denmark and Nor-way, and the Swedish fatalities per capita are larger by a similar amount. This argues that an aggressive national policy of minimizing the social response time τ , for example through contact tracing, extensive testing, rapid medical response and isolation of positive cases through quarantine, may be the most effective in reducing the total case and fatality numbers. It may also prove easier to sustain a strict policy over a short period of time than a more lax one over a longer period. The UK and Dutch social response times τ are also quite large, and can also be correlated with their early national policies that promoted "flattening the curve" and developing herd immunity. To quote the Dutch Prime Minister regarding COVID-19 policy in a national address [14] , (in translation) "... we can slow down the spread of the virus while at the same time building group immunity in a controlled way." In a national address [15] the UK Prime Minister stated that "It is vital to slow the spread of the disease. That is the way we reduce the number of people needing hospital treatment at any one time." Of course the argument that slowing the spread of the pandemic is advantageous because it "flattens the curve" is specious. This tacitly assumes that the total number of cases P max is constant, so a broader rate curve must have a lower mean. In practice there is instead evidence that a longer τ may result in more infections per capita, as seen in comparing Sweden with Denmark and Norway. If so, flattening the curve by slowing the spread of the pandemic unfortunately increases the total number of cases and fatalities. This complicated and perhaps counterintuitive issue certainly merits careful future investigation. We have also carried out fits of the SHIR model to COVID-19 data for three North American countries, Canada, Mexico and the USA. These three countries were treated as a separate exercise because of concerns that European and North American policies differed considerably, which might lead to very different fit parameters and pandemic curves. This was indeed found to be the case. Most of the fits of the SHIR model to European countries described above were carried out in May 2020, using 82-point Worldometer coronavirus datasets [7] covering the period 15 Feb -06 May 2020. Initially we used the same procedures and datasets for the three North American countries. This appeared satisfactory for Canada and the USA, however the data for Mexico showed that this country was still in the early stages of the pandemic, and had a very long social response time of about 63 days, with a predicted pandemic peak in late July. For this reason we decided to carry out a fit to Mexico incorporating later data, specifically a 151-point Worldometer cpd dataset for the period 15 Feb -14 July 2020, and have fitted the corresponding data for Canada as well. Recent developments in the USA have shown a dramatic departure from a single peak model, which suggests that the USA should be treated as a superposition of several more localized outbreaks. Accordingly, here we quote SHIR model fit results for New York State alone, as a representative component of the USA with an effective social response. The cpd data in this case is available from 12 Mar 2020, so we fitted a dataset covering the period 12 Mar -14 July 2020, comprising 125 data points. The results of these fits are given in Table. Of course each of the national pandemic datasets we are studying is in some sense the sum of multiple events, displaced in time and encountering special local conditions. In the USA for example the early data is dominated by events on the West Coast, but these were superseded by the major outbreak involving New York City, and then by a smaller event in New Jersey, and a larger number of local events in the Midwest and South. Fitting the data from many local outbreaks as a single event is clearly problematic. A series of local events separated in time but still overlapping significantly, if fitted as a single outbreak, could give a misleadingly large τ . Separating the individual events will require accurate local data, and a more detailed investigation. Despite this concern, the SHIR model does nonetheless allow a reasonably accurate fit to most of the national datasets we have considered. Of the 15 European countries we have discussed here, just two showed clear evidence of a more complicated pandemic history that could not be fitted as a single-peak event. These were Denmark and Norway. In both countries there was evidence of an early, short-lived outbreak, which led to about 800 cases of COVID-19 in each country, both during the period of 10-13 Mar 2020. In both countries this early outbreak was rapidly suppressed, presumably through contact tracing and quarantine. We found that the cpd data from Denmark and Norway could be described by using Eq.(37) to represent two independent outbreaks, a short, early one, with a very short τ ≈ 1.2 [days], and a dominant second outbreak, with the more typical COVID-19 parameters listed in Table.I. The resulting two-peak fit for Norway is shown in Fig.10 [11] . . The total cases predicted for Denmark and for Norway are simply the sum of the early and later peaks, thus 11.01k and 7.72k cases respectively. The idea of simultaneously fitting multiple countries to the SHIR model seems appealing, particularly as this model gives a good description of the case numbers from many different European countries. Unfortunately there are complications with this procedure; individual countries vary considerably in the fitted values of the total cases P max , the characteristic social response time τ , and the local time of the pandemic peak, t 0 . These differences suggest a scaling approach, in which we determine these parameters separately for each country in a SHIR model fit, and then translate (by t 0 ) and scale (by P max and τ ) the datasets for each country separately. If the model and the data are both accurate, these translated and scaled datasets should provide points that fall on the universal curves shown in Figs.5 and 6. Of course there are indications that the COVID-19 data itself is occasionally problematic, with considerable scatter, single-bin spikes, and in some countries a modulation with a 7-day period that may reflect a weekly reporting regimen. Nonetheless it is of interest to generate this translated and scaled scatterplot, to see whether a common pandemic profile is indeed evident in the European data. This scatterplot is shown in Fig.11 for the 82-point cpd datasets of all 15 European countries in Table. I. Each national dataset was translated and scaled separately by its particular values of t 0 , P max , and τ (as given in the table) before being added to this figure. The data from Denmark and Norway before 20 Mar 2020 was excluded from the plot, to suppress contributions from their initial early outbreaks. Table. I, translated and scaled to fit on the universal plot for dP/dt, Fig.6 . The (parameterfree) SHIR model prediction for dP/dt, Eq.(46), is shown as a solid line in this figure. The scaled, centered universal curve for dP/dt is shown as a solid line in Fig.11 , and is given by dP dt We can use this scatterplot to combine the data from the 15 European countries we have considered, since they have now been superimposed through translating in time and scaling both axes. On binning in the scaled time variable (t − t 0 )/τ , we can calculate the average and variance of the scaled cpd datapoints in each bin, which provides an average and error for dP/dt at that value of (t − t 0 )/τ . As an example, we have used bins of width ∆(t − t 0 )/τ = 0.5, centered on t − t 0 = 0, which gives the result shown in Fig.12 . This appears approximately consistent with the parameter-free SHIR model result for dP/dt (46), although there may be relatively minor discrepancies at onset ((t − t 0 )/τ ≈ −2) and well after peak The Europe-averaged COVID-19 data for the new cases per day shown as points with errors in Fig.12 can be used to calculate expectation values under this common COVID-19 dP/dt distribution. Some of these results are given in Table. III, where they are compared to theoretical values predicted by the SHIR model dP/dt, Eq.(46). Evidently the simpler expected values are in reasonably good agreement, notably the displaced value of the mean from the maximum. There is however a discrepancy in the cubic expected values, which are sensitive to the relatively less well determined wings of the distribution at larger |t − t 0 |/τ . Fig.12 . Here for simplicity we abbreviate (t − t0)/τ = x. We have shown that a SHIR (generalized SIR) model and its associated differential equations give rather good closed-form results for the COVID-19 data on new cases per day and cumulative cases, when solved under certain assumptions. These assumptions can be relaxed in future studies. One important assumption was to ignore the effects of the "resolved" population fraction f R . This quantity, which is certainly of interest in planning economic recovery, is simple to evaluate from the "infected" fraction we have discussed here; it is simply the integral in Eq.10. This can be evaluated analytically, together with the full DE for f I , including the r R term, to accommodate transitions out of "infected" and into "resolved." The cumulative case numbers P(t) discussed in relation to COVID-19 data should then be defined by P(t) = N 0 (f I + f R ), i.e. "infected" plus "resolved," since f R is no longer assumed to be negligible. Calculations of mortality rates are certainly of interest, and are implicitly included in the transition from the infected population I (or I S ) to the resolved population R. Mortality can easily be treated separately, for example by specifying separate I → R rate parameters for mortality and for recovery. In the extended SHIR model of Fig.13 , mortality would presumably only contribute to the combined rate for I S → R. The mortality rate coefficients may of course depend strongly on the procedures followed at each country's medical facilities. Numerical methods can be used to extract SHIR model results for the full model, including the S − I quadratic infection term, and can test the importance of the "small fraction of infections" approximation used here. A straightforward generalization of the model could accommodate the effect of "asymptomatic" infected individuals on the dynamics of the COVID-19 pandemic. Recent antibody results from Italy [16] suggest that the asymptomatic cases are a large but not dominant fraction of all COVID-19 cases. Once better data on the number of asymptomatic cases becomes available, it may prove useful to extend the model to include this population. This group follows a progression S → I A → R, and while in the I A population presumably contributes to the infection of other individuals still in S. Although adding this branch to the model may indicate that the fraction of resolved individuals f R is much larger than the SHIR model alone suggests, the pathway through symptomatic cases will remain as before. The most visible predictions of the model, for the number of symptomatic cases (now I S ), may not change significantly. A block diagram of this extended SHIR model is shown in Fig.13 . Finally, although we have primarily considered COVID-19 in Central and Western European countries, largely because of an assumed similarity in standards for reporting data, there is extensive data available from many other countries which could be addressed using this type of model, undoubtedly with interesting results. In this paper we have introduced a generalization of the SIR model in order to incorporate the effects of pandemic reduction measures. This "SHIR" model adds a rate process that allows transitions from the susceptible population (S) to a new "hidden" population (H), which we assume is not susceptible to infection. We have solved this model analytically in the limit of a small infected population fraction, and showed that the predicted time dependence of cumulative cases and new cases per day is in surprisingly good agreement with COVID-19 data. (We primarily considered Central and Western European countries in this study.) The solution of the model with our approximations involves three free parameters; one is the "social response time" τ , which varies widely between countries, in a manner that correlates with their coronavirus policies. We are also grateful to E.S.Swanson for assistance with calculations of expected values and errors in the SHIR model, and with the model block diagrams. Dynamics of infectious diseases In this simple SIR model the time at which fI is a maximum and that maximum value can be solved for analytically, and are respectively t = ln(rI/rR)/(rI − rR) and fI = (ρ/(ρ − 1))(ρ −1/(ρ−1) − ρ −ρ A contribution to the mathematical theory of epidemics Protected groups similar to the "H" box of Fig.2 have previously been considered in generalizations of the SIR model. For example, the effects of vaccination are discussed using an SIRV model [1], and the 2003 SARS model of Lipsitch Transmission Dynamics and Control of Severe Acute Respiratory Syndrome The use of "resolved" for the post-infected population R, the sum of deceased and recovered, was suggested by P.Ardaury. This population is also referred to as "closed cases Worldometer Coronavirus Updates On the Nature of the Function Expressive of the Law of Human Mortality, and on a New Mode of Determining the Value of Life Contingencies From MathWorld -A Wolfram Web Resource Skewness is a dimensionless property of a distribution We note in passing that a single "outlier" point in the Austrian cpd data, 1321 new cases on 26 Mar 2020, lies above the upper cutoff of 1000 cases per day in Fig.7. It was of course included in the fit. Similarly, the Norwegian cpd data includes an outlier of 399 new cases on 27 Mar 2020 and a negative entry of -44 new cases on 08 the cumulative "cas" data fit are much smaller than in the preceding new cases per day "cpd" fit. The cas parameter errors are likely underestimated, since the cas data points represent a running sum, and are therefore highly correlated. The cpd parameter errors are presumably more realistic National Address British Prime Minister Boris Johnson Coronavirus Address Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo'