key: cord-0824819-dqspyo5e authors: Kaxiras, Efthimios; Neofotistos, Georgios; Angelaki, Eleni title: The first 100 days: modeling the evolution of the COVID-19 pandemic date: 2020-07-10 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2020.110114 sha: 4f209dfb65b61193c664dec6fe33593fcefd2312 doc_id: 824819 cord_uid: dqspyo5e A simple analytical model for modeling the evolution of the 2020 COVID-19 pandemic is presented. The model is based on the numerical solution of the widely used Susceptible-Infectious-Removed (SIR) populations model for describing epidemics. We consider an expanded version of the original Kermack-McKendrick model, which includes a decaying value of the parameter β (the effective contact rate) due to externally imposed conditions, to which we refer as the forced-SIR (FSIR) model. We introduce an approximate analytical solution to the differential equations that represent the FSIR model which gives very reasonable fits to real data for a number of countries over a period of 100 days (from the first onset of exponential increase, in China). The proposed model contains 3 adjustable parameters which are obtained by fitting actual data (up to April 28, 2020). We analyze these results to infer the physical meaning of the parameters involved. We use the model to make predictions about the total expected number of infections in each country as well as the date when the number of infections will have reached 99% of this total. We also compare key findings of the model with recently reported results on the high contagiousness and rapid spread of the disease. tion to the differential equations that represent the FSIR model which gives very reasonable fits to real data for a number of countries over a period of 100 days (from the first onset of exponential increase, in China). The proposed model contains 3 adjustable parameters which are obtained by fitting actual data (up to April 28, 2020) . We analyze these results to infer the physical meaning of the parameters involved. We use the model to make predictions about the total expected number of infections in each country as well as the date when the number of infections will have reached 99% of this total. We also compare key findings of the model with recently reported results on the high contagiousness The recent pandemic due to the COVID-19 virus has created unprecedented turmoil and changed the daily life of people over the entire planet. It has also yielded a grim toll of victims that succumb to its attack. While there is great expertise in the medical community and the community of statisticians in dealing with epidemics, less is known about this particular disease to make reliable predictions for the evolution of the current pandemic. In studying past epidemics, scientists have systematically applied "random mixing" models which assume that an infectious individual may spread the disease to any susceptible member of the population, as originally considered by Kermack and McKendrick [1] . More recent modeling approaches considered contact networks in which the epidemic spreads only across the edges of a contact network within a population ( [2] [3] [4] ), Bayesian inference models [5] , models of spatial contacts in real cities or countries or in large-scale artificial cities and synthetic populations ( [6] [7] [8] ), and computational predictions of protein structures [9] , to name just a few of the modeling efforts. In the case of COVID-19, there is considerable uncertainty in the data collected about infected individuals due to the difficulty of testing large numbers of suspected cases. Although an avalanche of research studies are currently investigating the COVID-19 epidemiological characteristics ( [10] [11] [12] [13] [14] ), it appears that a simple model which can capture the basic behavior of the pandemic phenomenon, in spite of the large uncertainty in the data, can possibly offer useful guidance for its near-term and longer-term evolution. This paper aims to provide such a simple model with very few adjustable parameters. The original mathematical description of the spread of an infectious disease in a population is the so-called SIR model, due to Kermack and McKendrick [1] which divides the (fixed) population of N individuals into three compartments (groups, classes): • S(t) the number of individuals susceptible but not yet infected with the disease; • I(t) the number of infected individuals; • R(t) the number of individuals removed (recovered) from the infected group, either by becoming healthy again with long-term immunity or by passing away. Models that involve additional compartments can be constructed by considering more detailed stages of the infection or flows among the various compartments. The choice of the compartments is related to the disease that is being studied. For example, in the SIS model, susceptible individuals can become infected/infectious but when they are recovered/removed they can become susceptible again, that is, no permanent immunity is acquired. In the SEIR model, individuals are grouped in four compartments, one more than in the SIR model: the additional compartment is the group of "Exposed" (E) individuals, who have become infected but not yet infectious themselves. A recent model [15] considers eight stages of infection: susceptible (S), infected (I), diagnosed (D), ailing (A), recognized (R), threatened (T), healed (H) and extinct (E), collectively termed SIDARTHE. In the present study we have chosen to focus on the SIR model, which is the simplest model incorporating the Covid-19 epidemic dynamic. We are thus effectively combining the exposed and the infectious compartments of the SEIR model into one group; I(t) = infected and/or infectious individuals (referred hereafter as infected individuals). Although this combination of compartments misses some details of the disease evolution, in light of the approximate analytical solution we introduce later, it should not substantially affect the conclusions to be drawn from the approximate model. The SIR model involves two positive parameters, β and γ which have the following meaning: -β describes the effective contact rate of the disease: an infected individual comes into contact with β other individuals per unit time (the fraction that are susceptible to contracting the disease is S/N ); -γ is the mean removal (recovery) rate, that is, 1 γ is the mean period of time during which an infected individual can pass it on before being removed from the group of the infected individuals. This model obeys the following differential equations: Many recent studies (e.g. [16] , [17] , [18] ), have attempted to model the data of the COVID-19 pandemic by imposing time-dependence conditions (or by data assimilation) on the rates β and γ involved in the original SIR model, in order to account for the imposition of social-distancing measures, quarantine of infected individuals, and other interventions designed to slow down the spread of the disease. Motivated by such considerations, we will introduce a variation of the original model in which the parameter β is a time-dependent, monotonically decreasing function. This change can drastically affect the evolution of the populations. We give below a specific example to illustrate this point. Since the presence of time-dependence in β introduces a forcing term, which for reasonable parameter values lowers the number of infected individuals ("flattens the curve"). The system of equations that describe the SIR, with or without the timedependence in the parameter β, can be easily solved numerically, as shown in We observe from the numerical solution shown of the SIR model, shown in Fig. 1 , that both the susceptible and the removed populations (S and R, respectively) behave like sigmoids, which is the typical behavior of solutions to differential equations that involve exponential growth and decay. Moreover, the infected population is always given by the following expression From these observations, we take the approximate solutions to be given by: Since the analytical model of Eq. (3) can capture the behavior of the SIR model including a time-dependent β, which represent the "forcing" or "flattening" of the curve of infected individuals, we refer to it as the "FSIR" model. Here we derive relations between the parameters used in the model of Eq. (3), and the parameters of the original set of differential equations, Eq. (1). To keep the expressions simple, we will assume α 1 = α 2 = α and define ∆t = t 2 − t 1 . By inserting the expressions forS andĨ in Eq. (1b) we find: where we have defined n = N /N . Similarly, by inserting the expressions forR andĨ in Eq. (1c) we obtain: Thus, in the approximate model described by Eq. this quantity: n → 1, the limit in which the entire susceptible population was exposed and eventually becomes removed population, and n → 0, the limit in which only a tiny fraction of the susceptible population was exposed. In the first limit we obtain: while in the second limit we obtain: From the first expression we see that for t t 1 the value of β 1 (t) increases exponentially, which is an unphysical result. From the second expression, we see that β is a monotonically decreasing function of time and for t t 1 tends to the constant value lim t t1 which is the expected behavior in the FSIR model. For t t 1 and assuming that α∆t > 1 we find that which relates the adjustable parameter α of the analytical model to the value of the parameter β appearing in the original SIR model. The quantity R 0 = β/γ of the SIR model is used to estimate the value of the basic reproduction number of an epidemic. From our analytical model, in the limit n → 0, the quantity β/γ takes the form For t = 0, and assuming that αt 1 1 (as is the case for the fits to reported data discussed in the next section), this quantity becomes For t t 2 , the quantity β/γ becomes The first number is very large for typical values of the parameters in the FSIR model obtained from fits to reported data, while the second value is very small, close to zero. Neither result is realistic. In the important range t 1 < t < t 2 , we find from numerical results that this quantity is approximately described by a decaying exponential in time . This result implies that in this range we would expect β ∼ e −αt (the functional form we assumed for illustrative purposes in Fig. 1 ), and γ ∼ e αt . From this last expression, taking the time average of γ in the range where we have used from the expression of Eq. (5), a reasonable approximation for t t 2 . The last relation for the average value of γ is obeyed to a good approximation for each case of reported data we examined. Using the preceding analysis that led to the relations β ≈ α for the initial value of β and γ ∼ 1/∆t for the average value of γ, we suggest that a reasonable representation of the quantity β/γ is given by the value of α∆t. Thus, we will use this value as a proxy for R 0 , and will refer to it as R 0 . The parameters estimated from the fit of our analytical model to reported data give a value of R 0 which is in agreement with the recently reported median value of R 0 for the pandemic. We use our analytical FSIR model to fit the behavior of infected populations of different countries, as obtained from the European Centre for Disease Prevention and Control (ECDC) [21], for a period ending on April 28, 2020 which corresponds to 100 days from the onset of the exponential growth of reported cases in China. In order to obtain a meaningful fit, we had to consider data for each country that show a monotonic increase at the beginning. This means that a few data points in each case were excluded, as they corresponded to sporadic reports of very few isolated cases, typically 1 to 10 in a given day, interspersed by several days of zero cases. In practice this means that the fitting begins at a certain cutoff day denoted as t 0 . In order to make the fit more robust and simpler, we chose α 1 = α 2 = α. Moreover, we found by trial-and-error that the value In Fig. 4 we give some examples of the actual fits for the "outlier" countries (China, Greece, USA and Spain). To have a measure of the fit that is comparable between different countries, we defined the "quality of fit" as: which is expressed as a percentage (multiplied by a factor of 100). The resulting values of the parameters, including our choices of t 0 , are given in Table 1 . The values of the parameters obtained reveal interesting behavior. • ∆t: This value indicates the lag between the sigmoid that describes the recovered population (R) and the sigmoid of the susceptible population (S). As such, it can be interpreted as the effective rate of removal (γ in the SIR model). In Table 1 we present the values of ∆t for each country. The average of ∆t is close to 27.5 days (∼ 4 weeks), a value consistent with a recently reported estimated median time of approximately 2 weeks from onset to clinical recovery for mild cases, and 3-6 weeks for patients with severe or critical disease ( [22] , [23] , [24] ). Australia and China show an unusual low value, ∆t= 5.7, and 11.1 days, respectively. The value of this parameter has a significant effect on the total expected number of cases (see below). • N : the value of this parameter is representative of the number of daily cases near the peak of theĨ curve. It is close to reported values for this quantity for all the countries. Interestingly, if we assume that the total number of susceptible individuals is close to the population of each country, which in all cases is in the range of N ∼ 10 7 − 10 9 , then the ratio n = N /N → 0, as we assumed for the FSIR model earlier. In Table 1 we also include the values for the quality of the fit, which range from 6.6 (AUS) and 12.6 (USA) to 36.8 (Greece), representing a measure of the relative noise in the data; the noise is largest for Greece because the numbers are rather small. We have also considered fitting the FSIR model to seven-day running averages of the reported cases, and this in general makes almost no difference to the value of the parameters or the quality of the fit (see Fig. 4 for examples). Using the analytic expression forĨ(t) we can extrapolate to long times and try to obtain an estimate for the total value of cases over a long period, when the number of daily cases of infection have essentially dropped to negligible levels (this corresponds toĨ(t) ≈ 0). We call this asymptotic value N T and report it in Table 1 . [25] ). However, fitting real country data tends to produce higher values of ∆t because a seemingly single epidemic wave in a country can be the aggregate result of the superposition of smaller or larger sub-epidemics [26] . Table 1 ). As measured by the estimate of the basic reproduction number R 0 , Italy is the country most adversely affected by the disease (R 0 ≈ 10. The model presented here offers several advantages, but also has certain Second, the extrapolation to future cases of infection is only a lower limit. This point has been discussed in an elegant mathematical analysis of the data by Fokas et al. [28] . We emphasize here that this shortcoming is not traced to the analytical FSIR model, but rather to the underlying differential equations assumed to describe the phenomenon (the original SIR model). The reason is that any set of linear differential equations that describe exponential decay and growth, as those of Eq.'s (1) do, will necessarily have exponential (logistic) behavior of the tails; this is captured by our analytical model, but apparently does not represent the actual data for the long-term behavior of the total number of infected individuals. This is evident in Fig. 4 : in the countries that have long passed the peak of the reported cases, the tail does not asymptote to a constant value, as the sigmoid (logistic) model predicts, but the number actually keeps growing at a slow rate. To capture this behavior, non-linear terms are needed in the underlying differential equations (see [28] ). We suggest that, despite its limitations, the simple analytical model presented here can be useful for a quantitative evaluation of different efforts to contain the pandemic. Acknowledgement: This work grew out of a lecture and a lab exercise on numerical solutions of ordinary differential equations, presented on April 10 2020 in the course "APMTH-10: Computing for Science and Engineering", which the three authors teach. We gratefully acknowledge support from the Computing in Engineering Education (CEE) group of the Active Learning Labs in developing and offering this course. Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Table 1 ; these values are useful in estimating the reproduction number R0. Right: Estimated values for the reproduction number R0 for the 10 countries considered, as obtained from the FSIR model. The larger the value R0, the more adversely affected by the disease the country is. A contribution to the mathematical theory of epidemics Dynamical patterns of epidemic outbreaks in complex heterogeneous networks Network frailty and the geometry of herd immunity SIR dynamics in random networks with heterogeneous connectivity Bayesian Inference for Contact Networks Given Epidemic Data Lessons from being challenged by COVID-19 Measurability of the epidemic reproduction number in data-driven contact networks Modeling Spatial Contacts for Epidemic Prediction in a Large-Scale Artificial City Computational predictions of protein structures associated with COVID-19 High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2" . Emerg Infect Dis Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia Estimating the potential total number of novel coronavirus cases in Wuhan City, China Transmission of 2019-nCoV infection from an asymptomatic contact in Germany Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy Estimating the infection horizon of COVID-19 in eight countries with a data-driven approach A feedback SIR (fSIR) model highlights advantages and limitations of infection-based social distancing An Epidemiological Modelling Approach for Covid19 via Data Assimilation Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe Inferring change points in the spread of COVID-19 reveals the effectiveness of interventions Report of the WHO-China Joint Mission on Coronavirus Disease Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Virological assessment of hospitalized patients with COVID-2019 The reproductive number of COVID-19 is higher compared to SARS coronavirus A novel sub-epidemic modeling framework for short-term forecasting epidemic waves Predictive mathematical models for the number of individuals infected with COVID-19 MANUSCRIPT ENTITLED: The first 100 days: modeling the evolution of the COVID-19 pandemic Re-Submitted for publication to CHAOS, SOLITONS and FRACTALS Journal, on June 24, 2020. ☒ The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.☐The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: