key: cord-0918481-sehbkay1 authors: Neves, Armando G.M.; Guerrero, Gustavo title: Predicting the evolution of the COVID-19 epidemic with the A-SIR model: Lombardy, Italy and São Paulo state, Brazil date: 2020-08-12 journal: Physica D DOI: 10.1016/j.physd.2020.132693 sha: b20f2c59193b55c37d6e0fd84cb61f3c8494eea8 doc_id: 918481 cord_uid: sehbkay1 The presence of a large number of infected individuals with few or no symptoms is an important epidemiological difficulty and the main mathematical feature of COVID-19. The A-SIR model, i.e. a SIR (Susceptible-Infected-Removed) model with a compartment for infected individuals with no symptoms or few symptoms was proposed by Giuseppe Gaeta, Mathematics in Engineering, 3(2): 1–39 (2020). In this paper we investigate a slightly generalized version of the same model and propose a scheme for fitting the parameters of the model to real data using the time series only of the deceased individuals. The scheme is applied to the concrete cases of Lombardy, Italy and São Paulo state, Brazil, showing different aspects of the epidemics. In both cases we see strong evidence that the adoption of social distancing measures contributed to a slower increase in the number of deceased individuals when compared to the baseline of no reduction in the infection rate. Both for Lombardy and São Paulo we show that we may have good fits to the data up to the present, but with very large differences in the future behavior. The reasons behind such disparate outcomes are the uncertainty on the value of a key parameter, the probability that an infected individual is fully symptomatic, and on the intensity of the social distancing measures adopted. This conclusion enforces the necessity of trying to determine the real number of infected individuals in a population, symptomatic or asymptomatic. Although there are good models for predicting the time evolution of an epidemic of diseases such as influenza or measles, models of the same type are not working for the COVID-19. An important feature of the COVID-19 is that it may be asymptomatic or mildly symptomatic in some patients, although 5 causing severe respiratory symptoms in others. As a consequence, there is a large number of undocumented infections [17] . The lack of tests for assessing the health state of large samples of the populations contributes to the spread of the COVID-19, as asymptomatic individuals may not isolate themselves. Although there is no clear distinction be- 10 tween symptomatic and asymptomatic, in this paper we will use the acronym MSA (mildly symptomatic or asymptomatic) to mean a case of COVID-19 weak enough for not causing death or lead to hospitalization and probably unreported due to the lack of tests. We consider that a good step towards a good predicting model for COVID- 15 19 has been taken in [13] . In Sect. 2 we will describe a slight generalization of the A-SIR model proposed in that work and use it in the rest of this paper in predicting the possible evolution of the COVID-19 epidemics in Lombardy, Italy and São Paulo state, Brazil. COVID-19 is that the models usually contain parameters for which reasonable values are taken, but sometimes without full scientific support. In particular, the models are extremely sensitive to the infection rate β 0 , see (1) . We will show in this paper how to ignore the data on the number of currently infected people. These are prone to a large uncertainty, because of the MSA cases, but also 60 underreporting of the symptomatic cases due to the lack of tests. We will use only the data on the number of deaths due to the COVID-19. Such a decision has also been taken in [1] . The reason behind it is that we believe that the deaths data are a more faithful indicator than the number of cases. In most instances only the patients in severe conditions are being tested for COVID-19. 65 We are aware, however, that underreporting in deaths is also possible due to the lack of tests, and also that political manipulation -both for increasing and for decreasing deaths numbers -cannot be discarded. We will restrict for the time being to the study of the development of COVID- 19 in Lombardy and in the state of São Paulo. Both cases result in good fits 70 of the model to the data. It will turn out that an important part of the fitting procedure is the way of tuning the value of the infection rate β 0 to the data. One important conclusion supported by our good fits -both in Lombardy and in São Paulo -is that the adopted social distancing measures taken in both localities did contribute to diminishing the number of deaths due to with respect to the expected behavior if no measures were taken. An important question is what will happen when the social distancing measures currently in act in most countries are relaxed. One bad possibility is that a second wave of COVID-19 will arise. If not mitigated, the potential number of deaths in the second wave may be larger than the deaths in the first wave. 80 Another possibility is that sufficient herd immunity will have been acquired by the populations after the present epidemic and no large increase of cases should happen after relaxation of the social distancing. We will show in this paper that neither of the above possibilities can be ruled out for Lombardy. Part of our ignorance is due to the fact that one key 85 parameter of the A-SIR model, the probability that a newly infected individual 4 J o u r n a l P r e -p r o o f is symptomatic, is still largely unknown. Another reason for not being able to predict the future of the epidemic is that we do not know how much the social distancing measures adopted were effective in reducing the infection rate of the model. 90 In the case of São Paulo state, Brazil, the fraction of deaths up to now is much smaller than in Lombardy. Although this is good, it also means that the population is still very susceptible. Strong economic pressure is being exerted on politicians for relaxation of the social distancing measures. We predict that even in the best of the possibilities, the number of infected individuals will steadily 95 grow for a large period and in its peak it will be much larger than present values. Thus, social distancing measures should not be relaxed before the number of infected individuals is small. We see that the increase in the number of cases may be catastrophic if social distancing measures are not strengthened. The paper is organized as follows. Sect. 2 starts with a mathematical 100 description of the model and all its parameters. Then we will talk about the linear regime, i.e. the behavior of the solutions of the model for a short time after the beginning of the epidemic. Finally, we will describe conditions for the population fraction of infected individuals to decrease and also the limit behavior after the epidemic is finished. Sect. 3 describes the procedure for 105 finding values of the parameters such that the output of the A-SIR model fits well the deaths data, both in Lombardy and in São Paulo. The paper is closed by Sect. 4, in which we draw some conclusions on the results obtained, and by Sect. 5, in which we account for some changes in the conclusions because of new data released during the time the paper was being written. symptoms. By removed we mean individuals which were either healed after infection, isolated (at a hospital or at home) or deceased. The fraction of removed individuals is composed by the sum of symptomatic removed R s (t) and MSA removed individuals R a (t), according to whether the individuals were fully 120 symptomatic before removal, or had mild or no symptoms. The time span we are going to consider is of a few months, thus we may ignore births and disregard deaths by reasons other than infection. In particular, we suppose that MSA or susceptible individuals do not die and that all infected individuals do not become susceptible again, at least for the time span we are considering. The A-SIR model is described by the following set of nonlinear ordinary differential equations: The latter two equations are not essential for solving the system. In fact, we may calculate R s and R a simply by integrating respectively I(t) and A(t). Another simple property of the model, proved by summing all the 5 equations, is that the sum of the fractions S, I, A, R s and R a is a constant. If we take µ = 1, this is exactly the same model as in [13], although with different notation. All parameters above are considered to be positive and are interpreted as follows: • β 0 is the infection rate of symptomatic individuals; • µ ∈ (0, 1] is a reduction factor such that the infection rate for the MSA is • ξ ∈ (0, 1) is the probability that a new infection event leads to a symptomatic case; 6 J o u r n a l P r e -p r o o f • γ s and γ a are respectively the inverses of the mean time symptomatic and MSA individuals remain infective. We suppose that γ s > γ a . The mean removal time for symptomatic individuals will be considered to 140 be around a week. This does not mean that individuals with symptoms will be healed after one week, but that these individuals, after showing symptoms for some days will either be hospitalized, or stay isolated at home. We will take then γ s = 1/7 (days) −1 . Following [13], we will take γ a = 1/21 (days) −1 , meaning that MSA individuals will remain active in the population for a larger 145 time than symptomatic individuals. We believe that we may take values such as above for γ s and γ a without risk of overestimating or underestimating the size of the epidemics. Other values could have been considered, but would not change the main conclusions. In the following we will explain how to use mortality data to infer the value of the 150 contact rate β 0 . It will turn out that the value of the parameter µ will not alter almost anything in the numeric predictions. On the contrary, we will see that the remaining parameter ξ alters drastically the outcome of the model in the future. As a preparation for understanding things to come, we show in Fig To better understand this important issue, in Fig. 2 we plot, for the same set of parameters and initial conditions, the ratio I(t)/(I(t) + A(t)) of symptomatic 170 to total cases. The figure shows that, because of the initial condition I(0) = A(0), the ratio of symptomatic to total infected individuals starts equal to 1/2. After a transient, it becomes almost constant around 0.6, not 2/3, and then decays to 0. Part of this behavior will be explained in the next subsection. Another feature of the A-SIR model, already noticed in [13] , is that the fraction S(t) is very well approximated by 1 for the initial times. We may use this to approximate the solution of Eqs. (1) for small times. Substituting S(t) which is a linear system of ordinary differential equations with constant coefficients. Although the exact system (1) cannot be exactly solved, its linear 9 J o u r n a l P r e -p r o o f approximation for initial times can be solved in terms of the eigenvalues and eigenvectors of its coefficient matrix As M is 2 × 2, its eigenvalues can be easily calculated as roots of a quadratic polynomial. It can be shown that the eigenvalues of M are always real and that the smaller of them, denoted λ − , is negative. The larger eigenvalue of M will be denoted λ + and is positive, provided that β 0 is not too small, as will be seen ahead. Although the formula for λ + in terms of the parameters of the model is somewhat large, we may invert it and find a rather simple formula for β 0 as a function of λ + and the remaining parameters: In Sect. 3, we will use the above formula along with an estimate of λ + derived from the data to fix the parameter β 0 . A straightforward, but lengthy, calculation shows that for any initial conditions the solution of Eqs. (2) satisfies with In (2) breaks down as an approximation for larger times. This is also shown in Fig. 2 , because is not always close to the value defined in Eq. (6). Another consequence of Fig. 2 is that the ratio of symptomatic to total infected individuals is a dynamic quantity. It cannot be included as a parameter in the model's equations as in [17] . Moreover, as shown by Eq. (6), not even for the small interval of time in which the ratio I(t) I(t)+A(t) is approximately constant, it equals parameter ξ. 190 We will call linear regime the time interval in which the true solution of (1) is well approximated by the solution of (2). For the parameter values in Figs It can be shown [7] that the total fraction of infected individuals I(t) + A(t) in the solutions of (1) will decrease for all t > 0 if and only if λ + ≤ 0. The same condition is generally written in terms not of λ + , but of the basic reproduction ratio R 0 . R 0 is defined as the mean number of individuals infected by a single infected individual during its whole infective period if the population is entirely susceptible. If R 0 ≤ 1, it can be shown that the number of infected individuals will initially decrease and always decrease in the A-SIR model. It is straightforward to calculate R 0 for the A-SIR model following any of the recipes given in [7] . The result is 11 J o u r n a l P r e -p r o o f As commented before, the Malthusian parameter λ + will be positive if β 0 is sufficiently large. The exact condition is exactly that the right-hand side in the 205 above equation is larger than 1, i.e. β 0 > γ s γ a /(γ a ξ + γ s µ(1 − ξ)). If R 0 > 1 and the whole population is susceptible, the total number of infected individuals will initially increase, but as the number of susceptible individual decreases, contagion becomes more difficult, and, consequently, the total number of infected will reach a maximum at some time t * . In the simpler As Fig. 2 illustrates, whenever γ s > γ a and t is large enough, the fraction of MSA individuals is much larger than the fraction of symptomatic infected individuals, so that I(t) + A(t) ≈ A(t). We may use this fact to give an approximate condition for the fraction of total infected individuals to decrease. In fact, the third equation in (1) shows that A(t) decreases whenever . Substituting I+A A in the above formula for its approximate value 1 for large times, we get the approximate condition for the decrease of the total number of infected individuals. We stress that the above condition is approximate and holds whenever γ s > γ a and t is large enough. Notice that the threshold in the right-hand side of (8) is higher for smaller β 0 . As S(t) starts close to 1 and decreases, the threshold will be easier to attain Individuals exit the susceptible compartment of the model either as symptomatic or MSA infected, and they do so at ratios ξ and 1−ξ, respectively. Since they will eventually become either symptomatic removed, or MSA removed, then the symptomatic removed and MSA removed fractions at equilibrium obey Solving the set formed by the latter two In the important case of a very contagious virus, This approximation is also well illustrated in Fig. 1 . According to Crisanti [3] and Fenga [9] , the lack of efficient testing has been responsible for a substantial underestimation of the number of cases in the COVID-19 epidemic in Italy. In particular, probably due to testing preferentially the most severe cases, the mortality rate in the regions of Lombardy in both locations as the sources for our fitting. This approach was also taken e.g. in [11] . The official deaths data, plotted in Fig. 3 where ω ∈ (0, 1) is thus interpreted as the case fatality rate. The value for ω 250 must also be found. As the lack of tests is a reality, an examination of data for several countries, as in [24] , shows that the ratio of deaths to confirmed cases varies broadly among them. Before entering into details, we describe the fitting procedure in general. Both in Lombardy and in São Paulo state, the epidemic started uncontrolled. will be to assess whether this lower rate is a natural consequence of the A-SIR model, or if it is due to the mitigation measures adopted in both locations. As will be fully explained in the following, the first step will be using the deaths data in the uncontrolled epidemic phase to estimate the values of λ + and ω. In the next steps, for each phase of social distancing we will have another parameter indicating the intensity of the adopted measures. In Lombardy we will consider two phases of different intensities 1 and 2 for the period of social distancing. In São Paulo state, only one phase of social distancing will be considered. We will see that it is possible to use the number of deaths data to 270 obtain estimates for ξ and for the intensities i . Although in general an optimal choice for these parameters exists, we will see that many possible choices are almost as good for the purpose of fitting the data with the model. It results that more than one possible good fit of the model to the data exists. We will explore the consequences of this approximate degeneracy in the optimization 275 procedure. Since parameters γ a and γ e are fixed, as already explained, and β 0 will be related to λ + by Eq. (4), the parameter µ still remains undetermined. After several experiences we noticed that, as long as λ + is determined and β 0 is related to it by (4), the results of the model are to large extent independent of µ. This In the above formula, besides other notations already introduced, N is the total We will describe first the fitting procedure for Lombardy, where more data 300 are available and then proceed to São Paulo state. The national lockdown in Italy started on March 9, 2020, day number 14 after Feb. 24, the start date of the Italian time-series. We will define Mar. 9, 2020 to be the end of the uncontrolled epidemic phase for Lombardy. For the numerical solution of Eqs. (1) in Lombardy we use initial conditions Here we used that the population of Lombardy is 10 7 inhabitants and that the We stress that the initial conditions for I(0) and A(0) are the only places in our fitting procedure where some information on the number of confirmed cases is used. We will not use such information in any other day. It is conceivable 315 that at day 0 the number of confirmed cases is more reliable than in later days. Moreover, the exact values of I(0) and A(0) do not matter so much, as I(t) and A(t) initially grow exponentially by a rate to be determined by the number of deaths. In Fig. 5 we show contour plots of the objective function c 14 (λ + , 0.1, 0.5, ω). in Italy) the data increase less than the predicted R s based on an uncontrolled 340 epidemic. This is a good evidence that the lockdown was effectively important in reducing the number of deaths due to COVID-19 in Lombardy. We will incorporate the effect of social distancing measures in our model by introducing a decrease of the infection rate β 0 to a smaller value 1 β 0 , where 0 < 1 < 1. More precisely, in order to avoid introducing a discontinuous function into the system of differential equations, we replace β 0 in equations (1) by the smooth function with In the above formula, θ(t) may be any continuous approximation of the unit step function. We used where erf (z) = 2/ √ π z 0 e −t 2 dt is the integral of the normal distribution. The graph of θ(t) is shown in Fig. 7 In the second step of our fitting procedure, we will fix the values of λ + and ω already estimated and evaluate the values for the intensity 1 of the first phase of the lockdown, and parameter ξ, which was not relevant in the uncontrolled 350 epidemic phase. We referred above to the first phase of the lockdown, because a strengthening of it occurred on Mar. 22, day 27 after Feb. 24. Therefore, to estimate 1 and ξ we will use d = 27 in the cost function, Eq. (11). In both panels of Fig. 8 we also see that after day 27 the green dashed curve of R s (t) grows faster than the data points. This suggests us that the strengthening of the lockdown in Lombardy after Mar. 22 did produce effects in slowing down the number of deaths. 375 We will then introduce a further parameter 2 such that β(t) = 2 β 0 for t > 27. In order to see the effects of relaxing the social isolation measures, we will also restore β(t) smoothly to its value β 0 for t > 100. Precisely, we will take The value for 2 will be determined by minimizing the cost function up to day 63, April 27. It turns out that the best choice for 2 will depend on the choice made for ξ and 1 . In Fig. 10 we show the behavior of the cost function, c 63 , when varying 2 for both the optimistic and pessimistic choices for ( 1 , ξ) already considered in Fig. 9 . As the final result of the fitting procedure for Lombardy, we show in Fig. 11 the graphs of the solutions to the A-SIR model with two different sets of parameter values such that the model fits rather well the deaths data. A conclusion to be drawn from the results in Fig. 11 is that the A-SIR Due to Eqs. (10) and (9), these are also close to the extreme possibilities for the number of deaths at the end of the epidemics. The optimistic choice, corresponding to the first row of Fig. 11 , is such that after complete relaxation of the social distancing, the number of cases continues decreasing and the number of accumulated deaths increases slowly. This happens because the fraction of susceptible individuals since day 38 falls below the threshold in the right-hand side of Eq. (8) and does not increase very 400 much above it after the social isolation measures are removed at day 100. On the other hand, for the pessimistic choice, depicted in the second row of As far as 1 and 2 are concerned, some attempts have been made to mea-420 sure the intensity of the social distancing using cell phone localization data, in particular for the regions in Italy [19] . A problem is that we do not know how to relate these measures with the decreases 1 and 2 in the infection rate. In Brazil the first imported case of COVID-19 was identified in the city of The choice of these conditions is analogous to what was done for Lombardy. The uncontrolled epidemic phase lasted from day 0 (Mar. 17) to day 6 (Mar. 23). We used the deaths during that phase to obtain estimates for λ + and ω. In the optimistic case, we see that if the social distancing measures remain with the present intensity until day 100 (Jun. 23), the fraction of infected individuals will increase up to a maximum by day 80 and start decreasing. If the social distancing measures are removed at day 100, the fraction of infected 470 individuals will resume growth for some days and then it will decrease again. In the pessimistic case, the number of infected individuals will grow to a number much larger than the present one. Even if the social distancing measures are relaxed after day 120, there will still be a rapid increase in the number of infected people. In both cases, we predict that the number of infected individuals at present will still increase considerably before attaining its peak value, which will be by day 90 in the optimistic case, or by day 130 in the pessimistic case. Given that the health services at São Paulo are already operating close to their maximum capacity, in both cases we see that it is necessary that social distancing is in-480 tensified in order to raise the threshold for the decrease of the infected fraction and prevent their collapse. The COVID-19 pandemic forced us, scientists, to tackle the difficult task of trying to understand a new disease at the same time it is killing people in our 485 neighborhoods and stressing our health services. As shown in this paper, the lack of solid knowledge on basic questions produces also an ignorance of what may happen in the future, even the present situation being well described by a simple mathematical model. We hope that more basic research may help fill the knowledge gaps, but probably that will take time. In this paper we used a model as simple as possible in order to have the minimum number of parameters, and devised a procedure to estimate these parameters based only on the more faithful data, the number of deaths. We 500 avoided using information on the number of confirmed cases, or trying to guess underreporting factors. If we had considered more complete models, we would probably have to estimate a larger number of parameters, resulting on a larger uncertainty. We believe that our results, uncertain as they are, may be useful in showing both a worst and a best possible outcome. One important result of our 505 calculations is that the epidemics both in Lombardy and in São Paulo would have been much worse if social distancing measures had not been taken. We saw that in an optimistic possibility, the number of cases of COVID-19 in Lombardy might not increase after the social distancing measures are relaxed. However, we also saw that it is likely that the number of cases shows a new 510 quick rise, being then necessary either to keep these measures for longer, or use alternative measures. As social distancing is being relaxed at many countries, particularly in Italy, it is possible that the uncertainty in our results will be solved in the next days according to whether the number of cases will grow rapidly, or not. In São Paulo state, as the fraction of infected individuals is up to now much smaller than in Lombardy, herd immunity is still far. As a consequence, our calculations predict that the fraction of infected individuals will still increase for some time, even in the most optimistic case. As stressed before, strengthening social distance measures could alleviate this situation. Regarding the epidemic in São Paulo, Table 1 shows the ratio of the maxi- with ξ = 0.12, = 0.405 and social distance measures with the present intensity up to day 120. The table also shows the predicted date of the maximum. 555 We see that even in an optimistic situation, an increase of 38% in the number of cases in only 16 days is expected from the results of the A-SIR model. Intensified social distancing measures might help mitigating this situation. In the other two cases, the recommendation for intensifying social distancing is of course still stronger. A Simulation of a COVID-19 Epidemic Based on a Deterministic SEIR Model, Frontiers in Public Health Assessing the COVID-19 spreading in Rio de Janeiro, Brazil: Do the policies of social isolation really work? Metapopulation modeling of COVID-19 advancing into the countryside: an analysis of mitigation strategies for Brazil Vertical social distancing policy is ineffective to 585 contain the COVID-19 pandemic The construction of next-generation matrices for compartmental epidemic models Analysis and forecast of COVID-19 spreading in China, Italy and France Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand Estimating the effects of non-pharmaceutical interven-600 tions on COVID-19 in Europe Localized outbreaks in an S-I-R model with diffusion A simple SIR model with a large set of asymptomatic infectives Spread of SARS-CoV-2 in the Icelandic Population Contributions to the Mathematical 610 Theory of Epidemics Proc. R. Soc. Lond. A Suppression of a SARS-CoV-2 outbreak in the Italian municipality of Vo Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV2) Predicting the number of reported and unreported cases for the COVID-19 epidemic in South Korea, Italy, France and Ger-620 many COVID-19 outbreak response: first assessment of mobil-36 in Italy following lockdown The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study Differential Evolution -A Simple and Efficient Heuristic 630 for Global Optimization over Continuous Spaces Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Epidemic, e-print arXiv:2004.03539v1 [q-bio.PE] (2020).[3] A.Crisanti, In one Italian town, we showed mass testing could eradicate the coronavirus,