key: cord-0431892-zmc6y3wu authors: Gaeta, Giuseppe title: Data analysis for the COVID-19 early dynamics in Northern Italy date: 2020-03-04 journal: nan DOI: nan sha: 0ef3864f3509fa701635a93bbd61a699d0b62db5 doc_id: 431892 cord_uid: zmc6y3wu The COVID-19 epidemics, started in China in January 2020, was recognized to have reached Italy around February 20; recent estimates show that most probably the virus circulated in the country already in January, but was not recognized. Data for the early dynamics of COVID-19 in Northern Italy are analyzed. The COVID-19 epidemics, started in China in January 2020 1 , was recognized to have reached Italy around February 20; recent estimates show that most probably the virus circulated in the country already in January, but was not recognized. The development of the epidemic in Italy was characterized by a rather large number of new cases even in early days. This led to the isolation of two relatively small areas -one with about 50,000 inhabitants and the other with bout 3,000 ones -and in these areas the search for infected cases has been performed by a wide number of laboratory exams, which led to the discovery of a rather large number of cases with weak or no symptoms at all. In fact,about 50% of identified contagion cases are being teated by simple home isolation. This should be compared with early data from China and in particular in Wuhan (where the huge number of both serious medical cases and the total population presumably prevented from screening contacts of infected people not showing symptoms, except maybe for medical doctors most involved in fighting the virus), which reported a very small number, of the order of a few percent, of asymptomatic contagion [3] . On the other hand, the Chinese report made a well defined difference between serious and less seriously affected patients, showing that the first case amounted to about 20% of the total COVID-19 patients, all of whom were hospitalized. In this note we want to discuss raw data for the Italian situation; in a companion paper we will put forward a very simple model (of the SIR type [7] ) taking into account the presence of a large number of asymptomatic cases, many of which are most probably not detected nor detectable except in very small communities 2 ; this will then be used to understand how much a prompt identification of similar cases (not practically feasible at this stage, but which could become possible if rapid and economic tests would be commercially developed) could help in reducing the spread of the infection. The used data were obtained from the publicly available ones provided by the daily Italian "Protezione Civile" reports We start by recalling some facts and figures about the situation in China. The spreading of COVID-19 in all of China (thus not making a distinction between the Hubei region and the rest of the country) is summarized in the semi-logarithmic plot of Figure 1 . This shows that the growth rate α for the number of total cases in day k, passed from α i = 0.330 for the early period (in particular, this fit was obtained for the period January 23 -February 2) to α f = 0.006 for the last available data (in particular, this fit was obtained for for the period February 21 -March 2), showing that the drastic measures taken by the Chinese government produced a substantial effect. The doubling time τ , obtained from the above via passed correspondingly from τ i = 2.1 to τ f = 122. The daily growth factor γ is determined as this passed from γ i = 1.391 to γ f = 1.006. 2 We understand that in the smaller of the isolated communities in Northern Italy, totalizing about 3,000 inhabitants, a mass screening is being conducted; this should provide more reliable data on the fraction of asymptomatic infections, together with other precious information. The Republic of Korea (also known as South Korea; population around 52,000,000) is another country severely struck by COVID-19; this is also a relevant benchmark for the Italian situation given the similar population in the two countries and the similar political system (parliamentary democracy). In this case there was a sudden increase in the number of cases starting from February 19. In this case the restrictive measures also led to some reduction in the propagation speed of the epidemics, albeit of a smaller amount than in China. In this case the "initial" fit refers to days from February 18 to February 24, while the "final" one to days from February 24 to March 2. In this case, the parameter α passed from α i = 0.56 to α f = 0.26; correspondingly, the doubling time passed from τ i = 1.23 to τ f = 2.68 and the daily growth factor from γ i = 1.75 to γ f = 1.29. The statistics of data from Northern Italy is (luckily) more limited, both for what concerns the number of cases and the length of the time series. On the other hand this makes that these time series, in view of the relative large incubation period for the COVID-19 infection, did not have the time to react to the restrictive measures taken by the Italian government and to the modified habits of the population; thus in building our model we will consider the epidemic parameters as being constant over the time covered by these data. The epidemic developed so far mainly in three regions: Lombardia, Veneto and Emilia-Romagna; from now on these will be denoted as L, V, and ER respectively (data referring to all of Italy will be denoted as I). The total population of Italy is about 60,000,000; that of the mostly involved regions is roughly as follows: Lombardia 10,000,000; Veneto 5,000,000; Emilia-Romagna 4,500,000. The two isolated areas (red areas) have a population of 50,000 for the Lombardia one and 3,000 for the Veneto one. Publicly available data are divided for regions (we have not analyzed so far special data about the "red areas") and -limitedly to the overall count for all of Italy -by the state of the patient; in particular these are subdivided into the following classes (identified by an acronym in the following): Hospitalized in Intensive Care units (IC); Hospitalized in standard care units (SC); Isolated at home (HI); Recovered (REC); Dead (D). At the date of this study, as far as we know, shortage of IC facilities was not arising, so we assume that the state of the patient is well reflected by the type of treatment he/she is undergoing. The data for all of Italy and for the different regions are reported in figs.3 -6; these are also fitted by an exponential law, but here the fit is performed on data from day 3 (February 23) on, as in the first days there seem to be an anomalous growth, most probably due to recognizing as COVID-19 occurrences of cases which had been previous considered as standard flu with complications. 3 The fit for the whole of Italy, and for the different considered regions, produces the following parameter α and related quantities: and daily growth factor γ -see eqs. (1), (2) and (3) -for the different Departments considered in Table V .a. See Figure 7 for the fit. We note that while Lodi, Cremona and Pavia (which is however more distant from the red area) grow at about the same rate, the growth rate in the department of Piacenza -which is contiguous to the Lombardia red area -appears to be specially high. The same applies to the Bergamo and Brescia. It is possible that some area in these Departments should better be isolated in the same way as the red area in Lodi Department, and indeed such measures are presently under consideration. The impact of the red area in Vo' Euganeo on the Padova Department appears to be rather limited, as also implied by its small size. The growth rate in the Milano area is higher than other ones; this had to be expected considering that it is related to the population size and density, but for exactly the same reasons it appears worrying. Figure 7 : Semi-logarithmic plot of data for certain Departments (see text) together with best fit (see Table V .b for value of the parameters). So far we have only provided raw data; in this short Section we will present some considerations based on simple models and on projections of he progression of the epidemic with a reduced speed. 4 The figures given above for the growth rates and daily growth factors in different areas would allow for an easy estimate of the situation in the next few days should growth go on at the same rate. However, some restrictive measure and a campaign of public awareness were started about ten days ago, so -in view of the COVID-19 incubation time -these should start having some effect any moment. Comparison with the Korean experience suggest this could lead as far as halving the growth rate, i.e. doubling the replication time. In the following Table VI we give the forecast of known infections within one week (i.e. at March 10) for different areas under different hypotheses for the reduction of the growth rate 5 , i.e. assuming we get a reduced growth rate with α the present one. In this Table we have considered reductions as drastic as r = 0.3,but the Korean experience suggests it is difficult to go beyond r = 0.6 − 0.5. L 9500 7800 6400 5200 4200 3500 2800 2300 V 2800 2200 1700 1400 1100 900 700 500 ER 7800 5700 4200 3000 2200 1600 1200 900 Table VI . Projection of infection cases at March 10 under the hypotheses the restrictive measures reduce the growth rate by a factor r, see (4). Albeit we have not discussed models for the COVID epidemics, and we intend to present a dedicated model in a companion paper, it may be worth making some general considerations on the relevance of lowering the parameter α based on epidemiological models, and in particular on the simple SIR model [7] . The SIR model is described by the equations Here S represents Susceptible individuals, I the infected ones, and R those removed from the epidemic dynamics. Note that the total population N = S + I + R is assumed to remain constant. It is immediately apparent that in the SIR model the number of infected will grow as long as S > Γ := b/a ; thus Γ is also known as the epidemic threshold. The epidemic can develop only if the population is above the epidemic threshold. The parameters a and b describe the contact rate and the removal rate; they depend both on the characteristics of the pathogen and on social behavior. For example, a prompt isolation of infected individuals is reflected in raising b, a reduction of social contacts is reflected in lowering α, and both these actions raise the epidemic threshold Γ. If this is raised above the level of the total population N , the epidemic stops (which means the number of infected individuals starts to decrease, albeit new individuals will still be infected). The same effect can be obtained by reducing the population N , i.e. by partitioning it into non-communicating compartments, each of them with a population below the epidemic threshold. 6 One can easily obtain the relation between I and S by writing which upon elementary integration yields with I 0 , S 0 the initial data for I(t) and S(t); in ordinary circumstances, i.e. unless there are naturally immune individuals, S 0 = N − I 0 ≃ N . As we know that the maximum I * of I will be reached when S = Γ, we obtain from this an estimate of this maximum (note that we do not have an analytical estimate of the time needed to reach this maximum); writing Γ = σN (with σ < 1) this reads Thus increasing Γ, even if we do not manage to take it above the population N , leads to a reduction of the epidemic peak; this reduction can be rather relevant also for a relatively moderate reduction of α and thus increase of Γ. See the example condensed in Table VII . We stress that a reduction of the parameter a does not only lead to a lowering of the epidemic peak, but also slows down the whole epidemic dynamics. This is shown in Figure 8 . r Γ I * 1.0 1.21 · 10 7 1.81 · 10 6 0.9 1.35 · 10 7 1.21 · 10 6 0.8 1.52 · 10 7 6.42 · 10 5 0.7 1.73 · 10 7 1.89 · 10 5 0.6 2.02 · 10 7 1.02 · 10 3 Table VII . Exemplification of the effect of reduction of the contact rate a on the epidemic peak. Here N = 20, 000, 000, b = 1/5 and a 0 = (5/3)10 −8 , a = ra 0 . The values of Γ = b/a and I * (Γ) are tabulated for different values of r ≤ 1 such that the population is still above the epidemic threshold, Γ < N . We stress that the SIR model is too simple to try to extract from it any prediction on the progression of the COVID-19 epidemics; a fortiori if with a naive estimation of the parameters. However, it points out at the absolute need to reduce the parameter the contact rate a -which implies reducing the parameter α describing the growth of the infection -in order to reduce the epidemic peak and to slow down the epidemic dynamics, getting more time to get prepared to face the epidemic peak. The spread of COVID-19 epidemics in Northern Italy is well described by an exponential growth with parameter α ≃ 0.32, with higher values in some regions. This should be compared with the growth rates in China, and in this sense the observed figure is quite worrying because: 1. The absolute value of the growth rate is rather large, and very much similar to the growth rate in the early stage of the epidemics in China (α = 0.33). 2. The substantial reduction of the growth rate in China was due to rather strict measures by the central and local Governments. Restrictive measures have also been taken in Italy, but these are substantially weaker than those adopted in China. As these were implemented around February 24, their effects will be readable (in view of the rather large incubation time of COVID) only in the next few days. 3. The experience of Korea -which is in some ways more comparable to Italy in terms of total population and political decision system -can give some hint on the expected outcome of not so drastic restrictive measures; in that case, the α parameter has been nearly halved, and corresponding the doubling time has doubled itself. This leads -both in Korea and assuming the same effect is obtained by the Italian restrictive measures and public awareness -to a slowing down in the epidemic growth, but not to recovering control over the situation, i.e. taking the dynamics below the epidemic threshold [7] . 4. We expect the growth rate of an epidemic in a given region to be proportional to, and anyway correlated to, its overall population and population density.In this respect, one should recall that the population of Hubei is around 60,000,000; on the other hand, the total population of the Italian regions mainly involved (Lombardia, Veneto, Emilia-Romagna) is about 20,000,000. This means that the virus is circulating at a very high rate, and social contacts should be substantially reduced in order to have any hope to stop, or at least substantially slowing down, its spreading. 5. The standard theory of SIR epidemics (not completely adapted to the present situation) shows that a reduction in the contact rate can lead to a substantially more manageable development of the epidemics, in particular postponing the peak period and making it less severe in terms of the number of cases to be treated simultaneously. One should not forget, indeed, that albeit COVID is lethal only in a relatively small number of cases, of the order of 1% (to be however compared with the 10 −4 of flu), this refers to the situation where patients can be properly assisted; as well known, a saturation of Hospital facilities, or even just of Intensive Care units, would lead to a dramatic increase of the mortality rate. Vital Surveillances: The Epidemiological Characteristics of an Outbreak of 2019 Novel Coronavirus Diseases (COVID-19) -China, 2020 Mathematical Biology. I: An Introduction I thank Luca Peliti for useful discussions; the opinions expressed in this note are in no way involving his responsibility. I also thank Mariano Cadoni and Enrico Franco for pointing out a blur in the first version of this report. The paper was prepared over a stay at SMRI. The author is also a member of GNFM-INdAM.