key: cord-0511161-1ttt1fgr authors: Cardoso, Ben-Hur Francisco; Gonccalves, Sebasti'an title: Urban Scaling of COVID-19 epidemics date: 2020-05-15 journal: nan DOI: nan sha: a1383291523807cbdd9f4bf5bcde5ccfd823d007 doc_id: 511161 cord_uid: 1ttt1fgr Susceptible-Invective-Recovered (SIR) mathematical models are in high demand due to the COVID-19 pandemic. They are used in their standard formulation, or through the many variants, trying to fit and hopefully predict the number of new cases for the next days or weeks, in any place, city, or country. Such is key knowledge for the authorities to prepare for the health systems demand or to apply restrictions to slow down the infectives curve. Even when the model can be easily solved ---by the use of specialized software or by programming the numerical solution of the differential equations that represent the model---, the prediction is a non-easy task, because the behavioral change of people is reflected in a continuous change of the parameters. A relevant question is what we can use of one city to another; if what happened in Madrid could have been applied to New York and then, if what we have learned from this city would be of use for S~ao Paulo. With this idea in mind, we present an analysis of a spreading-rate related measure of COVID-19 as a function of population density and population size for all US counties, as long as for Brazilian cities and German cities. Contrary to what is the common hypothesis in epidemics modeling, we observe a higher {em per-capita} contact rate for higher city's population density and population size. Also, we find that the population size has a more explanatory effect than the population density. A contact rate scaling theory is proposed to explain the results. The epidemic of COVID-19 that started in the Chinese city of Wuhan in December of 2019, was declared a pandemic on March 11th, 2020 by the World Health Organization (WHO). Presently, the epicenter is in the US, while cases still are growing in many European countries. Soon, it will shift its center of gravity to Russia or Brazil, where the epidemic has the potential to hit even worse than in the US. Many groups still struggle to get precise median and long term predictions of the number of expected cases, especially hospitalized or ICU ones. Such is because the epidemic parameters are continuously changing as the population changes its behavior, with or without government interventions [1] . However, some general properties can already be identified across aggregated data, specifically related to demographic characteristics. To give ground to our proposed analysis, which is to compare empirical municipal-level data in some countries, we use a version of the SIR model, which we call the SIRD model, because of the fourth D compartment. In this simple model, each municipal-level region (city or county) with population size N and land area A is composed of the following epidemiological compartments: susceptible (S), infected (I), recovered (R), and dead (D). Considering only within-county transmission, the dynamics of these compartments is driven by the following system of differential equations [ where β is the transmission rate, γ is the removal rate, and φ is the case fatality rate. The N factor in the denominator makes β a disease only parameter, supposedly independent of the size or other characteristic of the population. Indeed, in the book of Keeling and Rohani [2] this formulation is referred to as frequency-dependent (or mass action) transmission. Called it proportionate mixing by Anderson and May [3] , it assumes that the number of contacts is independent of the population size, resulting in similar patterns of transmission, whether it is a town or a large city. However, in the unprecedented evidence that we are collecting from the ongoing COVID-19 pandemic, that common intuition seems not to be generally valid. On the opposite side, there is the pseudo-mass action formulation [2] , in which the infection rate is directly proportional to the population size -which is not usually applied to human infectious diseases. Our analysis shows that none of these extreme formulations can satisfactorily explain the available COVID-19 data. The best fit corresponds to a formulation that is somehow in between those ones, and which can be explained in terms of a contact rate scaling theory. The time evolution of these compartments is governed by the three parameters, φ, γ, and β. The last one can be factorized as β = pC, where p is the probability of transmission and C is the per capita contact rate [4] . The probability of infection p 1 is a characteristic of the disease, most likely universal, and a key to epidemics because if it is too low, we would probably not have an outbreak. C, on the other side, condenses all the human factors that give rise to different epidemic patterns in different places, countries, or cultures. It is the only parameter that non-pharmaceutical interventions, like activity restrictions or lock-downs, can modify. Yet, we will restrict ourselves here to its urban dependency. There are two main competing hypothesis that try to explain how C varies with N and A: the population size driven contact rate, where C = C(N ); and the population density driven contact rate, stating that C = C(ρ), where ρ = N/A. While the first approach assumes that the social mobility network grows in larger areas, allowing more distant people to interact [5] , the second one assumes that the length traveled by the individuals is invariant of the city's size [6] . Intriguingly, based on data of disease transmission in the United States, both approaches appear to be valid [4, 6, 7] . The reason for this is the quasi-linear correlation between density and size population of the US's counties, as shown in Fig. 1 . Indeed, we have found that the best fit is ρ ∝ N λ , with λ ≈ 1.03. Assuming a linear relation instead, ρ = kN gives an equally valid fit, where k = A −1 = 0.00059km −2 . An almost constant density across counties, or no correlation between ρ and N cannot explain the data. Note that from the value of the constant k we can obtain a typical county diameter in the United States of 46.5km. In addition to the US, we study the COVID-19 transmission in Brazil's and Germany's cities. In these two countries, the city's population size does not correlate well with their population density (see Fig. 2 ). The linear fitting, ρ = kN , is weak for the Brazilian cities and almost nonexistent for Germany. The constant case also cannot explain the data. Since there is no correlation between ρ and N , we can use these two countries to check the validity of the population size-driven or the population density-driven approaches. The results can be useful during the present COVID-19 pandemic and for futures ones. Let assume individuals distributed uniformly in a two-dimensional space according to a density ρ. As introduced by Noulas et al [8] , we can expect that the individual j interacts with the individual i with probability 1 It can be drastically attenuated or even suppressed by the use of masks, for example. where rank i (j) is the number of neighbors closer to i than j and 0 ≤ α ≤ 1 is a scaling factor. Assuming that the distance between these two individuals is r, we have that First, since 0 ≤ P ≤ 1, we must impose a bottom cutoff radius r 0 such that Secondly, it is natural to assume an upper cutoff radius r 1 for long distances such that P (r > r 1 ) = 0. So, the per capita contact rate is given by where A 1 ≡ πr 2 1 is the coverage area of individual mobility and the 1/2 factor eliminate the double counting. This result generalizes the α = 0 case deduced by Krumme et al [6] , where On the other hand, the population density driven approach states that A 1 is invariant, thus from the same Eq. 2 we got We use the municipal-level time series of confirmed cases and deaths for United Sates [9] , Brazil [10] and Germany [11] . Also, we use the municipal-level population size and land area for United Sates [12, 13] , Brazil [14, 15] and Germany [?]. Due to social distancing measures, it is expected that the value of β varies in time, but we can consider it as a constant for a sufficient short interval. So, let be [t, t + ∆t] such that S(t + ∆t) < S(t) and D(t + ∆t) > D(t). Assuming that β(t) is constant in this interval, we get form Eq. 1: where B is the integration constant. Aiming to cancel some day of week seasonality bias, we choice ∆t = 1 week. Now, noting the S is the population size minus the confirmed cases, we can construct a weekly time-series of D 0 (see Fig. 3 for an example) such that . We can note the maximum value in the 5th week, at the beginning of the PAUSE order [16] . Assuming that the social distancing measures are sufficiently distributed among cities and time, we can expect that D 0 varies in time around a value proportional to the theoretical one. With the population density-driven contact rate hypothesis, we relate the population density of each city with its D 0 /A value, where D 0 is the time average of D 0 (t) for this location. So, in this framework, we have Considering the population size-driven contact rate hypothesis, we plot the population size of each city and its D 0 , where D 0 is the time average of D 0 (t) for this location. In this approach, we expect 6 Results If the mass action (constant contact rate) is valid, we expect that D 0 /A scales linearly with ρ. In the another extreme, if the pseudo-mass action (C ∝ N ) is valid, we expect that D 0 /A is a constant. In Figs. 4, 5 and 6 we show, respectively, the comparison between D 0 /A and the population density for different counties of United States, cities of Brazil, and cities of Germany. We can note that the model (in Eq 3) provides a good fit for United States, better than the mass action hypothesis. However, this not happens in Brazil and Germany, where the model (in Eq 3) have almost the same predictability that this hypothesis. The pseudo-mass-action cannot explain these results. This is also true in a more general scope, as shown in Fig. 7 . This result can indicate that the contact rate is, in fact, related with population size and not with the population density. The case of United states can be explained by the linear scaling between their county's population size and population density, as show before in Fig. 1 . If the mass action (constant contact rate) is valid, we expect that D 0 scales linearly with N . In the another extreme, if the pseudo-mass action (C ∝ N ) is valid, we expect that D 0 is a constant. In Figs. 8, 9 and 10 we show, respectively, the comparison between D 0 and the population size for different counties of United States, cities of Brazil, and cities of Germany. Now, we can note that the model (in Eq 4) provides a good fit for the three countries and are better than the mass action hypothesis. The pseudo-mass action cannot explain these results. This can be viewed in a more general scope in Fig. 11 , where we found universally that α ≈ 1/3. Our main hypothesis is that closer people interact more frequently. So, we expect that if we increase the geographical scale (counties → metropolitan areas → states), we reduce the dependence between contact rate and the population size. In Fig. 12 and 13, we respectively show the relation between D 0 and N for Metropolitan Areas 2 and States of United States. As expected, the scaling dependence is higher for geographical scales with more granularity. The scaling α ≈ 0.2 for Metropolitan Areas is very close to the power-law scaling found in a previous work [18] . To do so, using the approximation S ≈ N for short-times, they measure the growth rate of confirmed cases of COVID-19 by an exponential fit between March 16th and March 25th. Here we follow the approach described in the methodology section, since it not involves approximations, uses all available data (both confirmed cases and deaths) and allows the weekly variation of β, since now we have access of a longer period. The epidemic dynamics are traditionally explained by two hypothesis: the mass action and the pseudo-mass action. Here we shown empirically that neither is good to describe the data. Also, we develop a theory to explain the found relation. Our analysis and results give support to the validity of the population size driven contact rate for the COVID-19 pandemic. This result can also explain the super-linear scaling of criminality in Brazil [19] , Japan [5] and United States [20] . Such is the the downside of leaving in large urban centers. From our analysis, it is clear that the scaling is valid at the municipal, county, or city level. If we make it broader at regions, province or state level, it is washed out by the different scales averaged over such large regions. This conclusion can provide useful insight regarding the urgent problem that cities, and the world in general, are facing. As others authors [18] already pointed out, larger cities require more strict social distancing policies. On the other side, smaller cities may relax controls before larger cities. Trend analysis of the covid-19 pandemic in china and the rest of the world Modeling infectious diseases in humans and animals Infectious Diseases of Humans The scaling of contact rates with population density for the infectious disease models The origins of scaling in cities Urban characteristics attributable to density-driven tie formation Dynamics of measles epidemics: estimating scaling of transmission rates using a time series sir model A tale of many cities: universal patterns in human urban mobility Cssegi sand data / covid-19 Brasil io / covid-19 County population totals 2010 census urban and rural classification and urban area criteria Declaring a disaster emergency in the state of new york Covid-19 attack rate increases with city size. Mansueto Institute for Urban Innovation Research Paper Forthcoming The statistics of urban scaling and their connection to zipf's law The scaling of crime concentration in cities