key: cord-0820761-5sijaqjg authors: Abbasi, M.; Bollini, A.L.; Castillo, J.L.B.; Deppman, A.; Guidio, J.P.; Matuoka, P.T.; Meirelles, A.D.; Policarpo, J.M.P.; Ramos, A.A.G.F.; Simionatto, S.; Varona, A.R.P.; Andrade-II, E.; Panjeh, H.; Trevisan, L.A. title: Fractal signatures of the COVID-19 spread date: 2020-07-11 journal: Chaos Solitons Fractals DOI: 10.1016/j.chaos.2020.110119 sha: 86b61a735ac4946266f81a73696ba98c294fa6af doc_id: 820761 cord_uid: 5sijaqjg Recent quantitative approaches for studying several aspects of urban life and infrastructure have shown that scale properties allow the understanding of many features of urban infrastructure and of human activity in cities. In this paper, we show that COVID-19 virus contamination follows a similar pattern in different regions of the world. The superlinear power-law behavior for the number of contamination cases as a function of the city population, with exponent β of the order of 1.15 is always obtained. Due to the strong indication that scaling is a determinant feature of covid-19 spread, we propose an epidemiological model that embodies a fractal structure, allowing a more detailed description of the observed data about the virus spread in different countries and regions. The hypothesis that fractal structures can be formed in cities as well as in larger networks is tested, indicating that indeed self-similarity may be found in networks connecting several cities. The pandemic spread of the COVID-19 virus has changed, in a few weeks, the life of people across the world. The velocity of contamination, the lack of natural immunological defenses in humans and the high mortality rates due to the disease have challenged the health system capacity in many countries and caused the temporary shutdown of almost all every-day activities in urban centers in order to provide the necessary social distancing. It is clear that social distancing implies some disruption to the socioeconomic structures of the societies when it lasts too long. Therefore, the discussion about the optimal equilibrium between the two risks, health and socioeconomic, has involved citizens, scientists and politicians in all regions. The most used models to describe the epidemic evolution of a disease with time are the SIR model and its variants [1, 2] , that divide the population of a region in three classes: the susceptible, S ( t ), the infected, I ( t ) and the removed, R ( t ). The time dependence of the number of individuals in each class is due to the fact that a susceptible person can get infected, and the infected will eventually be removed due to either immunization or death. The to- (1) which, under the usual conditions of contamination, lead to a function I ( t ) that correctly describes the increase of the infected population until a peak of maximum contamination, as well as its slow decrease till there is no infectious person any longer. In the Eq. (1) , ρ and κ refer respectively to removal rate of infected people and to the probability per unit of time that one infected subject will transmit the disease to a susceptible one. Although the SIR model is a very useful tool to investigate the time evolution of an epidemic disease, it gives little insight on how the disease evolves in space, while the geographic distribution of the infection is fundamental to plan optimal social distancing programs that effectively protect the health of populations while minimizing their socioeconomic impact. In the present work, we describe a model that can provide a good description of both spatial and temporal evolution of epidemic diseases, and therefore can https://doi.org/10.1016/j.chaos.2020.110119 0960-0779/© 2020 Elsevier Ltd. All rights reserved. be useful for the design of social distancing policies if future pandemics have to be faced. As mentioned above, while the SIR approach can give a good description of the way contamination increases with time in a city, it is not good enough to describe the spread of viruses in a very large population nor the spread of the virus in a network of interconnected cities as we will see below. In general, virus contamination needs a close contact between the infected person and the susceptible one hence they need to be at the same place at the same time. Of course, some viruses may live long enough on surfaces and the close contact may need to be relaxed in some aspects, but here we will study the cases where infection due to long term survival of the virus is negligible. If close contact is needed, it is clear that one cannot consider that people in a large city have equal probability to meet any other person in the city, and the same holds for people located at different cities. People organize themselves in order to meet more often a very small group of individuals which constitutes their family and close friends, colleagues in school or at work, but they will not have contact with most of the population in a large city in their lifespan. Therefore, the basic hypothesis in the SIR model does not hold for those cases. In order to derive a better expression to describe contamination, we initially analyze what happens to small groups and then introduce the hypothesis of fractal cities [3] . In Fig. 1 , we depict how one individual interacts with groups of people, and not with the whole population in the city. For the sake of clarity, we use family, relatives and schoolmates as examples of contexts where the social contact takes place, but these are to be considered merely as instances of a large class of social institutions where people interact. We observe that one person has usually a very close daily contact with a small group of people, let us say, his/her family. This person may also have contact with people in other similar groups that are their closest relatives, and the contact may involve all people in his/her group of closest contacts with the other groups, and we may consider one family visiting other families in the group of relatives. In this case, as we will see below, the probability of transmission of the disease refers to the contact Fig. 1 . Schematic view of social contact based on different levels of contact frequency. The red square represents the infected person in the social group. a and b are small groups of close and frequent contacts such as families and colleagues. A and B are larger groupings between which the contact rate is lower than between a and b. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) between groups, and not between individuals, because on such occasions, it is not only one individual who is making contact with the members of the other group but his entire group of closest contacts, that is, his family is in contact with another family. And there is a small group of families they will visit on a weekly basis. Therefore, as we shift the reference from one individual to a small group, the number of contacts increases but the frequency of contacts decreases because entire groups interact less often while the number of close contacts is higher each time. The small groups may comprise yet another level of interaction, a larger one now among groups, with whom the initial individual has even less frequent contacts. We can find even more levels of contact with larger groups and smaller frequencies, but at some point the frequency of contacts vanishes or becomes negligible. This is the reason why people can live in a large city for many years and still not meet most of its inhabitants. There is a limit to the number of levels that are relevant when we consider the contacts one person maintain during his/her life. According to the hypothesis of social contact depicted in Fig. 1 , the virus transmission will be different from that assumed in the SIR model. Those people from the same family and those who work together will interact daily so they have a much higher probability to get infected than those who contact each other only weekly or less. But another aspect is important as well. Since in the next level the contact happens between entire groups, then not only our initial infected person can be a vector for the virus but also, with some probability, his/her closest relatives/colleagues. The same holds when we move to larger groups. Let us say that there is a probability ε of direct contamination by the infected individual, then after some time, there will be a number of individuals who will be directly infected and this number is given by where α is the number of susceptible individuals in the group. If we include secondary contagion, that is, the possibility that the first infected will transmit the virus to a second person who in turn can transmit it to others, then we have to add powers of the term ε which correspond to secondary transmissions, and we get N(ε) = 1 + α ε + (1 / 2) α(α − 1) ε 2 + · · · + C(α, k ) ε k + · · · + ε α . (3) The expression above can be understood as the sum of all contributions of infection by different paths of contamination of each individual in the group. The expression above can be simplified to Let ˜ N be the average number of people who are relevant to the contagion, that is, those who have a large enough probability to be infected in the period considered. Then In case we have initially N o uncorrelated infected individuals, that is, at different groups, the number of infected people would increase by From Eq. (6) , we can find a relation between the quantity ε and the multiplication factor, μ for the virus spreading, that is, the average number of people who will be contaminated by the virus while the first infected person is a potential vector for contamination. μ can be defined from where N o is the initial number of infected individuals, and N ( ε) is the total number of newly infected people after the period of time considered, during which the initial individual is able to transmit the virus to others. Comparing Eqs. (6) and (7) , we get The meaning of the quantity ε becomes clearer in view of the equation above. If τ is the probability of virus transmission in a single contact, then where ν is the average number of contacts the infected individual makes in his/her closest social group in the period of transmission, and ( ˜ N − 1) λ is the average number of non-infected people in that social group during the period the first individual was transmitting the virus. In terms of the quantities just introduced, Eq. (4) gives All the discussion above was made while keeping in mind only the closest social group. It is necessary to go further, since it is common that an infected person has contact with larger groups, and this is the main cause of epidemic transmission. To include larger groups, we will retain our model of social interaction, as depicted in Fig. 1 and introduce the fractal hypothesis, what will allow us to obtain recurrence formulas which facilitate the description of larger social groups. The formation and growth of cities have been a long-standing concern, but more rigorous scientific approaches applied to the social sciences have been possible only after the advent of large databases providing information on different social aspects of human life. Although still mostly phenomenological, new models that explain different aspects of socioeconomic activities and human interactions in more solid grounds have appeared. These models allow the discovery of patterns that, before the recent developments, were unsuspected of [4] [5] [6] [7] [8] [9] and trigger the emergence of a science for cities propitiated by accurate information about how people live and interact. These studies are now possible due to the advances in information technology. The scientific approach to understand cities' organic evolution and how people live in the urban environment produced some promising results, revealing patterns that apparently are valid across the globe. These patterns are emergent characteristics of cities [10, 11] since they were not planned in any way, and therefore reflect fundamental aspects of human activity in urban centers that, in a coarse grained analysis, are determinant to shape urban structures and the way people interact in the urban environment. One fascinating finding is the fact that many aspects of urban life and city infrastructure increase as a power-law function of the city population, that is where N is the quantity of a mensurable feature of the city, as the number of gas stations, the number of patents produced, the number of squares in the city, while ν is the city population, with N o and β being constants to be determined from data analysis, the first one representing a reference value for the quantity measured and the second one being the slope of the logarithm increase of the measured quantity N . It turns out that the exponent β is in many cases different from unit, a result that shows that the usual per capita or linear method used to compare socioeconomic or infrastructural data among different cities or countries is not necessarily the best form of analysis. More interestingly, studies have shown that when N refers to an infrastructure quantity, the exponent is β < 1, resulting in a sublinear increase of this infrastructure with the city size. On the other hand, when N refers to a measurement of a human activity, such as wages or crimes, the increase is superlinear, with β > 1. Surprisingly, the exponents turn out to be always around the values β = 0 . 85 in the first case and β = 1 . 15 in the second, independently of other characteristics such as culture, ethnicity or country where the city is placed, while N o may vary more significantly from one place to another [3, 12, 13] . The theoretical assumption underlying the power-law behavior given by Eq. (11) is that cities (and their aggregation) are constituted of networks through which information, goods and people flow. The interconnections between human interactions and city infrastructure, therefore, determine how people live in the cities and how cities grow shaped by human needs [11, 14] . Hence, the power-law emerges as a consequence of scaling properties common to all cities in a given country [5] . In order to verify if the fractal hypothesis can hold for the COVID-19 pandemic spread, we analyze the number of infected individuals in a region or city as a function of its population. If the power-law behavior is indeed obtained with its exponent around the expected value, β = 1 . 15 , it may show that the virus contamination follows a geographical distribution similar to the flux of people between cities, provinces or states. This information may be of great value to the design of the best strategy for containment of the virus, while minimizing the side effects in the socioeconomic activities inside the country. This verification is based on data collected from the Eq. (11) through ROOT platform of analysis [22] , obtaining the two fitting parameters, N o and β. The results are shown in Fig. 2 and they suggest the existence of fractal scaling across continents and across scale, in this case from state to country. The analysis performed above shows that the fractal hypothesis can be valid for the COVID-19 disease propagation. The fractal hypothesis is that large groups reproduces what is observed in the smallest social group by keeping the ratio ν/ This is a strong hypothesis that can be proved only by observing if the model is able to describe correctly the spread of virus over large populations. The basic concept behind this fractal hypothesis is that when one changes the scale from the closest social group, such as the groups a and b in Fig. 1 to a higher social grouping, such as the groups A and B , then each smaller group a and b may be considered, for the sake of virus transmission, as a single agent of transmission in the same way a person was an agent of transmission. This may happen because the transmission rate is high enough to infect most of the individuals in a close contact group, or because the contact between the different groups in the higher level involves most of the individuals of the smaller social group. In this condition, ˜ N represents the number of small groups that have frequent contact, perhaps on a weekly basis, with the infected group, and we suppose that this number is, on average, the same as the number of people in one small group. We introduce which will lead to a well-known distribution associated to Tsallis statistics. Also, the average number of contacts per non-infected group keeps the same ratio ν/ λ, which we assume to be constant throughout the large population in a region that can be enclosed by the fractal network of social contacts. As mentioned above, this network can be as large as an entire country. The meaning of the quantity λ now can be clarified. It is a scale variable that indicates the number of independent groups in which a population with ν individuals can be divided. Independent group, here, is intended to mean a group of people who interact with other social groups with a number of individuals of the same order, as a single set, i.e., keeping some specific feature under study, like the virus infection, more or less uniform when compared to the other groups, as the groups A and B in Fig. 1 . These more or less uniform groups are called agents in some models. In these groups, we can include families or coworkers, in a scale of units, enlarged family group or small companies in the scale of tens, neighborhood or large companies in the scale of hundreds, small cities in the scale of thousands, and so on. Then we can write Eq. (11) as Observe that the result is not an exponential function, so the increase in the number of infected people with the population size is smaller than what would be expected by the SIR model. This happens because in the present case the different possibilities of contact between individuals in different groups were taken into account. The fractal scale parameter λ determines the size of the agents in the population one wants to describe. If one is studying one family, the agent is the individual, and λ = λ o = 1 . If one wants to study a city, the agent can have the size of its main centers. In a network, the number of agents, ˜ N , as well as q , is constant, and We observe that for ν/ λ o 1, the multiplication factor can be well approximated by which is the known power-law behavior observed in cities. In the application of the power-law to study the urban environment, it was found that the exponent α = (1 − q ) −1 = 1 . 15 , resulting in q = 0 . 13 . Using the relation between α and (1 − q ) in Keeping just terms up to 2 nd order in τ / λ o and using the definition of multiplication rate in Eq. (7) we find τ˜ N λ and obtain If we suppose the closest contact group to be formed by ˜ N ∼ 4 individuals, and considering that we are counting number of people, so λ = λ o , and considering μ~3, then τ~0.3. This number represents the probability that an infected person will transmit the virus directly to another one in his family in the period that person is a potential vector of infection. Of course our model will be useful only if we can make the calculation for large groups, not only for families. To do it we use the scale free variable ν/ λ, and use the hypothesis of fractal structure to the dissemination of the virus. According to this assumption, τ must be independent of the scale λ, what means that the probability that a family will transmit the virus to another family is equal to the probability of transmission from one person to another. The value for τ obtained above corresponds to a family group, so we can estimate λ = 5 as the size of a typical family. However, it is convenient to fix the scale to λ = λ 0 = 1 , that is, although we will be dealing with large groups, they will be characterized by the number of inhabitants, and therefore we use τ o ≈ 0.3, giving the virus transmission probability per person. So we finally get, from Eq. (13) , that The description above gives the distribution of infected people by group of people, or agents, and assumes that the probability is scale-free. In addition, the assumptions made are valid as far as only one person in the group initiates the contamination in that group. This restriction imposes a maximum value for the scale λ = λ m , since at some point, in real cases, groups where the contamination is initiated by different individuals will get into contact. Since it is difficult to determine the value λ m , it is usually left as a free parameter in fitting to data, and the ratio β = τ /λ m cannot be disentangled only by fitting data. Despite this difficulty, we can proceed to obtain a description of the time evolution of the contamination. We will, as before, assume fractal properties of urban life. Notice that τ is the transmission in the period in which the initial infected person is a vector of transmission. We can include the time variable in our description by assuming that the contacts between individuals in the same group happens randomly in that period, so that τ = κt, with κ being a contamination constant. Let us introduce the period for one contamination as which is specific for a scale λ, with n being an integer. From this expression we get interval which gives the average time between consecutive spikes of contamination in groups of people at the scale λ, in a region with population ν. Observe that the intervals between spikes are proportional to the size of the agent we are considering, given by λ, for a given population ν, and each spike in the contamination rate corresponds to a new agent with size around the scale λ which is contaminated. Once the contamination starts in an agent, one has to go to a small scale to describe it, until the fundamental scale λ o is reached. If t o = 0 is the moment when the first infected person gets into the group, then the spikes of the order of λ in the number of newly infected people are given by t n = t o + t n , with n = 1 , 2 , . . . , and the cumulative number of infected people increases as an approximate step function. If one wants to describe the evolution of contamination with greater detail, then λ is reduced, the spikes become smaller and more frequent, at intervals t . The cumulative number tends to a smoother function of time. We can include the effects of the reduction of susceptible people in the contamination, developing a sort of non-extensive SIR model. To do it, we first separate the population in number of infected, i ( t ) and number of susceptible, σ ( t ). Observe that there is no distinction between infected and removed in our model. This can be done because the limited time for infection is already incorporated in the model by the fact that the index q is a constant different from unit, what means that the α is constant and finite. If the removed individuals have not been taken into account, the number of people directly infected by the first infectious person would increase and make α → ∞ , what would lead to the exponential behavior observed in the SIR model. In these terms the equation of disease transmission above becomes . (22) Observe that t λ is the time measured according to the scale λ, as discussed above, and we are supposing that t λ = 0 when the first i o infected individuals appeared in the group. Deriving that expression with respect to t λ gives where ˙ σ is the time derivative of the susceptible population. Here we consider that σ (t λ ) t λ˙ σ (t λ ) , which is possible because t λ is limited to the period while the first agent in the group is a vector of transmission, and in most cases σ ( t λ ) can be approximated, in the period of time considered, by a straight line. With this approximation we obtain To complete the model, we follow the SIR approach and describe the variation of the susceptible population by imposing that the total population is constant in the time period relevant to the problem, so and we get The model developed here results in a set of two coupled differential equations, since which are to be compared to the Eqs. (1) . The main difference with respect to the SIR equations is the power q in the infected population. This power is a characteristic feature in the Tsallis version of non-extensive statistics, where the probabilities acquire this power to depart from the additive property of entropy that is valid in the Boltzmann-Gibbs statistics. The solution to the coupled equations are It is clear that the solutions above are recursive, a feature that is common to the description of fractal systems. An approximate solution can be easily obtained by noticing that, in most cases of interest, we can approximate the equation that is a separable equation resulting in Integrating both sides we get σ σo dσ (t λ ) This equation results in Observe that the equation above can be conveniently written as and defining we obtain that is, the susceptible population follows a negative q-exponential function of the modified time variable, θ . As already mentioned, we do not need to treat the removed population separately from the infected one, however it is convenient to distinguish the infected population between infectious and removed. We can obtain the removed population by considering that after a time t λ all the infectious will be removed either by immunization or death, so the removed population can be obtained by with r(t λ ) = 0 if t λ − t λ ≤ 0 . Then, the infectious population is i n (t λ ) = i (t λ ) − r(t λ )) . The infected population according to SIR model corresponds to i n ( t ) in our model. We have implemented the non-extensive SIR model in Python programming language, with the parameters q, κ and λ set to constant values and the initial number of infected individuals set to i o = 10 2 . In Fig. 3 , one can see the numerical solution for two population sizes. We observe that in about 10 days the number of infectious people goes to zero. Notice that the ratio κ/ λ defines the time scale for the complete contamination of the population, and is roughly independent of the population size. In fact, in Fig. 3 (b) the population is 100 times larger than the population in Fig. 3 (a) , but the contamination is mostly over in the same time period. The appropriate use of the model implies scaling λ with the population, which is equivalent to extend the contamination time. Another possibility is to add the contribution of different groups of population with the same size with initial time of contamination spaced in a larger range of time. This model is compared to the real case observed in the State of São Paulo, Brazil, in Fig. 4 and to Europe data in Fig. 5 . The parameters used to describe the data for Europe are presented in Table 1 . The population of the cities around the center is defined according to the better description of the data available. For Europe, all cities around the center have population of 1.3 × 10 4 individuals. For São Paulo, 3 levels of cities were required with populations progressively decreasing as factors of the population at center. The factors are 1/7, 1/15 and 1/40. The model not only reproduces very closely the number of infected individuals but also matches to a good extent the periodicity in which new population centers enter the larger set of infected people. Notice that in the case of São Paulo the description includes more levels of the fractal structure in the network of cities than in the case of Europe. We observe that a better description is obtained in the second case in both the daily increase of cases as in the cumulative data. The process to adjust parameters to a fractal structure is not trivial. The usual method to investigate the fractal spectrum of dimensions is the intermittency method [23, 24] . To our knowledge, no work has been done on such analysis for the spread of COVID-19. In the present work, we investigated the existence of signatures of fractality in the spread of COVID-19 in China, USA, the state of So Paulo and Europe. We obtain the characteristic power-law behavior, indicative of scaling properties in the epidemic process. A model based on fractal structure of cities is proposed, and with this model we can describe not only the contamination rates for different populations, but also the time evolution of the contamination in each region. One important characteristic of the model is the definition of a probability of contamination that follows the q-exponential function common in Tsallis statistics [25] . The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. A contribution to the mathematical theory of epidemics Mathematical biology. an introduction. 3ed Fractal cities: A Geometry of form and function The new science of cities A unified theory of urban living Cities, from information to interaction Urban scaling and its deviations: revealing the structure of wealth, innovation and crime across cities Bigger cities do more with less Growth, innovation,scaling, and the pace of life in cities The size, scale, and shape of cities Scale: the universal laws of life, grouth, and death in organisms, cities and companies Scaling laws for the movement of people between locations in a large city The scaling of human interactions with city size This work was supported by the Conselho Nacional de