key: cord-0984229-g6bevqxq authors: Ahundjanov, B. B.; Akhundjanov, S. B.; Okhunjanov, B. B. title: Power Law in COVID-19 Cases in China date: 2020-07-27 journal: nan DOI: 10.1101/2020.07.25.20161984 sha: 9dd5a9e01fb8bd7ed37293a86ed4d994245c57f4 doc_id: 984229 cord_uid: g6bevqxq The novel coronavirus (COVID-19) was first identified in China in December 2019. Within a short period of time, the infectious disease has spread far and wide. This study focuses on the distribution of COVID-19 confirmed cases in China---the original epicenter of the outbreak. We show that the upper tail of COVID-19 cases in Chinese cities is well described by a power law distribution, with exponent less than one, and that a random proportionate growth model predicated by Gibrat's law is a plausible explanation for the emergence of the observed power law behavior. This finding is significant because it implies that COVID-19 cases in China is heavy-tailed and disperse, that a few cities account for a disproportionate share of COVID-19 cases, and that the distribution has no finite mean or variance. The power-law distributedness has implications for effective planning and policy design as well as efficient use of government resources. , farmland (Akhundjanov and Chamberlain, 2019) , city size (Krugman, 1996; Gabaix, 1999; Ioannides and Overman, 2003; Devadoss et al., 2016) , natural gas and oil production (Balthrop, 2016) , carbon dioxide (CO 2 ) emissions (Akhundjanov et al., 2017) , frequency of words (Zipf, 1949; Irmay, 1997) , among others. Our paper finds the existence of a power law in epidemiology as well. The omnipresence of power laws is partly explained by the fact they are preserved over an extensive array of mathematical transformations (Gabaix, 2009 ). An interesting aspect of power law distribution is that it is the macro-level steady-state phenomenon that, in theory, can arise from a micro-level random proportionate growth process, known as Gibrat's law (Gibrat, 1931) , 2 whereby each unit's (e.g., city's) growth rate is drawn randomly and independently of its current size. 3 Given power law and Gibrat's law often go hand-in-hand, Gibrat's law has also been extensively documented in the social and natural sciences. 4 The robust fit of power law to cross-sectional distribution of COVID-19 cases in Chinese cities potentially provides macro-level evidence for random proportionate growth posited by Gibrat's law. However, it is well known that power laws can similarly be obtained from other models and systems (Barabási and Albert, 1999; Carlson and Doyle, 1999; Mitzenmacher, 2004; Newman, 2005; Gabaix, 2016) . Therefore, we formally test for random proportionate growth at micro-level by analyzing growth rates of COVID-19 cases in Chinese cities. Our empirical analysis provides support for Gibrat's law of proportionate growth, which, in turn, offers a plausible explanation for the emergence of a power law behavior in the data. 2 For a detailed review of Gibrat's law, see Sutton (1997) . 3 Gibrat's law alone is not sufficient to give rise to a power law. In fact, it leads to the lognormal distribution as shown by Gibrat (1931) ; though many examples used by Gibrat (1931) have recently been shown to actually follow a Pareto-type distribution rather than the lognormal (Akhundjanov and Toda, 2020) . Nonetheless, Gibrat's law can generate a power law with an auxiliary assumption (Gabaix, 1999) . Section 4.1 elaborates on a link between Gibrat's law and power law. 4 In particular, Gibrat's law has been shown to explain the growth process of consumption (Battistin et al., 2009) , firms (Luttmer, 2007) , farms (Clark et al., 1992) , trucking industry (Balthrop, 2020) , cities (Ioannides and Overman, 2003; Eeckhout, 2004; Luckstead and Devadoss, 2014) , countries (Rose, 2006; González-Val and Sanso-Navarro, 2010) , bird population (Keitt and Stanley, 1998) , among others. There are a number of practical implications of the power law fit. First, given the estimated Pareto exponent is less than one (γ < 1), the distribution is heavy-tailed and so disperse that observations near the mean account for little of the cumulative distribution of COVID-19 cases. This implies talking about the average number of COVID-19 cases is inconsequential as it no longer represents the majority of cases. In fact, even though it is possible to compute sample mean and variance for the observed data, these moments are generally non-convergent. In this case, quantile analysis or order statistics would be more appropriate. Second, the heavy upper-tail of the distribution (also, the confirmation of Gibrat's law) is suggestive of concentration of COVID-19 cases in China, with the total cases essentially determined by a few cities that bore the brunt of the outbreak, which is true in case of China (Han et al., 2020) . This has implications for more effective epidemi- . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 spread, peak, and decline to zero daily cases. In contrast, the analyses of Beare and Toda (2020) and Blasius (2020) are based on data sets that were largely evolving at the time, as both the United States of America and a whole host of other countries are still battling to contain the spread of the virus to this date. Thus, the results of the above studies are likely subject to change with newer data. The remainder of the paper is structured as follows. Section 2 introduces the data for COVID-19 cases in China. Section 3 presents the methods and findings for power law analysis. Section 4 provides the methods and results for Gibrat's law analysis. Section 5 includes some concluding remarks. Daily data on the cumulative number of COVID-19 confirmed cases for Chinese cities comes from Harvard Dataverse (China Data Lab, 2020). The dataset includes 339 cities in China and covers periods between January 15, 2020, and May 23, 2020, which are determined by the data source. Our main (power-law) analysis focuses on COVID-19 cases as of May 23, 2020, the latest data on cumulative cases, while an auxiliary analysis potentially explaining the emergence of a power law behavior uses the data between January 15, 2020, and May 23, 2020. A power law analysis is data intensive, with Clauset et al. (2009) recommending a minimum of 50 observations for reliable analysis. This condition is well-satisfied here, including for the upper tail (see Section 3.3). Figure 1 shows the evolution of empirical distribution of COVID-19 cases in Chinese cities over select dates. It is apparent that the distribution has been right-skewed, with heavier right tail, and it has been gradually sliding rightward, which reflects increasing number of COVID-19 cases across Chinese cities over time. In this section, we study the distribution of the cumulative number of COVID-19 confirmed cases for Chinese cities. We first present the methodology for power law analysis, followed by estimation results. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Suppose X is a random variable whose data generating process is a continuous power law (Pareto) distribution. The corresponding probability distribution function (PDF) is specified as where x is an outcome of X for x ∈ R + , where R + = {x ∈ R|x > 0}, x min is the threshold beyond which (i.e., x ≥ x min ) power-law behavior sets in, and α is the power-law (Pareto) exponent, a parameter of interest. The mth non-central moment for the power law distribution is given by Hence only the first α − 1 moments exist for m < α − 1. Although higher-order moments can be calculated for any finite sample, these estimates do not asymptotically converge to any particular value. Given the sample x 1 , . . . , x n , the joint log-likelihood function can be written as First-order condition yields the maximum likelihood estimate (MLE) of with the standard error (SE) of the estimate given by 6 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . It is standard to report the counter-cumulative parameter γ = α M LE − 1, known as the Hill estimator (Hill, 1975) , instead of (4). The Hill estimator is obtained from (4), after a small-sample adjustment, and takes the following form with the standard error of the estimate given by The power-law fit to data is depicted by plotting the counter-(complimentary-) cumulative distribution function (CDF) on doubly logarithmic axes. The counter-CDF of a power law is specified as where k = x α−1 min is a constant. Taking the log of both sides of (8) yields a linear relationship between log counter-cumulative probability (i.e., ln Prob(X > x)) and log data (i.e., ln x), with the counter-cumulative parameter −γ being the slope of the line. An alternative approach to estimate the counter-cumulative parameter γ is through a regression-based technique. Specifically, estimate the following regression equation with ordinary least squares (OLS) where rank i is observation i's rank in the distribution, φ is the intercept term, γ OLS is the parameter of interest, and ε i is the idiosyncratic disturbance term. Equation 9 also shows that a power law distributed process appears approximately linear on a log-log plot of rank i against x i , with slope of −γ OLS . The asymptotic standard error for γ OLS is given by (Gabaix 7 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . and Ibragimov, 2011) An important consideration in power law analysis is the specification of the threshold parameter x min , beyond which power-law behavior takes hold. There are several approaches proposed in the literature in this regard. For instance, one strand of literature suggests to select x min arbitrarily at either the 95% quantile of the data (Gabaix, 2009) or the point where empirical PDF or CDF roughly straightens out on a log-log plot. Clearly, both of these approaches are rather subjective and thus suffer from a certain degree of uncertainty about whether they are able to capture the true starting point of power-law behavior. In fact, Perline (2005) , exploring the empirical consequences of this concern, shows that sufficiently truncated Gumbel-type distributions (e.g., the lognormal) can also produce a linear pattern on a log-log plot, hence imitating the power law distribution. Consequently, we adopt a more systematic, data-driven procedure proposed by another strand of literature (Clauset et al., 2009) to select x min . This approach essentially treats each observation in the sample as a potential candidate for x min and selects the best candidate based on the minimization of the Kolmogorov-Smirnov (KS) goodness-of-fit statistic, which is given by where E(x) is the empirical CDF andF (x) is the estimated power-law CDF. The optimal x min minimizes the distance between the empirical CDF and the estimated power-law CDF. The computational algorithm takes the following form: Step 1: Set x min = x 1 ; Step 2: Perform power-law parameter estimation using x ≥ x min ; Step 3: Compute the KS statistic in (11); Step 4: Repeat steps 1-4 for all x i for i = 1, . . . , n; Step 5: Select x min with the lowest KS statistic. 8 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10.1101/2020.07.25.20161984 doi: medRxiv preprint Power-law analysis is accompanied by a series of diagnostic tests given significant parameter estimates alone do not provide sufficient evidence in favor of power-law fit to data. In order to guard against potential misspecification issues, one needs to conduct a goodness-of-fit test and compare the power-law fit to data with those of alternative distributions. Gabaix and Ibragimov (2011) proposed 'rank -1/2' test to verify the goodness-of-fit of power law distribution. Let x * be defined as Then, regress bias-adjusted log rank against the log data and a quadratic deviation term, as in The goodness-of-fit statistic is specified as q/ζ 2 . The null hypothesis of power-law distributedness is rejected if q/ζ 2 > 1.95(2n) −1/2 , where the latter term is the goodness-of-fit threshold. Further, Clauset et al. (2009) suggest comparisons of power-law fit with those of other, competing, heavy-tailed distributions, such as the lognormal and exponential. Accordingly, we fit these alternative distributions to the data by MLE and provide visual comparisons of the distributions' fits on a doubly logarithmic plot as detailed above. We also implement the likelihood ratio test of Clauset et al. (2009) for a more formal comparison. The likelihood ratio statistic is specified as wheref 1 (x i ) andf 2 (x i ) are the probabilities predicted by power law and an alternative distribution, respectively. If the likelihood ratio statistic is positive, it indicates the power law distribution fits the data more closely. If it is negative, then an alternative distribution 9 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . yields a better fit. 5 The methods discussed in Section 3 are applied to the cumulative number of COVID-19 confirmed cases in Chinese cities (x) as of May 23, 2020. The results from power law analysis are provided in Tables 1-2 and Figure 2 . As noted earlier, the requirement placed on sample size for credible power law analysis is a minimum of 50 observations (Clauset et al., 2009 ). This condition is well-satisfied here as the upper-tail sample (x > x min ) contains 151 observations. The Hill and OLS estimates of the counter-cumulative parameter γ are around 0.80 and highly statistically significant. Given m < 0.80, the moments of the fitted power law distribution (including mean and variance) are generally non-convergent. The goodness-of-fit test of Gabaix and Ibragimov (2011) suggests we fail to reject the null hypothesis of power-law distributedness, which provides strong evidence in favor of power law fit to COVID-19 cases in China. The fits of competing distributions-the lognormal and exponential-noticeably deviate from the empirical data throughout the domain. The likelihood ratio tests in Table 2 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10.1101/2020.07.25.20161984 doi: medRxiv preprint formal evidence in this regard. As is evident from large positive likelihood ratio statistics, the power law distribution significantly outperforms both the lognormal and exponential distribution in fitting COVID-19 cases in China, which is in line with our observations from Figure 2. We reject both the lognormal and exponential as an adequate specification for COVID-19 cases. In summary, our estimation results and diagnostic tests provide strong evidence that the COVID-19 cases in Chinese cities can be well characterized by the power law (Pareto) distribution. In this section, we explore whether a growth model involving Gibrat's law (Gibrat, 1931) can potentially explain the emergence of the observed power-law behavior in COVID-19 cases in China. We focus on Gibrat's law specifically granted a random multiplicative growth (with a caveat) is the prevalent attribute of models explaining the genesis of power laws (Gabaix, 1999 (Gabaix, , 2009 ). There are different mechanisms proposed in the literature, including the Yule process (Willis and Yule, 1922; Yule, 1925) and random growth models with geometrically distributed age distribution (Wold and Whittle, 1957; Reed, 2001; Toda, 2014; Beare and Toda, 2017) , that can generate power laws. 6 In what follows, we describe a simple of such mechanisms. Suppose S it is the size of a stochastic process of interest for unit i at time t. For instance, COVID-19 cases in city i up to day t. According to Gibrat's law, the size of the process (at least in the upper tail) exhibits random multiplicative growth, evolving as over time, where µ it+1 is independently and identically distributed (i.i.d.) random variable with an associated PDF of f (µ). Hence, random growth factor µ it+1 = S it+1 /S it is independent of the current size S it , which is commonly known as Gibrat's law of proportionate 6 For a detailed review, see Newman (2005) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10.1101/2020.07.25.20161984 doi: medRxiv preprint growth. Gibrat's law alone does not generate power law but, instead, gives rise to the lognormal distribution for the size of the process (see Section 4.2 for details), which was noted by Gibrat (1931) himself early on. Later, Gabaix (1999) showed that power law can arise from Gibrat's law with an additional assumption, a sketch of which we provide below. Let G t (s) = Prob(S t > s) be the counter-CDF of S t . Substituting (15) into the counter-CDF, the equation of motion for G t+1 (s) boils down to If there is a steady state process G t = G, then The mechanism ensuring that power law distribution is the (only) suitable steady state distribution in (17) is if S t has lower reflecting barrier S min , i.e., the minimal size of the process, such that S t > S min (Gabaix, 1999, Proposition 1) . In this case, G(s) = k x γ , from (8). Thus, Gibrat's law combined with a lower bound on S t can plausibly yield power law distribution. For empirical purposes, we consider a continuous time representation of Gibrat's law, given by geometric Brownian motion where g is the expected growth rate, ν > 0 is the volatility, and B it is a standard Brownian motion that is i.i.d. across cross-sectional units. Applying Itô's lemma to (18) yields 12 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . meaning the cross-sectional distribution of S it , with the initial size of S i0 , is lognormal (20) Equation (19) (along with Proposition 1 in Gabaix (1999) ) suggests that growth rates under Gibrat's law can be described by a random walk process of the form (Sutton, 1997; Eeckhout, 2004; Gabaix, 2009) Setting the random growth component ζ it = φ i +ξ it , where φ i is the effect of unit-wide factors and ξ it is an i.i.d. random effect, produces a random walk with drift. A standard method to test for Gibrat's law is through estimation of the following cross-sectional regression equation In (22), ρ is the parameter of interest, with ρ 1 providing statistical evidence that the growth process of S t adheres to Gibrat's law. An alternative approach for testing for Gibrat's law of proportionate growth is through estimation of the cross-sectional regression equation of the form (Beare and Toda, 2020) where ∆ is the difference operator, ∆ ln S it+1 is the COVID-19 growth rate in city i between day t and t + 1, ∆ ln S it is the COVID-19 growth rate in city i between day t − 1 and t, D it is the number of days between day t and the day of the first COVID-19 case in city i, and e it is an i.i.d. error term. The parameters of interest are β 1t , β 2t , β 3t , with β 1t 0, β 2t 0, β 3t 0 providing empirical evidence for the presence of Gibrat's law. The distinctive feature of equation (23) is the inclusion of age distribution-days since outbreak for each city-in addition to the growth rate. Obtaining age distribution has traditionally been cumbersome in power law analysis (e.g., of cities). Fortunately, our data conveniently affords us this 13 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 variable as we observe the entire timeline of the evolution of COVID-19 across Chinese cities. We apply the methods discussed in Section 4.2 to each day between January 23, 2020, and February 25, 2020 (inclusive). The reason for starting from January 23 is because at least 30 cities had a positive number of cumulative cases (S it > 0) starting from January 23 (see Figure 3 ). The reason for stopping at February 25 is because COVID-19 dynamics in China had largely stabilized by February 25 (S it+1 S it ), with a small to zero number of new daily cases after February 25, which left the distribution of cumulative cases after February 25 virtually unaffected (see Figure 1 ). This will also become apparent from our findings below. Figure 4 shows the estimation results for ρ t in equation (22) for t = Jan 23, . . . , Feb 25. Clearly, the estimates of ρ t are statistically indistinguishable from unity (ρ t 1), which confirms the random growth model predicated by Gibrat's law. The 95% confidence interval shrinks moving left to right, which can be attributed to two factors. First, it reflects increasing sample size (i.e., increasing number of cities with confirmed cases), at least until February 8, when most Chinese cities had reported a positive number of cases ( Figure 3) . Second, the thinning of the confidence interval can also be attributed to the stabilization of COVID-19 situation in China, which saw a rapid decline in new daily cases starting from mid-February, with the daily change (growth rate) approaching to zero. Figure 5 reports the estimation results for β 0t , β 1t , β 2t , β 3t in equation (23) for t = Jan 23, . . . , Feb 25. Panels (b)-(d) contain the estimates for β 1t , β 2t , β 3t , which are of main interest here. It is apparent that these estimates are largely equal to zero or close to zero (β 1t 0, β 2t 0, β 3t 0), which indicates the growth rate between days t and t + 1 does not depend on the number of cases on day t, nor on the growth rate between days t − 1 and t, nor on the number of days since the first confirmed case. This also provides evidence for the presence of Gibrat's law for COVID-19 cases in Chinese cities. The estimates of β 0t in panel (a) show that the expected growth rate of confirmed cases declined over the study period, with some fluctuations, and eventually approached to zero around February 9, which is consistent with the observed data. 14 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10. 1101 In light of the discussion in Section 4.1, the confirmation of Gibrat's law for COVID-19 cases in Chinese cities provides a plausible explanation for the emergence of power law behavior shown for the data. The dynamics of the novel coronavirus pandemic are complex and affected by a plethora of factors, which are yet to be fully understood. In spite of the apparent chaotic evolution of the pandemic, surprising regularities can still be observed in the size distribution and growth process of COVID-19 cases. In this paper, we examined the distribution of the novel coronavirus cases in China-the original epicenter of the ongoing pandemic. We presented empirical evidence for a power law distribution for the upper tail of the number of COVID-19 cases in Chinese cities. The power law fit is robust to different estimation methods, passes rigorous diagnostic tests, and fits the data better than a number of rivaling distributions. The implications of the power law fit are that the number of COVID-19 cases in Chinese cities is heavy-tailed and disperse, so that average number of COVID-19 cases is problematic to talk about; that COVID-19 cases are concentrated within a few cities that account for a disproportionately large amount of infections; and that mean and variance are generally not finite. Admittedly, there may always be a distribution that fits the data better than a power law granted there are virtually an infinite number of distributions. What we showed here is that the power law distribution is able to capture the upper tail of the data, which it do so parsimoniously, and better than a couple 'go-to' distributions. In addition, given that the data is not lognormally distributed, we reject Gibrat's law of random proportionate growth in its standard form. However, the nuanced version of Gibrat's law (Gabaix, 1999) , as discussed in Section 4.1, is demonstrated to be a plausible mechanism for the emergence of power law behavior in COVID-19 cases in China. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020 . . https://doi.org/10.1101 20 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. Figure 2 : Plot of empirical and fitted log counter-cumulative probability and log COVID-19 confirmed cases. Estimation is based on upper-tail observations x > x min as of May 23, 2020, where x min is determined based on the minimization of the KS statistic. Clauset et al. (2009) recommend to have at least 50 observations for accurate power law analysis, a condition well-satisfied here. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. Figure 4 : Estimates of ρ t in equation (22) between January 23, 2020, and February 25, 2020, with 95% confidence bands. ρ t 1 provides empirical evidence for Gibrat's law. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 27, 2020. Figure 5 : Estimates of β 0t , β 1t , β 2t , β 3t in equation (23) between January 23, 2020, and February 25, 2020, with 95% confidence bands. The parameters of interest are β 1t , β 2t , β 3t , with β 1t 0, β 2t 0, β 3t 0 providing empirical evidence for Gibrat's law. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10.1101/2020.07.25.20161984 doi: medRxiv preprint Note: Estimation is based on upper-tail observations x > x min as of May 23, 2020, where x min is determined based on the minimization of the KS statistic. For the Gabaix and Ibragimov (2011) test, the null hypothesis that COVID-19 confirmed cases is distributed according to a power law is rejected if test statistic > threshold. Clauset et al. (2009) recommend to have at least 50 observations for accurate power law analysis, a condition well-satisfied here. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10.1101/2020.07.25.20161984 doi: medRxiv preprint Note: Estimation is based on upper-tail observations x > x min as of May 23, 2020, where x min is determined based on the minimization of the KS statistic. A positive value of the likelihood ratio statistic indicates that the power law is the better fitting distribution. A negative value indicates the alternative distribution fits the data more closely. P-values are calculated using the methods detailed in Clauset et al. (2009) . The null hypothesis is there is no significant differences in likelihoods of the distributions tested. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 27, 2020. . https://doi.org/10. 1101 The Power-Law Distribution of Agricultural Land Size Size Distribution of National CO 2 Emissions Is Gibrat's 'Economic Inequality' Lognormal? Zipf Distribution of US Firm Sizes Power Laws in Oil and Natural Gas Production Gibrat's Law in the Trucking Industry Emergence of Scaling in Random Networks Why is Consumption more Log Normal than Income? Gibrat's Law Revisited Geometrically Stopped Markovian Random Growth Processes and Pareto Tails On the Emergence of a Power Law in the Distribution of COVID-19 Cases Power-law Distribution in the Number of Confirmed COVID-19 Cases Highly Optimized Tolerance: A Mechanism for Power Laws in Designed Systems A Model of Income Distribution Power-Law Distributions in Empirical Data WHO Declares COVID-19 a Pandemic The Power Law Distribution for Lower Tail Cities in India Gibrat's Law for (All) Cities Zipf's Law for Cities: An Explanation Power Laws in Economics and Finance Power Laws in Economics: An Introduction Rank -1/2: A Simple Way to Improve the OLS Estimation of Tail Exponents The Evolution of US City Size Distribution from a Long-Term Perspective (1900-2000) Gibrat's law for countries Epidemiological Assessment of Imported Coronavirus Disease 2019 (COVID-19) Cases in the Most Affected City Outside of Hubei Province A Simple General Approach to Inference about the Tail of a Distribution Zipf's Law for Cities: An Empirical Examination The Relationship between Zipf's Law and the Distribution of First Digits Dynamics of North American Breeding Bird Populations The Forbes 400 and the Pareto Wealth Distribution The Self-Organizing Economy Do the World's Largest Cities Follow Zipf's and Gibrat's Laws? Selection, Growth, and the Size Distribution of Firms A Brief History of Generative Models for Power Law and Lognormal Distributions Power Laws, Pareto Distributions and Zipf's Law. Contemporary Physics Cours d'économie politique professé al'université de lausanne A Graphical Test for Local Self-Similarity in Univariate Data Strong, Weak and False Inverse Power Laws The Pareto, Zipf and Other Power Laws Cities and Countries A Function for Size Distribution of Incomes Critical Phenomena in Natural Sciences Zipf Plots and the Size Distribution of Firms Gibrat's Legacy The Double Power Law in Income Distribution: Explanations and Evidence Incomplete Market Dynamics and Cross-Ssectional Distributions A Note on the Size Distribution of Consumption: More Double Pareto than Lognormal The Double Power Law in Consumption and Implications for Testing Euler Equations Some Statistics of Evolution and Geographical Distribution in Plants and Animals, and their Significance A Model Rxplaining the Pareto Distribution of Wealth A Mathematical Theory of Evolution, Based on the Conclusions of Dr A Novel Coronavirus from Patients with Pneumonia in China Human Behavior and the Principle of Least Effort