key: cord-0839135-p8cyn487 authors: Messner, Wolfgang; Payson, Sarah E. title: Variation in COVID-19 Outbreaks at U.S. State and County Levels date: 2020-08-03 journal: Public Health DOI: 10.1016/j.puhe.2020.07.035 sha: 72d9ed2e8324f20aca7a9962a68511c67c36935a doc_id: 839135 cord_uid: p8cyn487 Abstract Objective The COVID-19 pandemic poses an unprecedented threat to the health and economic prosperity of the world’s population. Yet, because not all regions are affected equally, this research aims to understand whether the relative growth rate of the initial outbreak in early 2020 varied significantly between U.S. states and counties. Study design Based on publicly available case data from across the U.S., the initial outbreak is statistically modeled as an exponential curve. Methods Regional differences are visually compared using geo maps and spaghetti lines. Additionally, they are statistically analyzed as an unconditional model (one-way random effects ANOVA estimated in HLM 7.03); the bias between state- and county-level models is evidenced with distribution tests and Bland-Altman plots (using SPSS 26). Results At the state level, the outbreak rate follows a normal distribution with an average relative growth rate of 0.197 (doubling time 3.518 days). But there is a low degree of reliability between state-wide and county-specific data reported (ICC = 0.169, p < 0.001), with a bias of 0.070 (standard deviation 0.062) as shown with a Bland-Altman plot. Hence, there is significant variation in the outbreak between U.S. states and counties. Conclusions The results emphasize the need for policy makers to look at the pandemic from the smallest population subdivision possible, so that countermeasures can be implemented, and critical resources provided effectively. Further research is needed to understand the reasons for these regional differences. On January 20, 2020, the first case of the novel coronavirus disease 2019 (COVID-19) was reported on U.S. soil, with cases in the U.S. growing to over 579197 as of April 13, 2020. 1 In the struggle to contain the pandemic's growth rate, the U.S. Government took unprecedented action. At the federal level, international and domestic travel restrictions were imposed, and at a state level, closing down of businesses, stay at home orders, and social distancing mandates were enacted. A community's susceptibility to any virus is determined by a variety of factors, including but not limited to biological determinants, demographic profiles, type of habitat, and socioeconomic characteristics. 2 Because these factors vary significantly across the U.S., there is likely to be considerable intra-country variation in the outbreak as well. In the current study, we examine the relative growth rate of the COVID-19 outbreak and its variation on a state and county level across the U.S. We show, both through visualizations as well as statistical analysis, that the outbreak varies significantly across counties and that an aggregate view at the state level, as it is most often reported in media, hides differences at a lower level. In this article, we show the necessity of analyses on a lower level. We obtain COVID-19 outbreak data from the China Data Lab published at Harvard Dataverse (as of April 13, 2020) and USA Facts (as of April 14, 2020); 1,3 we check for consistency between the two databases. Since January 22, 2020, the latter database has aggregated data from the Centers for Disease Control and Prevention (CDC) and state-and local-level public health agencies, confirming them by referencing state and local agencies directly. For our county-level analysis, we discard cases which USA Facts can only allocate at the state, but not at the county level due to a lack of information. On average, the number of unallocated cases is small, but a few states contribute as many as 4866 (New Jersey), 1300 (both Rhode Island and Georgia), or 1216 (Washington State) unallocated cases, resulting in an average of 308 unallocated cases per state, again as of April 14, 2020. . This is a statistical, but not an epidemiological model, that is, we are neither trying to model infection transmission nor estimate epidemiological parameters, such as the pathogen's reproductive or attack rate. Instead, we are fitting a curve to observed case data at the country, state, and county level, so that the estimated outbreak rate is independent of the population in the respective unit. However, it does not control for confounders specific to the habitat. A change-point analysis using the Fisher discriminant ratio as a kernel function does not show any significant change points in the outbreak, and therefore justifies modeling the COVID-19 outbreak as a phenomenon of unrestricted population growth. 6 Because outbreak rates change over time and their estimation is somewhat sensitive to the starting figure, we alternatively calculate the outbreak rate after it reached 10 and 25 cases in the respective unit, finding a high correlation among the rates. We are aware that testing differences between states may also be important confounds. As the number of tests administered and the number of confirmed cases correlates to varying extents, 7 this is however difficult to control for. A disadvantage of this statistical approach is that we cannot forecast outbreak dynamics, though we do not require extrapolated data in our work. For In the U.S., the initial outbreak of the COVID-19 pandemic varied considerably not only between states, but also within the counties of a state. The outbreak rate followed a normal distribution across 50 states plus Washington, D.C. When we extrapolate this to the county data, we find that the outbreak rate significantly deviated from a normal distribution, even when omitting the counties with little to no outbreak. When graphed, this variation in case counts from county to county is easily visible (Figure 3 ). In comparison with state level depiction ( Figure 2) , there is great variation between the state ranking and the situation in its individual counties. In the U.S., most response measures to the pandemic are devised and effected at the state level. Although this is certainly better targeted than an overall response at the federal level, which might spread resources too thinly in some regions, it still may not cater sufficiently for local outbreak differences and resource utilization. For example, while many counties in South Carolina still conveyed a utilized hospital bed capacity of less than 50% (as of April 21, 2020), Lexington County reported 90.6%, followed by Orangeburg and Colleton Counties at 82.2 and 78.0% respectively. 8 Politics and political partisanship play a large role in the resolution of national health emergencies, and have been found to be the strongest predictor of the early adoption of social distancing policies. 9 But such policies tend to generalize strategy and target larger populations. Various institutional, societal, and cultural factors influence the development and adoption of these policies, and are important in the analysis of variations in the pandemic's growth rate across states and counties. Between countries at the international level, previous research indicates the association of such contextual factors with the outbreak rate. 10 For the U.S., we expect comparable findings, and aim to understand potential reasons for the differences in further research. More generally, our study indicates that governments must track a pandemic's outbreak and tailor appropriate response strategies to the most granular level possible. This would not only increase effectiveness of political policy and response strategy, but also allow for a redistribution of excess resources to areas most vulnerable to the pandemic. This will become increasingly important as the world begins returning to normalcy, and attempts to prevent further waves of the COVID-19 pandemic. This geo map displays the variation in outbreak rates at U.S. state level. Lighter colors signify that the pandemic has a slower relative growth rate, and darker colors point to a faster growth. Figure 3 : Variation in outbreak rates at U.S. county level China Data Lab. US COVID-19 daily cases with basemap county-level characteristics to inform equitable COVID-19 response Coronavirus locations: COVID-19 map by county and state Forecasting COVID-19 impact on hospital bed-days, ICU-days, ventilatordays and deaths by US state in the next 4 months Projections for first-wave COVID-19 deaths across the U.S. using social-distancing measures derived from mobile phones measures [Internet]. The University of Texas at Austin COVID-19 Modeling Consortium Outbreak definition by change point analysis: A tool for public health decision COVID-19 positive cases, evidence on the time evolution of the epidemic or an indicator of local testing capabilities? A case study in the United States SCDHEC. Hospital bed capacity Pandemic politics: Timing state-level social distancing responses to COVID-19 The institutional and cultural context of cross-national variation in COVID-19 outbreaks (forthcoming) We would like to thank the anonymous peer reviewer for helpful feedback, which helped to improve the manuscript. We gratefully acknowledge support through the Darla Moore School of Business and the Center for International Business Education and Research (CIBER) at the University of South Carolina. This geo map reveals the larger variation in outbreak rates at U.S. county level. The color band is the same as in Figure 2 .• The COVID-19 pandemic appears to affect some countries or regions in different ways.• Based on a statistical model, the research aims to understand whether the outbreak varies significantly between U.S. states and counties.• A low degree of reliability between state-wide and county-specific data is found.• Policy makers are advised to implement countermeasures and provide critical resources at the smallest population subdivision possible.