key: cord-0894227-mwp5dfkw authors: Khose, Swapnil; Chan, Hei Kit; Wang, Henry E.; Moore, Justin Xavier title: Predictors for County Level Variations in Initial 4-week COVID-19 Incidence and Case Fatality Risk in the United States date: 2020-12-21 journal: Res Sq DOI: 10.21203/rs.3.rs-131858/v1 sha: 509162921f1b1fe53389d2a5801afa58fe2c6609 doc_id: 894227 cord_uid: mwp5dfkw While studies indicate differences in incidence and case fatality risk of COVID-19, few efforts have shed light on regional variations in the intensity of initial community spread. We conducted a nationwide study using county-level data on COVID-19 from Center for Systems Science and Engineering at Johns Hopkins University. We characterized intensity of initial community COVID-19 attack by calculating the incidence and case fatality risk (CFR) for the first 4-week period of COVID-19 spread in each county. We used multivariate multilevel multinomial logistic regression to estimate the association of county-level characteristics with COVID-19 incidence and CFR. Of 3,143 counties, we included 1,052 with at least 100 reported cases on June 1st. Median incidence was 193.4 per 100,000 population (IQR: 94.2-397.5). Median case fatality risk was 3.6% (IQR: 1.4–7.3). Median age, rural population, population density, lower education, uninsured population, obesity, COPD prevalence were positively associated, while population, female sex, races (Asian, white), higher education, excessive drinking were negatively associated with initial COVID-19 incidence. Median age, female sex, Asian race, population density, higher education, excessive drinking, Intensive Care Unit beds, airborne infection isolation rooms were positively associated, while Hispanic ethnicity, lower education, obesity (paradox), uninsured population were negatively associated with initial COVID-19 CFR. As of 10th August, 2020, the pandemic of COVID-19, caused by SARS-CoV-2, has claimed more than half a million lives worldwide. With nearly 5 million cases and 163,461 deaths, the United Sates has experienced the highest burden [1, 2] . While many studies have shown stark differences in incidence and case fatality risk of COVID-19 across different counties, some counties are being disproportionately affected than others [3] [4] [5] . Further, various studies have identi ed individual-level risk factors for COVID-19 hospitalization [6, 7] , associated complications [8] [9] [10] [11] [12] [13] as well as mortality [14] [15] [16] [17] . However, there are very few studies examining the population-level factors explaining the differential rate of spread of COVID-19 infection as well as rate of fatality in different geographical regions [18] [19] [20] [21] [22] . Furthermore, few studies have focused on the initial community spread, which may indicate regions and communities particularly vulnerable to the effect of COVID-19. The initial intensity by which a disease spreads through a community may be in uenced by numerous factors such as the virulence of the pathogen, the health behaviors of citizens, the biologic susceptibility of the population, or the health resources of the community. Understanding the factors responsible for the variation in initial incidence as well as case fatality risk could help efforts to identify high risk communities as well as targets for mitigating the spread of infection. The primary objectives of this study were 1) To determine county level variations in initial COVID -19 incidence and case fatality risk indexed to the start of epidemic in each county and 2) To identify the predictors for county level variations in initial incidence and case fatality risk of COVID-19. We performed an ecological study examining the regional variation of COVID-19 across counties in the United States. We obtained county-level data on COVID-19 con rmed cases and deaths from the COVID- 19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University through 29th of June, 2020 [1, 2] . We included counties with at least 100 cases on 1st June, 2020 to allow for 4-week period before we obtained the data i.e. 29th June, 2020. The primary outcomes of the study were incidence (number of new con rmed cases per 100,000 population) and case fatality risk [23] (CFR: ratio of number of new deaths and new con rmed cases, expressed as a percentage) of COVID-19. We calculated the incidence and case fatality risk for the 4-week period from the day of reporting at least 100 cases in each county to ensure fair comparison between counties. We focused primarily on initial community spread so as to identify high risk communities and their characteristics. County-level data on socio-demographic factors, health behaviors, chronic medical conditions' prevalence rates and availability of healthcare resources were obtained from the 2020 County Health Rankings (CHR) [24] , 2018-2019 Area Health Resources File (AHRF) [25] and 2017 Centers for Medicare & Medicaid Services (CMS) [26] report on chronic medical conditions. We linked these county-level community characteristics with COVID-19 data using Federal Information Processing Standards (FIPS) codes. The details of sources and de nitions for variables used can be found in the appendix I. We estimated descriptive statistics for COVID-19 outcomes as well as various community characteristics of the counties included in the study. We t multilevel multinomial logistic regression models to estimate the association of county-level factors (socio-demographics, health behaviors, air pollution level, chronic medical conditions' prevalence and availability of healthcare resources) with incidence and case fatality risk (CFR) of COVID-19. We used quartiles of incidence and CFR of COVID-19 as dependent variables. The models also constituted a random intercept for each state to account for unknown variations among states, such as weather, social distancing norms, timing of stay-at-home orders, etc. All models were adjusted for median age, sex (females) and race/ethnicity (Asian, Hispanic, non-Hispanic black, non-Hispanic white). All analyses were conducted at the county level. We performed all statistical analyses This study was considered exempt from Institutional Review Board (IRB) review as we used publicly available, population-level data. Of the total 3,143, we included 1,052 counties with at least 100 cases on 1st June, 2020. The characteristics of these counties are presented in Table 1 We used multinomial regression to determine the association of county-level characteristics with the quartiles of 4-week COVID-19 incidence. Median age, rural population, population density, lower education (< HS Diploma), adult obesity prevalence, COPD prevalence, and uninsured population were positively associated with the highest quartile of incidence compared to the lowest quartile. While population, female sex, races (Asian and non-Hispanic white), higher education (HS diploma or more, 4 + years of college), and excessive drinking were negatively associated with the highest quartile of incidence. (Table 2 ) Table 2 Association of county-level characteristics with the quartiles of 4-week COVID-19 incidence (1st quartile as a reference category). Ours is the rst study to examine association of multiple population-level factors with the county-level variations in initial incidence and case fatality risk of COVID-19. We focused primarily on initial community spread so as to identify populations with higher susceptibility for COVID-19 infection and fatality. We found signi cant variation in the incidence We also identi ed various independent predictors of initial incidence of COVID-19. The positive association with higher median age, male sex, and chronic medical conditions (obesity and COPD) is in accordance with the various individual-level risk factors described by numerous clinical studies [6] [7] [8] [9] [10] . The elderly male populations with higher chronic disease burden are likely to have high susceptibility for COVID-19. Interestingly, female sex was negatively associated with higher incidence. Biological susceptibility, occupational roles as well as responsible behavior with regard to following public health guidelines might explain this. Excessive drinking was also found to be strong protective factor, which could be explained by less mobility and social interaction by this population. On the other hand, population density was positively associated with higher incidence, supporting the role of social mobility in driving the spread of infection. All of these factors underscore the utility of social distancing in slowing the transmission of COVID-19. Additionally, higher education was negatively associated and percent uninsured population was positively associated with highest quartile of incidence. This highlights the importance of regular academic education as well as health education (percent uninsured population as proxy) in slowing the spread of the virus. Furthermore, we identi ed independent predictors of case fatality risk of COVID-19 during initial community spread. Higher age and female sex were the strongest predictors associated with higher CFR, as shown by other individual-level clinical studies [14] [15] [16] [17] . We also found signi cant positive association of Asian race with higher CFR, whereas Hispanic ethnicity was found to be negatively associated. Non-Hispanic black race was not found to be signi cantly associated with higher CFR. Various other studies have found non-signi cant association of black race with CFR [27] [28] [29] , while some have shown signi cantly higher mortality [30] . Further research is needed in this area. Unexpectedly, we did not nd association of higher CFR with the prevalence of any of the included chronic medical conditions, except adult obesity. Adult obesity was negatively associated with the highest quartile of CFR (aOR: 0.95; 95% CI: 0.90, 0.99), supporting the 'obesity paradox'. Obesity paradox has been described as an association of obesity with decrease in mortality in patients with acute respiratory distress syndrome (ARDS), reported previously in various studies [31] [32] [33] . However, whether such a phenomenon also holds true for ARDS following COVID-19 infection is not yet clear [32, 34] . Moreover, we found that ne particulate matter (PM 2.5) was not associated with CFR. This is in consonance with another nation-wide cross sectional study on effect of air pollution, which showed insigni cant effect of PM 2.5 and Ozone, but signi cant effect of NO 2 on C0VID-19 death outcomes [22] . We also did not nd independent association of smoking with CFR. However, different meta-analyses have identi ed signi cant associations of smoking with severe complications as well as higher mortality from COVID-19 [35, 36] . Surprisingly, availability of healthcare resource, de ned by number of Intensive Care Unit beds and number of airborne infection isolation rooms, was found to be positively associated, although weakly, and uninsured population was found to be negatively associated with the highest quartile of case fatality risk. The lesser disease burden as well as rapidity of spread during the initial weeks of epidemic in each county might explain this contradictory effect of healthcare resources availability on CFR variation. A study in China showed that the rapid escalation in the number of infections around the epicenter of the outbreak (Wuhan city) resulted in an insu ciency of health-care resources, thereby negatively affecting mortality in Hubei province, but not in other provinces of China [37] . Our study included an assessment of comprehensive range of factors with potential predictability role for the spread and fatality of COVID-19. In contrast to other population-level studies on COVID-19, we were able to control for major confounding by epidemic timing as well as stage of the epidemic by identifying a common starting point for each county (i.e. reporting of rst 100 cases). We also were able to control for the unmeasurable effect of various factors such as diverse weather, varied social distancing norms, different timing of stay-at-home orders, etc. by including the group effect for each state. However, we do acknowledge that our study is limited in several key areas. Firstly, the data on con rmed cases and deaths of COVID-19 at CSSE at Johns Hopkins University is derived from publicly available data from multiple sources such as the World Health Organization, the U.S. Centers for Disease Control and Prevention, state and national government health departments, local media reports, etc [1, 2] . Because of the different COVID-19 case de nitions used by different organizations, there could be an arti cial variability in the data itself. Secondly, the case fatality risk estimation used does not provide the true rate, as there is a substantial lag of reported deaths among reported cases (most hospitalizations take 2-3 weeks till experiencing mortality) [38] . However, this is the limitation for all population-level studies. Thirdly, because of limited sample size, we were not able to control for all the plausible confounders in our modeling. Fourthly, we did not look at some other potential factors as it was beyond the scope of this study. Speci cally, we could not examine the effect of important chronic medical conditions identi ed by various other studies, such as hypertension [39] , chronic heart disease [40] , cancer [41] , etc. as well as other air pollutants such as NO 2 & Ozone [22, 42, 43] . Fifthly, few chronic medical conditions' data (asthma, COPD, chronic kidney disease) used in this study was obtained from CMS [26] . This is a Medicare bene ciary data and hence is not generalizable to the general population. Caution should be taken while interpreting the ndings with respect to these three factors. Since the beginning of the pandemic of novel coronavirus, there have been numerous efforts to build better prediction models. However, the predictability of these models has not been up to the expectation. The predictors identi ed by our study will de nitely help build better models. Additionally, these ndings may help identify most susceptible and high-risk populations and target public health interventions to focus areas. Lastly, our study also highlights the importance of social distancing as well as health education. To summarize, we identi ed various county-level independent predictors of initial incidence as well as case fatality risk of COVID-19. The ndings can help build better future prediction models. The results also support targeted public health actions by identifying susceptible and high-risk populations as well as counties. Ethics approval and consent to participate This study was considered exempt from Institutional Review Board (IRB) review as we used publicly available, population-level data. Not applicable County-level 4-week COVID-19 incidence (per 100,000 population). Depicts incidence for the rst four weeks of COVID-19 spread in each county. Four week period de ned by the date of the 100th reported case in each county. Includes counties with at least 100 cases as of June 1st, 2020 Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. County-level 4-week COVID-19 case fatality risk Depicts case fatality risk for the rst four weeks of COVID-19 spread in each county. Four week period de ned by the date of the 100th reported case in each county. Includes counties with at least 100 cases as of June 1st, 2020 Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. This is a list of supplementary les associated with this preprint. Click to download. Appendix.docx COVID-19 Data Repository by the An interactive web-based dashboard to track COVID-19 in real time Geographic Differences in COVID-19 Cases, Deaths, and Incidence -United States Epidemiology of the 2020 pandemic of COVID-19 in the state of Georgia: Inadequate critical care resources and impact after 7 weeks of community spread Epidemiology of the 2020 Pandemic of COVID-19 in the State of Texas: The First Month of Community Spread Lifestyle risk factors, in ammatory mechanisms, and COVID-19 hospitalization: A community-based cohort study of 387,109 adults in UK Hospitalization Rates and Characteristics of Patients Hospitalized with Laboratory-Con rmed Coronavirus Disease 2019 -COVID-NET, 14 States Factors Associated With Intubation and Prolonged Intubation in Hospitalized Patients With COVID-19 Host susceptibility to severe COVID-19 and establishment of a host risk score: ndings of 487 cases outside Wuhan Development and Validation of a Clinical Risk Score to Predict the Occurrence of Critical Illness in Hospitalized Patients With COVID-19 Predictive factors for disease progression in hospitalized patients with coronavirus disease 2019 in Wuhan Con rmation of the high cumulative incidence of thrombotic complications in critically ill ICU patients with COVID-19: An updated analysis Association Between Clinical Manifestations and Prognosis in Patients with COVID-19 Clinical Characteristics and Risk Factors for Mortality of COVID-19 Patients With Diabetes in Wuhan, China: A Two-Center Risk Factors for Mortality in 244 Older Adults With COVID-19 in Wuhan, China: A Retrospective Study Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study Risk factors for severity and mortality in adult COVID-19 inpatients in Wuhan Impact of Social Vulnerability on COVID-19 Incidence and Outcomes in the United States Racial demographics and COVID-19 con rmed cases and deaths: a correlational analysis of 2886 US counties Social determinants of COVID-19 mortality at the county level S. county-level characteristics to inform equitable COVID-19 response Urban Air Pollution Case-Fatality and Mortality Rates in the United States Case Fatality: Rate County Health Rankings & Roadmaps. Robert Wood Johns Found Program 2020 Area Health Resources Files Hospitalization and Mortality among Black Patients and White Patients with Covid-19 Characteristics and Clinical Outcomes of Adult Patients Hospitalized with COVID-19 -Georgia Clinical Characteristics and Morbidity Associated With Coronavirus Disease Are black and Hispanic persons disproportionately affected by COVID-19 because of higher obesity rates? Surg Obes Relat Dis Off Can body mass index predict clinical outcomes for patients with acute lung injury/acute respiratory distress syndrome? A meta-analysis Does Coronavirus Disease 2019 Disprove the Obesity Paradox in Acute Respiratory Distress. Syndrome? Obesity Paradox" in Acute Respiratory Distress Syndrome: Asystematic Review and Meta-Analysis Obesity, overweight and survival in critically ill patients with SARS-CoV-2 pneumonia: is there an obesity paradox? Preliminary results from Italy The impact of COPD and smoking history on the severity of COVID-19: A systemic review and meta-analysis Severity and Mortality associated with COPD and Smoking in patients with COVID-19: A Rapid Systematic Review and Meta-Analysis Potential association between COVID-19 mortality and healthcare resource availability Likelihood of survival of coronavirus disease 2019 Clinical Characteristics of Coronavirus Disease 2019 in China Characteristics and outcomes of patients hospitalized for COVID-19 and cardiac disease in Northern Italy Cancer patients in SARS-CoV-2 infection: a nationwide analysis in China Assessing nitrogen dioxide (NO2) levels as a contributing factor to coronavirus (COVID-19) fatality Can atmospheric pollution be considered a co-factor in extremely high level of SARS-CoV-2 lethality in Northern Italy? Environ Pollut Barking Essex The datasets analyzed during the current study are available in the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University, https://github.com/CSSEGISandData/COVID-19 . [1, 2] Competing interestsThe authors declare that they have no competing interests.