key: cord-0999932-j09r4c73 authors: Thomas, Michael M.; Mohammadi, Neda; Taylor, John E. title: Investigating the association between mass transit adoption and COVID-19 infections in US metropolitan areas date: 2021-12-10 journal: Sci Total Environ DOI: 10.1016/j.scitotenv.2021.152284 sha: f438b59eb61cabe7198b09275960c4651dc7f205 doc_id: 999932 cord_uid: j09r4c73 Urbanization introduces the threat of increased epidemic disease transmission resulting from crowding on mass transit. The coronavirus disease 2019 (COVID-19) pandemic, which has directly led to over 600,000 deaths in the US as of July 2021, triggered mass social distancing policies to be enacted as a key deterrent of widespread infections. Social distancing can be challenging in confined spaces required for transportation such as mass transit systems. Little is published regarding the degree to which mass transit system adoption effects impacted the rise of the COVID-19 pandemic in urban centers. Taking an ecological approach where areal data are the unit of observation, this national-scale study aims to measure the association between the adoption of mass transit and COVID-19 spread through confirmed cases in US metropolitan areas. National survey-based transit adoption measures are entered in a negative binomial regression model to evaluate differences between areas. The model results demonstrate that mass transit adoption in US metropolitan areas was associated with the magnitude of outbreaks. Higher incidence of COVID-19 early in the pandemic was associated with survey results conveying higher transit use. Increasing weekly bus transit usage in metropolitan statistical areas by one scaled unit was associated with a 1.38 95% CI: (1.25, 1.90) times increase in incidence rate of COVID-19; a 10% increase in weekly train transit usage was associated with an increase in incidence rate of 1.54 95% CI: (1.42, 2.07) times. These conclusions should inform early action practices in urban centers with busy transit systems in the event of future infectious disease outbreaks. Deeper understanding of these observed associations may also benefit modeling efforts by allowing researchers to include mathematical adjustments or better explain caveats to results when communicating with decision makers and the public in the crucial early stages of an epidemic. Diseases that spread through aerosols, droplets, or fomites are often transmitted widely in crowded environments. Therefore, social distancing has been a key feature of prevention strategies for coronavirus disease 2019 . Despite these efforts, as of July 2021 the COVID-19 pandemic directly led to over 600,000 deaths in the US according to Johns Hopkins Center for Systems Science and Engineering's continually updated dashboard (Dong et al., 2021) . One important aspect of outbreak prevention involves targeting locations with increased transmission rates with information, testing, and other preventative or assessment measures. Although transmission rates of novel diseases are generally estimated on a population level, an emerging epidemic of such a disease can be challenging to characterize because the key analytic parameters estimated by early studies are specific to the initial outbreak's population. If ungeneralizable data are used to model transmission rates, it is possible that inadequate resources will be allocated to locations with greater disease burden (Jewell et al., 2020) . Early estimates of COVID-19 transmission rates were based on data from Wuhan, a city with a popular mass transit system (BBC, 2020; X. Yang et al., 2014) . As soon as the intensity of the local epidemic was acknowledged, transit systems in many major cities were reduced or temporarily shut down to avoid increased disease spread. Once disease spread was marginally contained, authorities requested improved ventilation, sanitation, and social distancing where possible (BBC, 2020; Calvert, 2020; Chan et al., 2020; Shen et al., 2020) . Although similar preventative measures were taken in many urban centers, COVID-19 was prevalent in many communities before political and public health agencies acted (Carteni et al., 2020; Jorden et al., J o u r n a l P r e -p r o o f Journal Pre-proof 2020; Teixeira & Lopes, 2020) . The spread of COVID-19 on transit can be especially challenging to control due to rapid spread and asymptomatic carriers who are not aware of their need to self-quarantine (Chipimo et al., 2020) . It became clear as the pandemic unfolded that transmission rates do vary from population to population, so including transit use among other factors including race, age, essential worker status, and household structure is integral to estimating where disease prevention and medical resources are required for accurate spread estimation (Dasgupta et al., 2020; Grijalva et al., 2020; Lewis et al., 2020; Oster, 2020; Scarpone et al., 2020; Stephanie et al., 2020) . Researchers have investigated the impact of transit acquired infections on previous outbreaks and air transmitted endemic diseases such as tuberculosis in the literature (Andrews et al., 2013; Zamudio et al., 2015) . Because of challenges involved in contact-tracing large crowds, there have been few studies that are able to directly estimate the risks of mass transit use in high disease prevalence areas. Moser and colleagues' work demonstrated a nearly indisputable instance of a single infectious individual spreading influenza, another disease that transmits well in close proximity, to a number of other riders on an unventilated aircraft (Moser et al., 1979) ; the circumstances surrounding this study are rare; however, the various facets of non-COVID-19 transit disease transmission risks have been illuminated using myriad alternative methods. Numerous epidemiological study designs have been used to avoid the challenges posed by contact tracing large crowds to estimate transit adoption's effect on disease spread. Traditional mathematical modeling includes population dynamics methods to estimate the impact of travel on epidemics in ways that avoid having to collect specific individual-level ridership data (Xu et al., 2013) . To further address issues of defining ridership habits of study participants and add granularity to the mathematical approach, simulation studies explained the impact of transit J o u r n a l P r e -p r o o f Journal Pre-proof adoption in major cities with rail systems using agent-based models that leveraged contact probabilities to estimate spread (Cooley et al., 2011; Yashima & Sasaki, 2014) . Similar computational methods in combination with household surveys led to simulation-based contact estimation; these studies found that mass transit acquired infections made up a small but nonnegligible proportion of simulated disease spread in the UK and South Africa respectively (Johnstone-Robertson et al., 2011; Mossong et al., 2008) . Cohort and case-control studies have been executed to draw contrast between transmissible disease incidence rates among populations who travel in close proximity to others and populations that do not (Baker et al., 2010; Horna-Campos et al., 2007; Maogui Hu et al., 2020) . Each of these analyses suggest a relationship between infectious disease and transit adoption, but none focus on COVID-19. Although there was little published on COVID-19 and transit adoption, researchers conducted literature reviews to summarize findings that convey transportation disease transmission risks or best practices to public health officials (Browne et al., 2016; Edelson & Phypers, 2011; Mohr et al., 2012; Nasir et al., 2016; Tirachini & Cats, 2020) . Many of these studies inferred that crowded modes of transit, poor ventilation, and long travel times each increases disease transmission on mass transit; however, there are few that have examined this association in the context of COVID-19. Ecological studies have examined the hypothesis that there is an association between transit adoption and disease transmission by comparing measures of disease between geographic units, some granular enough to define proximity to a rail station and exposure variable (Sanna & Hsieh, 2017) ; however, this ecological study was not specific to . The association between COVID-19 infections and mass transportation has been established in China (Zheng et al., 2020) . But further national-scale study that considers covariates via explicit modeling of control variables has yet to be published. Implementing an J o u r n a l P r e -p r o o f Journal Pre-proof ecological study allows for quick, large scale analysis of emerging diseases that the aforementioned research methods cannot obtain. However, measuring exposure to transit in an ecological analysis can also be a challenge given there is little available data directly measuring the entire populations of riders' mass transit adoption. The literature on transit modes as an exposure in ecological studies often relies on survey data to quantify exposure to different modes of transit, because those data offered a level of detail unavailable in most administrative data (e.g., ridership estimates by a transit authority). This is relevant to disease modeling because using a diverse array of exposure estimation methods limits potential biases in assessing exposure-outcome relationships. For example, there are numerous studies that examined the link between ride time and location to bicycle accident frequency, exploiting time spent on transportation mode as a control variable (Beck et al., 2007; Schneider et al., 2017) . Weighted surveys including the National Household Travel Survey (NHTS) have been leveraged to estimate similar risk factors such as ride distance where a simple random sample is infeasible (Buehler & Pucher, 2017; Pucher & Renne, 2003) . Researchers also strengthened their transit-related hypotheses by using multiple exposure measures to describe similar constructs (Ferenchak & Marshall, 2020) . Many of these analyses are similar to ecological analyses of COVID-19 spread and transit in that they measure the relationship between transit mode adoption and some health outcome. Researchers have used ridership estimates to determine whether the number of transit riders increasing led to sharp increases in COVID-19 cases, but little effect was found after controlling for covariates such as essential worker density (Sy et al., 2020) . Other studies have considered urban factors, but did not focus on mass transportation (Ming Hu et al., 2021; Scarpone et al., 2020) . This study leverages J o u r n a l P r e -p r o o f Journal Pre-proof survey-based exposure measures as opposed to ridership estimates in a national scale ecological analysis to address this gap in the literature and account for this finding. Understanding factors that impact the disease spread helps modelers create more accurate estimates, and therefore, will improve the accuracy of information used by policymakers. Mass transit adoption may be an important factor in estimating disease spread parameters. Although transit has been considered a potential risk factor for the spread of COVID-19, there is currently a gap in the research where this association has not been shown for COVID-19 specifically on a scale larger than one city or subnational region using exposure data regarding transit adoption attitudes. This study applies statistical modeling to assess the association between transit adoption and COVID-19 incidence while addressing potential confounding variables that may affect the association on a metropolitan-area level. A hierarchical regression model compares COVID-19 incidence early in the pandemic using transit attitude measures for metropolitan statistical areas (MSA) in the US. The data used to model the association between mass transit adoption and COVID-19 incidence rates were gathered from the NHTS and the COVID-19 Data Repository by the Center scale with options like "Daily," "A few times a week," "A few times a month," "A few times a year," and "Never." "Daily" and "A few times a week" were grouped in this study, because frequent public transportation use is relatively uncommon in many U.S. cities (Blumenberg et al., 2020) , and a combined group retains the same direction increase in transit usage behavior. MSAs are combinations of counties created by the US Census Bureau to represent urbanized areas containing 50,000 or more residents (U.S. Census Bureau, 2020). The early time period was chosen to demonstrate effects that included disease transmission before and during individuals' changing attitudes towards transit and policies preventing them from using transit. Because many of the public health interventions and changes in human behavior happened early in the pandemic, it is likely that this exposure type did occur more in the first few months of the pandemic than later when policy and risks were clearer (Mohammadi & Taylor, 2020) . Data from the NHTS was estimated using the summarizeNHTS R package (Fucci & Cates, 2018) First, a locally estimated regression method known as LOESS lines and standard errors were applied to a plot of incidence rate versus the proportion of respondents stating they take mass transit either weekly or daily (Jacoby, 2000) . LOESS smoothing is a non-parametric data visualization method used to describe the relationship between two continuous variables. Negative binomial regression was used to estimate the association between mass transit adoption and COVID-19 incidence. The negative binomial regression model was chosen for its common use in ecological studies of disease involving incidence rates for diseases in areal data as well as its ability to replace the Poisson regression model when response data are overdispersed Szklo & Nieto, 2014) . For the negative binomial regression models, MSA-level estimates of COVID-19 incidence in US MSAs are modeled as the response variable ~( ) with an offset term for the population of each MSA. A negative binomial model is a generalized linear model with a log link and response modeled as a binomial random variable. Random effects error modeling is used to address the regression assumption violation of independent observations. In the context of this analysis, that means that the random effect term will be used to model the spatial autocorrelation attributable to differences in US Census Regions that are not explicitly indicated among the other independent predictors such as climate or elevation. In this model, case counts in each MSA, , are assumed to be independent random variables each with mean . The model is stated as In the model, each of the unknown parameters representing fixed effects is represented by . measures the association between mass transit usage (i.e., bus and train), measures each of the covariates' impact on cases and an offset term ( ) is included to control for the size of an area's population ( ). The offset term is used to define the output in terms of impact on the rate for each location as opposed to raw case counts; this is necessary because the population in an area impacts the number of cases that arise. The primary exposure Journal Pre-proof of interest is proportion of survey respondents stating weekly or daily bus or train transit usage , (i.e., they responded "daily" or "weekly" to the survey item asking about their transit use frequency). The negative binomial regression model parameters were estimated using iteratively reweighted least squares. Base R and lme4 software packages were used to estimate the parameters for the model (Bates et al., 2015; R Core Team, 2018) . The maps in Figures 1 show that J o u r n a l P r e -p r o o f survey transit crowding is likely highest in New York City MSA, and therefore, that municipal area would suffer the greatest disease spread given that the hypothesis of this study is correct. The results of this study are of value to epidemiologists, policymakers, and municipal transit providers. Modeling attempts made earlier in the pandemic were integral to the allocation of resources in early stages of transmission in the US. Literature has criticized some of these models for using data from southern Italy and Hubei province to inform large areas that differ in key factors relating to the application of social distancing (Jewell et al., 2020) . Paying attention to these factors, such as mass transit adoption in a region, when generating or presenting model results may improve the way policy makers are able to avoid mistakenly sending resources to areas with fewer transmission risks leaving areas with high transmission risk without aid. Epidemiologic modelers can modify disease transmission parameters with mathematical adjustments or clarify rhetorically the impacts of the generalizability when presenting models to policy makers and the public. Additionally, it may be possible to use the NHTS estimates of transit to weight the disease transmission rates in models explicitly outlining their impact. Awareness regarding the risks of transit can keep the aforementioned entities ahead of heuristic assessments of the risks of riding transit. Preempting these risks could help prevent the magnitude of transit ridership drop-offs that occurred at the beginning of the pandemic and continued through the ensuing year (Sharifi & Khavarian-Garmsir, 2020) . Some research has shown that these changes may persist, so it is imperative that those advocating for transit adoption remain aware of the risks and actively address them when they appear (Mohammadi & Taylor, 2020) . Although this study identifies mass transit adoption as one of many risk factors for COVID-19 transmission in cities, it should be mentioned that there are key steps that can curtail J o u r n a l P r e -p r o o f this effect. Tirachini and Cats (2020) as well as Nasir et al. ( 2016) The analysis in this study elucidated the association between the stated adoption of mass transit and COVID-19 incidence in US metropolitan statistical areas; there is evidence that urban centers with high usage transit systems experience differential disease transmission rates than those without. This finding adds to existing findings by focusing specifically on COVID-19 as opposed to other infectious diseases, expanding the analysis to a national scale while controlling for relevant crowding and socioeconomic indicators. This notion implies that transit infrastructure must respond quickly when a potential pandemic situation is developing. Ventilation, social distancing, and other public health measures support meeting this need. This finding also suggests that modeling efforts undertaken to estimate the number of cases and, by extension, impact of policies early in a pandemic, such as the 2020 COVID-19 pandemic, should take measures to control for the impact of mass transit and possibly other urban crowding measures in their estimates. This directly relates to models generated for COVID-19, some of which used transmission rates from Wuhan, China, a region with higher transit usage than most US urban centers. Although their work was integral to the COVID-19 response, in order to retain the trust of the public and lawmakers, it is essential that epidemiologic modelers provide as accurate and consistent models as possible. Considering transit and other relevant population J o u r n a l P r e -p r o o f characteristics provides an essential component of meeting that end and ultimately reducing the burden of devastating health emergencies. Modeling the role of public transportation in sustaining tuberculosis transmission in South Africa Transmission of pandemic A/H1N1 2009 influenza on passenger aircraft: retrospective cohort study Fitting Linear Mixed-Effects Models Using {lme4} Coronavirus: Wuhan Shuts Public Transport over Outbreak Motor Vehicle Crash Injury Rates by Mode of Travel, United States: Using Exposure-Based Methods to Quantify Differences 2017) using four exposure metrics summarizeNHTS. GitHub Transmission of SARS-COV-2 Infections in Households -Tennessee and Wisconsin Testing individuals for coronavirus disease 2019 (COVID-19) Public transportation and pulmonary tuberculosis Risk of Coronavirus Disease 2019 Transmission in Train Passengers: an Epidemiological and Modeling Study The role of built and social environmental factors in Covid-19 transmission: A look at America's capital city Loess:: a nonparametric, graphical tool for depicting relationships between variables Caution warranted: using the Institute for Health Metrics and Evaluation model for predicting the course of the COVID-19 pandemic Social mixing patterns within a South African township community: implications for respiratory disease transmission and control Evidence for limited early spread of COVID-19 within the United States Disparities in COVID-19 Incidence, Hospitalizations, and Testing, by Area-Level Deprivation -Utah Impact of meteorological factors on the COVID-19 transmission: A multi-city study in China Human-Infrastructure Interactional Dynamics: Simulating COVID-19 Pandemic Regime Shifts. 2020 Winter Simulation Conference (WSC) Evidence for airborne infectious disease transmission in public ground transport--a literature review An outbreak of influenza aboard a commercial airliner Social contacts and mixing patterns relevant to the spread of infectious diseases Airborne biological hazards and urban transport infrastructure: current challenges and future directions Transmission dynamics by age group in COVID-19 hotspot counties-United States Safer cycling through improved infrastructure Socioeconomics of urban travel COVID-19 case fatality rate and tuberculosis in a metropolitan setting Ascertaining the impact of public rapid transit system on spread of dengue in urban settings A multimethod approach for county-scale geospatial analysis of emerging infectious diseases: a cross-sectional case study of COVID-19 incidence in Germany Comparison of US metropolitan region pedestrian and bicyclist fatality rates The COVID-19 pandemic: Impacts on cities and major lessons for urban planning, design, and management Prevention and control of COVID-19 in public transportation: experience from China Geographic Differences in COVID-19 Cases, Deaths, and Incidence-United States Socioeconomic disparities in subway use and COVID-19 outcomes Communicating Results of Epidemiologic Studies The link between bike sharing and subway use during the COVID-19 pandemic: The case-study of New York's Citi Bike COVID-19 and public transportation: Current assessment, prospects, and research needs Spatial spread of an epidemic through public transportation systems with a hub Changing disparities in COVID-19 burden in the ethnically homogeneous population of Hong Kong through pandemic waves: an observational study Seasonal variation of newly notified pulmonary tuberculosis cases from Epidemic process over the commute network in a metropolitan area Public transportation and tuberculosis transmission in a high incidence setting Spatial transmission of COVID-19 via public and private transportation in China