key: cord-0259807-kbv9kh6z authors: Singer, Gregor; Graff Zivin, Joshua; Neidell, Matthew; Sanders, Nicholas title: Air Pollution Increases Influenza Hospitalizations date: 2020-04-10 journal: nan DOI: 10.1101/2020.04.07.20057216 sha: 840262b2784d16557957bfdccf141c8bec761fe2 doc_id: 259807 cord_uid: kbv9kh6z Seasonal influenza is a recurring health burden shared widely across the globe. We study whether air quality affects the occurrence of severe influenza cases that require inpatient hospitalization. Using longitudinal information on local air quality and hospital admissions across the United States, we find that poor air quality increases the incidence of significant influenza hospital admissions. Effects diminish in years with greater influenza vaccine effectiveness. Apart from increasing vaccination rates, improving air quality may help reduce the spread and severity of influenza. Seasonal influenza is a global health threat, with an average of 3-5 million severe cases per year and 290,000 to 650,000 respiratory deaths (1, 2) . The disease exhibits variability in spread and severity across individuals, regions, and over time. Prior research has produced two broad sets of findings to explain this variation: a) meteorological factors that affect the spread of the virus, such as temperature, sunlight and humidity (3, 4, 5, 6, 7, 8) ; and b) individual level host factors, such as age, sex, underlying health and smoking that affect the intensity of symptoms (9, 10). We know considerably less, however, about how air pollution affects influenza spread and severity, a surprising gap given the pervasiveness of air pollution around the world and the well-established policy tools available to control it. Air pollution could affect influenza hospitalizations via both susceptibility and exposure (11). Like smoking (10), air pollution can impair the respiratory functioning of patients, e.g., by damaging the respiratory epithelium, thereby facilitating the progression of influenza virus beyond the epithelial barrier into the lungs (12, 13, 14, 15) . Existing medical research finds exposing in vitro respiratory epithelial cells to air pollution increases susceptibility and penetration of influenza (13), and experimental exposure of mice to air pollution before influenza infections increased morbid-ity and mortality (16, 17) . Like humidity and temperature (5, 6, 7, 18, 19) , air pollution particles could also impact the airborne survival of viruses outside the body (18, 20, 21, 22, 23, 24) and thus increase the probability of disease transmission. We build on the existing evidence that links ambient air pollution with influenza spread and severity (13, 16, 17, 21, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35) with two significant advancements toward improving causal inference (36). First, we exploit a long panel of influenza-specific hospital admissions from numerous states across the United States (U.S.) to estimate statistical models that exploit both spatial and temporal variation within counties over time, limiting threats from confounding factors. Second, to better understand the causal link, we explore the role of the influenza vaccine in moderating this relationship. If the vaccine reduces infections and the probability of influenza spread, seasons in which the vaccine is more effective should weaken the link between air pollution and influenza (37). Our analysis utilizes patient level data on inpatient hospitalization (39), which allows us to focus on severe cases specifically limited to influenza (for details on data, descriptives, and empirical methods see Supplementary Appendix S.1, S.2 and S.3). Our principal outcome of interest is the number of inpatient ad-missions per county-month where the primary diagnosis is influenza according to the International Classification of Diseases (ICD) (40). We combine this with high frequency air pollution readings of local ground monitors across the U.S., as well as data on local temperature, specific humidity, precipitation and wind speed (41). The richness of our data allows us to control for a wide variety of both regional and temporal controls. Our preferred specification includes county-by-year and month-byyear fixed effects. County-by-year effects control for differences in unobserved characteristics such as demographics, socio-economic factors, and health care access and protocols that influence pollution exposure and health outcomes across counties separately for each year. The month-by-year fixed effects control for general monthly and seasonal trends within each year in both influenza and pollution (42). As our measure of pollution, we use the U.S. Environmental Protection Agency's Air Quality Index (AQI), which we aggregate to county-bymonth-by-year to match outcomes. The AQI is a measure of overall air quality based on the primary criteria pollutants specified in the Clean Air Act. Aggregation of pollutants means there are no real "units" for the measure. It is designed such that higher AQI values indicate worse air quality. To ensure we capture exposure to air pollution before diagnosis, we lag the AQI by one month. In all of our analyses, we focus on the influenza season (October to March). Figure 1 shows the seasonality of inpatient hospitalizations in our data (Figure 1a) , which matches closely with general influenza-like illnesses reported by the Centers for Disease Control and Prevention (CDC) (Figure 1b ). Figure 1c shows the age distribution of hospital admissions, which has important implications for vaccine effectiveness, described in more detail below. Figure 2a shows a clear positive correlation between air quality and count of influenza admissions in the raw data; higher AQI correlates with more influenza admissions (43). Figure 2b shows the correlation after adjusting both variables for fixed effects and weather controls. After this adjustment, a strong, positive correlation remains. Table 1 shows estimates from Poisson Pseudo-Maximum Likelihood regressions given the count nature of the dependent variable. The coefficients represent the change in the expected log of inpatient admission counts, which approximates a percentage change in number of county-year-month admissions within our data (44). Column (1) implies a 1-unit increase in the lagged monthly AQI results in a 0.56% increase in inpatient influenza admissions. To put this estimate in national context, a one standard deviation increase in AQI (12.79-unit increase in our data) amounts to approximately 4,064 additional inpatient hospitalizations for the 6-month influenza season in the U.S. (45). Column (2) replaces our continuous measure of air quality with the count of days in a month with air quality the EPA classifies as "unhealthy for sensitive groups" (AQI ≥ 100). These days are rare: in our data, the average county has around 0.4 such days per month. An additional unhealthy air quality day raises admission counts by approximately 5%. Continuing with our U.S.-wide calculation, an additional unhealthy air quality day in each county generates 2,786 additional inpatient hospitalizations per influenza season. We next interact our air quality measure with a measure of influenza vaccine effectiveness. Every year, the CDC reports results from small-scale studies of that season's influenza vaccine effectiveness rate by age group (see details in Supplementary Appendix S.1). Based on the histogram in Figure 1c , we use the vaccine effectiveness for the two age groups traditionally susceptible to health complications from influenza: children up to 8 and adults 65 and older. This group comprises 65% of inpatient hospitalization in our data. Figure 3 shows the regression-adjusted relationship between AQI and influenza admissions separately in seasons of low vaccine effectiveness and high vaccine effectiveness for the up to 8-year-old group and 65-year-and-older group, as determined by a median sample split (46). For both age groups, the relationship between air quality and admissions rates flattens and effectively disappears in years of high vaccine effectiveness. Columns (3) and (4) of Table 1 show a similar story using a more continuous measure of vaccine effectiveness. A vaccine effectiveness of 53% for the up to 8year-old group or 34% for the 65-yearand-older group nullifies the link between air pollution and influenza hospitalizations (47). While our fixed effects can address many unobservable factors, there remain possible confounders in establishing a causal link between pollution exposure and influenza hospitalizations. Air quality could trigger health problems in sensitive populations (e.g. asthmatics) who would then go to the hospital, where they might be observed to have influenza. For this reason, our analysis focused on patients whose primary diagnosis is influenza and ignore occurrences of influenza in secondary diagnoses. We also repeat our analysis using two alternative measures: patients where influenza is the only diagnosis and patients where any diagnosis is influenza. Supplementary Appendix S.4 shows that our results are robust to either of these alternatives. We perform various falsification tests by repeating our analysis using health outcomes that should not correlate with air quality and health: diabetes mellitus with complications; urinary tract infections; skull and face fractures; and osteoarthritis (48). The result of a falsification test in Column (5), using the combined number of the above health outcomes, indicates a precise zero to the thousandth decimal place. We present estimates on each of these four falsification outcomes individually in Supplementary Appendix S.4 with similar results. Supplementary Appendix S.4 ex-plores heterogeneity and conducts further sensitivity analysis and robustness checks. Our estimates are stable across gender and age groups. We find suggestively larger effects for blacks and Hispanics, but the estimates are not statistically different from those for whites. We show robustness to (i) different weather controls, (ii) additional fixed effects, (iii) multilevel clustering of standard errors, (iv) different winsorization and interpolation of the raw AQI data, (v) including out-state patients at hospitals, (vi) focusing on states with a long time series only, (vii) using missing values instead of zeros for county-months with no hospital admissions, and (viii) using a linear ordinary least squares instead of a Poisson Pseudo-Maximum Likelihood estimator. We also show the effect of air pollution on outpatient hospitalization is larger than for inpatient hospitalizations, consistent with the notion that emergency department encounters are more frequent (but also less severe) than those requiring admission to the hospital. As a final consideration, we shift from additional influenza cases to an economic endpoint. Column (6) of Table 1 shows ordinary least squares estimate of the effect of AQI on hospitaliza- tion charges for influenza admissions. This suggests a one-unit increase in AQI increases hospital billing by approximately $4,929 per month in the average county during influenza season. Across the U.S., a one standard deviation increase in AQI (12.79-unit increase) generates an additional $1.19 billion inpatient hospital charges per influenza season. Using a rich longitudinal dataset, we provide causal evidence that air pollution increases hospitalization rates for seasonal influenza. Our findings offer novel evidence important for policy making, highlighting the heightened importance of increasing vaccination rates in polluted urban centers (49). This is especially important in developing countries, which house the most polluted cities in the world and have very low baseline vaccination rates (50). They also imply pollution controls can provide an important hedge against antigenic drift or shift in the influenza virus that renders the vaccine significantly less effective in some years, helping reduce global medical spending, avoid lost productivity, and reduce loss of human life. If our results generalize to other respiratory viral infections, they will significantly understate the infectiousdisease related benefits from environmental protection (51). They may also provide important insights for the ongoing fight against the COVID-19 pandemic (52). Social distancing and large scale reductions in economic activity aimed at reducing viral spread have also reduced air pollution (53), which may be helping reduce the impacts of the disease. As countries relax restrictions and economic activity resumes, they may choose to reduce environmental regulations in exchange for a more rapid return to economic growth: the U.S. EPA recently announced plans to suspend enforcement of environmental laws during the pandemic (54). Our results suggest there could be additional disease-related social costs to consider when worsening air quality during the economic recovery. [8] A. I. Barreca, J. P. Shimshack, Absolute humidity, temperature, and influenza mortality: 30 years of county-level evidence from the united states. American journal of epidemiology 176, S114 (2012). [11] A separate literature shows air pollution has mortality effects through and on top of concurrent influenza episodes (55, 56, 57). [12] G. Diamond, D. Legarda, L. K. Ryan, The innate immune response of the respiratory epithelium. Immunological reviews 173, 27 (2000). [ [40] For our baseline results we count patients whose primary disease ICD code is influenza, but show robustness to alternative definitions. That is we exclude, for example, patients with bacteria related pneumonia or other respiratory diseases as their primary diagnosis. [41] We include weather controls to address the link between both influenza and weather (temperature and humidity can both influence influenza transmission rates) and weather and pollution (different climatic conditions can lead to different levels of air quality, all else held constant). [42] In models where we interact vaccine effectiveness, we control for county-by-influenza season effects since vaccine effectiveness varies by season. The baseline model is robust to controlling for countyby-influenza season fixed effects (see Supplementary Appendix S.4). [43] This relationship can be spuriously driven by external factors that affect both pollution levels and admission rates. For example, more populated counties typically have higher pollution levels and (mechanically) higher hospital admissions. Seasonality can also be a factor, as particulate matter and carbon monoxide, two common lung irritants included in the AQI, peak in winter months just as influenza admissions. Our set of fixed effects can address both of these issues. [44] We cluster all standard errors at the county level and provide further robustness checks in Supplementary Appendix S.4. [45] We multiply the 12.79-unit increase by 0.0056, by the average inpatient admissions per countymonth (3.01), the total number of US county equivalents according to the US Census Bureau (3142) (58) and by the 6 months within a influenza season. Since reporting is voluntary, our hospital data are not exhaustive. If reporting behavior does not correlate with the likelihood of influenza infections, the 0.56% relative increase in admissions should not be affected. However, the translation into absolute admissions is likely underestimated. [46] For the up to 8-year-olds median vaccine effectiveness is 45% and for the over 65-year-olds the median is 36%. [47] Vaccine effectiveness during our study period ranges from 25-57% for those up to 8 and 0-50% for those 65 and older. [48] These four diseases are a random selection of disease groups we think are unlikely to be correlated with air pollution, and also occur a sufficient number of times in the hospitalization data. See the Supplementary Appendix S.4 and S.1 for details on ICD codes, estimation and results. [49] In a study of the Spanish flu in 1918, (32) show cities with higher coal-fired power generating capacity saw higher mortality rates, potentially through exposure to higher air pollution aggravating either of these vectors. Our data allows us to more narrowly investigate this link by directly assessing the impact of air pollution readings on influenza diagnosed hospitalizations across the U.S. during the modern pollution control era. [50] C. Tables S.1 to S.6 1 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.20057216 doi: medRxiv preprint Hospitalization data: We use hospital admission data from the Healthcare Cost and Utilization Project (HCUP) and focus on the inpatient data from hospital stays (39). We exploit patient level information on diagnosed diseases per International Classification of Diseases (ICD) codes, patient zip codes, admission months, age, gender, race as well as hospital charges. The data are available for a subset of U.S. states and years from 1991 (see Table S .1). We convert monetary hospital charges to common 2018 US$ using a GDP deflator from the World Bank (60) . To identify influenza hospitalizations, we count patients whose primary diagnosis is a strain of influenza. We use the Clinical Classifications Software (CCS) from the Agency for Healthcare Research and Quality (AHRQ) to classify relevant influenza ICD codes. These are the following ICD-9-CM codes: 4870, 4871, 4878, 488, 4880, 48801, 48802, 48809, 4881, 48811, 48812, 48819, 48881, 48882, 48889; and, for the period from October 2015 when the system was changed to ICD-10-CM, the following ICD-10-CM codes: J09X1, J09X2, J09X3, J09X9, J1000, J1001, J1008, J101, J102, J1081, J1082, J1083, J1089, J1100, J1108, J111, J112, J1181, J1182, J1183, J1189. We exclude patients whose primary diagnosis is not influenza, even if influenza is included among secondary diagnoses. Counting primary influenza diagnoses reflects a middle ground between two extreme alternatives for which we perform robustness checks. In one robustness check, we count patients who have any (primary or secondary) influenza diagnosis. In another robustness check we only count patients for whom influenza is their only diagnosis. We exclude patients whose zip code is from a different state than the hospital in which they are treated. Hospitalization data are available at the patient zip code-by-month level, which we aggregate to the county-by-month level. We assign a zero value for admissions to counties in the months with no reported influenza admission. We only do this for counties and months in states that report data in the given year. During the influenza season from October to March, 57% of county-months have no influenza related hospital admissions in the HCUP data. Our results are robust with and without using the zero valued county-months in our estimations. In four falsification tests, we use outcomes less likely to be affected by air quality: primary ICD codes associated with (i) diabetes mellitus with complications, (ii) urinary tract infections, (iii) skull and face fractures, and (iv) osteoarthritis. We use the categories and ICD codes from the Clinical Classifications Software (CCS) from the Agency for Healthcare Research and Quality (AHRQ). See Section S.1.1 for details. For a further robustness check, we use outpatient data from emergency departments (61) instead of the inpatient data, with the same strategy of counting influenza patients as above. To measure air quality, we use the EPA Air Quality Index (AQI), which measures air quality derived from ground monitors (62) . The AQI captures pollution from particulate matter (PM2.5), sulfur dioxide (SO2), carbon monoxide (CO), nitrogen dioxide (NO2) and ozone (O3). Further details on AQI calculation are provided by the EPA (63). We use the daily, county level, pre-aggregated data and further aggregate up to the county-by-month level. For missing county-months, we take the average value of the adjacent counties in the same month. We use the average value of the AQI within a month as well as the number of days with air at least "unhealthy for sensitive groups" according to the EPA (AQI≥100). We winsorize the AQI at the top and bottom 1% for the main analysis and show robust results without winsorization. For our analysis, we take the one month lagged AQI to identify exposure to air pollution before influenza diagnosis and not afterwards. We use pre-aggregated monthly weather averages from (64, 65) , including temper-2 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.20057216 doi: medRxiv preprint ature, specific humidity, wind speed, and precipitation, and aggregate grid points up to the county-bymonth level. We use data on the timing of national influenza-like illnesses from the CDC (38) to identify the main influenza months: October through March (see Figure 1b ). This coincides with the reported influenza season in various CDC publications. We restrict our main analysis to this influenza season. Vaccine effectiveness: We use the estimated vaccine effectiveness, for different age groups, by influenza season, from the CDC (66) . Underlying cited studies are available from 2007/2008. Since vaccine effectiveness can vary across age groups during the same influenza season, we use the reported effectiveness of the two age groups most relevant for our study: children up to 8 years old and for people 65 years and older. Figure 1c shows these are the main age groups observed in the HCUP inpatient data with primary influenza diagnoses. We use the categories from the Clinical Classifications Software (CCS) from the Agency for Healthcare Research and Quality (AHRQ) to identify the relevant ICD codes. E093392, E093393, E093399, E09341, E093411, E093412, E093413, E093419, E09349, E093491, E093492, E093493, E093499, E09351, E093511, E093512, E093513, E093519, E093521, E093522, E093523, E093529, E093531, E093532, E093533, E093539, E093541, E093542, E093543, E093549, E093551, E093552, E093553, E093559, E09359, E093591, E093592, E093593, E093599, E0936, E0937X1, E0937X2, E0937X3, E0937X9, E0939, E0940, E0941, E0942, E0943, E0944, E0949, E0951, E0952, E0959, E09610, E09618, E09620, E09621, E09622, E09628, E09630, E09638, E09641, E09649, E0965, E0969, E098, E1010, E1011, E1021, E1022, E1029, E10311, E10319, E10321, E103211, E103212, E103213, E103219, E10329, E103291, E103292, E103293, E103299, E10331, E103311, E103312, E103313, E103319, E10339, E103391, E103392, E103393, E103399, E10341, E103411, E103412, E103413, E103419, E10349, E103491, E103492, E103493, E103499, E10351, E103511, E103512, E103513, E103519, E103521, E103522, E103523, E103529, E103531, E103532, E103533, E103539, E103541, E103542, E103543, E103549, E103551, E103552, E103553, E103559, E10359, E103591, E103592, E103593, E103599, E1036, E1037X1, E1037X2, E1037X3, E1037X9, E1039, E1040, E1041, E1042, E1043, E1044, E1049, E1051, E1052, E1059, E10610, E10618, E10620, E10621, E10622, E10628, E10630, E10638, E10641, E10649, E1065, E1069, E108, E1100, E1101, E1110, E1111, E1121, E1122, E1129, E11311, E11319, E11321, E113211, E113212, E113213, E113219, E11329, E113291, E113292, E113293, E113299, E11331, E113311, E113312, E113313, E113319, E11339, E113391, E113392, E113393, E113399, E11341, E113411, E113412, E113413, E113419, E11349, E113491, E113492, E113493, E113499, E11351, E113511, E113512, E113513, E113519, E113521, E113522, E113523, E113529, E113531, E113532, E113533, E113539, E113541, E113542, E113543, E113549, E113551, E113552, E113553, E113559, E11359, E113591, E113592, E113593, E113599, E1136, E1137X1, E1137X2, E1137X3, E1137X9, E1139, E1140, E1141, E1142, E1143, E1144, E1149, E1151, E1152, E1159, E11610, E11618, E11620, E11621, E11622, E11628, E11630, E11638, E11641, E11649, E1165, E1169, E118, E1300, E1301, E1310, E1311, E1321, E1322, E1329, E13311, E13319, E13321, E133211, E133212, E133213, E133219, E13329, E133291, E133292, E133293, E133299, E13331, E133311, E133312, E133313, E133319, E13339, E133391, E133392, E133393, E133399, E13341, E133411, E133412, E133413, E133419, E13349, E133491, E133492, E133493, E133499, E13351, E133511, E133512, E133513, E133519, E133521, E133522, E133523, E133529, E133531, E133532, E133533, E133539, E133541, E133542, E133543, E133549, E133551, E133552, E133553, E133559, E13359, E133591, E133592, E133593, E133599, E1336, E1337X1, E1337X2, E1337X3, E1337X9, E1339, E1340, E1341, E1342, E1343, E1344, E1349, E1351, E1352, E1359, E13610, E13618, E13620, E13621, E13622, E13628, E13630, E13638, E13641, E13649 , E1365, E1369, E138. 03284, 59000, 59001, 59010, 59011, 5902, 5903, 59080, 59081, 5909, 5950, 5951, 5952, 5953, 5954, 59581, 59582, 59589, 5959, 5970, 59780, 59781, 59789 , 59800, 59801, 5990; 3 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.20057216 doi: medRxiv preprint Table S .1 contains states and years with available admission months and patient zip codes in the inpatient hospitalization data we use. Table S.2 contains summary statistics for inpatient hospital admissions with a primary influenza diagnosis, average monthly AQI per county-month, and the number of days with AQI ≥ 100. We use the standard deviation of the AQI during the influenza season (12.79) as well as the average inpatient hospitalization numbers (3.01) for the calculation of absolute effects based on our Poisson Pseudo-Maximum Likelihood estimation. Arizona 1991 Arizona ,1992 Arizona ,1993 Arizona ,1994 Arizona ,1995 Arizona ,1996 Arizona ,1997 Arizona ,1998 Arizona ,1999 Arizona ,2000 Arizona ,2001 Arizona ,2002 Arizona ,2003 Arizona ,2004 Arizona ,2005 Arizona ,2006 Arizona ,2007 Arizona ,2008 Arizona ,2009 Arizona ,2010 Arizona ,2011 Arizona ,2012 Arizona ,2013 Arizona ,2014 Arizona ,2015 1991 ,1992 ,1993 ,1994 ,1995 ,1996 ,1997 ,1998 ,1999 ,2000 ,2001 ,2002 ,2003 ,2004 ,2005 ,2006 ,2007 ,2008 ,2009 ,2011 ,2012 ,2013 ,2014 ,2015 New York 1993 ,1994 ,1995 ,1996 ,1997 ,1998 ,1999 ,2000 ,2001 ,2002 ,2003 ,2004 ,2005 ,2006 ,2007 ,2008 ,2009 ,2011 ,2012 ,2013 ,2014 ,2015 Oregon 1999 ,2008 ,2009 South Dakota 2009 Utah 2009 Vermont 2009 Washington 1993 ,1994 ,1995 ,1996 ,1997 ,1998 ,1999 ,2000 ,2001 ,2002 ,2003 ,2004 ,2005 ,2006 ,2007 ,2008 ,2009 ,2011 ,2012 ,2013 Wisconsin 1999 ,2009 Notes: The table shows the states and years used in the main analysis. We estimate the relationship between influenza-related inpatient hospitalizations H cym and the lagged air quality index AQI cym−1 at the county c by calendar month m by year y level using a Poisson model: H cym = exp(βAQI cym−1 + X cym δ + γ cy + µ ym + cym ). We include county-by-year fixed effects γ cy to control for changing factors such as population size, income, demography and influenza testing procedures across counties and time. This also captures 5 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . unobserved annual shocks at the county level that affect both air pollution and hospitalizations. Calendar month-by-year fixed effects µ ym control for a flexible overall time trend. Results are robust to including additional fixed effects such as state-by-calendar month or county-by-influenza season fixed effects. While county-by-year fixed effects capture the bulk of climatic differences across counties, we also control for within-year differences with a vector of weather control variables X cym . This includes temperature, specific humidity, precipitation, and wind speed in various combinations. Temperature and humidity has been shown to affect both virus survival (see (5, 6, 7, 19, 67) and air pollution (18, 23, 55) . In our baseline we include three temperature (C) bins (< 0, ≥ 0 & < 15 and > 15), five bins based on the quintiles of specific humidity, and linear terms for precipitation and wind speed. We lag the AQI by one month to account for hospital admissions data at the monthly level. The goal is to capture pollution exposure prior to the influenza diagnosis, not after. In principle, air pollution could also affect patient progression after diagnosis, but we focus on the effect of pollution leading up to the diagnosis. We estimate the model with a pseudo-maximum likelihood estimator (68, 69) , which performs well with a large number of zeros and is consistent with over-or under-dispersion in the data (70) . We cluster standard errors at the county level to allow for arbitrary heteroskedasticity and serial correlation in the errors, and show robustness to two-way clustering at the added state-year level. Table S .3 provides falsification tests with outcomes unlikely to be correlated with air pollution. Column 1 repeats our baseline results for influenza patients. The next four columns use inpatient hospitalizations with a primary diagnosis of diabetes mellitus with complications, urinary tract infections, skull and face fractures, and osteoarthritis. Coefficients and standard errors indicate a precise zero effect for these outcomes. Table S .4 explores heterogeneous effects by age, gender and race. Estimates across different groups are statistically indistinguishable from one another, however, the point estimates for blacks and especially Hispanics are larger than for whites. Table S .5 explores robustness of our main results to different controls, fixed effects, and standard error calculations. Column (1) replicates the baseline results, and reports the estimates for our weather controls (reporting was suppressed in the manuscript for simplicity). Temperature and humidity controls are included as dummies for separate bins. While the coefficients on temperature are not statistically significant (county-year fixed effects absorb much of the large-scale variation), the sign is as expected. Temperatures below zero C as well as above 15 C lead to fewer observed hospitalizations (see also (5, 6) ). Humidity decreases hospitalizations consistent with (5, 6, 7, 19) , while precipitation and wind speed have no statistically or economically significant effects. In Column (2) of Table S .5, we drop weather controls, and in Column (3) we include alternative functional forms of the weather controls using second order polynomials in temperature and humidity with a full set of interactions. In Columns (4) and (5) we include county-by-influenza season (Oct -Mar) fixed effects and state-by-month of the year fixed effects. Columns (6) and (7) replicate (1) and (4), but cluster standard errors on the county level as well as on the state-by-year level to allow additional arbitrary spatial correlation of errors across counties within a state-year. Table S.6 reports from further robustness checks. Column (1) replicates the baseline results. Column (2) does not winsorize the AQI data. Column (3) drops county-month cells with missing AQI 6 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . measures (rather than interpolating them based on the average value of the adjacent counties). Column (4) includes patients whose zip code is from a different state than the hospital in which they are treated. Column (5) restricts to states with at least seven years of reported data: Arizona, Colorado, Kentucky, Massachusetts, New Jersey, New York, and Washington. Column (6) drops county-months with no reported influenza admissions (rather than assigning a zero value for admissions). Column (7) contains results from an ordinarily least square (OLS) regression instead of a Poisson Pseudo-Maximum Likelihood regression. Columns (8) and (9) use alternative assumptions on who to count as an influenza patient. Our baseline only counts patients whose primary diagnosis is influenza. Column (8) counts patients where all diagnoses are influenza, i.e., there are no other diagnosed conditions. Column (9) counts all patients with any influenza diagnosis, primary or non-primary. Columns (10) and (11) use the data on outpatient (instead of inpatient) hospitalizations as the outcome variable. The effect of AQI is slightly larger on outpatient hospitalizations, consistent with the notion that these are less severe but more frequent than inpatient hospitalizations. 7 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.20057216 doi: medRxiv preprint The dependent variable is the count of hospital admissions with diagnosed influenza within a county and month. We include only the influenza intensive months of October through March. The results are from a Poisson Pseudo-Maximum Likelihood regression with specified fixed effects and control variables, except the last Column (9), which is an OLS regression. The number of included observations can vary across different outcomes due to fixed effects and varied counts in each county-month cell. Temperature controls consist of three separate bins, specific humidity controls consist of five separate bins, precipitation and wind speed are linear terms. All weather variables are based on the monthly county averages. A higher AQI means worse air quality. Standard errors in parentheses are clustered at the county level. 9 All rights reserved. No reuse allowed without permission. author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint (which was not peer-reviewed) is the . https://doi.org/10.1101/2020.04.07.20057216 doi: medRxiv preprint Influenza vaccines for the future Estimates of global seasonal influenza-associated respiratory mortality: a modelling study Epidemic influenza and vitamin d On the epidemiology of influenza Influenza virus transmission is dependent on relative humidity and temperature Absolute humidity modulates influenza survival, transmission, and seasonality Absolute humidity and the seasonal onset of influenza in the continental united states Air pollution and case fatality of sars in the people's republic of china: an ecologic study World Bank Emergency Department Databases (SEDD) (Healthcare Cost and Utilization Project, Agency for Healthcare Research and Quality Air Quality System Data Mart (US Environmental Protection Agency Technical Assistance Document for the Reporting of Daily Air Quality (United States Environmental Protection Agency Continental-scale water and energy flux analysis and validation for the north american land data assimilation system project phase 2 (nldas-2): 1. intercomparison and application of model products NLDAS Primary Forcing Data L4 Monthly 0.125 x 0.125 degree V002 (Goddard Earth Sciences Data and Information Services Center (GES DISC) CDC, Seasonal Flu Vaccine Effectiveness Studies (Centers for Disease Control and Prevention, National Center for Immunization and Respiratory Diseases (NCIRD) Airborne micro-organisms: survival tests with four viruses Pseudo maximum likelihood methods: Applications to poisson models Ppmlhdfe: Fast poisson estimation with high-dimensional fixed effects The log of gravity Skull and face fractures ICD-9-CM codes: 80000 Skull and face fractures ICD-10-CM codes: S020XXA Osteoarthritis ICD-10-CM codes We thank Luisa Osang and Jeffrey Shaman for helpful discussions. All errors are our own. The IRB for access to the HCUP data through the National Bureau of Economic Research (NBER) was approved by the NBER. Funding: No specific grants were connected to this project. Author contributions: GS, JGZ, MN and NS conceptualized the study, GS analyzed the data, and GS, JGZ, MN and NS wrote the manuscript. Competing interests: The authors declare no competing interests. Data and materials availability: The replication code and materials for both the manuscript and the supplementary materials will be made publicly available at Harvard Dataverse. The restricted access data can be accessed at (39). Notes: The dependent variable is the count of hospital admissions with diagnosed influenza within a county-month. We include only the influenza intensive months of October through March. Results are from a Poisson Pseudo-Maximum Likelihood regression with specified fixed effects and control variables. The number of included observations can vary across different outcomes due to fixed effects and varied counts in each countymonth cell. A higher AQI means worse air quality. Standard errors in parentheses are one-way or two-way clustered as indicated.