key: cord-0472178-f6lore7a authors: Malkov, Egor title: Spousal Occupational Sorting and COVID-19 Incidence: Evidence from the United States date: 2021-07-29 journal: nan DOI: nan sha: b584015d1f2c2d0195b66f6ac6b71954d85d9634 doc_id: 472178 cord_uid: f6lore7a How do matching of spouses and the nature of work jointly shape the distribution of COVID-19 health risks? To address this question, I study the association between the incidence of COVID-19 and the degree of spousal sorting into occupations that differ by contact intensity at the workplace. The mechanism, that I explore, implies that the higher degree of positive spousal sorting mitigates intra-household contagion and this translates into a smaller number of individuals exposed to COVID-19 risk. Using the U.S. data at the state level, I argue that spousal sorting is an important factor for understanding the disparities in the prevalence of COVID-19 during the early stages of the pandemic. First, I document that it creates about two-thirds of the U.S. dual-earner couples that are exposed to higher COVID-19 health risk due to within-household transmission. Moreover, I uncover substantial heterogeneity in the degree of spousal sorting by state. Next, for the first week of April 2020, I estimate that a one standard deviation increase in the measure of spousal sorting is associated with a 30% reduction in the total number of cases per 100000 inhabitants and a 39.3% decline in the total number of deaths per 100000 inhabitants. Furthermore, I find substantial temporal heterogeneity as the coefficients decline in magnitude over time. My results speak to the importance of policies that allow mitigating intra-household contagion. Coronavirus disease 2019 (COVID-19) pandemic created substantial challenges for health systems and economies all over the world. To reduce the spread of disease, many countries imposed numerous mitigation measures, such as lockdowns and stay-at-home orders. These policies forced many people to work from home. However, a sizeable fraction of jobs, e.g. in the United States it is equal to 21.6% of the workforce (Leibovici et al., 2020), requires high contact intensity at the workplace and cannot be performed remotely. The nature of work became a crucial factor behind the exposure to health, income, and unemployment risks. As for the health risk, the workers whose jobs do not require high contact intensity at the workplace face a lower risk of being infected, compared to those who work in high physical proximity to the other individuals (ECDC, 2020; Mutambudzi et al., 2021). In turn, the members of these workers' households are less exposed to the risk of within-household COVID-19 transmission in the former than in the latter case. In other words, the patterns of intra-household contagion depend on the joint distribution of spouses by occupation: both own and spousal degree of contact intensity at the job matter. In this paper, I evaluate the importance of spousal occupational sorting in explaining the disparities in COVID-19 incidence across the United States. Following the aforementioned idea, if at least one of the spouses in a dual-earner couple works in a high contact intensity occupation, then the other family members are more exposed to the risk of being infected. Therefore, for the population to be less exposed to the intra-household contagion risk, the fraction of couples with exactly one high contact intensity worker should be lower, while the fraction of couples where both spouses have low contact intensity jobs should be higher. 1 In other words, higher degree of positive occupational sorting is associated with smaller number of individuals who are exposed to COVID-19 health risk. Two regions characterized by identical distributions of males and females by occupation may demonstrate substantially di erent exposure to COVID-19 contagion risk depending on the patterns in spousal sorting. To address the question about the relevance of this mechanism, I construct a state-level measure of spousal sorting using the 2015-2019 American Community Survey (ACS) individual data merged with the classi cation of occupations by contact intensity from Mongey et al. (2021) . In particular, I use the correlation between the contact intensity degrees of husband and wife's occupations. Next, I combine this measure with the state-level daily data on COVID-19 cases and deaths at the early stages of the pandemic, namely, April-June, 2020, provided by the Johns Hopkins University Center for Systems Science and Engineering (Dong et al., 2020), and demographic and socioeconomic characteristics of the states from the 2015-2019 ACS. As for the latter, I use the data on racial, age, and gender composition, the share of married couples, average household size, population density, income, employment by occupation groups, and commuting patterns. At the aggregate level, spousal occupational sorting in the United States creates 64.2% dualearner couples with at least one spouse whose job requires high contact intensity at the workplace. These families are exposed to greater intra-household contagion risk. I also estimate this fraction under two counterfactual distributions: zero sorting (or random matching of spouses) and "ideal" sorting (the distribution with maximum feasible positive sorting). Under zero sorting, it is equal to 66.5%, under ideal sorting-48.3%. Furthermore, I show that the aggregate distribution masks signi cant state-level heterogeneity with the District of Columbia has the highest and North Dakota has the lowest degree of spousal sorting. In my empirical analysis, I focus on the relationship between the total number of COVID-19 cases and deaths per 100000 inhabitants and the measure of spousal occupational sorting by state. I run the regressions week by week, hence allowing the coe cients to be time-varying. In all the regressions, I control for the battery of demographic and socioeconomic variables that are widely considered as potential factors of COVID-19 spread and include day xed e ects to capture the common factors across all the states. I nd that in the week of April 1-7 a one standard deviation increase in the measure of spousal sorting (this corresponds to moving from Oregon to New York) is associated with a decrease in the cumulative number of cases per 100000 inhabitants by 30%. This represents 21.5 fewer cases per 100000 inhabitants from a sample mean of 72.4 per 100000. As for the number of deaths, I estimate that a one standard deviation increase in the measure of spousal sorting is associated with a decline in the cumulative number of deaths per 100000 inhabitants by 39.3%. This represents 0.9 fewer cases per 100000 inhabitants from a sample mean of 2.2 per 100000. I nd that these e ects are stronger at the early stage of the pandemic as both coe cients decrease in magnitude over time. Furthermore, in the regressions for the number of cases, spousal sorting becomes insigni cant starting from the week of April 22-28, while in the regressions for the number of deaths it is signi cant until the week of June 17-23. To the best of my knowledge, my paper is the rst one that documents the important role of spousal occupational sorting in the incidence of COVID-19. 2 My ndings about the e ects of the other variables are in line with the existing research. In particular, I show that the share of males is positively correlated and the share of married couples is negatively correlated with the number of cases and deaths per capita. 3 Furthermore, I nd that population density and the use of public transportation are associated with the higher number of cases and deaths per capita. 4 The main lesson from my ndings is that the interaction between the nature of work (distribution of workers by occupations of di erent contact intensity) and patterns in spousal sorting plays an important role in explaining COVID-19 incidence across the United States. 5 Furthermore, my results suggest several policy implications. First, targeting individuals who work in occupations that require high contact intensity with testing and vaccination, and providing them with protective equipment, would allow for reducing the health risk not only for these workers but also for the members of their households. This is an indirect way of mitigating the intra-household contagion channel. On the other hand, it is also necessary to think about the direct measures aimed at mitigating the within-household transmission. For example, providing shelter for high contact intensity workers will likely reduce the risk of contracting a disease for their families. The scope of these policies is quite sizable since, as I document, about two-thirds of the U.S. dual-earner couples are exposed to COVID-19 health risk through intra-household contagion. My paper is related to voluminous literature studying the characteristics that account for dis- . Bwire (2020) points out that a signi cant part of the gap in the number of deaths between men and women is explained by the di erence in gender behavior, e.g. men tend to smoke and drink more than women, and women are more likely to take preventive measures, such as handwashing and wearing a mask. 4 See the discussion of these factors by Almagro and Orane-Hutchinson (2020) and Glaeser et al. (2021) for major U.S. cities. 5 In the previous version of this paper, I also discuss the role of spousal occupational sorting in shaping income risk during the COVID-19 pandemic. commuting patterns (Glaeser et al., 2021), and occupations (Almagro and Orane-Hutchinson, 2020). I complement these studies by showing that spousal occupational sorting is a signi cant factor of COVID-19 incidence even after controlling for these variables. More broadly, my results con rm the idea that many e ects of the pandemic are mediated through the economics of the household (Davis, 2021). In particular, this paper bridges the medi- The rest of the paper is organized as follows. In Section 2, I describe the data and construction of the variables. In Section 3, I discuss the sorting of spouses by occupation contact intensity in the United States and provide the results describing its e ects on COVID-19 incidence. Section 4 concludes. I study the period between April 1, 2020 and July 1, 2020. Most of the U.S. states imposed restrictions, such as stay-at-home orders, school closures, and suspension of public gatherings, in late March 2020. 6 My analysis starts at the point when the measures against COVID-19 contagion in the public places already went into e ect, hence the results are unlikely to be a ected by di erential timing of these policies. respondingly, π s hh ). Next, denote the fraction of couples where a male has a high CI job and a female has a low CI job (correspondingly, a male has a low CI job and a female has a high CI job) 7 The data is extracted from IPUMS at https://usa.ipums.org/usa/. 7 In this section, I begin by describing the mechanism through which the distribution of couples by contact intensity of occupations may a ect the prevalence of COVID-19. Second, I discuss the patterns of spousal occupational sorting in the United States at the aggregate and state levels. Finally, using the state-level variation, I show that spousal occupational sorting is an important factor for understanding the disparities in COVID-19 incidence even after controlling for the rich set of other characteristics that are commonly considered as drivers of COVID-19 contagion risk. Workers whose occupations require high contact intensity at the workplace face higher risk of being infected, compared to those who work in low physical proximity to the other individuals. 8 Since the presence of the other family members creates the risk of intra-household contagion (Sun et al., 2020), then if at least one of the spouses work in a high CI occupation, the other family members are more exposed to COVID-19 risk. Consider an illustrative example described CI jobs. The resulting distribution imply that there are 50 couples with at least one worker having a high CI job, and hence these 50 couples are more exposed to intra-household contagion risk. In particular, this risk is concentrated in high-contact-intensity couples. At the other extreme, as shown in Figure 1d , spouses have di erent jobs in terms of contact intensity (perfect negative sorting). The resulting distribution imply that there are 100 couples with at least one worker having a high CI job, and hence all 100 couples are exposed to intra-household contagion risk. To complement the discussion, in Figure 1c , I also show the case of zero sorting when males and females match at random. Under this scenario, 75 couples have at least one worker in a high CI job, and hence more exposed to within-household COVID-19 transmission. In a nutshell, this example shows that higher degree of positive occupational sorting is associated with smaller number of individuals who are exposed to COVID-19 health risk. Two populations characterized by similar and compares it against two counterfactual distributions: zero sorting (or random matching) and "ideal" sorting. I de ne ideal sorting as the distribution where the fraction of "mixed" couples (one high CI and one low CI worker) is minimised or, in other words, it is maximum feasible positive sorting (i.e. the risk of intra-household contagion is minimized). The actual sorting (black bars) creates 64.2% (29.2% + 19.1% + 15.9%) couples with at least one spouse whose job requires high contact intensity at the workplace. These couples are exposed to greater intra-household contagion risk. Under zero sorting (dark grey bars), this fraction goes up to 66.5% (31.4% + 17.0% + 18.1%). Therefore, the existing occupational sorting in the U.S. couples creates a lower fraction of individuals who are exposed to intra-household contagion risk, compared to the case of zero sorting. Under ideal sorting (light grey bars), it falls down to 48.3% (13.2% + 35.1%). Using expression (1), I obtain that the measure of spousal occupational sorting for the United States is equal to 0.091. As follows from Table 1 , this aggregate statistic masks signi cant heterogeneity by state. In my regression analysis, I use the state-level variation in spousal sorting. Figure 3 shows the value of correlation (1) by state. First, no state is characterized by negative sorting. Second, from simple inspection, we observe the following spatial pattern: states with the highest levels of sorting are located on the West Coast, East Coast, and South (Texas). In this subsection, I present the main empirical results. In order to estimate the relationship between spousal occupational sorting and the incidence of COVID-19 and its evolution over time, I run the following regressions for every week w(t): log T otal Cases sw(t) = α t + β w(t) Spousal Occ. Sorting s + δ w(t) X s + ε st (2) log T otal Deaths sw(t) = α t + β w(t) Spousal Occ. Sorting s + δ w(t) X s + ε st Regressions (2) and (3) Table 2 shows the results for COVID-19 cases over twelve weeks between April 1-7, 2020 and June 17-23, 2020. In particular, I report the coe cients for the household characteristics. Additionally, I plot the evolution of these coe cients and their 95% con dence intervals on the left panel of Figure 4 and the top panel of Figure 5 . The rst nding is that spousal occupational 13 (2) for di erent weeks. The dependent variables is log of daily cumulative cases per 100000 inhabitants up to date. All controls are described in the text. Standard errors are clustered at the state level and shown in parentheses. Signi cance: * p < 0.1, ** p < 0.05, *** p < 0.01. Next, Table 3 reports the results for COVID-19 deaths over twelve weeks between April 1-7, 2020 and June 17-23, 2020. The right panel of Figure 4 and the bottom panel of Figure 5 plot the evolution of coe cients over time. The rst lesson from these results is that spousal occupational sorting is a statistically signi cant factor until the week of June 17-23. Speaking about magnitudes, for the week of April 1-7, a one standard deviation increase in the measure of spousal occupational sorting is associated with a decline in the cumulative number of deaths per 100000 inhabitants by 0.5 log points (or 100 × (exp (−0.5) − 1) = 39.3%). Given that for the week of April 1-7, the average cumulative number of deaths over states was 2.2 per 100000 inhabitants, a one standard deviation increase in spousal occupational sorting would lower this to 1.3 per 100000. Similarly to the regressions with the number of cases per capita as a dependent variable, 10 It is worth noting that the coe cients may be biased due to the measurement error. One source of the measurement error comes from the fact that a sizeable share of workers lost their jobs in the rst months of the pandemic, and hence the pre-pandemic occupation shares and the measure of spousal occupational sorting di er from the actual ones. To minimize this measurement error, I conduct my analysis on the rst weeks of the pandemic. Furthermore, the number of COVID-19 cases and deaths as well as the variables from the ACS data (that uses a 5% sample) may be measured with error. Apr 8-14, 2020 Apr 15-21, 2020 Apr 22-28, 2020 Apr 29-May 5, 2020 May 6-12, 2020 Spousal occ. sorting (z-score) -0.500 * * * May 20-26, 2020 May 27-Jun 2, 2020 Jun 3-9, 2020 Jun 10-16, 2020 Jun 17-23, 2020 Spousal occ. sorting (z-score) -0. (3) for di erent weeks. The dependent variables is log of daily cumulative deaths per 100000 inhabitants up to date. All controls are described in the text. Standard errors are clustered at the state level and shown in parentheses. Signi cance: * p < 0.1, ** p < 0.05, *** p < 0.01. In this paper, I present evidence that the degree of spousal occupational sorting is an important factor for explaining disparities in the incidence of COVID-19 across the United States during the early stages of the pandemic. In particular, I test the hypothesis that a higher degree of positive sorting is associated with a lower incidence of COVID-19. To address this question, I construct the state-level measure of spousal sorting using the ACS data and merge it with the daily data on COVID-19 cases and deaths provided by the JHU CSSE. My rst step is to document the distribution of the U.S. dual-earner couples by occupation contact intensity. I show that spousal sorting creates 64.2% families that are exposed to greater intra-household contagion risk because at least one spouse has a high contact intensity job. Furthermore, I uncover substantial heterogeneity in the degree of spousal sorting by state. Next, using variation in the degree of spousal sorting by state, I estimate its e ects on the number of COVID-19 cases and deaths per 100000 inhabitants. In particular, I run the regressions for each week over the period between April 1, 2020 and July 1, 2020, hence allowing the coe cients to be time-varying. The estimates imply that, for the week of April 1-7, a one standard deviation increase in the measure of spousal sorting is associated with a 30% reduction in the number of cases per 100000 and a 39.3% decline in the number of deaths per 100000. Furthermore, I nd that the coe cients decline in magnitude over time and eventually lose statistical signi cance. Beyond that, I show that the share of males is positively correlated and the share of married couples is negatively correlated with the number of cases and deaths per capita. My ndings emphasize the importance of interaction between the nature of work and spousal sorting for the incidence of COVID-19 and suggest several policy implications. On the one hand, targeting workers in jobs that require high contact intensity with testing and vaccination, and providing them with protective equipment, would indirectly mitigate the intra-household contagion channel as their family members would be less likely exposed to the risk of being infected. On the other hand, the other policy tools may directly address the problem of within-household transmission. In this paper, I focus on spousal occupational sorting and hence limit my analysis to dualearner couples. An important direction for future research is to explore the role of the other family-related factors in the incidence of COVID-19. Furthermore, my analysis can be extended by considering the economic outcomes for couples with di erent exposure to contact intensity at work. In the previous version of this paper, I document a negative relationship between the total household income and the share of couples where at least one spouse should work in high contact intensity. Hence the COVID-19 pandemic likely exposes poorer households to greater health risks. Moreover, while I brie y discuss several policy implications, a more careful and formal study of optimal policies is necessary. Baqaee et al. (2020) is an example of a quantitative paper that studies the economic reopening using the data on contact intensity and teleworkability by sector. Furthermore, my results emphasize the importance of accounting for contacts between individuals, both at work and within a household, for designing the optimal policies (Azzimonti et al., 2020). Last but not least, my analysis can be extended using the data from the other countries. These are fruitful avenues for future research. (3) for the weeks from April 1-7, 2020 to June 24-July 1, 2020. The shaded area represents the 95% con dence interval. 25 Essential -Professional: Management, business, and nancial occupations (OCC codes 0010-0960 in the 2018-onward Census occupational classi cation system) Architecture and engineering occupations (OCC 1305-1560), Community and social service occupations (OCC 2001-2060), Educational instruction, and library occupations (OCC 2205-2555), Arts, design, entertainment, sports, and media (OCC 2600-2920), O ce and administrative support occupations Science: Life, physical, and social science occupations (OCC 1600-1980) Law and Related: Legal occupations (OCC 2100-2180) Healthcare Practitioners: Health diagnosing and treating practitioners and other technical occupations (OCC 3000-3270) Health technologists and technicians The author did not receive support from any organization for the submitted work.