key: cord-0853485-evcrluhm
authors: Brandily, Paul; Brébion, Clément; Briole, Simon; Khoury, Laura
title: A poorly understood disease? The impact of COVID-19 on the income gradient in mortality over the course of the pandemic
date: 2021-10-06
journal: Eur Econ Rev
DOI: 10.1016/j.euroecorev.2021.103923
sha: 91a7a536d5979500c54ddf033751ad110399f649
doc_id: 853485
cord_uid: evcrluhm

Mortality inequalities remain substantial in many countries, and large shocks such as pandemics could amplify them further. The unequal distribution of COVID-19 confirmed cases suggests that this is the case. Yet, evidence on the causal effect of the epidemic on mortality inequalities remains scarce. In this paper, we exploit exhaustive municipality-level data in France, one of the most severely hit country in the world, to identify a negative relationship between income and excess mortality within urban areas, that persists over COVID-19 waves. Over the year 2020, the poorest municipalities experienced a 30% higher increase in excess mortality. Our analyses can rule out an independent contribution of lockdown policies to this heterogeneous impact. Finally, we find evidence that both labour-market exposure and housing conditions are major determinants of the epidemic-induced effects of COVID-19 on mortality inequalities, but that their respective role depends on the state of the epidemic.

Despite an unprecedented worldwide decline in mortality over the last century, a substantial income gradient still persists within most countries (Case et al., 2002; Cutler et al., 2006; Chetty et al., 2016; Currie et al., 2020) . These structural inequalities result primarily from individual differences between the rich and the poor in health behaviours, information, economic or social stress (Butikofer et al., 2021) and from ecological factors, like the living environment or public health institutions (Chetty et al., 2016; LeClere et al., 1997; Robert, 1999; Pickett and Pearl, 2001) . However, sudden exogenous events like pandemics can also significantly affect these inequalities. Historical studies have shown that past epidemics had unequal effects on the mortality of different socio-economic groups, either because they revealed latent inequalities in individual health capital or because they spread differently across living environments (Snowden, 2019; Alfani, 2021; Beach et al., 2021) .

The COVID-19 crisis epitomizes such massive mortality shock, on a worldwide scale. In 2020, about 100 million people have tested positive for COVID-19 -among which 2.2% died -with a large number of infections and deaths also remaining undetected. 1 Assessing the causal impact of this pandemic on mortality inequalities is therefore key, not only to inform public policy in the context of the crisis, but also to study the role of pandemics in shaping mortality inequalities from a historical perspective. However, while a growing literature has pointed out that most vulnerable socioeconomic groups were more often affected in the early months of the pandemic, most of the existing evidence remains essentially correlational.

A first challenge in measuring the causal impact of the epidemic on mortality inequality consists in isolating the specific contribution of COVID-19. Even in the absence of a pandemic, structural socioeconomic inequalities in health would have likely resulted in an income gradient in 2020 mortality. A second challenge mostly ignored by the literature is to disentangle the respective effects of the pandemic on the one hand and of the policy responses such as lockdown on the other hand. Finally, a third challenge relates to the dynamics of the pandemic: many studies are based on early and short-term measures. It is unclear whether these results only reflect harvesting effects due to a temporary mortality shock on individuals particularly at risk. In particular, one cannot exclude that the income gradient gradually disappears, or is even reversed, as time goes by and the disease spreads.

In this paper, we address these challenges in the context of France, one of the most severely hit country in the world. Using exhaustive administrative data sets, we are able to study the links between income and the 2020-specific excess mortality at a very local level, for the entire country and over a long time that COVID-19 had a greater impact on poor municipalities because of a higher virus transmission among individuals living in these municipalities.

This article makes several contributions to the literature on the income gradient in the specific case of COVID-19. Many papers have shown that the first wave affected more strongly deprived neighborhoods and individuals based on confirmed cases (Chen and Krieger, 2021; Abedi et al., 2020; Ashraf, 2020; Kim and Bostwick, 2020; Williamson et al., 2020; Drefahl et al., 2020; Jung et al., 2020; Caul, 2020) . 4 Our data improve the measurement of the specific contribution of COVID-19 to health inequalities in two ways.

First, confirmed cases underestimate the number of deaths attributable to the pandemic. In particular, they do not account for false negatives or deaths on which no test has been carried out, either because testing material was lacking in hospital or simply because they occurred at home. Unreported cases are highly unlikely to be independent of income and may therefore bias the results of the literature based on confirmed cases. To avoid such bias, we favour all-cause mortality to confirmed cases in our estimations.

Second, our approach allows to better account for the structural income heterogeneity in mortality that is found absent COVID-19. A large literature documents mortality inequalities between socio-economic groups and it could then be the case that the income heterogeneity in COVID-19 confirmed deaths reflect only these. In particular, structural health inequalities may be especially strong in the US (where most of the evidence on the income gradient comes from), a country where access to health care is very unevenly distributed among the population. 5 We therefore use all-cause excess mortality as our main dependent variable, defined as the deviation in 2020 all-cause mortality with respect to a counterfactual (pre-COVID) period -namely the average of all-cause mortality in 2019 and 2018. 6 To our knowledge, very few articles have worked on all-cause excess mortality, and all of them focused on the first wave (Calderón-Larrañaga et al., 2020; Krieger et al., 2020; Decoster et al., forthcoming) . By using two successive waves in France, our paper adds to this literature by evaluating the evolution of the income gradient over time. Our analysis reveals that the relationship between income and COVID-19 mortality applies with regularity to each wave of the epidemic and that the mortality differential created 4 A few exceptions should be mentioned. Comparing US counties, Brown and Ravallion (2020) , Desmet and Wacziarg (2020) and Knittel and Ozaltun (2020) find no association between median income or poverty rates and COVID-confirmed deaths. Jung et al. (2020) finds a positive relation between poverty rates and confirmed deaths in counties with a low density but rather a U-shaped relation in dense areas. According to Sa (2020) , the positive association between socioeconomic deprivation and confirmed COVID-19 deaths in the UK turns negative once self-declared health quality is taken into account. 5 By contrast, measuring the income gradient in a country like France is informative on the unequal impact of COVID-19 despite a rather equal access to health care. Comparing the income gradients found in both countries is interesting to the extent that France and the US systematically rank at both ends of the distribution of OECD countries in terms of equality of access to health care (source, see also Currie et al. (2020) for a comparison of mortality trends in the US and in France).

6 Two exceptions to the literature using confirmed cases should be mentioned: Caul (2020) describes both confirmed cases and all-cause mortality -but with no counterfactual; Chen and Krieger (2021) use a counterfactual but do not consider all-cause mortality.

does not disappear over time. 7 The epidemic caused an increase in differential mortality between rich and poor municipalities that is not a simple one-time shock or anticipation effect but is rather deepening after each wave.

This article is also the first to distinguish between epidemic-induced and lockdown-induced effects on mortality. The COVID-19 epidemic had many effects on mortality: directly on infected individuals but also more indirectly by increasing the pressure on health infrastructures (thereby delaying other types of treatment) and by modifying individuals' behavior. Likewise, lockdown policies may impact mortality in a way that is non-orthogonal to income (e.g. through mental-health consequences, increased domestic violence, reduced road accidents, etc.; cf. Banerjee et al. (2020) ; Belot et al. (2020) ; Caul (2020); Mulligan (2021) ; McIntyre and Lee (2020); Bullinger et al. (2020) ; Calderon-Anyosa and Kaufman (2021)), reinforcing or, on the contrary, mitigating the effects of the epidemic on mortality inequalities. We exploit the quasi-natural experiment provided by the first lockdown in France to disentangle between the epidemic-and the lockdown-induced mortality dimensions. The uniform implementation of a lockdown over the territory froze the epidemic at very different stages of development. In zones with a low level of infection at the start of the lockdown, the propagation of the virus was stopped before it could have a significant impact on mortality. By contrast, in the high-infection zone, the virus had already circulated enough to have a large impact on mortality, despite the lockdown. We find no income gradient in the within-urban-area mortality in the low-infection zone, suggesting that lockdown policies alone do not play a major role. Instead, the overall income gradient seems to be driven by the effect of COVID-19 infections and related health issues. Among these effects, we further argue that COVID-19 (immediate) mortality represents the main driver of excess mortality in 2020.

Finally, our paper speaks to the literature on the mechanisms underlying the relationship between income and COVID-19 mortality. Other papers have highlighted the importance of labor-market exposure or housing conditions (including the first version of this paper, as of July 2020, as well as Almagro and

Orane-Hutchinson (2020); Almagro et al. (2020) ; Glaeser et al. (2020); Naticchioni et al. (2020) ). Building on exhaustive administrative data, we confirm the role of increased exposure inside and outside the house. 8 We complement the literature by examining the dynamics of the mechanisms effect over both waves and 7 More specifically, a surplus of 1.18 (resp. 1.08) excess deaths per 10k. inhabitants is found in the bottom income quartile in the first wave (resp. second wave), as compared to richer municipalities. 8 Although we argue that occupational and housing conditions capture a substantial share of this inside and outside the house exposure, we acknowledge the potential role of other mechanisms, such as: greater levels of air pollution in poorer areas (Cole et al., 2020; Persico and Johnson, 2020) , lower levels of compliance to lockdown and to self-protection measures among low-income individuals (Papageorge et al., 2020) , more comorbidities among individuals living in poor areas (Wiemers et al., 2020; Raifman and Raifman, 2020) among others. While population density is also an often debated mechanism, we do not treat it as a potential mechanism given its non-significant correlation with poverty in our data. by trying to quantify the respective contribution of each mechanism in explaining the income gradient in excess mortality. 9 The contribution of the share of essential workers remains high in both waves, while the effect of the share of workers with frequent social contact at work decreases in magnitude during the second wave. It suggests that employers have partly adjusted the management of the occupational risk over time. Housing conditions also matter less during the second wave, as the lockdown was less stringent.

While we acknowledge that other mechanisms could be at play, a horse race performed between variables related to occupational exposure and housing conditions indicates that our variables -and especially the share of essential workers and of overcrowding housing -capture a substantial share of the income gradient in COVID-19 related mortality.

The remainder of the paper is organized as follows: section 2 describes the data and the construction of our outcome variable, while section 3 explains the context. In section 4, we present the first evidence of an income gradient in COVID-19 mortality in France that persists across waves. In section 5, we disentangle between the epidemic-induced and the lockdown-induced effects of COVID-19 on excess mortality. Given these results, section 6 explores potential mechanisms and section 7 concludes.

In this paper, we gather various individual-level data sets that we match using municipality-level identifiers. In France, municipalities are very small administrative units of 1,600 inhabitants and 15.3 sq.km on average; there are about 35,000 of them in 2020. Our analysis compares municipalities within urban areas. Urban areas are groups of neighboring municipalities, defined by the French National Statistical Institute (INSEE) on the basis of commuting patterns. We consider 16,640 municipalities distributed across 621 urban areas, each being made of 27 municipalities and hosting 85,000 inhabitants on average. 10 The majority of our sources are exhaustive administrative data sets made available by INSEE. We provide more details on all data in appendix A1.

Most other works studying the unequal impact of the pandemic are based on COVID-19 confirmed cases (infection or death). These measures suffer from several conceptual limitations. 11 First, the total im-pact of an epidemic on mortality is a function of both the direct effects of infections (D d ), and the indirect effects of infections, including the effect of public policies taken as a response to infections, (D i ). Indirect effects include, among others, (i) physical, psychological, and social effects of distancing; 12 (ii) economic changes (Banerjee et al. (2020) ); (iii) deaths due to altered access to health services. Furthermore, D d can be broken down into reported infection-caused deaths (D d,r ) and unreported infection-caused deaths (D d,u ). Studies based on confirmed cases not only ignore the indirect effects of the epidemic but can also suffer from severe biases arising from the salience of unreported cases and the saturation of health services.

A growing literature confirms that measures based on COVID-19 reported cases actually include significant biases. First, testing capacities and strategies have proven to vary substantially over space and time, not only across countries but even across regions or neighbourhoods (Kung et al., 2020; Rivera et al., 2020; Borjas, 2020; Balmford et al., 2020; Silverman et al., 2020; Yorifuji et al., 2021; Wu et al., 2021) . France is no exception to this rule: it ranked among the worst OECD countries in number of tests per inhabitant by the end of the first wave, in May 2020 (Scarpetta et al., 2020; Foucart and Horel, 2020) .

Another source of bias comes from the non-random testing of the population in a context of a shortage in testing resources. Typically, tests are more often conducted on symptomatic persons, the elderly or socioeconomic vulnerable people, leading to overestimation of infection rates (Beaney et al., 2020; Böttcher et al., 2021) . Finally, some individuals who die from COVID-19 are not accurately identified, due to difficulties in attributing the cause of death. This may particularly bias the number of reported cases in France, where no system existed to report the COVID-19 mortality at home (Fouillet et al., 2020) .

A second limitation of analyses using only measures based on COVID-19 confirmed cases is that they ignore counterfactual mortality. To conclude about an income gradient from COVID-19 observed death actually presumes that such gradient would not exist absent the epidemic. In other words, it fails to take into account structural health inequalities between socio-economic groups. Such inequalities have been consistently documented in a wide range of countries (Case et al., 2002; Cutler et al., 2006; Adler and Rehkopf, 2008; Cutler et al., 2012; Currie and Schwandt, 2016; Mackenbach et al., 2019) and have been recently highlighted in the case of France, where the richest 5% men (resp. women) have a life expectancy of 13 (resp. 8) years longer than the poorest 5% on average over the 2012-2016 period (Blanpain, 2018) . 13

As a consequence, accounting for the baseline inequality in mortality between rich and poor (in the absence of is key to accurately identify the specific contribution of the epidemic on inequality in mortality.

To address these issues, we build measures of all-cause excess mortality at the municipality level based on daily counts of deaths from INSEE. For every single individual who died in France in 2018, 2019 or 2020, this data set provides the municipality and place of death (hospital or clinic, home, care home, etc.) as well as a set of individual-level characteristics such as sex, date of birth and municipality of residency. 14 Such data makes it possible to compare the mortality for the residents of a given municipality, during a given period in 2020, to the mortality in the same municipality during the same period in the two previous years, 15 before the COVID-19 outbreak. For each municipality m, we therefore define excess mortality during period p as follows: Using all-cause excess mortality allows us to study the impact of the COVID-19 epidemic on mortality based on a measure that is of constant quality over space and time and that is available at a fine level for a very large territory. We are therefore able to avoid the usual biases found in the literature that are due to testing or death reporting issues. Furthermore, our counterfactual analysis also allows us to take into account the structural inequalities between rich and poor municipalities in terms of mortality in order to accurately identify the impact of the epidemic on these inequalities. Finally, our measure allows us to account for both the effects induced by the epidemic itself and those induced by the policy response, the latter being generally ignored in the literature. In section 5, we develop a triple-difference strategy based on a quasi-experimental setting, which exploits natural variations in infection rates over the French territory at a very early stage of the epidemic to net out the potential effects induced by the lockdown policy. This strategy allows us to isolate the overall effect of COVID-19 infections and to study how it is distributed based on a reliable and unbiased measure of mortality.

A municipality-level analysis

The main goal of this paper is to estimate the impact of the COVID-19 epidemic on inequalities in mortality. Our approach at the municipal level allows us to study this issue from a geographical point of view, at a fine local level, and to estimate the average effect of COVID-19 on the mortality of individuals living in a poor area. 16 This approach captures mortality inequalities that primarily relate to ecological (or local) factors. Such factors are known to influence the income gradient in life expectancy in normal times (Chetty et al., 2016) . And given the nature of an epidemic, they are likely to be central to the spread of the virus. In other words, we think that location is an important source of heterogeneity in the impact of COVID-19 on mortality and we thus conduct our analysis at a fine geographic scale.

Because poorer individuals tend to both locate in poorer municipalities (by construction) and suffer from poorer health status, mortality spatial inequalities may also come from individual factors, that would be better captured using individual-level data. Local and individual aspects are equally important but relate to somewhat different and complementary questions. In particular, we think that only a local approach is able to take into account the interactions between individuals living or working in the same area. It is therefore especially suited for the analysis of an epidemic where one's risk spills over to his neighbors. To the best of our knowledge, the only study able to look at both individual and local levels is the one by Decoster et al. (forthcoming) , who exploit mortality data from the first wave of the epidemic (March-May) in Belgium. Their results indicate that individual income has only a small insignificant effect once one controls for local income. That is, during the epidemic, people died more like their neighbors than like their own income's category.

Municipalities are very small geographic units and a good approximation of individuals' living environments. This precision allows us to ask how mortality differ across different municipalities (neighborhoods) of a same urban area (agglomeration). It also drives our focus on ecological aspects in section 6.

3 COVID-19 in France: context and background 3.1 The temporal dynamics of the epidemic in 2020

Overall in 2020, there was 49,495 deaths in excess in French urban areas in comparison with the average of 2018 and 2019. Figure 1 represents the distribution across months of these extra deaths in France in 2020 (black line). It clearly appears that, as in many European countries, excess deaths in 2020 essentially occurred over two distinct waves that peaked in April (15,479 deaths) and November (12,537 deaths), respectively. In the remainder of this paper, we will refer to the period covering March and April 2020 as the "first wave"; and the period October to December 2020 as the "second wave".

In both cases, the French government reacted by taking extraordinary containment measures. As COVID-19 first spread in the country in early 2020, the government decided of a national lockdown on

March 17 and that eventually lasted until May 11. This first lockdown was the most stringent and seems to have reduced the spread of COVID-19 drastically. All workers stayed home except if their activity was deemed essential for the country and visits to elderly care homes (hereafter EHPADs) were forbidden.

After a lull in the summer, the epidemic developed again until a second lockdown was decided on October 30 and continued until December 15. This second lockdown was less stringent however: (i) on top of essential private industries, factories, firms from the construction sector as well as most public services remained open; (ii) to the exception of universities most schools kept on receiving students; (iii) visits to EHPADs were allowed; (iv) parks and gardens remained open. The second lockdown got also repealed quicker, with the end of a first phase after a month when all shops opened again. We provide anecdotal evidence on that aspect in Figure B1 (black line) using Google mobility data (Google LLC, 2021) . The increase in time spent at home appears much higher when a lockdown policy is in place, and more so during the first lockdown than the second, Figure 1 : Monthly counts of excess deaths in French urban areas NOTE: The figure represents the difference between the monthly number of deaths in 2020 and its average over 2019 and 2018 in the relevant zone. The "red" zone corresponds to the areas that were the most severely hit by the first wave, and that are located in the North-Eastern quarter of the country. This zone covers about 44% of the urban population of (mainland) France. The "green" zone encompasses the rest of the French territory.

An important feature of the French case is that the spread of the epidemic was still very uneven across the country when the first lockdown was implemented (March 17, 2020) . While the North-East was severely hit, the West and the South were then almost spared by the epidemic. In the remainder of this paper we call the former region the "red zone" and the latter the "green zone". Figure 1 exhibits this spatial heterogeneity in excess deaths during the first wave. We reproduce the death toll separately represents 44% of the total urban population. By contrast, the green zone suffered a much smaller increase in mortality over the same period. 17 Figure 1 thus summarizes the spatio-temporal dynamics of the COVID-19 epidemic in France in 2020.

We distinguish four phases, 1: January and February are marked by a relatively low mortality in all regions of France. 18 2: March and April represent the "first wave" that mostly hit the North-Eastern quarter (red zone) of the country. 3: May to September, mortality is close to normal. 4: October to

December, a second wave hit the country, this time more homogeneously across the territory.

As shown in appendix (Tables B3 and B4), the red zone contains less urban areas than the green 

This section highlights a key heterogeneity in all-cause excess mortality in France that appears consistently over the two waves: an income gradient in the total impact of the COVID-19 epidemic on mortality.

In 2020, poorest municipalities in France suffered from a greater increase in mortality, as compared to other municipalities. This stylized fact appears very clearly in Figure 2 , which displays the (nonparametric) monthly cumulative number of excess deaths (per 10k. inhabitants) in 2020, separately for the poorest municipalities and other municipalities. 19

To categorize a municipality as poor, we first rank all municipalities according to the median income of their inhabitants and we define as "poor" all the municipalities that fall into the first quartile (Q1) of this distribution (cf. A1 and A2 for details and data). We weight each municipality by its total population in order to give an equal weight to each individual. The municipalities included in Q1 therefore represent the 25% of the French urban population living in the municipalities with the lowest median income.

Figure 2: Cumulative excess mortality rate per 10,000 inh. by poverty status NOTE: The graph plots the cumulative sum of all excess deaths per 10,000 inhabitants from January 2020 for poor and non-poor municipalities. Poor is defined as belonging to the bottom quartile of the national distribution of municipal median income weighted by the municipality size. Figure 2 shows no specific pattern in the cumulative excess mortality rate in the first three months of 2020 (phase 1, in section 3.1), followed by a marked increase from April (phase 2), that further grows as the second wave takes place (phase 4). By the end of 2020, there was, on average, 11.4 more deaths per 10k. inhabitants than usual in the poorest municipalities in France, against 8.7 in richer municipalities.

In the following paragraphs, we employ an econometric approach to check whether this stylized fact holds after accounting for the influence of age and local factors.

We check that the descriptive evidence displayed in Figure 2 holds parametrically and test its robustness. We contrast the evolution of excess mortality in poor municipalities with that of richer municipalities 13 J o u r n a l P r e -p r o o f Journal Pre-proof over the whole French territory, taking a synthetic year (the average of 2018 and 2019) as the baseline.

Our main model writes:

Where m indicates the municipality, ua an urban area. D [m,ua] represents municipalities' excess mortality rate as defined in section 2 (i.e. as the deviation to the average 2018 and 2019 mortality rate of the same municipality over the same calendar period). In our analysis, every municipality m belongs to one single urban area ua, and we therefore drop this redundant subscript in subsequent models. X m,ua is a vector of controls including the total population size and the share of the population above 65 years old. Importantly, we introduce γ ua , an urban-area fixed effect so that we only exploit differences between municipalities located within a contiguous urban environment. This is an important aspect of our model since it absorbs specific local factors that may foster or hinder the spread of the epidemic and are unlikely to be independent from municipalities' income. Our results are thus based on comparisons at a very fine spatial level. Municipalities are weighted by their total population. Standard errors are clustered at the urban-area level.

Our model can be viewed as a difference-in-differences design where the coefficient of interest is β. It estimates the difference in excess mortality rate (time dimension) between municipalities from the poorest (Q1) and the other quartiles (comparison group). This model identifies the heterogeneity in the causal (total) effect of the pandemic on excess mortality under the sole hypothesis that, absent COVID-19 and the associated public policies, the average difference in the evolution of mortality over a given calendar period (in 2020 vs. before) between rich and poor municipalities of the same urban area would have remained stable. Below, we use falsification tests to provide evidence that this assumption is sensible. Table 1 reports the estimates associated with equation 2. Column 1 is estimated on all urban areas from mainland France and using the cumulative excess mortality rate over 2020 as a dependent variable. On average, within a given urban area, and once population size and age are controlled for, municipalities of the poorest quartile had an excess mortality rate of 2.627 (deaths per 10k. inhabitants) higher than other municipalities. This has to be compared with the baseline average of 8.668 (deaths per 10k. inhabitants) across municipalities of the other quartiles. Given the respective average median income in poor (17,108 euros) and non-poor municipalities (22,204 euros), this result implies an elasticity of excess mortality with respect to income of 1.32 at the municipality level.

We next consider three sub-periods for 2020 that correspond to the dynamics of the epidemic described in section 3.1. In column (2) we report the coefficients associated with equation 2 when considering only the first wave, that is for March and April, while in column (3) we consider the second wave, from October to December. The income gradient is significant in both waves and slightly higher than 1 extra dead per 10,000 inhabitants. By contrast, we observe no gradient when we focus on excess mortality outside of the two waves. In column (4), we consider a synthetic period made of all the months in 2020 that we did not classify in either wave. In these months, mortality was really close to baseline (i.e. 0 excess death). The independent variable, excess mortality rate, is computed considering four different time periods: the whole year (column 1), wave 1 (March to April, column 2), wave 2 (October to December, column 3) and other months in 2020 outside the two waves (January, February and from May to September, column 4). By construction, column 1 is the sum of column 2 to 4. The nonpoor average line reports the average of the predicted value of the dependent variable in non-poor municipalities. Controls include total population size and the share of the population over 65 years old.

We then further decompose the effect and estimate the model for each of the 12 months separately. Figure 3 plots the estimated β coefficients. The difference in monthly all-cause excess mortality between municipalities of the poorest quartile and richer municipalities closely follows the dynamics of the epidemic depicted in Figure 1 and the β only differs from 0 at the peak of the first and second waves. In other words, in each epidemic wave, mortality increases on average ( Figure 1 ) but even more so in the poorest municipalities ( Figure 3 ). 20

Taken together, these results support the idea that the income gradient we estimate for the whole 2020

year actually reflects the causal (total) effect of COVID-19 on mortality inequalities. They also provide evidence that this causal effect appears with regularity in each wave of the epidemic. We provide clearer and more detailed evidence on this last finding in section 4.4. Figure 3 : Monthly difference in all-cause excess mortality rate by income NOTE: The graph plots the point estimate and the 95% confidence intervals of the estimation of β from equation 2 evaluated each month. It accounts for the monthly difference in all-cause excess mortality between the poor municipalities and the rest, where poor is defined as belonging to the bottom quartile of the national distribution of municipal median income weighted by the municipality size.

Our main model compares municipalities from the poorest quartile (Q1) to the others. This approach has the advantage of simplicity: by discretizing income into a simple "poor" vs. "non-poor" comparison, it greatly reduces the dimension of the problem and allows the exploration of heterogeneity and mechanisms in the following sections. In appendix C1 we explain this definition carefully. In particular, we show that contrasting the evolution of mortality in each of the four quartiles leads to a clear monotonic pattern.

That is, within an urban area, the increase in excess mortality (during COVID-19 waves) decreases in J o u r n a l P r e -p r o o f Journal Pre-proof municipalities' income. This monotonicity is robust to a number of alternative partitioning of the distribution. Given this monotonic income gradient, pooling non-poor municipalities together to form a comparison group actually attenuates the differences in mortality rates we estimate (as Q1 is closer to the Q2-Q4 average).

The presence of a stronger increase in mortality of the poorest municipalities is robust to a number of alternative specifications. We only summarize these results here; comprehensive analysis can be found in Appendix C. As already mentioned, we test alternative grouping of municipalities into distinct quartiles (Table C1 ) or deciles (Table C2 ) of the income distribution. We further implement a log-linear model which relates excess mortality per 10k. inhabitants to the log of municipalities' median income (Table   C2 ). The table shows that comparing a municipality to another with a 10% higher median income is associated with a reduction of observed excess mortality by almost 1 death per 10k. inhabitants over the 2020 year. Our main result also holds when excluding elderly care homes. These were severely hit during the epidemic and one may worry that their location could drive a spurious correlation between municipality-level income and mortality. Removing all death records in such institutions does not alter our conclusion (Table C3 ). Next, we compute the income gradient separately for different age categories ( Figure C2 ): death toll and the income gradient increase with age, except for the category of people over 85 years old, who may represent a particularly vulnerable population irrespective of the level of income.

In other words, except for the very old, the size of the gradient increases with the size of the death toll.

Finally, we run an additional falsification test which compares mortality in 2019 and 2020 to the same baseline of 2018. Reassuringly, we find a gradient in 2020 excess mortality but none in 2019 excess mortality (see appendix C2.2). Figure 3 suggests that there is a proportional link between the income gradient and excess mortality at the country level. Our data allows us to test this relation at a much finer scale, namely the urban area level. To do so we proceed in three steps. First we estimate the gradient (for each month of 2020) separately for each of the 421 urban areas that include both poor and non-poor municipalities. This new urban-area specific gradient measures how unequal the increase in mortality has been within each urban areas. Second, we measure mortality shocks faced by each urban area. For this, we simply compute urban-area specific (monthly) excess mortality. Finally, we study the link between the measured (urban This exercise yields a clear message: when a given urban area faces an increase in excess mortality rate by one (death per 10k. inhabitant) in a given month, we also observe an increase of that urban area's gradient that same month by a bit more than 0.3 (more death per 10k. inhabitant) on average (cf. Table D1 ). This means that an increase in the intensity of an epidemic wave in given urban area will disproportionately affect poorer municipalities of this urban area.

Figure 3 also shows that the gradient exists in both waves. Two different hypotheses could explain such finding. In a "structural" hypothesis, there could be specific features of poorer municipalities that strengthen the mortality response to COVID-19. In a "circumstancial" hypothesis, poorer municipalities may have simply been hit first; and one could expect non-poor areas to subsequently catch up. The "circumstancial" hypothesis is potentially compatible with the reappearance of the gradient because our result could be driven by different urban areas in each wave. For instance, it could be that Paris was hit in both waves but only displayed a gradient in the first; while Marseille was unaffected in the first wave but displays an important gradient in the second wave. To the best of our knowledge, there are no papers disentangling between these two hypotheses. Our data allows to test these two scenarii over a one-year and two-waves span.

To discriminate between both hypotheses, we simply consider the 243 (among 421) urban areas that suffered positive excess mortality in each of the two waves. 21 We find that, for these hit-twice urban areas, the average income gradient in both wave is positive (cf. Table D2 ). We interpret this result as an evidence that, in these urban areas, although residents of poorer municipalities suffered more in the first wave, they also did so in the second wave. In other words, over a one year period, we find no evidence of catch-up of the richer municipalities.

This longer-term perspective provides new evidence on the impact of COVID-19 on mortality inequal-5 The role of lockdown: evidence from a quasi-natural experiment

The previous section has established evidence of an income-related heterogeneity in the total impact of the pandemic on mortality in France. This income gradient can result directly from COVID-19 infections, from the related effect on health system and individual behaviors, or because of important economic and social changes. In particular, it may be that lockdown policies have an independent impact on mortality that is heterogeneous between poor and non-poor municipalities, thereby playing a role in the observed mortality gradient. We call "epidemic channels" the COVID-19 impact on the health situation (broadly understood as infection and any indirect effect that increases with the level of infection, such as hospital congestion, increased anxiety, etc.) and "policy channel" the mortality changes caused by lockdown policies alone. 22 In this section, we take advantage of the quasi-natural experiment induced by the first lockdown to isolate the independent impact of lockdown policies on mortality inequalities and to estimate the impact of the epidemic net of this effect.

At the core of our strategy is the fact that the first lockdown was implemented uniformly over the country while the epidemic was at very heterogeneous stages of development across regions. In particular, based on an external classification provided by the government on May 7, we can distinguish regions strongly hit by the epidemic as early as mid-March (the red zone), from regions where the epidemic was at a much earlier stage of development when the first lockdown was decided (the green zone). Although this classification was only provided towards the end of the first lockdown, it strongly correlates with indicators of the spread of COVID-19 in mid-March, before lockdown: (i) the average occupancy rate of intensive care beds was 26.5% in red départements, against 7.0% on average in the green départements; 23 (ii) the likelihood to visit an emergency unit for suspicion of COVID-19 was twice as large in red départements (12.0% on average in the red départements vs 6.3% in the low-infection ones). 24 Importantly for what 22 According to McIntyre and Lee (2020), unemployment in time of lockdown is expected to have a strong impact on suicide rates -and we know that poor individuals are more likely to have jobs that cannot be done remotely and were therefore more often laid off during the lockdown (Gottlieb et al., 2021; Palomino et al., 2020) . Bullinger et al. (2020) have shown that stay-at-home orders increase domestic violence in poor households but not in households with above-median income. The strong variation in car crashes and in pollution levels due to the lockdown could also differently affect municipalities depending on their income (Brodeur et al., 2021) . 23 Note that red départements have more intensive care beds per inhabitant (7.58 vs 5.86 per 100K. inhabitants on December 31, 2018) so that the high occupancy rate of intensive care beds in the North-East of France is very unlikely to be driven by worse preliminary health care infrastructures. 24 An alternative possibility could have been to group départements based on these figures form mid-March. While the nature of our results is similar with this alternative classification, we preferred to rely on an external classification provided by the government, which cuts France in half and minimizes the share of areas classified as highly infected that are contiguous with areas classified as low-infected.

follows, we also show that there is no differential pre-trend in mortality between both zones in the years, months, and weeks preceding the epidemic ( Table 2 ). The table also confirms that COVID-19 infection rates were already quite high in the red zone in early March 2020, while the virus was almost absent from the green zone where excess mortality was still very close to 0 during the last two weeks of March.

Finally, we show in section 5.2 that trends in differential mortality between rich and poor municipalities are very similar across zones during pre-COVID periods, in line with Le Bras (2020) and Fouillet et al.

(2020) who provide evidence on the absence of a relationship between the location of the first epicentres and socio-demographic characteristics at the département level. 1 ). One can see no statistical difference between the two zones before the last two weeks of March 2020 (v.s. 2018-2019). Excess mortality rates then increase significantly more in the red zone than in the green zone and this pattern remains during the first wave. In May 2020 (v.s. 2018-2019) the two zones do not differ anymore in terms of excess mortality rates.

This quasi-experimental setting provides a unique opportunity to compare the evolution of excess mortality in two zones equally affected by lockdown restrictions 25 and sharing similar characteristics (health care system, other institutions, etc.) but unequally exposed to COVID-19. As before, we analyze within-urban areas gradient. Our claim is that the evolution of the within-urban-area gradient would have been comparable (on average between green and red zones) absent COVID-19 and the associated public 25 Figure B1 (Appendix B) supports the idea of a very uniform enforcement of the lockdown policy, since the increase in time spent at home compared to normal conditions is very similar across zones.

J o u r n a l P r e -p r o o f policies. In other words, we assume that the probability for an urban area to be exposed to is independent of the evolution of its poor vs. non-poor mortality differential that would have occurred absent COVID-19.

To disentangle the epidemic effects of COVID-19 on excess mortality from lockdown effects, we treat the green zone as a control group: we consider the income gradient in excess mortality found in this zone during wave 1 as a measure of the differentiated impact (between poor and non-poor municipalities) of the lockdown. By contrast, the income gradient found in the red zone captures both the effects of the epidemic and those of the lockdown. The difference between the two estimated gradients can thus be attributed to the aforementioned "epidemic channels". These channels are diverse: from direct death caused by infection to any indirect effect that increases with the level of infection (e.g. due to hospital congestion or any behavioral response such as increased anxiety or greater fear of commuting caused by the important circulation of the virus in the area). We argue, however, that direct death from COVID-19 infection remains the main driver of the 2020 excess mortality. First because aggregate numbers suggest so: confirmed COVID-19 fatalities amount to 118% of 2020 excess mortality in France (Le Minez and Roux, 2021) . Second, the heterogeneity across age bins shows no excess mortality or income gradient among the youngest. Such pattern restricts the set of possible confounding factors to those only affecting mortality of older people. Although we do not exclude other contributions, these are likely to be secondary at most.

We first re-estimate Equation 2 for each zone in each month and we plot the coefficients of interest on Figure 4 (the zone-specific equivalent of Figure 3) . It clearly appears that no gradient is found in the green zone during the first wave, unlike in the red zone and despite common lockdown restrictions. This leads us to conclude that lockdown policies did not have a significant independent contribution to the excess mortality income gradient. By contrast, the Figure shows a marked income gradient in excess mortality whenever the level of infection is high (i.e. wave 1 in the red zone; wave 2 in both zones). This suggests that the income gradient is only to be found in the epidemic effect of COVID-19. The graph plots the point estimate and the 95% confidence intervals of the estimation of β from equation 2 evaluated each month on each zone separately. It accounts for the monthly difference in all-cause excess mortality between the poor municipalities and the rest in each zone, where poor is defined as belonging to the bottom quartile of the national distribution of municipal median income weighted by the municipality size. The red zone corresponds to the areas that were the most severely hit by the first wave, and that are located in the North-Eastern quarter of the country. This zone covers about 44% of the urban population of (mainland) France. The green zone encompasses the rest of the French territory.

To test more formally the existence of an income gradient in the effect of the epidemic, we exploit the lockdown natural experiment by employing a triple-difference strategy. That is, we add the red vs. green dimension (defined at the département level) to the difference-in-differences setting used in section 4.

Formally, we estimate the following model:

The main coefficient of interest ρ estimates the difference between red and green zones in the within urban-area difference between rich and poor municipalities' excess mortality. All monthly coefficients are reported in Figure 5 . Consistent with our hypothesis, the figure shows no significant difference in excess mortality between zones outside of the first wave. This difference first increases in March, due to the early direct effects of the epidemic, but remains insignificant at a 5% level.

Conversely, in April, at the peak of wave 1, the main coefficient is strongly significant and is even stronger than in the difference-in-differences setting.

26 Note that these urban-area fixed effects account for the risk that: (i) coincidentally, the differences in COVID-19 infection intensity across urban areas at a given point in time could be non-orthogonal to their income level; (ii) people may adapt their behaviour to the local level of infection which would bias our results (Almagro et al., 2020) .

27 Note that this assumption is much less demanding than (for instance) assuming that the impact of the lockdown alone is the same in both zones.

28 Note that clustering at the level of urban areas does not change the nature of our results. 

To check the robustness of our results, we take advantage of the panel nature of our data to estimate an explicit triple-difference model, which includes municipality fixed effects and uses death toll as the main dependent variable. Formally, we estimate the following model: 

Where Death nb y,m,ua,d is the number of deaths in year y, municipality m in urban area ua and département d. This alternative specification allows us to control for municipality fixed effects c m , which capture all time-invariant factors influencing municipality-level mortality between 2018 and 2020. 29 This 29 We, therefore, do not need to divide the number of deaths by the population in the dependent variable, because we measure the population only once in 2014.

J o u r n a l P r e -p r o o f implies that urban-area fixed effects cannot be estimated (each municipality belongs to a single urban area). It follows that in this model, we do not restrict ourselves to comparing adjacent municipalities within the same urban area. We estimate this model on the first wave of the epidemic (March-April), when the situation was suited for the use of a triple difference. As shown in Table E1 , the nature of the result remains: over March-April, mortality increased in 2020, more so in red than in green areas, and more so in poor than non-poor municipalities. This result holds true regardless of whether we include municipalities located in rural areas (column 1) or not (column 2) in the estimation sample. 30 Finally, this result is also robust to the exclusion of elderly care homes. 31 In total, this set of results confirms that the most severely hit municipalities are those belonging to the poorest quartile and to the red zone.

6 Potential mechanisms

The heterogeneous impact of COVID-19 on excess mortality between rich and poor municipalities can cover many mechanisms. While we acknowledge that we cannot examine all the potential underlying channels, we choose to study closely the occupational and housing exposure mechanisms, based on three reasons: (i) these mechanisms have been mentioned as potential powerful channels of transmission very early on, both by economists and public health experts; (ii) they relate to ecological infection factors;

(iii) the availability of high-quality data. We provide more explanations on the two first rationales in the following paragraphs.

Both the scientific literature and newspapers have extensively covered the key role of housing conditions and occupational exposure from the beginning of the epidemic (Almagro and Orane-Hutchinson, 2020; Almagro et al., 2020; Glaeser et al., 2020; Naticchioni et al., 2020; Angelucci et al., 2020) . 32 Informed by the state of the scientific knowledge at that time and given the likely positive association of housing and occupational exposure with poverty, we focus on these mechanisms.

The second reason that leads us to focus on these mechanisms relates to the ecological approach that we follow throughout the paper. The probability of dying from COVID-19 can be decomposed as follows: we are primarily interested in mechanisms affecting the probability of getting infected, rather than the probability of dying conditional on being infected. While we cannot directly distinguish between both components due to the lack of individual data, we can argue, however, that the transmission component matters and is correlated with income. 33 To rule out the hypothesis that the entire gradient comes from a difference in lethality (because, for instance, all individuals with a comorbidity may also locate in poor municipalities) and make sure that there exists a difference in transmission, we use data on tests.

Finding a gradient in incidence rate would suggest that the mortality gradient comes at least in part from differential transmission. We use administrative testing data at the municipality level from the French public health agency (SI-DEP), which we describe in Appendix A1. The weekly incidence rate is computed as the number of positive tests per 100,000 inhabitants over the period October 6 th -December 28 th 2020 covered by the data. Test data suffers from caveats that we already mentioned in section 2 and that make us favour all-cause excess mortality in our main analysis. We describe these caveats in Appendix A2. With these limitations in mind, 34 we regress the incidence rate on our P oor(Q1) variable along with our standard controls, urban-area fixed effects and we cluster standard errors at the urban-area level (as in Equation 2). Figure 6 exhibits a pronounced gradient in infection at the peak of the second wave, in November and beginning of December. 35 33 We regret the absence of individual data on health and cause of death or high-quality measures of health in municipalities at the municipality level that could directly control for lethality. We therefore notably ignore the role of comorbidities known to be related with poverty and that most likely played a role in the observed mortality gradient (Wiemers et al., 2020; Raifman and Raifman, 2020) . Controlling for age, proven to be one of the main individual determinants of COVID-19 mortality (Zhou et al., 2020), presumably mitigates this concern. Furthermore, using Belgian individual-level data, Decoster et al. (forthcoming) find that the effect of individual income is not significant once one controls for place of residency.

34 One of the main limitations of the testing data -the fact that testing is limited and not random -is presumably less of a concern during the second wave where testing was more generalized. 35 We confirm this result by further decomposing the gradient into four quartiles (as in Equation (5)). Here again, we observe a monotonous gradient in infection from early November to mid-December. Results available upon request.

NOTE: The graph plots the point estimate and the 95% confidence intervals of the estimation of β from equation 2 evaluated each week with the incidence rate as the dependent variable. It accounts for the weekly difference in incidence rate between the poor municipalities and the rest, where poor is defined as belonging to the bottom quartile of the national distribution of municipal median income weighted by the municipality size.

From this exercise we conclude that the entire gradient in mortality cannot be explained by a (simple) difference in underlying health conditions, but that ecological dimensions matter to a great extent. In the following subsections, we first describe how we measure such potential mechanisms and confirm that each of our measures is positively correlated with poverty (section 6.2). We then try to quantify the extent to which these mechanisms explain the income gradient in COVID-19 related mortality and to understand which of them prevails (section 6.3).

Highly-exposed occupations are successively defined as occupations with frequent direct contact with the public in usual (pre-COVID-19) business conditions, and as occupations in sectors which kept operating during lockdown periods (the so-called essential workers). Informed by administrative data on the occupational distribution in each municipality, we compute (i) the worker-weighted average frequency of contact (hereafter "index of frequent contact") and (ii) the share of essential workers, in every municipalities. Regarding housing conditions, we use census data at the household level to measure the J o u r n a l P r e -p r o o f Journal Pre-proof share of overcrowded housing units in the municipality. This variable improves on some other measures of household size used in the literature (Almagro and Orane-Hutchinson, 2020) since it is a function of household size, dwelling size and number of rooms. At the interaction between labor-market and housing dimensions, we finally compute the share of municipalities' households that gather at least one member aged 65 or more and one member from a younger generation who is currently employed. This measure requires occupation information from the Census, which is only available for municipalities with at least 2,000 inhabitants. With respect to other papers establishing a link between physical contact (at home or at work) or maintained activity in the workplace during lockdown and COVID-19 exposure, we are able to provide evidence at a fine level both geographically -since we have administrative data at the municipal level while covering the whole national territory -and sector-wise -as the definition of essential workers is at the 3-digit occupation level. 36 We are also able to uncover the role of the interaction of labor-market and housing channels: workers increase their risk of catching the disease in their workplace, and transmit it to vulnerable persons when living in multigenerational households. We provide more details on the construction of these measures in Appendix A2 and reference the sources in Appendix A1. Table F1 shows descriptive statistics on the different mechanism variables. Table F2 shows the strength of the link between poverty and our (normalized) labor-market and housing measures, once included our baseline controls and urban-area fixed effects. Municipalities of the poorest quartile have more occupations in contact with the public (column 1) and more essential workers (column 2). A one standard-deviation increase in the share of essential workers (respectively in the index of frequent contact) makes the probability to fall in the bottom quartile of the national weighted income distribution increase by 16pp (respectively 14pp). The association is even stronger with the housingcrowding variable (column 3): a one standard-deviation increase in the share of over-crowded housing amounts to a 29pp rise in the probability to be living in a poor municipality. Poorest municipalities are also more likely to have multigenerational households (column 4), although the reduced sample size of municipalities with more than 2,000 inhabitants decreases the precision. 37 36 Almagro et al. (2020) exploit both occupation data from the American Community Survey and ZIP-code level and individual geolocation data to track time spent outside of home, but they focus on New York City only. Angelucci et al. (2020) use individual survey data in the US but they define essential workers based on the ability to work remotely before COVID-19. 37 The coefficient indicates that a one standard-deviation increase in the share of multigenerational households increases the probability to be in the poorest quartile by 7pp, but the p − value is only equal to 0.177.

In order to understand which mechanism prevails and to better inform public policy, we perform a horse race between our mechanism variables. Table 3 Column (1) of Table 3 reports the difference in excess mortality between poor and non-poor municipalities. In the first wave, each of the mechanism variables is positively related to excess mortality, when included along with the poverty indicator. 39 Including the frequent contact variable, however, has only a minor effect on the poverty coefficient (column (2)) whereas the inclusion of the share of essential workers (column (3)) and overcrowded housing (column (4)) makes the coefficient of poverty shrink. In the first wave, the initial coefficient of poverty diminishes by 35% when the share of essential worker is included, and by 97% when the share of over-crowded housing is included. The last column shows that the covariates altogether absorb all of the poverty coefficient, which is not significant anymore, while housing conditions and the share of essential workers are still significantly related to excess mortality.

Similarly, in wave 2, a high share of the poverty coefficient is absorbed by the mechanism variables (63%), but here the share of essential workers appears as the main channel of the income gradient. It alone absorbs 62% of the poverty coefficient, whereas the relationship between housing conditions and excess mortality decreases in magnitude, consistent with the less strict implementation of the lockdown policy (see section 3 for more details). It is reasonable to think that there is an overlap between the group of workers with frequent contacts in their job and essential workers (e.g. cashiers, bus drivers, etc.). 40

The coefficient of the frequent contact variable drops when the share of essential worker is included during the first wave. It suggests that non-essential workers with frequent contacts in normal business conditions do not suffer more from the pandemic during the first wave, since the lockdown probably prevents them from having such contacts. Conversely, being an essential worker alone is associated with a higher risk of COVID-19 related mortality.During the second wave, the inclusion of both occupational exposure variables even yields a negative coefficient on the contact variable. If we do not want to put too much emphasis on this result, one explanation could be that more protective measures have been 38 Although these mechanisms are likely to be correlated with each other, we also looked at their independent association with excess mortality and its dynamic. We checked, in particular, that this association is insignificant outside of the two epidemic waves. Results are available upon request. 39 A one standard-deviation increase in the mechanism variables is associated with an increase from around 0.5 death per 10k. inhabitants for the occupational variables to around 1.5 deaths per 10k. inhabitants for the housing variable 40 The correlation between both variables is 0.4. Out of the ten occupation codes with the highest index of frequent contact, six include essential workers.

Journal Pre-proof taken for these over-exposed workers, maybe to the extent that they become more protected than the average worker. 41 Essential workers, however, continue to be hit more severely by COVID-19 despite these protective measures because they are going to the workplace more frequently.

Examining the dynamics of the effect, Almagro and Orane-Hutchinson (2020) and Almagro et al.

(2020) highlight a growing importance of housing crowding over time compared with commuting in explaining COVID-19 infection, that they justify by the fact that essential workers were gradually laid off over time. The different policies implemented and the lower incidence of layoffs in France compared with the US during the pandemic may explain why we do not find a similar pattern.

In Appendix F, we reproduce the horse race on the incidence rate to check that our mechanism variables affect mortality through the transmission channel. Table F3 indicates that both horse races on excess mortality and incidence rate give very similar results: all mechanisms are positively associated with the incidence rate when sequentially added with the poverty variable. The last column of Table F3 that includes the mechanism variables altogether with the poverty measure reports a poverty coefficient close to zero and non-significant. The table also suggests that housing conditions are more crucial in explaining differences in infection than in the probability of dying conditional on being infected, as compared to labor market measures. 41 Think for example of a shop assistant, who must constantly wear a face mask and use hydro-alcoholic gel, compared to a clerical worker sharing his office and not constantly wearing a face mask.

J o u r n a l P r e -p r o o f (2) to (4) respectively include one additional variable capturing either the occupation or housing mechanism. Column (5) includes both the poverty dummy and all the mechanism variables. The last column for wave 2 includes the same variables but uses the incidence rate as the dependent variable instead. All regressions include urban-area fixed-effects and control for total population and for the share of inhabitants over 65 y.o. in the municipality. The mechanism variables have been normalized such that coefficients can be interpreted in terms of the effect of a one standard-deviation change, and can be compared with each other. The control outcome mean line reports the average excess mortality rate per 10k. inhabitants or incidence rate per 100K inhabitants in each wave in non-poor municipalities (conditional on controls and urban-area fixed effects).

All in all, Table 3 suggests that both housing conditions and the share of essential workers are important determinants of the relatively higher excess mortality due to COVID-19 in poor municipalities. The

J o u r n a l P r e -p r o o f Journal Pre-proof relative importance of housing conditions diminishes over time, potentially due to a less strict lockdown during the second wave, whereas remote work was prevalent throughout the period. The whole set of covariates explain about 13% to 18% of the overall variation in excess mortality according to the adjusted R 2 . Additional evidence that our mechanism variables channel a significant share of the income gradient by affecting transmission is that the R 2 of columns 2 to 5 of Table F3 is much higher, at about 70%.

In this paper, we provide clear evidence that COVID-19 contributes to increasing inequalities in mortality through an unequal impact across municipalities. We find that the epidemic caused 2.6 more deaths per 10k. inhabitants on average in the poorest municipalities in 2020, relative to a baseline of 8.7 in non-poor municipalities (i.e. a 30% higher effect in the poorest municipalities). Importantly, the income gradient measured in the first wave (March-April) persists in the second wave (October-December), even within cities strongly affected during the first wave. We further show that lockdown policies do not appear to have a significant independent contribution to this gradient. Finally, our analysis suggests a key mediating role of labor market and housing conditions, in line with the idea that ecological factors are important determinants of the spread of epidemics. More specifically, we find that labor-market exposure remains an important determinant of COVID-19 mortality across both waves, while the role of housing conditions decreases over time.

Our results show that the correlation between income and COVID-19 mortality highlighted in the recent literature does not only mirror existing health inequalities but that the baseline income gradient in mortality is rather amplified during each epidemic wave. Observing such an effect in both waves in a country with relatively egalitarian access to health care also reinforces the idea that the income gradient in mortality is a very robust feature of the epidemic. Potential future epidemic waves of or similar infectious diseases are likely to follow the same pattern and further amplify existing health inequalities.

Filosofi database (a French acronym for "Localised Disposable Income System"), 2014 version, contains a series of local measures on income and poverty. INSEE uses fiscal and social benefits data to compute these indicators at the municipality level. In particular, we are interested in the median standard of living of each municipality's inhabitants. The concept of standard of living corresponds to the household disposable income divided by consumption units (using the OECD scale), to account for the size of the household. URL: https://www.insee.fr/fr/metadonnees/source/operation/s1451/presentation Population Censuses are used to compute a series of measures at the municipality level. Each year:

(i) a fifth of municipalities with less than 10,000 inhabitants are covered by an exhaustive census and (ii) a sample of 8% of the population of municipalities with 10,000 or more inhabitants are surveyed. In-depth questions relative to economic activity and family structure of households are only available for about 20% of the households of municipalities under 10,000 inhabitants and about 40% of the households in the other municipalities (10,000 inhabitants or more). Analyses of in-depth questions cannot be done for municipalities with less than 2,000 inhabitants as samples get too small. We use INSEE annual population counts and both exhaustive and partial census data files.

URL: https://www.insee.fr/fr/information/2383265 SI-DEP data is provided by Santé Publique France. These testing data cover the period from October 6 th , 2020 to December 28 th 2020. On that period and for (almost) every day, we obtain the weekly incidence rate at the municipality level. The incidence rate is computed as the number of positive tests per 100,000 inhabitants. Open-access data deviates from the actual measure of the daily incidence rate in two dimensions. First, the reported incidence rate is computed over a rolling window of 7 days from d-9

to d-3. This methodology aims at reducing measurement error. Second, for privacy reason, the open-data file transforms the continuous variable into a right-censored discrete variable taking 8 values. We assign each municipality its median income as computed by INSEE from fiscal and social benefits data. We next order each municipality from the poorest to the richest.

Next, we classify municipalities into bins of equal population-weighted size. Put differently, we divide the French urban population into equal groups based on their municipality of residency. In our favourite specification we consider four quartiles, from Q1 the poorest to Q4 the richest. Because municipalities have different sizes, each group is made of a different number of municipalities. This weighting follows our empirical specifications. suspicion; (ii) the occupancy rate of intensive care beds relative to the initial one; (iii) the testing rate relative to the needs. High-infection départements correspond to 44% of the urban population and to roughly 30% of mainland France départements, all in the North-Eastern quarter of the country.

Although published in May, we use this distinction as a proximate measure of COVID-19 spread intensity before and during lockdown. We discuss this aspect in section 5.1.

We exploit incidence data provided by the French Public Health Institute on a weekly basis (see Appendix A1 for more details). For each day d, the incidence rate is given as a moving average from d − 9 to d − 3. The graphical evidence ( Figure 6 ) therefore reports the coefficients on 12 key dates, corresponding to each day numbered 1, 7, 14, 21 or 28 of each of the 3 months of the observed period.

We call this measure a weekly incidence rate for simplification in the core of the paper. The regression reported in Table 3 averages over dates every 7-day after the first one observed (October 6 th , 2020), which corresponds to the average of the daily incidence rate over the observed period. We note the following important caveats about the data:

• Unfortunately, it does not allow us to consider the first wave;

• The data is categorized and right-censored, creating measurement error;

• The data also suffers the limitations we describe in Section 2 which is that people do not get tested systematically and that individuals' propensity to get tested often reacts to local intensity of the infection;

In our analyses, we treat the incidence rate as a continuous variable and ignore right-censoring, that is: we transform each bin-indicator into the value of the center of the bin. For the last right-censored bin, there is no mid-point. We extrapolate the increasing trend of mid-points of the lower bins to get the 8 th bin mid-point.

One novelty of this paper is to use administrative data to test whether labor market conditions explain the differential in excess mortality between poor and rich municipalities. We focus on two main 46 J o u r n a l P r e -p r o o f characteristics of jobs that we expect to be key vectors for spreading the disease: (i) whether the worker kept on going to her workplace during the lockdown; (ii) whether the job involves frequent contacts with the public in usual business conditions (i.e. as measured before the lockdown).

As for the first dimension, for each municipality, we compute the share of essential workers 42 based on individual employment data from the DADS and on a list of essential occupations built by the Paris Region

Health Observatory (Mangeney et al., 2020) . 43 Note that, although this list was built by an administrative organisation, it remains arbitrary by nature. However, to our knowledge, there is no better way at the moment to characterise workers that kept on going to their workplace during the lockdown. 44

As for the second dimension, our proxy is based on the question "In your job, are you in direct contact with the public" that is available in a survey called DEFIS (see Appendix A1). For each occupation code (at the 3-digit level), we compute the share of workers answering "Often" (v.s. "Sometimes" or "Never"). Using the DADS, we then compute the average of this index at the municipality level based on the occupation distribution of workers living in each municipality.

Our last dimension of interest is the relation between housing conditions and excess mortality due to COVID-19. We use the average share of housing units that are overcrowded provided by INSEE at the municipality level. Overcrowded accommodations are those that have less than "one living room, one room for each couple, one room for each other adult aged 19 or older, one room for two children if they are of the same sex or are under 7 years of age, and one room per child otherwise".

To investigate the interactions between both mechanisms, we finally build an index at the municipality level, based on the partial census data files available for municipalities with more than 2,000 inhabitants.

We compute the share of households with both an elderly person (over 65 y.o.) and a worker that is younger by at least 18 years ("multi-generational households" hereafter). This variable is meant to capture the fact that having different generations living in the same apartment increases the likelihood 42 Workers are localised according to their municipality of residency and regardless of their municipality of work. 43 Essential workers include: health workers, auxiliary nurses, pharmacists, ambulance drivers, post office clerks, the police, public transport and funeral services, firefighters, persons working in the sale of food products, delivery workers, tobacconists and cleaning staff. On average, they include about 19.5% of workers in each municipality. The French National Statistical Institute also used this list in one of their paper (Papon and Robert-Bobée, 2020) . 44 We tried an alternative approach based on sectors that remained active during the lockdown. Our results got very noisy as we could not systematically identify firms that remained open and workers that kept on going to their workplace at a disaggregated level.

Descriptive statistics across the 16,640 municipalities. The average mortality rate (per 10,000 inhabitants) was 84 in 2018 and 2019 and rose to 90 in 2020. The average population of municipalities is 3,188, of which 17% are aged 65 or older. Municipalities' median income is, on average, 22,000 euro. In this appendix we take a more comprehensive approach and contrast the evolution of excess mortality rate in all four quartiles of municipalities' median income, taking the richest (Q4) as the reference. To do so, we estimate the following model: m,ua] .Λ + γ ua + ν [m,ua] 

Where all definitions (concept, subscripts, controls and standard errors) are the same as in equation

2. The coefficients of interest are β 1 , β 2 and β 3 . They estimate the difference in excess mortality rate between richest municipalities (Q4) and each of the three other quartiles (Q1, Q2 and Q3), respectively.

The identification assumption remains: the three coefficients identify the heterogeneity in the causal (total) effect of the pandemic on excess mortality under the hypothesis that, absent COVID-19 and the associated public policies, the average difference in the evolution of mortality in a given calendar period (in 2020 vs. before) between rich and poor municipalities of the same urban area would have remained stable. Table C1 reports the estimates of coefficients β 1 , β 2 and β 3 from equation 5. It is the equivalent of Table 1 . On average, within a given urban area, and once population size and age are controlled for, municipalities of the poorest quartile had an excess mortality rate of 4.216 (deaths per 10k. inhabitants) higher than the richest municipalities. This has to be compared with the baseline average of 6.584 (deaths per 10k. inhabitants) across municipalities of the richest quartile. Excess mortality is also significantly higher in municipalities of the two other quartiles than in municipalities of the richest quartile, but for both Q2 and Q3 the difference with Q4 is half smaller than the one observed between Q1 and Q4.

Columns (2) and (3) consider the first (i.e. March-April) and second (i.e. October-December) wave, respectively. The income gradient is striking in both waves and of very similar size. We observe no gradient when we focus on excess mortality outside of these two waves (column (4)).

J o u r n a l P r e -p r o o f Journal Pre-proof 

The monotonic gradient can also be seen when using alternative grouping in municipalities instead of quartiles. In Table C2 below we estimate two alternative specifications of equation 5. Columns (1) to (4) reproduce Table C1 using 10 deciles of income (instead of four quartiles) to group municipalities.

Similarly, columns (5) to (8) EHPADs are not distributed homogeneously across municipalities, they could drive the difference between income groups (in one way or another). To exclude the threat of this confounding factor, we exclude deaths recorded from such institutions from our sample and re-estimate our model of equation 2. Table   54 J o u r n a l P r e -p r o o f Journal Pre-proof C3 displays the results in the same format as Table 1 . Although lower in numbers, the main conclusion remains: an income gradient appears during epidemic waves.

The gradient in wave 1 is both large (+1.021 per 10k inhabitants) and insignificant. As explained in the core of the paper, this is largely due to the unequal spread of COVID-19 when the first lockdown started. If one considers only the red zone, the estimated coefficient is 1.945 (se = 0.644***). We therefore conclude from this exercise that EHPADs do not explain the entire increase in the gradient. The grey dotted line shows that using 2018 as the only reference point does not alter our main results too much, although having only one year of reference makes our measure more sensitive to year-specific shocks. Nevertheless, two take away points emerge from the black line. Overall, comparing 2019 to 2018 yields no significant difference in yearly excess mortality between poor and non-poor municipalities.

45 One could also have used 2019 as baseline, but all three measures are very close (with correlations between 0.5 and 0.9).

J o u r n a l P r e -p r o o f

Excess mortality is smaller for richer municipalities over some short periods (in August and September) but the estimated effects always remain smaller than 1 death per 10,000 inhabitants (i.e. about 0.3 death in the average municipality of our sample). (2) run separately on excess mortality over 2020 for different age categories. The first point reports the coefficient estimated on the whole population. 95% confidence intervals are reported. On the right-hand side, we show the magnitude of excess mortality defined as the number of excess deaths per 10,000 inhabitants over 2020, for the same age categories.

J o u r n a l P r e -p r o o f

To explore the validity of our results at the urban-area level, we first measure urban-area specific gradients by estimating the following model:

We include our baseline controls: population and share of residents older than 65 years old. The main difference from the model of equation 2 lies in the β ua which is urban area-specific (while β in 2 is not). Put differently, this model adds an interaction between the urban area indicators (γ) and the Q1

indicator. 46 That is to say, this model estimates one gradient per urban area. It follows that 6 can only be estimated on the 421 urban areas that host both poor and non-poor municipalities.

Next, we measure the intensity of the mortality shocks at the urban-area level. To do this, we simply compute each urban-area specific excess mortality in 2020. This is equation 1 at the urban-area level (instead of municipality level). When measuring the urban-area excess mortality, we sum all excess deaths from every municipality, irrespective of their income level.

In Table D1 below we regress urban-area month-specific gradient as measured in equation 6 on urbanarea month-specific excess mortality. This is to understand how the local gradients evolve with the magnitude of the mortality shock. The coefficient in column (1) corresponds to a pooled regression where each of the 421 urban areas is present 12 times (one per month; and we cluster standard errors by urban area). The coefficient indicates that increasing urban-area level excess mortality rate by one death (per 10k. inhabitants) also increases the income gradient in mortality by 0.373 death (per 10k. inhabitants).

Column (2) next introduces an urban-area fixed effect and therefore only considers the changes across months to identify the effect. Columns (3) and (4) reproduce columns (1) and (2), respectively, weighting observations by urban-area population. They mostly ensure that the correlation between excess mortality and gradient is not driven by small urban areas only. Columns (2) to (4) report results that are qualitatively similar to column (1). That is to say: on average, when an urban area faces a greater mortality, a within urban area gradient appears (the share of all death coming from the poorest municipalities of the urban area increases). 46 To insist on the difference between the coefficients of interest in 2 and 6, we indicate both β and γ with a ∼ .

J o u r n a l P r e -p r o o f (1) is a pooled regression while column (2) introduces an urban area fixed effect. Column (3) and (4) repeat (1) and (2), respectively, weighting by urban areas population.

We next investigate whether the 234 urban areas that suffered increased mortality in both waves also display an income gradient in the second wave. This amounts to testing the hypothesis that the positive relationship between the income gradient and excess mortality is a "structural" feature of the epidemics. The alternative would be that the income gradient only captures a difference in the timing of deaths between inhabitants of poor and non-poor municipalities. Table D2 shows that, in both waves, the average (and median) gradient of these 234 urban areas was greater than 0. That indicates that residents of poorer municipalities within an urban area were more likely to die (more) in 2020 (than before) than residents of non-poor municipalities. Importantly, the positive gradient is also found in the second wave indicating that there has been no convergence in mortality rates between poorer and richer municipalities of these urban areas over the two waves.

We remain cautious in interpreting the relative size of these gradients. The reduction of the average gradient (between wave 1 and 2) indicates a slower divergence between poorer and richer municipalities in the second wave. But such effect could be the product of many factors: a) we document in the paper that wave 2 is more diffuse over France than wave 1 (an aspect that is mirrored in the higher dispersion of gradients in wave 2 shown in Table D2 ), b) first-and second-wave policy response differed, c) there is potential learning from municipalities (i.e. residents, governing bodies, etc.).

J o u r n a l P r e -p r o o f Journal Pre-proof * p<0.1, ** p<0.05, *** p<0.01. Standard errors in parentheses are clustered at the département level. This table shows the result of regressing the poverty dummy on mechanism variables measuring either housing conditions or occupational exposure and the interaction of these variables via the variable multigenerational household. The poverty dummy is equal to one for the bottom 25% of the national distribution of municipal median income weighted by population size. Each column reports the result of a separate regression examining one mechanism. The main coefficient corresponds to the correlation between each of the mechanism variable and poverty. The mechanism variables have been standardized such that coefficients can be interpreted in terms of the effect of a one standard-deviation change, and can be compared with each other. All regressions include urban areas fixed-effects and control for total population and for the share of inhabitants over 65 y.o. in the municipality. The table only shows results for the second wave (October-December) -that is when test data are available -on municipalities in all urban areas. The first column only examines the poverty channel. Columns (2) to (4) respectively include one additional variable capturing either the occupation or housing mechanism. The last column includes both the poverty dummy and all the mechanism variables. All regressions include urban-area fixed-effects and control for total population and for the share of inhabitants over 65 y.o. in the municipality. The mechanism variables have been normalized such that coefficients can be interpreted in terms of the effect of a one standard-deviation change, and can be compared with each other. The outcome-mean line reports the mean of the incidence rate per 100K inhabitants (conditional on controls and fixed effects) in each wave.

Racial, Economic, and Health Inequality and COVID-19 Infection in the United States

Us disparities in health: descriptions, causes, and mechanisms

Epidemics, inequality, and poverty in preindustrial and early industrial time

The determinants of the differential exposure to covid-19 in new york city and their evolution over time. Covid Economics: Vetted and Real-Time Papers

Racial disparities in frontline workers and housing crowding during covid-19: Evidence from geolocation data

Remote Work and the Heterogeneous Impact of COVID-19 on Employment and Health

Socioeconomic conditions, government interventions and health outcomes during COVID-19

Cross-country comparisons of covid-19: Policy, politics and the price of life

Estimating excess 1-year mortality associated with the covid-19 pandemic according to underlying conditions and age: a population-based cohort study

The 1918 influenza pandemic and its lessons for covid-19

Excess mortality: the gold standard in measuring the impact of covid-19 worldwide

Unequal consequences of covid 19 across age and income: representative evidence from six countries

L'espérance de vie par niveau de vie: chez les hommes, 13 ans d'écart entre les plus aisés et les plus modestes

Demographic Determinants of Testing Incidence and COVID-19 Infections in New York City Neighborhoods. SSRN Scholarly Paper ID 3572329

Using excess deaths and testing statistics to improve estimates of covid-19 mortalities

On the effects of COVID-19 safer-at-home policies on social distancing, car crashes and pollution

Inequality and the Coronavirus: Socioeconomic Covariates of Behavioral Responses and Viral Outcomes Across US Counties

COVID-19 and Crime: Effects of Stay-at-Home Orders on Domestic Violence

Income inequality and mortality: A norwegian perspective

Impact of COVID-19 lockdown policy on homicide, suicide, and motor vehicle deaths in Peru

High excess mortality in areas with young and socially vulnerable populations during the COVID-19 outbreak in Stockholm Region

Economic status and health in childhood: The origins of the gradient

Deaths Involving Covid-19 by Local Area and Socioeconomic Deprivation: Deaths Occurring Between

Revealing the Unequal Burden of COVID-19 by Income, Race/Ethnicity, and Household Crowding: US County Versus Zip Code Analyses

The Association Between Income and Life Expectancy in the United States

Air pollution exposure and covid-19

Mortality inequality: the good news from a county-level approach

Pauvreté, egalité, mortalité: mortality (in) equality in france and the united states

The determinants of mortality

Socioeconomic status and health: Dimensions and mechanisms

The income gradient in mortality during the covid-19 crisis: evidence from belgium

Understanding Spatial Variation in COVID-19 across the United States

A population-based cohort study of socio-demographic risk factors for COVID-19 deaths in Sweden

Dépistage du coronavirus : les raisons du fiasco français sur les tests. Le Monde.fr

Excess all-cause mortality during the first wave of the covid-19 epidemic in france

How much does covid-19 increase with mobility? evidence from new york and four other us cities

Google covid-19 community mobility reports

Working from home in developing countries

Coronavirus Infections and Deaths by Poverty Status: The Effects of Social Distancing. SSRN Scholarly Paper

Social Vulnerability and Racial Inequality in COVID-19 Deaths in Chicago

What Does and Does Not Correlate with COVID-19 Death Rates

Impact of non-pharmaceutical interventions for SARS-CoV-2 on norovirus outbreaks: an analysis of outbreaks reported by 9 US States. medRxiv

COVID-19 and Overall Mortality Inequities in the Surge in Death Rates by Zip Code Characteristics: Massachusetts

Underestimation of covid-19 mortality during the pandemic

2020 : une hausse des décès inédite depuis 70 ans

Ethnicity and mortality in the united states: individual and community correlates

Progress against inequalities in mortality: register-based study of 15 european countries between

La surmortalité durant l'epidémie de covid-19 dans les départements franciliens

Projected increases in suicide in Canada as a consequence of COVID-19

Deaths of despair and the incidence of excess mortality in 2020

Partial lockdown and the spread of covid-19: lessons from the italian case

Wage inequality and poverty effects of lockdown and social distancing in Europe

Socio-demographic factors associated with self-protecting behavior during the covid-19 pandemic

Une hausse des décès deux fois plus forte pour les personnes néesà l'étranger que pour celles nées en france en mars-avril 2020

The effects of increased pollution on covid-19 cases and deaths. Available at SSRN 3633446

Multilevel analyses of neighbourhood socioeconomic context and health outcomes: a critical review

Disparities in the Population at Risk of Severe Illness From COVID-19 by Race/Ethnicity and Income

Excess mortality in the united states during the first three months of the covid-19 pandemic

Socioeconomic position and health: the independent contribution of community socioeconomic context. Annual review of sociology

Socioeconomic determinants of covid-19 infections and mortality: Evidence from england and wales

Testing for covid-19: A way to lift confinement restrictions

Using influenza surveillance networks to estimate state-specific prevalence of sars-cov-2 in the united states

Epidemics and society: from the black death to the present

Disparities in Vulnerability to Severe Complications from COVID-19 in the United States. medRxiv

Factors associated with COVID-19-related death using OpenSAFELY

The pandemic's hidden toll: Half a million deaths