key: cord-1018337-zzrfo0z7 authors: Nafilyan, V.; Pawelek, P.; Ayoubkhani, D.; Rhodes, S.; Pembrey, L.; Matz, M.; Coleman, M. P.; Allemani, C.; Windsor-Shellard, B.; van Tongeren, M.; Pearce, N. title: Occupation and COVID-19 mortality in England: a national linked data study of 14.3 million adults date: 2021-05-17 journal: nan DOI: 10.1101/2021.05.12.21257123 sha: aef2f7e6236f028991291a3ab51e142c99fe4f5d doc_id: 1018337 cord_uid: zzrfo0z7 Objective: To estimate occupational differences in COVID-19 mortality, and test whether these are confounded by factors, such as regional differences, ethnicity and education or due to non-workplace factors, such as deprivation or pre-pandemic health. Design: Retrospective cohort study Setting: People living in private households England Participants: 14,295,900 people aged 40-64 years (mean age 52 years, 51% female) who were alive on 24 January 2020, living in private households in England in 2019, were employed in 2011, and completed the 2011 census. Main outcome measures: COVID-19 related death, assessed between 24 January 2020 and 28 December 2020. We estimated age-standardised mortality rates per 100,000 person-years at risk (ASMR) stratified by sex and occupations. To estimate the effect of occupation due to work-related exposures, we used Cox proportional hazard models to adjust for confounding (region, ethnicity, education), as well as non-workplace factors that are related to occupation. Results: There is wide variation between occupations in COVID-19 mortality. Several occupations, particularly those involving contact with patients or the public, show three-fold or four-fold risks. These elevated risks were greatly attenuated after adjustment for confounding and mediating non-workplace factors. For example, the hazard ratio (HR) for men working as taxi and cab drivers or chauffeurs changed from 4.60 [95%CI 3.62-5.84] to 1.47 [1.14-1.89] after adjustment. More generally, the overall HR for men working in essential occupations compared with men in non-essential occupations changed from 1.45 [1.34 - 1.56] to 1.22 [1.13 - 1.32] after adjustment. For most occupations, confounding and other mediating factors explained about 70% to 80% of the age-adjusted hazard ratios. Conclusions Working conditions are likely to play a role in COVID-19 mortality, particularly in occupations involving contact with COVID-19 patients or the public. However, there is also a substantial contribution from non-workplace factors, including regional factors, socio-demographic factors, and pre-pandemic health. The coronavirus pandemic has been particularly severe in the United Kingdom, where high infection and death rates have been reported. Whilst most deaths occur amongst elderly adults [1] , many deaths have also occurred among those of working age, particularly among essential workers [2] . Several studies have reported important occupational differences in the risk of SARS-CoV-2 infection and death [3, 4, 5] , but there have been relatively few systematic comparisons of death rates in different occupations. Infections in health care workers have received the most attention [6, 7] , with evidence that intensive care unit workers who care for COVID-19 patients are at elevated risk. However, other occupations may also be at increased risk, particularly those which involve social care, or contact with the public [8] . In particular, age-standardised mortality rates (ASMRs) for COVID-19 by occupation are high among taxi drivers and chauffeurs, bus and coach drivers, chefs, sales and retail assistants, and social care workers [9] . Occupational inequality in COVID-19 mortality is a major public health problem [10, 8] , but it is challenging to determine the extent to which working conditions drive these raised risks. Occupational differences in COVID-19 mortality could be caused by non-workplace factors such as living conditions at home or poor underlying (pre-pandemic) health. Deprivation, poor health and occupation are all linked. For example, people working in low-paid, insecure jobs are also likely to experience poor housing conditions, overcrowding and low pre-pandemic health status. COVID-19 mortality is also higher in people with specific comorbidities [11, 12] . However, it is important to distinguish between situations where high COVID-19 mortality rates in a particular occupational group are directly due to workplace exposures, and those which are due to non-workplace factors. This distinction is particularly important for public health policy. If the excess risk in an occupation (e.g. bus drivers) is due to working conditions, this would suggest the need for preventive interventions in the workplace. However, if the excess is due to non-workplace factors such as living conditions at home (which may be associated with working conditions, but are not caused by them), then different interventions would be required. In both instances, the observed differences in risk would be real, and preventable, but the policy implications would be different. In this study, we estimated occupational differences in COVID-19 mortality in England and Wales during 2020. We have examined how much these differences changed after adjustment for nonworkplace factors, using Cox proportional hazard models. We used individual-level data from the Public Health Data Asset. This dataset is based on the 2011 Census in England, linked with the NHS number to death records, Hospital Episode Statistics and the General Practice Extraction Service (GPES) data for pandemic planning and research. To obtain NHS numbers, the 2011 Census was linked to the 2011-2013 NHS Patient Registers, using deterministic and probabilistic matching, with an overall linkage success of 94.6%. We excluded individuals (12.4%) who did not have a valid NHS number or were not linked to GPES primary care records. We used data on 14,295,900 individuals who were aged 31-55 years at the time of the 2011 Census and were therefore likely to be in stable employment both in 2011 and 2020 (by which time they were aged 40-64 years). We examined the differences between occupation groups in the risk of death involving COVID-19 during the 11 months from 24 January to 28 December 2020. Individuals in the study population were followed up from 24 January until 28 December 2020 for COVID-19 death (either in hospital or out of hospital), defined as confirmed or suspected COVID-19 death as identified by one of two ICD10 (International Classification of Diseases, 10th revision) codes (U07.1 or U07.2) derived from the medical certificate of cause of death. We chose the main exposure as the occupation at the time of the 2011 Census. Occupations are coded using a hierarchical classification, under the Standard Occupation Classification (SOC) 2010 (7) . The most detailed classification (Unit group, with 4-digit codes) includes 369 categories, whilst the most aggregated (Major group, with 1-digit codes) has only nine groups. We derived a hybrid classification based on the sub-major groups (2-digit codes), which include 25 categories. We also examined some 3-digit and 4-digit codes to assess selected occupations that have previously been shown to have high COVID-19 mortality, such as taxi drivers, security guards, or care home workers [4] . Our final classification contained 41 categories (Supplementary Table S1 in Appendix). We derived a classification of essential workers, based on the classification developed for a recent study using data from the UK Biobank [3] . Because we used the occupation recorded at the 2011 Census, our exposure variable is likely to be misclassified for some participants, since people may have left the labour force or changed occupation since 2011. To estimate the extent of misclassification, we used data from Understanding Society, a large-scale longitudinal household survey, to analyse occupational mobility across Major (1-digit SOC codes) groups between 2011 and 2019. We aimed to distinguish between situations where high COVID-19 mortality rates in a particular occupational group are likely to be directly due to work-place exposures, and those that may be due to non-workplace factors. In addition to the basic age-adjusted models, we adjusted for potential confounders such as geography and ethnicity. We also adjusted for factors that may be related to occupation, but which affect exposures outside the workplace, including markers of deprivation and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint housing conditions. These are all potential confounders of the association between workplace exposures and COVID-19, because they may be associated with the risk of COVID-19 mortality, either through the propensity to become infected or the propensity to die once infected. All covariates are summarised in Supplementary Table S2 in the Appendix. Geographical factors and sociodemographic characteristics were based on the 2011 Census; body mass index (BMI) and comorbidities were derived from the primary care and hospitalisation data following the definitions adopted by the QCOVID risk prediction model [11] . For the period from 24 th January 2020 to 28 th December 2020, we calculated age-standardized mortality rates (ASMRs) for each occupation using the European Standard Population [13] . To estimate the effect of occupation due to work-related exposures, we used Cox proportional hazard models to adjust for confounding (region, ethnicity, education), as well as non-workplace factors that are related to occupation. We estimated five models, sequentially adjusting for additional covariables to assess how they might confound or mediate differences in workplace exposure on the risk of death from COVID-19 (See Figure 1 ). Our first model was only adjusted for age. The second model also adjusted for geographical factors (region, population density, rural urban classification) to account for the differential spread of the virus in different areas. The third model further adjusted for other confounding factors, ethnicity and education, which are related both to occupation and COVID-19 risk. The fourth model also controlled for non-workplace factors (living conditions), including socio-. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Finally, the last model adjusted for pre-pandemic health (BMI, chronic kidney disease, learning disability, cancer or immunosuppression, and other conditions; see Supplementary Table S2 for details on all the covariates). We used corporate managers and directors as the reference category, because it is a large group with a low absolute risk [9] . Our analytical sample comprises 14,295,900 people aged 40-64 years (mean age 52 years, 51% female) who were alive on 24 January 2020, living in private households in England in 2019, were employed in 2011, and completed the 2011 census. Between 24 January and 28 December 2020, 4,552 people (0.003%) died from a cause related to COVID-19; characteristics of these individuals are summarized in Table 1 (further details in Supplementary Table S3) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Age-standardized mortality rates . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The age-adjusted Hazard ratios (HRs), relative to corporate managers, indicated large differences in COVID-19 mortality between occupations for both men and women ( Figure 1 ). For men, adjustment . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. For women, adjusting for confounding factors also greatly attenuated the estimated difference in risk between occupations. The highest age-adjusted HRs were observed for plant and machine operatives For most occupations, confounding and other mediating factors explained about 70% to 80% of the age-adjusted hazard ratios. Adjusting for socio-economic status had the largest impact on the hazard ratios, followed by geographical factors (See Supplementary Table S4 and S5 for men and women respectively). A notable exception is health professionals, for whom adjustment for socio-economic factors did not affect the hazard ratios. Hazard ratios obtained when using all other occupations (rather than corporate managers and directors) as a reference group are similarthe unadjusted hazard ratios are slightly lower, but the adjusted estimates are similar to those in our main analyses (Supplementary Table S7) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. Note: Fully adjusted Cox regression models include geographical factors (region, population density, urban/rural classification), ethnicity, socio-economic characteristics (Index of Multiple Deprivation decile group, household deprivation, educational attainment, social grade, household tenancy, type of accommodation, household size, multigenerational household, household with children), health (body mass index, chronic kidney disease (CKD), learning disability, cancer and immunosuppression, other conditions). See Supplementary . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint Tables A1 for more details. Occupation classification and associated SOC 2010 codes are given in Supplementary Table S1 . Numerical results can be found in Supplementary Tables S4 and S5. Table 4 shows the hazard ratios for essential workers compared to non-essential workers as the reference category. Overall, essential workers are at higher risk of COVID-19 death than nonessential workers, and most categories of essential workers also have higher mortality. Once again, the differences are generally much attenuated after adjusting for potential confounding and mediating factors; a notable exception is health care professionals. We also report hazard ratios for major groups, compared to directors and managers, in Supplementary Table S7 . By combining data from the 2011 Census with electronic health records, the Public Health Data Asset has enabled us to analyse detailed information on a wide range of socio-demographic characteristics . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint and individual (pre-pandemic) health status. Information on occupation is not available in traditional electronic health records in the United Kingdom, and the Census is the only source of populationwide occupation data. Our dataset contains over 14 million people aged 40 to 64 years who were living in England at the beginning of the pandemic. We were therefore able to estimate COVID-19 mortality for detailed occupational groups, and to estimate whether the differences in mortality are driven by workplace-related factors, or by other confounding and mediating factors. The main limitation of our study is that the information on occupation is nine years out of date. Our exposure is therefore likely to be misclassified for a proportion of people, because they have left the labour force or changed occupation since 2011. To mitigate measurement error, we restricted our analysis to people aged 40-64 years, who had a relatively high occupational stability, as shown in our analysis of a large longitudinal household survey. Exposure misclassification is nonetheless likely to result in biasing of the estimated hazard ratios towards the null value of 1.0. However, we still observed strongly elevated hazard ratios for many occupations. Misclassification of occupation would be constant across our various analyses and could not explain the substantial decrease in most hazard ratios after adjustment for confounders. On the other hand, the confounders that we have addressed are also likely to be misclassified to some extent. Given that adjustment for confounders produced large changes in the estimated occupational associations, it is possible that if more accurate or detailed confounder data were available, adjustment would have driven the hazard ratio estimates even lower towards the null value of 1.0. Another limitation is that our dataset excludes recent migrants, since it is based on people who were enumerated at the 2011 Census. Finally, some deaths may not have been registered by the end of the study period if they had been sent to a coroner, which could affect some occupational groups such as healthcare workers. Our age-adjusted results are consistent with official estimates of COVID-19 mortality by occupation group [9] . However, we find that these elevated risks were greatly attenuated after adjustment for nonworkplace factors, such as geographic factors, socio-demographic factors and pre-pandemic health. A recent study based on the UK Biobank found that compared to non-essential workers, medical support staff and healthcare professionals had the highest risk of severe COVID-19 [3] . We also found that, amongst men, healthcare professionals were at increased risk of death from COVID-19, but the HRs for healthcare professionals in our study were similar to those working as care workers, taxi drivers or in secretarial occupations, after full adjustment. Our results are also consistent with US studies documenting higher mortality rates in essential workers, such as transportation/logistics workers and healthcare workers [5, 14] . Our findings are also generally consistent with a recent analysis of data from the UK Coronavirus Infection Survey, which found increased risks for a similar list of occupations [15] . This analysis . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint found that the occupational differences largely disappeared after adjustment for other factors, but the adjustment included factors that are likely to be inherent to working conditions (inability to work at home, and inability to socially distance at work), and are therefore on the causal pathway linking occupational exposure and infection. Thus, our adjusted findings are not directly comparable with those obtained from the Coronavirus Infection Survey. Our age-standardised mortality rates and age-adjusted hazard ratios confirm that there is a wide variation in the risk of COVID-19 mortality between occupations. However, workplace exposure is only one of several possible factors that drive the observed differences in the risk of COVID-19 mortality between occupations: other factors also contribute to the observed differences. After adjusting for these factors, people who work in occupations that involve contacts with patients (e.g. health and social care workers) or the public (e.g. bus and taxi drivers, retail workers) remain at elevated risk COVID-19 related death. Other occupations that do not involve contact with patients or the public may also have increased risks due to specific working conditions (e.g. overcrowding in the workplace, lack of ventilation, lack of PPE, etc), but our analyses indicate that these relative risks are generally small, after adjustment for confounding. This does not mean that infection is not occurring in specific workplaces. Whilst there have been a number of workplace outbreaks reported in various industries such as food processing which do not involve patient or public contact [16, 17] , it appears that such outbreaks are not sufficient to produce strongly elevated sector-wide increased risks after adjustment for non-workplace factors. Our analyses have confirmed previous findings that many occupations have elevated risks of COVID-19 mortality. These associations were greatly attenuated, for many occupations, after adjustment for measures of deprivation and geographical factors, suggesting that differences in risk between occupations are a result of a complex mix of different factors. A number of occupations showed increased risks, even after comprehensive adjustment, and it is likely that working conditions played a role. However, our findings also indicate that non-workplace factors also play a major role. Preventive measures therefore need to reduce workplace exposures, but also to reduce exposures outside the workplace, including overcrowding, inadequate housing, and deprivation. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) VN had full access to all data in the study and takes responsibility of the integrity of the data and the accuracy of the data analysis. The lead author (VN) affirms that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained. Dissemination of the results to study participants is not possible. Patients or the public were not involved in the design, or conduct, or reporting, or dissemination plans of our research. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. Note: Household deprivation is defined according to four dimensions: employment (at least one household member is unemployed or long-term sick, excluding full-time students); education (no household members have at least Level 2 education, and no one aged 16-18 years is a full-time student); health and disability (at least one household member reported their health as being 'bad'/'very bad' or has a long-term health problem); and housing (the household's accommodation is overcrowded, with an occupancy rating -1 or less, or is in a shared dwelling, or has no central heating). Key worker type is defined based on the occupation and industry code. 'Exposure to disease' and 'proximity to others' are derived from the O*NET database, which collects from industry experts a range of information about working conditions and day-to-day tasks of job. To calculate the proximity and exposure measures, the questions asked were: i) How physically close to other people are in their current job? ii) How often is the worker exposed to diseases or infection? Scores ranging from 0 (no exposure) . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint to 100 (maximum exposure) were calculated based on these questions using methods previously described by the ONS. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted May 17, 2021 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) Note: Reference -Managers and directors. Fully adjusted Cox regression models include geographical factors (region, population density, urban/rural classification), ethnicity, socio-economic . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint characteristics (IMD decile, household deprivation, educational attainment, social grade, household tenancy, type of accommodation, household size, multigenerational household, household with children), health (Body Mass Index, Chronic kidney disease (CKD), Learning disability, Cancer and immunosuppression, other conditions) See Supplementary Tables A1 for more details. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 17, 2021. ; https://doi.org/10.1101/2021.05.12.21257123 doi: medRxiv preprint Deaths registered weekly in England and Wales, provisional: week ending COVID-19) related deaths by occupation, England and Wales: deaths registered up to and including Occupation and risk of severe COVID-19: prospective cohort study of 120 075 UK Biobank participants COVID-19) related deaths by occupation COVID-19 deaths by occupation The COVID-19 pandemic: major risks to healthcare and othre workers on the front line Risk factors for positive and negative COVID-19 tests: A cautious and in-depth analysis of UK biobank data The Covid-19 (Coronavirus) pandemic: consequence for occupational health Coronavirus (COVID-19) related deaths by occupation, England and Wales: deaths registered between 9 Covid-19 and health at work Living risk prediction algorithm (QCOVID) for risk of hospital admission and mortality from coronavirus 19 in adults: national derivation and validation cohort study OpenSAFELY: factors associated with COVID-19 death in 17 million patients Revision of the European Standard Population -Report of Eurostat's task force Excess mortality associated with the COVID-19 pandemic among Californians 18-65 years of age, by occupational sector and occupation Coronavirus (COVID-19) Infection Survey: characteristics of people testing positive for COVID-19 in England Characteristics of SARS-CoV-2 Transmission among Meat Processing Workers in Nebraska, USA, and Effectiveness of Risk Mitigation Measures Coronavirus disease among workers in food processing, food manufacturing, and agriculture workplaces