key: cord-0739656-n0egc9ak
authors: Kostka, Kristin; Duarte-Salles, Talita; Prats-Uribe, Albert; Sena, Anthony G; Pistillo, Andrea; Khalid, Sara; Lai, Lana Y H; Golozar, Asieh; Alshammari, Thamir M; Dawoud, Dalia M; Nyberg, Fredrik; Wilcox, Adam B; Andryc, Alan; Williams, Andrew; Ostropolets, Anna; Areia, Carlos; Jung, Chi Young; Harle, Christopher A; Reich, Christian G; Blacketer, Clair; Morales, Daniel R; Dorr, David A; Burn, Edward; Roel, Elena; Tan, Eng Hooi; Minty, Evan; DeFalco, Frank; de Maeztu, Gabriel; Lipori, Gigi; Alghoul, Hiba; Zhu, Hong; Thomas, Jason A; Bian, Jiang; Park, Jimyung; Martínez Roldán, Jordi; Posada, Jose D; Banda, Juan M; Horcajada, Juan P; Kohler, Julianna; Shah, Karishma; Natarajan, Karthik; Lynch, Kristine E; Liu, Li; Schilling, Lisa M; Recalde, Martina; Spotnitz, Matthew; Gong, Mengchun; Matheny, Michael E; Valveny, Neus; Weiskopf, Nicole G; Shah, Nigam; Alser, Osaid; Casajust, Paula; Park, Rae Woong; Schuff, Robert; Seager, Sarah; DuVall, Scott L; You, Seng Chan; Song, Seokyoung; Fernández-Bertolín, Sergio; Fortin, Stephen; Magoc, Tanja; Falconer, Thomas; Subbian, Vignesh; Huser, Vojtech; Ahmed, Waheed-Ul-Rahman; Carter, William; Guan, Yin; Galvan, Yankuic; He, Xing; Rijnbeek, Peter R; Hripcsak, George; Ryan, Patrick B; Suchard, Marc A; Prieto-Alhambra, Daniel
title: Unraveling COVID-19: A Large-Scale Characterization of 4.5 Million COVID-19 Cases Using CHARYBDIS
date: 2022-03-22
journal: Clin Epidemiol
DOI: 10.2147/clep.s323292
sha: 91a9194857175f6e9a0352bc308df1a376efd223
doc_id: 739656
cord_uid: n0egc9ak

PURPOSE: Routinely collected real world data (RWD) have great utility in aiding the novel coronavirus disease (COVID-19) pandemic response. Here we present the international Observational Health Data Sciences and Informatics (OHDSI) Characterizing Health Associated Risks and Your Baseline Disease In SARS-COV-2 (CHARYBDIS) framework for standardisation and analysis of COVID-19 RWD. PATIENTS AND METHODS: We conducted a descriptive retrospective database study using a federated network of data partners in the United States, Europe (the Netherlands, Spain, the UK, Germany, France and Italy) and Asia (South Korea and China). The study protocol and analytical package were released on 11th June 2020 and are iteratively updated via GitHub. We identified three non-mutually exclusive cohorts of 4,537,153 individuals with a clinical COVID-19 diagnosis or positive test, 886,193 hospitalized with COVID-19, and 113,627 hospitalized with COVID-19 requiring intensive services. RESULTS: We aggregated over 22,000 unique characteristics describing patients with COVID-19. All comorbidities, symptoms, medications, and outcomes are described by cohort in aggregate counts and are readily available online. Globally, we observed similarities in the USA and Europe: more women diagnosed than men but more men hospitalized than women, most diagnosed cases between 25 and 60 years of age versus most hospitalized cases between 60 and 80 years of age. South Korea differed with more women than men hospitalized. Common comorbidities included type 2 diabetes, hypertension, chronic kidney disease and heart disease. Common presenting symptoms were dyspnea, cough and fever. Symptom data availability was more common in hospitalized cohorts than diagnosed. CONCLUSION: We constructed a global, multi-centre view to describe trends in COVID-19 progression, management and evolution over time. By characterising baseline variability in patients and geography, our work provides critical context that may otherwise be misconstrued as data quality issues. This is important as we perform studies on adverse events of special interest in COVID-19 vaccine surveillance.

The World Health Organization (WHO) declared the coronavirus disease 2019 (COVID-19) pandemic on 11 March 2020 after 118,000 reported cases in over 110 countries. 5 By the end of 2021, the number of COVID-19 cases increased to over 278 million cases globally, and the death toll exceeded 5 million. 6 Thousands of publications have attempted to aid our scientific understanding of this public health emergency. 7, 8 Characterisation studies, called descriptive epidemiology, provide an important context into our understanding of disease by describing the basic attributes of who gets sick and in what context. The initial body of COVID-19 characterisation work gave researchers information on the stark difference in the perception of the novel coronavirus compared to flu-like illnesses: patients were male, younger, and with fewer concurrent comorbidities and less documented prior medication use. 9 Utilising routinely collected real world data (RWD) can be a powerful asset for understanding an evolving pandemic response. 1, 2 Each data source provides novel information, be it the geographic variability of COVID-19, the impact of varying government strategies to contain spread or the evolution of treatment protocols. With extensive heterogeneity in public health strategies and clinical care across the world, 10 a large repeated multi-center study to describe disease across locations, practices, and populations, but that holds data analysis constant would go far in determining what factors impact observed differences.

RWD networks are vital in helping to understand the magnitude of the problem, and developing possibly mitigating strategies both globally and locally. 11, 12 Here we present the global Observational Health Data Sciences and Informatics (OHDSI) community, an international open-science initiative of more than 3500 collaborators from 34 countries, response to the COVID-19 pandemic. 3 Founded in 2015, the OHDSI data network enabled a rapid baseline understanding of COVID-19 in emerging hotspots (United States of America [USA], Spain and South Korea). 9 Our work evolved into a systematic framework for analysing and reporting COVID-19 RWD that we call Characterizing Health Associated Risks, and Your Baseline Disease In SARS-COV-2 (CHARYBDIS). CHARYBDIS offers multiple insights into COVID-19 clinical presentations, management and progression. Herein we aim to describe baseline demographics, clinical characteristics, treatments received, and outcomes among individuals diagnosed and hospitalized with COVID-19 in actual practice settings in nine countries from three continents. These data reflect an international community of research collaborators who are working to advance retrospective database research in RWD for COVID-19. Our body of research is freely available, foundational result set that can provide benchmarks in how COVID-19 manifests over time including its inevitable evolution as we roll-out additional vaccines and treatments.

We conducted a descriptive retrospective database study using a federated network of data partners in the USA, Europe (the Netherlands, Spain, the UK, Germany, France and Italy) and Asia (South Korea and China). Each data partner mapped their source system to the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). [13] [14] [15] The use of a CDM ensured shared conventions, including consistent representation of clinical terms across coding systems. We assessed the plausibility, conformance and completeness of each contributing database using a common data quality tool for repeated assessment and monitoring the adherence to conventions across the network. 16, 17 We ensured technical reproducibility by using the same package of analytical code for all contributing data partners. 18 The study protocol and analytical package were released on 11 June 2020 and iterative updates have continued to be released via GitHub: https://github.com/ohdsi-studies/Covid19CharacterizationCharybdis. 4 23 real world healthcare databases contributed to the CHARYBDIS study (Supplementary Table 1 ). Contributing institutes ranged from major academic medical centers to small community hospitals from across three continents. Date capture ranged from December 2019 to as recent as January 2021 (site specific dates in Supplementary Table 1 ). Prior to performing these analyses, all the data partners obtained Institutional Review Board (IRB) or equivalent governance approval. Each data partner executed the study package locally on their OMOP CDM. Only aggregate results from each database were publicly shared. Minimum cell sizes were determined by institutional protocols. All data partners consented to the external sharing of the result set on data.ohdsi.org.

We focused on three non-mutually exclusive COVID-19 cohorts: i) diagnosed with COVID-19 (a positive SARS-CoV-2 laboratory test or clinical diagnosis code documenting COVID-19 -earliest event served as the index date); ii) hospitalized with COVID-19 and; iii) hospitalized with COVID-19 and requiring intensive services. Due to variability in access to diagnostic testing, we specifically looked for the presence of a PCR or antigen laboratory test OR the use of clinical diagnosis codes documenting COVID-19 presentation. 19 The codes used to identify cohorts and more detail on the definitions of the above cohorts can be found in Supplementary Table 2 . These cohorts were generated both with a requirement of at least 365 days of data availability prior to the index date, and without any requirement for prior observation time. Databases created specifically for COVID-19 tracking may be unable to support extensive lookback periods and thus, we used multiple definitions to ensure inclusiveness in our approach. Cohorts were followed from their cohort-specific index date to the earliest of death, end of the observation period, and up to 30 days post-index.

Each cohort was analyzed by the overall study population and stratified by additional available characteristics including: follow-up time; socio-demographics, baseline comorbidities, pregnancy status (yes/no), and flu-like symptom episodes (yes/no). Detailed definitions of each stratification are available in Supplementary 

Baseline Characteristics, Symptoms, Medication Use and Outcomes of Interest Information on socio-demographics was identified at or before baseline (index date). All conditions, symptoms and medications were identified and described at four different time intervals (1 year prior, 30 days prior, at index and up to 30 days after index). The definition of each symptom and outcome is provided in Supplementary Table 2.

We built this analysis using Health Analytics Data-to-Evidence Suite (HADES), a set of open source R packages for large scale analytics. 20 Proportions, standard deviations (SD), and standardized mean differences (SMD) within each subgroup were tabulated as pre-specified in our study protocol. This analysis was descriptive in nature with the explicit intention of building an initial, repeatable framework for constructing prevalent rates of disease. Only cohorts or stratified sub-cohorts with a minimum sample size of 140 subjects were characterized. This cut-off was deemed necessary to estimate with sufficient precision the prevalence of a previous condition or 30-day risk of an outcome affecting ≥10% of the study population. SMDs were plotted in Manhattan-style plots, a type of scatter plot designed to visualize large data with a distribution of higher-magnitude values. Scatter plots were also created to compare the described conditions, symptoms and demographics of patients diagnosed (Y axis) to those hospitalized (X axis) with COVID-19. 

The USA data partners contributed 96% of the diagnosed with COVID-19 cohorts, including the single largest diagnosed cohort from IQVIA Open Claims (n=2,785,812). Europe contributed 4% of the diagnosed with COVID-19 cohorts, owing the single largest regional diagnosed cohort to SIDIAP-Spain (n=124,305). Asia contributed less than 1% of diagnosed with COVID-19 cohorts, with the single largest regional diagnosed cohort contributed from Daegu Catholic University Medical Center (n=599).

In the USA, the proportion of diagnosed cases generally decreased with age, with most diagnosed cases being within the 25 to 60 age group. The proportion of cases hospitalized and intensive services increased with age, with the highest proportions of cases of hospitalized, or intensive cases in the 60 to 80 year age group ( Figure 2) . A slightly higher proportion of women were diagnosed than men but a greater proportion of men were hospitalized (and where available, required intensive services) than women in the USA databases. In Europe, databases captured diagnosed or hospitalized cohorts but had limited information on intensive services. In Europe, databases capturing hospitalized cases (HMAR, HM-Hospitales, SIDIAP, and SIDIAP-H) showed a similar trend to the USA databases in that there was a higher proportion of men were hospitalized than women (Supplementary Figure 1 ). Unlike the USA and European databases, there was also a higher proportion of women in hospitalized cases in the South Korean database (HIRA). Age-wise trends in the European and Asian databases were similar to those in the USA databases, in that the bulk of the diagnosed cases were in the 25 to 60 year age group, whilst the majority of the hospitalized cases were in the 60 to 80 year age group (Supplementary Figure 1 ).

Overall, the proportion of patients with type 2 diabetes mellitus, hypertension, chronic kidney disease, end stage renal disease, heart disease, malignant neoplasm, obesity, dementia, auto-immune condition, chronic obstructive pulmonary disease (COPD), and asthma was higher in the hospitalized cohort as compared to the diagnosed (Tables 1 and 2) . Data on tuberculosis, human immunodeficiency viruses (HIV), and hepatitis C infections were sparse, and where available the proportions were generally low (≤1%). In the US databases, the proportion of pregnant women was generally higher in the hospitalized cohort than in the diagnosed, but not so in two European databases (HM and SIDIAP). The remaining five European and one of the Asian databases had data on pregnant women only in the hospitalized cohort, the proportion of which was < 2%.

Dyspnea, cough, and fever were the most common symptoms in diagnosed and hospitalized cohorts globally (Supplementary Table 5 ). Where recorded, the proportion of dyspnea and malaise/fatigue was consistently higher in the hospitalized cohort as compared to the diagnosed. Anosmia/hyposmia/dysgeusia was present in less than 1% individuals in all but one database and more common in the diagnosed than the hospitalized cohorts (Supplementary Table 6 ).

We further described a total of 19,222 conditions and 2973 medications registered during the year prior to the index date (Supplementary Figure 2) . The same information is also described for 30 days prior to the index date, at index date, or during the first 30 days after index date (Supplementary Tables 4-6) The full result set of comorbidities, presenting symptoms, medications and outcomes are reported by each cohort in aggregate counts, and are available in an interactive website: https://data.ohdsi.org/Covid19CharacterizationCharybdis/.

CHARYBDIS is the world's largest open science aggregate result set aimed at describing the baseline demographics, clinical characteristics, treatments received, and outcomes among individuals diagnosed and hospitalized with COVID-19. To accomplish this, we aggregated over 22,000 unique characteristics creating a multi-centre view to describe trends in COVID-19 progression, management and evolution over time. Globally, we observed similarities in the USA and Europe in gender (more women diagnosed than men but more men hospitalized than women) and age (most diagnosed cases between 25-60 years of age versus most hospitalized cases between 60-80 years of age) distributions. Similar to previous studies, we observed South Korea differed with more women than men hospitalized. We found similarities in comorbidities and presenting symptoms. The large, diverse sample size allows also for the identification of populations of great interest, including children and adolescents, 25 pregnant women, 26 patients with a history of cancer, 27 patients with a history of autoimmune disorders, 28 or patterns of drug utilization in COVID-19 treatment, 21 and which were the focus of additional in-depth investigations.

We described characteristics of 4,537,153 individuals with a clinical COVID-19 diagnosis or positive test, 886,193 hospitalized with COVID-19, and 113,627 hospitalized with COVID-19 requiring intensive services from 9 countries. Up to 22,200 unique aggregate characteristics have been produced across databases, with all made publicly available in an accompanying website. The evidence framework is a method for systematically understanding cohort-level differences in COVID-19 from different regions and different points in the pandemic. In the months since we started this effort, our network has already aided in rapid study for coagulopathy and adverse of events of special interest for COVID-19 vaccines to inform regulatory bodies. 22 This research community can be a public health utility to guide in 1) better patient characterization and stratification, 2) identifying areas of gap in knowledge/evidence, and 3) generating hypotheses for future research. Comparison to Other Multi-Centre COVID-19 Consortia

We began our deep phenotyping work through an initial investigation of persons hospitalized with COVID-19 compared to prior flu seasons in our global federated network. 9 The National COVID Cohort Collaborative (N3C) is a NIH NCATS funded initiative collecting centralizing patientlevel data to study patterns in COVID-19 patients. 23 This effort has over 80 participating institutions contributing 4.5M COVID-19 patients to date to a centralized harmonized repository. The consortia has enabled many US institutions in adoption of common data models in COVID-19 research. 4CE is another multi-site data-sharing collaborative of 342 hospitals in the US and in Europe, utilizing i2b2 or OMOP data models. 24 The hospitalization cohorts presented in 4CE cohorts remain smaller than the scope of CHARYBDIS with only 36,447 hospitalized patients with COVID-19 as of August 2020. 24 Even when adjusting for cohort overlap, our work to date with CHARYBDIS is nearly triple the diagnosis and double the hospitalized cohorts represented in prior research. Our results also have more international representation across the cascade of hotspots over the course of the pandemic's spread. As we continue our research, we are working with researchers to create inpatient-outpatient linkages and understand COVID-19 patient trajectories across care settings.

Our study has several strengths. This study is unique in its approach to characterizing COVID-19 cases across an international network of healthcare systems with varied policies enacted to combat this pandemic. This allows better understanding of the implications of the pandemic for different countries and regions, in the context of an international comparison. Particularly, it provides visibility into the variability of patient characteristics across healthcare settings. This study is the most comprehensive federated network of healthcare sites in the world, creating the single largest cohort study on diagnosed and hospitalized COVID-19 cases to date. The large, diverse sample size allows for extensive investigation on subgroups of interest. CHARYBDIS is the framework for additional in-depth investigations on children and adolescents, 25 pregnant women, 26 patients with a history of cancer, 27 patients with a history of autoimmune disorders, 21 or patterns of drug utilization in COVID-19 treatment. 21 The size of these results are so large, we have hundreds combinations of subgroups of interest that remain unreported. There is significant opportunity for this framework to inform additional research.

We recognize there are limitations in our approach. First, this study is descriptive in nature. Further analyses are needed to utilize these findings in clinical application. The observed differences between groups (eg diagnosed versus hospitalized) should therefore not be interpreted as causal effects without further statistical scrutiny. Answering causal questions is especially difficult in COVID-19 because of the varying processes by which patients were screened, tested, admitted, and treated; the critical importance of knowing the exact timing of treatments and outcomes in severe cases; and the lack of appropriate comparison groups. Simple multivariable models by themselves will not sufficiently address bias for multiple questions and were purposely not applied here. This study was carried out using data recorded in routine clinical practice and based on electronic health records (EHRs) and/or claims data. The analysed data are therefore expected to be incomplete in some respects and may have erroneous entries, leading to potential misclassification. We have selectively reported database-specific outcomes to minimise the impact of incompleteness. We are aware that this may mean the network assembled is not inherently valuable for every follow-on analysis as each data partner may have different elements missing. Hospital encounters may be unable to ascertain outcomes experienced in an outpatient data. Our EHR partners rely on structured data and may be missing key findings from clinical notes. Additionally, the under-reporting of symptoms observed in these data is a key finding of this study, and should be taken into consideration in previous and future similar reports from "real world" cohorts. Differential reporting in different databases is likely a function of differential coding practice as well as of variability in disease severity, with milder/less symptomatic cases more likely presenting in outpatient and primary care EHR, and more severe ones in hospital databases. Finally, the current result submissions are prejudiced to data in the initial wave of COVID-19 cases. Further analysis using this network requires 

stratification by calendar month. Lastly, we currently lack data partners in low to middle income countries and recognize these data are lacking representation of some of the hardest hit areas in the world (eg Brazil, India). As data are accumulated over time, future updates of the results will provide the opportunity to study more recent cohorts of COVID-19 patients, who seem to have a better prognosis overall compared to those diagnosed in the first half of the pandemic.

We constructed a global, multi-centre view to describe trends in COVID-19 progression, management and evolution over time. By characterising baseline variability in demographics across geography, our work provides critical context to the reliability of the insight we generate. In retrospective database studies, one can struggle to identify whether heterogeneity occurs because of patient variability or because of the variability in source systems we use to capture patient data. Here we use a network of retrospective databases standardised to the same data model adhering to a shared ontology and data quality processes. Our study provides a comprehensive view into the first year of the pandemic at a scale unlike most retrospective research. Our work sheds light on the natural history of millions of COVID-19 patients from the USA, 6 European countries and 2 Asian countries. This framework is open source and available for re-use enabling a repeatable, reproducible method to capture the evolving natural history of this novel coronavirus and can be extended to other disease of international interest. We believe it is critically important to repeat and reproduce the findings we produce in real world studies. Leveraging this global federated network to corroborate single center findings can provide context to national database findings in the presence of regional variability in COVID-19 management including vaccine rollout and treatments.

Lead authors affirm that the manuscript is an honest, accurate, and transparent account of the study being reported; that no important aspects of the study have been omitted; and that any discrepancies from the study as planned have been explained.

Analyses were performed locally in compliance with all applicable data privacy laws. Although the underlying identified patient data is not readily available to be shared, authors contributing to this paper have direct access to the data sources used in this study. All results (eg aggregate statistics, not presented at a patient-level with redactions for minimum cell count) are available for public inquiry. These results are inclusive of site-identifiers by contributing data sources to enable interrogation of each contributing site. All analytic code and result sets are made available at: https://github.com/ohdsistudies/Covid19CharacterizationCharybdis.

All the data partners received Institutional Review Board (IRB) approval or exemption. STARR-OMOP had approval from IRB Panel #8 (RB-53248) registered to Leland Stanford Junior University under the Stanford Human Research Protection Program (HRPP). The use of VA data was reviewed by the Department of Veterans Affairs Central IRB, was determined to meet the criteria for exemption under Exemption Category 4(3), and approved for Waiver of HIPAA Authorization. The research was approved by the Columbia University Institutional Review Board as an OHDSI network study. The use of SIDIAP was approved by the Clinical Research Ethics Committee of the IDIAPJGol (project code: 20/070-PCV). The use of HMAR was approved by the Parc de Salut Mar Clinical Research Ethics Committee. The use of CPRD was approved by the Independent Scientific Advisory Committee (ISAC) (protocol number 20_059RA2). This study is approved by the University of Florida IRB under protocol IRB202100175. Some databases used (HealthVerity, Premier, IQVIA Open Claims, Optum EHR, and Optum SES) in these analyses are commercially available, syndicated data assets that are licensed by contributing authors for observational research. These assets are de-identified commercially available data products that could be purchased and licensed by any researcher. The collection and de-identification of these data assets is a process that is commercial intellectual property and not privileged to the data licensees and the co-authors on this study. Licensees of these data have signed Data Use Agreements with the data vendors which detail the usage protocols for running retrospective research on these databases. All analyses performed in this study were in accordance with Data Use Agreement terms as specified by the data owners. As these data are deemed commercial assets, there is no Institutional Review Board applicable to the usage and dissemination of these result sets or required registration of the protocol with additional ethics oversight. Compliance with Data Use Agreement terms, which stipulate how these data can be used and for what purpose, is sufficient for the licensing commercial entities. Further inquiry related to the governance oversight of these assets can be made with the respective commercial entities: HealthVerity (healthverity.com), Premier (premierinc.com), IQVIA (iqvia.com) and Optum (optum.com). At no point in the course of this study were the authors of this study exposed to identified patient-level data. All result sets represent aggregate, de-identified data that are represented at a minimum cell size of >5 to reduce potential for re-identification. Furthermore, the New England Institutional Review Board of Janssen Research & Development (Raritan, NJ) has determined that studies conducted on licensed copies of Premier, Optum EHR, Optum SES and HealthVerity are exempt from study-specific IRB review, as these studies do not qualify as human subjects research.

Common problems, common data model solutions: evidence generation for health technology assessment

PCORnet ® 2020: current state, accomplishments, and future directions

Observational Health Data Sciences and Informatics (OHDSI): opportunities for observational researchers

ohdsi-studies/Covid19CharacterizationCharybdis: Charybdis v1.1.1 -Publication Package

director-general/speeches/detail/who-director-general-s-opening-remarks-at-The-media-briefing-on-covid

Johns Hopkins Coronavirus Resource Center. COVID-19 map

COVID-19-related medical research: a meta-research and critical appraisal

Publishing volumes in major databases related to Covid-19

Deep phenotyping of 34,128 adult patients hospitalized with COVID-19 in an international network study

Ethics and Informatics in the age of COVID-19: challenges and recommendations for public health organization and public policy

Use of electronic health records to support a public health response to the COVID-19 pandemic in the United States: a perspective from 15 academic medical centers

Factors associated with COVID-19-related death using OpenSAFELY

Validation of a common data model for active safety surveillance research

Empirical assessment of methods for risk identification in healthcare data: results from the experiments of the Observational Medical Outcomes Partnership

Development and evaluation of a common data model enabling active drug safety surveillance using disparate healthcare databases

A harmonized data quality assessment terminology and framework for the secondary use of electronic health record data

Observational health data sciences, informatics

How confident are we about observational findings in healthcare: a benchmark study

Uptake and accuracy of the diagnosis code for COVID-19 among US hospitalizations

Observational health data sciences and informatics

Use of repurposed and adjuvant drugs in hospital patients with covid-19: multinational network cohort study

Characterising the background incidence rates of adverse events of special interest for covid-19 vaccines in eight countries: multinational network cohort study

The National COVID Cohort Collaborative (N3C): rationale, design, infrastructure, and deployment

International comparisons of harmonized laboratory value trajectories to predict severe COVID-19: leveraging the 4CE collaborative across 342 hospitals and 6 countries: a retrospective cohort study. bioRxiv medRxiv

Thirty-Day Outcomes of Children and Adolescents With COVID-19: An International Experience. Pediatrics

Clinical characteristics, symptoms, management and health outcomes in 8598 pregnant women diagnosed with COVID-19 compared to 27,510 with seasonal influenza in France, Spain and the US: a network cohort analysis

Characteristics and Outcomes of Over 300,000 Patients with COVID-19 and History of Cancer in the United States and Spain

COVID-19 in patients with autoimmune diseases: characteristics and outcomes in a multinational network of cohorts across three countries

online journal focusing on disease and drug epidemiology, identification of risk factors and screening procedures to develop optimal preventative initiatives and programs. Specific topics include: diagnosis, prognosis, treatment, screening, prevention, risk factor modification, systematic reviews, risk & safety of medical interventions, epidemiology & biostatistical methods, and evaluation of guidelines, translational medicine, health policies & economic evaluations. The manuscript management system is completely online and includes a very quick and fair peer-review system, which is all easy to use

We would like to acknowledge the patients who suffered from or died of this devastating disease, and their families and caregivers. We would also like to thank the social workers and healthcare professionals involved in the management of COVID-19 during these challenging times, from primary care to intensive care units. We also thank the database curation teams around the world including the COVIDMAR Group (R. Güerri Yamanouchi, Pfizer-Boehringer Ingelheim, GSK, Amgen, UCB, Novartis, Astra-Zeneca, Chiesi, Janssen Research and Development, none of which relate to the content of this work. Dr. Hripcsak reports grants from US NIH and Janssen Research. Dr. Ryan is an employee of Janssen Research and Development and shareholder of Johnson & Johnson. Dr. Suchard reports grants from US National Institutes of Health, Department of Veterans Affairs, during the conduct of the study; grants and/or personal fees from IQVIA, Janssen Research and Development, US Food and Drug Administration, and Private Health Management, outside the submitted work. Dr. Prieto-Alhambra reports grants, nonfinancial support, speaker/consultancy services and/or advisory board membership from AMGEN, UCB Biopharma, and Les Laboratoires Servier, outside the submitted work; and Janssen, on behalf of IMI-funded EHDEN and EMIF consortiums, and Synapse Management Partners have supported training programmes organised by DPA's Department and open for external participants. The views expressed are those of the authors and do not necessarily represent the views or policy of the Department of Veterans Affairs or the United States Government. No other relationships or activities that could appear to have influenced the submitted work. The authors report no other conflicts of interest in this work.

were critical to drafting the manuscript and the overall interpreting results. All authors made a significant contribution to the work reported, whether that is in the conception, study design, execution, acquisition of data, analysis and interpretation, or in all these areas; took part in drafting, revising or critically reviewing the article; gave final approval of the version to be published; have agreed on the journal to which the article has been submitted; and agree to be accountable for all aspects of the work.