key: cord-0703323-ssv8xf9d authors: Oelsner, E. C.; Allen, N. B.; Ali, T.; Anugu, P.; Andrews, H.; Asaro, A.; Balte, P. P.; Barr, R. G.; Bertoni, A.; Bon, J.; Boyle, R.; Chang, A. A.; Chen, G.; Cole, S. A.; Coresh, J.; Cornell, E.; Correa, A.; Couper, D.; Cushman, M.; Demmer, R. T.; Elkind, M. S.; Folsom, A. R.; Fretts, A. M.; Gabriel, K. P.; Gallo, L.; Gutierrez Contreras, J.; Han, M. K.; Henderson, J. M.; Howard, V. J.; Isasi, C. R.; Jacobs, D. R.; Judd, S. E.; Kamin Mukaz, D.; Kanaya, A. M.; Kandula, N. R.; Kaplan, R. C.; Krishnaswamy, A.; Kinney, G. L.; Kucharska-Newton, A.; Lee, J. S.; Lewis, C. E.; Levine, D. A.; Levitan, title: Collaborative Cohort of Cohorts for COVID-19 Research (C4R) Study: Study Design date: 2021-03-20 journal: medRxiv : the preprint server for health sciences DOI: 10.1101/2021.03.19.21253986 sha: 2334f35258690040758f318b750de72331f5ef1c doc_id: 703323 cord_uid: ssv8xf9d The Collaborative Cohort of Cohorts for COVID-19 Research (C4R) is a national prospective study of adults at risk for coronavirus disease 2019 (COVID-19) comprising 14 established United States (US) prospective cohort studies. For decades, C4R cohorts have collected extensive data on clinical and subclinical diseases and their risk factors, including behavior, cognition, biomarkers, and social determinants of health. C4R will link this pre-COVID phenotyping to information on SARS-CoV-2 infection and acute and post-acute COVID-related illness. C4R is largely population-based, has an age range of 18-108 years, and broadly reflects the racial, ethnic, socioeconomic, and geographic diversity of the US. C4R is ascertaining severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection and COVID-19 illness using standardized questionnaires, ascertainment of COVID-related hospitalizations and deaths, and a SARS-CoV-2 serosurvey via dried blood spots. Master protocols leverage existing robust retention rates for telephone and in-person examinations, and high-quality events surveillance. Extensive pre-pandemic data minimize referral, survival, and recall bias. Data are being harmonized with research-quality phenotyping unmatched by clinical and survey-based studies; these will be pooled and shared widely to expedite collaboration and scientific findings. This unique resource will allow evaluation of risk and resilience factors for COVID-19 severity and outcomes, including post-acute sequelae, and assessment of the social and behavioral impact of the pandemic on long-term trajectories of health and aging. access and quality among vulnerable communities. EHRs typically lack detailed information on health-related behaviors, such as smoking, so that controlling for confounders is challenging. Moreover, in the course of usual clinical care, clinically actionable diagnostic testing is performed for sick persons, but not well persons; hence, subclinical disease is not well detected, and genomic and other mechanistic biomarkers are generally lacking. Although inception cohorts with longitudinal follow-up of clinically ascertained cases of COVID-19 cases can address some of these knowledge gaps, survival bias, recall biases, and non-randomly missing data regarding pre-COVID health and behaviors are inevitable. In this context, strong assumptions are required to define phenotypes identified in COVID-19 survivors (e.g., fibrotic lung disease) as "sequelae" when they may have been present prior to the pandemic, and actually be antecedent risk factors or effect modifiers. The Collaborative Cohort of Cohorts for COVID-19 Research (C4R) was established as a national, prospective study of adults at risk for incident COVID-19 that is relatively free of referral, survival, and recall biases. C4R includes fourteen US prospective cohort studies that, collectively, constitute a large, well-characterized, population-based sample that ranges in age from young adults to centenarians, and reflects the racial, ethnic, socioeconomic, and geographic diversity of the US. Using standardized protocols, C4R is aggressively attempting full ascertainment of SARS-CoV-2 infection and COVID-19 illness across all cohorts. C4R offers the additional major advantages of standardized data collection protocols, including high-quality clinical events surveillance dating back as far as 1971 in some studies, and robust retention rates. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2021. ; https://doi.org/10.1101/2021.03. 19.21253986 doi: medRxiv preprint For decades, the C4R cohorts have collected extensive longitudinal data on clinical and subclinical disease, behaviors, cognition, biomarkers, and social determinants of health. C4R will link this "pre-COVID" phenotyping to information on SARS-CoV-2 infection and acute and post-acute COVID-related illness. The integration of antecedent and illness-related data will provide a unique opportunity to understand mechanisms and modifiers of risk and resilience for SARS-CoV-2 infection and adverse COVID-19 outcomes. C4R will also support comparisons of longitudinal changes in health measures over the course of the pandemic in persons with varying degrees of COVID-19 severity. Furthermore, the availability of well-characterized participants unaffected by COVID-19 will allow the assessment and differentiation of the effects of infection, illness, and pandemic-related social, economic, and behavioral changes. Overall, C4R aims to provide a valuable scientific resource to (1) evaluate risk and resilience factors for adverse COVID-19 outcomes, including severe COVID-19 illness and long-term complications, (2) assess the social and behavioral impact of the COVID-19 pandemic on longterm outcomes and trajectories of health and disease, and (3) examine disparities in COVID-19 risk and outcomes according to race, ethnicity, geography, and other social determinants of health. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2021. ; https://doi.org/10.1101/2021.03. 19.21253986 doi: medRxiv preprint Fourteen prospective cohorts are collaborating in C4R ( Table 1) . Eight of the cohorts were designed to study cardiovascular disease epidemiology: Atherosclerosis Risk in Communities (ARIC) Study (20) , Coronary Artery Risk Development in Young Adults (CARDIA) Study (21) , Framingham Heart Study (FHS) (22) , Hispanic Community Health Study/Study of Latinos (HCHS/SOL) (23) (24) (25) , Jackson Heart Study (JHS) (26) (27) (28) , Mediators of Atherosclerosis in South Asians Living in America (MASALA) Study (29, 30) , Multi-Ethnic Study of Atherosclerosis (MESA) (31) , and the Strong Heart Study (SHS) (32, 33) . These cohorts generally recruited populationbased samples, although only three (ARIC, CARDIA, FHS, HCHS/SOL) used representational sampling techniques at some or all sites. Four of the cardiovascular studies (ARIC, CARDIA, FHS, MESA) recruited multi-racial participants, and four were designed to study primarily specific race or ethnic groups (Hispanic/Latino participants in HCHS/SOL, Black participants in JHS, South Asian participants in MASALA, American Indian participants in SHS). Four multi-ethnic cohorts were established to study respiratory epidemiology: the Genetic Epidemiology of COPD (COPDGene) Study (34) and the SubPopulations and InteRmediate Outcome Measures in COPD Study (SPIROMICS) (35) were established as longitudinal case-control studies of cigarette smokers with and without COPD; Prevent Pulmonary Fibrosis (PrePF) is a study of early and established interstitial lung disease; and, the Severe Asthma Research Program (SARP) is a study of the entire range of mild to severe asthma, enriched for severe disease (36) . Two studiesthe Northern Manhattan Study (NOMAS) and the REasons for Geographic and Racial Differences in Stroke (REGARDS) -were established to study primarily neurological outcomes, including stroke and cognition. NOMAS is a multi-ethnic community study (37) and REGARDS is All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (38) . The cohorts that comprise C4R have collected detailed data on their participants' health and behavior for as long as fifty years of follow-up (Figure 1 ). As summarized in Table 2 (42) , and the genetic sequencing and multi-omics-focused Trans-Omics for Precision All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Medicine (TOPMed) Project (43) . C4R is building and expanding upon these successes to advance COVID-19 research. Planning for C4R began in March 2020, when the need for a coordinated, cross-cohort response to the knowledge gaps posed by the COVID-19 pandemic became self-evident and urgent. Cohort investigators initiated discussions regarding approaches to ascertain SARS-CoV-2 infections and COVID-related illnesses within the context of unprecedented cohort operational challenges associated with the outbreak. The National Heart, Lung, and Blood Institute (NHLBI) funded C4R via an Other Transactional Authority (OTA) mechanism in October 2020. Additional funding for inclusion of the neurology-focused cohorts was provided via the OTA by the National Institutes of Neurological Disorders and Stroke (NINDS) and the National Institute of Aging (NIA). Leadership for C4R is provided by an organizing committee that includes leading -and often, founding -principal investigators (PIs) from all C4R cohorts, PIs from the C4R Data Coordination and Harmonization Center (DCHC), PIs from the C4R Biorepository and Central Laboratory (BCL), and program officers from the NHLBI, NINDS, and NIA. This organizing committee developed master C4R protocols for COVID-19 data collection. Consistent with an ancillary studies model, each cohort in C4R is directly responsible for accomplishing its own data collection in accordance with the master protocol and under the All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. To promote and sustain this broad collaborative effort, C4R PIs invited additional investigators and cohort personnel to participate in C4R committees and working groups. Study materials, including protocols and meeting materials, are posted regularly on a password-protected investigator section of the C4R website (c4r-nih.org). Cohort participants previously consented for in-person, telephone, and/or email contact and for the abstraction of medical records. Additional consent for ascertainment of COVID-19 data, including the serosurvey, is being obtained according to cohort-specific procedures, including verbal, remote, and traditional written informed consent. Of 73,119 active participants across the fourteen cohorts, 53,972 participants were readily available for recruitment into C4R. Anticipated socio-demographic characteristics of potential C4R participants, estimated from current active cohort participants, are shown in Table 3 . Fiftyeight percent of potential participants are 65 years or older, and thus at high risk for severe COVID-19. The anticipated sample is racially and ethnically diverse, based on self-report (44) , with approximately 6% American Indian participants, 2% Asian participants, 26% Black participants, and 20% Hispanic/Latino participants. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All forty-eight continental states are represented among C4R participants, including rural, suburban, and urban communities (Figure 2) . In all, C4R is being conducted across forty field/clinical centers, many of which are associated with more than one C4R cohort; one cohort with extensive geographic reach, the REGARDS, operates via telephone and in-home exams only (38) . COVID-19 questionnaires. C4R is ascertaining self-reported COVID-related experiences by questionnaire. Each cohort will deploy C4R questionnaires twice within 18 months following the initial outbreak in March 2020 via telephone, mail-in, online, email, or smartphone apps. Wave 1 questionnaires were developed as early as March 2020 in certain cohorts (45) and urgently administered in spring and summer 2020. Although these efforts pre-dated C4R, early informal cross-cohort collaborations ensured that many cohorts used identical questionnaires, and all of them generated common data elements regarding infection, testing, hospitalization, and recovery. Wave 2 questionnaires were fully standardized to include domains on COVID-19 infection, testing, hospitalization, symptoms, recovery, re-infection, contacts, vaccination, behavioral changes, sleep, memory loss, depression, anxiety, fatigue, and resilience. The C4R questionnaire was developed collaboratively to include validated and PhenX toolkit instruments (https://www.phenxtoolkit.org) (46-55) in order to optimize comparability with pre-pandemic assessments and across C4R and other epidemiology cohorts. The C4R questionnaire, including All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (56, 57) ) programming may be available on request. COVID-related events ascertainment. C4R is ascertaining COVID-related hospitalizations and deaths that are identified via the C4R questionnaire or other surveillance methods available to the cohorts, including EHR linkages, where available. Each cohort is using its own established infrastructure for ascertainment of medical records and death certificates, including use of the National Death Index (NDI), the Centers for Medicare & Medicaid Services (CMS), International Classification of Diseases (ICD) codes (58), and linkage to records from local departments of health. Cohorts may review events locally at their Field/Coordinating Centers or transfer records for central review by C4R. The C4R events review is designed to assess severity and major complications of COVID-19 illness, including pneumonia, myocardial infarction, stroke, thromboembolism, and acute renal failure. The protocols use, or are modeled after, longstanding cohort protocols to classify and validate cardiovascular, respiratory (19) , and thromboembolic (59) events. Protocols for ascertainment, review, and classification are available on the study website (c4r-nih.org). Dried blood spot collection. C4R is ascertaining serostatus by dried blood spot (DBS) in 2021. Cohort field centers receive DBS collection kits from the BCL and are responsible for recruitment, consent, and distribution to participants. Updated details regarding vaccination status are obtained at the time of DBS consent and immediately prior to mailing the DBS kit to the participant. Participants mail the completed kits directly to the BCL or to the cohort field or All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. coordinating center as an intermediary step. Participant instructions, including a video, are provided by the cohort and via the C4R website (c4r-nih.org) and/or cohort-specific websites. In cohorts with upcoming in-person exams, the DBS may be collected in-person by research staff. C4R data collection will define a spectrum of COVID-19 outcomes. Ascertainment of COVIDrelated hospitalizations and deaths will characterize, classify, and validate moderate-to-severe COVID-19 illnesses. In addition to identifying these events, questionnaires are being used to obtain self-reported information on the nature, severity, and duration of symptoms during acute infection and in the post-acute setting. This will support classification of symptomatic and asymptomatic infections, as well as cases of prolonged recovery or post-acute sequelae of SARS-CoV-2 infection (PASC). Data on behaviors, attitudes, psychosocial impacts, and vaccinations will also be collected. Seropositive individuals without self-reported infection will be reclassified as infected, whereas seronegative individuals with prior positive testing by selfreport or health records will be classified as sero-reverted. Data management C4R data collection is coordinated centrally at the DCHC at Columbia University Irving Medical Center. Electronic data collection forms are being programmed into REDCap for use or adaptation by the cohort coordinating centers. Metadata on completion of questionnaires, All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted March 20, 2021. ; https://doi.org/10.1101/2021.03.19.21253986 doi: medRxiv preprint events ascertainment, and DBS kit collection status are reported and reviewed bi-weekly to ensure operational milestones are met. Participants are assigned a C4R study identifier by cohort-specific coordinating centers that is used for participant-level data transfers and analyses. The C4R BCL at the University of Vermont is responsible for establishing a C4R biorepository of DBS, plus other biospecimens that may be collected in the future, and for performing and/or coordinating performance of any centralized clinical and biomarker assays and serology assays. Individual DBS Collection kits are produced by the BCL and shipped to the cohorts (either to the individual field centers or the cohort coordinating center, based on cohort preference). Kits and DBS cards are labeled with a biospecimen identifier, which is linked to C4R identifiers that are maintained centrally and not shared with the BCL, through the use of a "linking key." Filled DBS cards are returned to the BCL, and batches prepared for serology assays performed by the New York State Wadsworth Center's Bloodborne Viruses Laboratory (BVL) under CLIA and New York State certification. The BVL performs a SARS-CoV-2 IgG Microsphere Immunoassay using Luminex bead technology for qualitative detection of human IgG antibodies to SARS-CoV-2 nucleocapsid (N) and spike subunit 1 (S1) antigens. Based on testing 730 pre-COVID DBS and >1100 DBS from individuals with laboratory-confirmed infection, specificity is 99.5% for both N and S1 and sensitivity ranged from 90 to 96% for symptomatic individuals and 77 to 91% for asymptomatic individuals. Sensitivity increased for both groups with time from positive PCR All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Serology results are reported by the BVL to the C4R BCL, and then to the cohort coordinating centers, which are responsible for a) recombining the results with the proper participants based on the "linking key", and b) reporting results to participants according to usual cohort practices. Serological results are not believed to have clinical relevance, and the CDC does not currently recommend modifications to individual behavior or clinical care based on antibody status alone (60); hence, no protocols for "alert" findings have been established, and participants may opt out of results return. Protocols for the serosurvey are available on the study website (c4rnih.org). Since all current vaccines in use in the U.S. generate an immune response to the Spike protein, we anticipate being able to distinguish vaccination from viral infection by the use of the antinucleocapsid assay results (61) . Harmonization of COVID-19 and pre-pandemic data will be performed centrally to define COVID-19 common data elements and to align pre-pandemic data for large-scale, longitudinal analyses. This effort will leverage prior harmonization efforts across C4R cohorts in the TOPMed Project, the NHLBI Pooled Cohorts Study, the BP COG Study, and the CHARGE Working Groups (10, 40, 42, [62] [63] [64] [65] [66] [67] . Due to their significance to COVID-19 epidemiology, particular emphasis will All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (68) (69) (70) (71) (72) (73) (74) , and imaging-based (75) (76) (77) (78) (79) (80) (81) (82) phenotyping collected within the decade prior to the outbreak using deep-learning (18, (83) (84) (85) and other methods ( Table 4) . Quality control C4R cohorts have established protocols for checking data completeness and accuracy at the field center and coordinating center levels. Dual data entry is encouraged but not required, since it will not be feasible in all settings due to local impediments and COVID-related exigencies. Ten percent of event reviews will be randomly selected for re-review. Reviewers not meeting standards will receive regular feedback with recommendations for retraining and/or protocol modifications, as appropriate. Serological assays will be repeated on a random 5% subsample of blind duplicates. The C4R Commons Agreement, modeled on the CHARGE Analysis Commons Consortium Agreement (86), will expedite cross-cohort data harmonization and sharing, as allowed (87). Following review and approval, cohort-specific agreements would permit COVID-19 and prepandemic data to be uploaded to the NIH-supported cloud computing platform, hosted by BioData Catalyst. Access to the pooled C4R dataset would be granted to investigators involved in core harmonization efforts and those with manuscript proposals approved by C4R All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. publications and cohort coordinating committees. Once harmonization and related quality control is completed, C4R common data elements will be transferred as a limited dataset for public access on BioData Catalyst in accord with cohort-specific consents and commitments. The administrative coordinating center for C4R is the NHBLI CONNECTS program (nhlbi- (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. based studies, C4R's repeated exams and cognitive assessments before and after COVID-19 also provide important opportunities to estimate the social and behavioral impact of the COVID-19related pandemic response on changes in long-term mental and physical health across multiple domains. C4R will provide important opportunities for future studies using a range of epidemiologic study designs. For example, nested within C4R, longitudinal cohort studies of COVID-affected and unaffected participants could repeat a variety of subclinical measures (e.g., echocardiography, lung imaging, neuro-cognitive assessment) to define reliably the consequences of COVID-19 infection. Ongoing high-quality events follow-up will allow assessment of long-term clinical health outcomes following COVID-19 and the pandemic period. The extensive biobanks maintained by the cohorts could support measurement of prior viral infections, immunephenotypes, metabo-types, 'Omics, and other pre-COVID characteristics that may be risk determinants or modifiers for COVID-19 susceptibility and vaccine effectiveness. The fact that the cohorts continue to follow their participants provides a dynamic resource to study emerging questions in COVID-19 epidemiology, including but not limited to viral variants and vaccination. And, C4R provides a model for cross-cohort collaboration and active data sharing that will promote consortium-based epidemiologic work on biological, social, and epidemiologic questions beyond the COVID-19 pandemic, in alignment with recommendations for the strategic transformation of population studies (88). All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. COPDGene The project described was supported by Award Number U01 HL089897 and Award Number U01 HL089856 from the National Heart, Lung, and Blood Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Heart, Lung, and Blood Institute or the National Institutes of Health. COPDGene is also supported by the COPD Foundation through contributions made to an Industry Advisory Board comprised of AstraZeneca, Boehringer-Ingelheim, Genentech, GlaxoSmithKline, Novartis, Pfizer, Siemens, and Sunovion. The Framingham Heart Study (FHS) acknowledges the support of Contracts NO1-HC-25195, HHSN268201500001I and 75N92019D00031 from the National Heart, Lung and Blood Institute. We also acknowledge the dedication of the FHS study participants without whom this research would not be possible. The Hispanic Community Health Study/Study of Latinos is a collaborative study supported by contracts from the National Heart, Lung, and Blood Institute (NHLBI) to the University of North Carolina (HHSN268201300001I / N01-HC- (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. This research project is supported by cooperative agreement U01 NS041588 co-funded by the National Institute of Neurological Disorders and Stroke (NINDS) and the National Institute on Aging (NIA), National Institutes of Health, Department of Health and Human Service. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NINDS or the NIA. Representatives of the NINDS were involved in the review of the manuscript but were not directly involved in the collection, management, analysis or interpretation of the data. The authors thank the other investigators, the staff, and the participants of the REGARDS study for their valuable All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The Strong Heart Study has been funded in whole or in part with federal funds from the National Heart, Lung, and Blood Institute, National Institute of Health, Department of Health and Human Services, under contract numbers 75N92019D00027, 75N92019D00028, 75N92019D00029, & 75N92019D00030. The study was previously supported by research grants: R01HL109315, R01HL109301, R01HL109284, R01HL109282, and R01HL109319 and by cooperative agreements: U01HL41642, U01HL41652, U01HL41654, U01HL65520, and U01HL65521. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. We thank the participants of each cohort for their dedication to the studies. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Sally Wenzel receives funding for consulting and clinical trials from AstraZeneca, GSK, Sanofi-Genzyme, Novartis, Knopp; she also receives research support from Pieris and Regeneron. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. TOPMed) T T T T T T T T T T All rights reserved. No reuse allowed without permission. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. Table 4 . Estimated number of participants with recent pre-pandemic deep phenotyping for harmonization in C4R, by cohort, 2010-2020. If the most recent exam was prior to 2010, data are not included. CT = computed tomography. ECG = electrocardiogram. GWAS = genome-wide association study. MRI = magnetic resonance imaging. Neurocog = neurocognitive. Figure 1 . Longitudinal pre-COVID follow up and planned follow-up of C4R participants, by cohort, 1971-2025. Some visits were overlapping, which is not shown; instead, midpoints of the visits are indicated. COVID-era exams are shaded in blue. Solid lines indicate cohort follow up, which typically includes regular contact by telephone and mail and ongoing events ascertainment. a 424 gave restricted consent; b Includes 1,626 participants recruited from ARIC; c Withdrawal of consent by one participant; d MESA + 257 new recruits into the MESA Air Pollution Study. 50 Figure 2 . C4R participants, field/clinical centers, and coordinating centers. Blue circles indicate field/clinical centers, and the size is proportional to the number of participants at that field/clinical center. Participants in the REGARDS, which does not have field/clinical centers, are shown by additional blue shading according to their geocoded home addresses. Red squares indicate coordinating centers involved in the study. Yellow squares indicate C4R central resources: the data coordination and harmonization center, the biorepository and central laboratory, and the administrative coordinating center. Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period A weekly surveillance summary of US COVID-19 activity: Centers for Disease Control and Prevention COVID-19 as the Leading Cause of Death in the United States Reductions in 2020 US life expectancy due to COVID-19 and the disproportionate impact on the Black and Latino populations 6-month consequences of COVID-19 in patients discharged from hospital: a cohort study At the Heart of the Matter: Unmasking and Addressing COVID-19's Toll on Diverse Populations Racial Health Disparities and Covid-19 -Caution and Context Racial and Ethnic Differences in Presentation and Outcomes for Patients Hospitalized with COVID-19: Findings from the American Heart Association's COVID-19 Cardiovascular Disease Registry Coronavirus Disease 2019 (COVID-19) in Italy Functional MRI using Fourier decomposition of lung signal: reproducibility of ventilation-and perfusion-weighted imaging in healthy volunteers The impact of COPD and smoking history on the severity of COVID-19: A systemic review and meta-analysis Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72314 Cases From the Chinese Center for Disease Control and Prevention Jin-ping Zheng, Nuofu Zhang Autopsy Findings and Venous Thromboembolism in Patients With COVID-19: A Prospective Cohort Study Clinical features of patients infected with 2019 novel coronavirus in Wuhan Deep Learning for Health Informatics Classifying Chronic Lower Respiratory Disease Events in Epidemiologic Cohort Studies The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators CARDIA: study design, recruitment, and some characteristics of the examined subjects Cohort Profile: The Framingham Heart Study (FHS): overview of milestones in cardiovascular epidemiology Prevalence of major cardiovascular risk factors and cardiovascular diseases among Hispanic/Latino individuals of diverse backgrounds in the United States Sample design and cohort selection in the Hispanic Community Health Study/Study of Latinos Design and implementation of the Hispanic Community Health Study/Study of Latinos Laboratory, reading center, and coordinating center data management methods in the Jackson Heart Study Cardiovascular disease event classification in the Jackson Heart Study: methods and procedures Toward resolution of cardiovascular health disparities in African Americans: design and methods of the Jackson Heart Study Recruitment and retention of US South Asians for an epidemiologic cohort: Experience from the MASALA study Mediators of Atherosclerosis in South Asians Living in America (MASALA) study: objectives, methods, and cohort description Multi-Ethnic Study of Atherosclerosis: objectives and design The Strong Heart Study. A study of cardiovascular disease in American Indians: design and methods Genetic and environmental contributions to cardiovascular disease risk in American Indians: the strong heart family study Genetic epidemiology of COPD (COPDGene) study design Design of the Subpopulations and Intermediate Outcomes in COPD Study (SPIROMICS) Baseline Features of the Severe Asthma Research Program (SARP III) Cohort: Differences with Age Stroke incidence among white, black, and Hispanic residents of an urban community: the Northern Manhattan Stroke Study The reasons for geographic and racial differences in stroke study: objectives and design Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: Design of prospective meta-analyses of genome-wide association studies from 5 cohorts Harmonization of Respiratory Data From 9 US Population-Based Cohorts: The NHLBI Pooled Cohorts Study Association Between Blood Pressure and Later-Life Cognition Among Black and White Individuals Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program The Reporting of Race and Ethnicity in Medical and Science Journals: Comments Invited MACS/WIHS-CSS) MACSWsIHSCCS. COVID-19 Questionnaire 2020 Screening for depression in well older adults: evaluation of a short form of the CES-D (Center for Epidemiologic Studies Depression Scale) A global measure of perceived stress Reliability and validity of the Women's Health Initiative Insomnia Rating Scale Item banks for measuring emotional distress from the Patient-Reported Outcomes Measurement Information System (PROMIS(R)): depression, anxiety, and anger Social Support Survey Instrument UCLA Loneliness Scale (Version 3): reliability, validity, and factor structure The brief resilience scale: assessing the ability to bounce back The REDCap consortium: Building an international community of software platform partners Research electronic data capture (REDCap)--a metadata-driven methodology and workflow process for providing translational research informatics support Racial and regional differences in venous thromboembolism in the United States in 3 cohorts Antigen-based multiplex strategies to discriminate SARS-CoV-2 natural and vaccine induced immunity from seasonal human coronavirus humoral responses. medRxiv Association of Nonobstructive Chronic Bronchitis With Respiratory Health Outcomes in Adults Discriminative Accuracy of FEV1:FVC Thresholds for COPD-Related Hospitalization and Mortality FEV1:FVC Thresholds for Defining Chronic Obstructive Pulmonary Disease-Reply A dyadic growth modeling approach to weight gain and lung function loss: The NHLBI Pooled Cohorts Study Lung function decline in former smokers and low-intensity current smokers: a secondary data analysis of the NHLBI Pooled Cohorts Study Associations of Blood Pressure and Cholesterol Levels During Young Adulthood With Later Cardiovascular Events Association Between Blood Pressure and Later-Life Cognition Among Black and White Individuals Prepared by the McMaster University Evidence-based Practice Center under Contract No. 290-2007-10060-I.) AHRQ Publication No.13-EHC040-EF Statistical approaches to harmonize data on cognitive measures in systematic reviews are rarely reported Calibration and validation of an innovative approach for estimating general cognitive performance Calibrating longitudinal cognition in Alzheimer's disease across diverse test batteries and datasets Effects of education and race on cognitive decline: An integrative study of generalizability versus study-specific results The Health and Retirement Study Harmonized Cognitive Assessment Protocol Project: Study Design and Methods A Genetic Risk Score Associated with Chronic Obstructive Pulmonary Disease Susceptibility and Lung Structure on Computed Tomography Association of Dysanapsis With Chronic Obstructive Pulmonary Disease Among Older Adults Comparison of spatially matched airways reveals thinner airway walls in COPD. The Multi-Ethnic Study of Atherosclerosis (MESA) COPD Study and the Subpopulations and Intermediate Outcomes in COPD Study (SPIROMICS) Adaptive quantification and longitudinal analysis of pulmonary emphysema with a hidden Markov measure field model Association Between Long-term Exposure to Ambient Air Pollution and Change in Quantitatively Assessed Emphysema and Lung Function Unsupervised Discovery of Spatially-Informed Lung Texture Patterns for Pulmonary Emphysema: The MESA COPD Study Emphysema quantification on cardiac CT scans using hidden Markov measure field model: The MESA Lung Study Unsupervised Domain Adaption With Adversarial Learning (UDAA) for Emphysema Subtyping on Cardiac CT Scans: The Mesa Study Deepcare: A deep dynamic memory model for predictive medicine Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records -12-10) Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology Strategic transformation of population studies: recommendations of the working group on epidemiology and population sciences from the National Heart, Lung, and Blood Advisory Council and Board of External Experts Timed walk 3140 3879 2431 2473 1064 9847 Hand-grip 3140 1342 2257 628 1160 8527 Spirometry 3612 3119 4000 6232 9436 5006 3334 2517 1123 380 1149 94 0 6427 5081 2849 2920 502 82 1901 25850 ECG 2409 0 9697 5289 892 3584 3803 502 7778 1910 35864 Brain 7570 7570 Brain MRI 771 653 1245 1095 1117 803 5684 Neurocog-long 3589 3354 3543 8400 1480 2000 600 528 817 31783 Neurocog-short 1360 2477 8400 3583 4570 1260 8000 817 30467 Neurocog-any 3589 3354 1360 3934 8400 3652 4570 1260 8000 817 38936 Sleep+Activity 3800 3800 Polysomnography 11 835 9195 913 1718 1779 14451 Actimetry 513 1397 0 9012 852 1764 11774 ECG monitoring 2257 4100 1480 1557 300 9694 Biomarkers Blood 5046 4221 4000 7574 8400 2900 1144 3731 4683 1267 2500 8000 380 1800 2701 58347 Urine 5046 4221 7574 8400 2900 1144 3688 4683 1267 8000 1800 2701 51424 GWAS 4541 3799 4000 6817 12670 2610 3979 4215 2250 380 1620 46230 RNAseq (blood) 800 2730 1040 2973 342 4027 11912 Metabolomics 3025 8000 2750 787 2000 4000 20562 Methylation 3799 4000 1900 3000 1752 1179 1830 2325 19346 Proteomics 2813 1852 786 2000 7451 Sputum/bronch 380 1000 1380 Gut microbiome 607 8000 8607 All rights reserved. No reuse allowed without permission.(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.