key: cord-0724864-r18yrnyf authors: Pfaff, E.; Madlock-Brown, C.; Baratta, J. M.; Bhatia, A.; Davis, H.; Girvin, A. T.; Hill, E.; Kelly, L.; Kostka, K.; Loomba, J.; McMurry, J.; Wong, R.; Bennett, T. D.; Moffitt, R.; Chute, C. G.; Haendel, M.; The N3C Consortium,; Consortium, The RECOVER title: Coding Long COVID: Characterizing a new disease through an ICD-10 lens date: 2022-04-19 journal: nan DOI: 10.1101/2022.04.18.22273968 sha: 4ef4c5a338068cdf93e996e6b55a871ed7e8a164 doc_id: 724864 cord_uid: r18yrnyf Naming a newly discovered disease is always challenging; in the context of the COVID-19 pandemic and the existence of post-acute sequelae of SARS-CoV-2 infection (PASC), which includes Long COVID, it has proven especially challenging. Disease definitions and assignment of a diagnosis code are often asynchronous and iterative. The clinical definition and our understanding of the underlying mechanisms of Long COVID are still in flux. The deployment of an ICD-10-CM code for Long COVID in the US took nearly two years after patients had begun to describe their condition. Here we leverage the largest publicly available HIPAA-limited dataset about patients with COVID-19 in the US to examine the heterogeneity of adoption and use of U09.9, the ICD-10-CM code for "Post COVID-19 condition, unspecified." Our results include a characterization of common diagnostics, treatment-oriented procedures, and medications associated with U09.9-coded patients, which give us insight into current practice patterns around Long COVID. We also established the diagnoses most commonly co-occurring with U09.9, and algorithmically clustered them into three major categories: cardiopulmonary, neurological, and metabolic. We aim to apply the patterns gleaned from this analysis to flag probable Long COVID cases occurring prior to the existence of U09.9, thus establishing a mechanism to ensure patients with earlier cases of Long-COVID are no less ascertainable for current and future research and treatment opportunities. Naming diseases is an ever present challenge, and there is no shortage of efforts that aim to better standardize, disambiguate, and keep track of disease nomenclature and definitions [1] [2] [3] [4] . Disease naming has always been controversial-for example, there are more than 400 names for syphilis dating back to the 15th century [5] . Naming a disease requires defining it, and assigning a standard code to the disease facilitates research, care, and patient engagement due to ease of patient classification and knowledge exchange. However, naming and coding a disease does not mean the disease did not exist prior to its naming or coding. For instance, although "SARS-CoV-2" and "COVID-19" were both coined February 11, 2020, by the International Committee on the Taxonomy of Viruses and the WHO, respectively [6, 7] , we know that cases of COVID-19 began to surface in Wuhan, China in late December 2019 [8] . In the US, most diagnostic coding Despite the relatively early recognition of this condition, an ICD-10-CM code (U09.9, "Post COVID-19 condition, unspecified") was not made available for use in the clinical setting until October of 2021. Moreover, this single code may prove insufficient: considering the phenotypic and severity variation seen in Long COVID patients, it is likely that subtypes of Long COVID exist, and such subtypes may correlate with specific underlying mechanisms that should be targeted by different interventions. There is thus more naming to be done, and a particular need to define and refine computable phenotypes for Long COVID and its subtypes. In doing so, we can appropriately define cohorts for clinical studies and provide more precise treatment and clinical decision support. This is a key priority for the parent program for this work, the NIH Researching COVID to Enhance Recovery (RECOVER) Initiative, [20] which seeks to understand, treat, and prevent PASC through a wide variety of research modalities, including electronic health record (EHR) and real-world data. In response to the COVID-19 pandemic, the informatics and clinical community harmonized an enormous amount of EHR data to reveal candidate risk factors and therapies associated with COVID-19. The NIH's National COVID Cohort Collaborative (N3C) is now the largest publicly available HIPAA limited dataset in U.S. history, with over 13 million patients, and is a testament to the partnership of over 290 organizations. Due to the scale and demographic and geographic diversity of data within the N3C, it is uniquely well-suited to characterize the early use of the new Long COVID ICD-10-CM code. In prior work, we proposed a machine learning-based computable phenotype definition for Long COVID using the N3C data [21] . Now that U09.9 is available, the presence of the code will be a valuable addition to that existing Long COVID model, especially since ascertainment of presumptive cases based on EHR data in the absence of a U09.9 diagnosis code is limited by the non-specificity of the clinical manifestations of the disease, the frequency with which these symptoms are seen in the general population, and the observation that the diagnosis of Long COVID is one of exclusion. However, due to the caveats noted above regarding newly introduced codes, we first sought to characterize the early clinical use patterns of U09.9 before accepting it into our model and cohort definition at face value. This characterization revealed interesting patterns that may enable us to glean a better understanding of both rough subtypes of Long COVID and current clinical practices for diagnosis and treatment of Long COVID. Ultimately, identifying patients with Long COVID based upon multiple means of inquiry (including U09.9) is critically important to recruit participants for research studies, assess the public health burden, and support nimble analytics across our heterogenous health care systems. To characterize the use of the U09.9 code, we used EHR data integrated and harmonized inside the NIH-hosted N3C Secure Data Enclave to identify clinical features co-occurring around the time of patients' U09.9 index date. The methods for patient identification, data acquisition, ingestion, and harmonization into the N3C Enclave have been described previously [22] [23] [24] . Briefly, N3C contains EHR data for patients who (1) tested positive for SARS-CoV-2 infection, (2) whose symptoms are consistent with a COVID-19 diagnosis, or (3) are demographically matched controls who have tested negative for SARS-CoV-2 infection (and have never tested positive) to support comparative studies. Lookback data are available from January 2018 forward for each patient. In this analysis, we defined our initial population (n = 9,571, sourced from 28 different health care systems) as any non-deceased patient with one or more U09.9 diagnosis codes recorded on or after October 1, 2021. U09.9 codes appearing prior to this date were likely retroactively applied to these patients' records (e.g., as "onset dates" in an EHR Problem List), therefore making it difficult to determine an index date that reflects the actual date of diagnosis. We excluded patients (n = 1,497) whose U09.9 index occured during an inpatient hospitalization, due to the difficulty of distinguishing co-occurring clinical features related to Long COVID versus the primary reason for their hospitalization. After these exclusions, a base population of 8,074 remained (see Supplemental Figure 1 ). Note that we did not require patients in our cohort to have a COVID-19 diagnosis code (U07.1) or positive SARS-CoV-2 test on record, as many patients with Long COVID do not have this documentation [19] . Data from 28 of the 71 N3C sites were used for this analysis. The remaining sites either (1) did not use the U09.9 code in their N3C data or had not refreshed data since November 1, 2021, meaning the U09.9 code would not be present even if used at the site (n = 30 sites), or (2) did not meet the minimum criteria we set for site data for all RECOVER-related analyses (n = 13 sites): (a) >=25% of inpatients with at least one white blood cell count and at least one serum creatinine (to ensure lab measurement completeness); (b) 75% of inpatient visits have valid end dates; and (c) dates must not be shifted by the site more than 30 days. Additional N3C data quality criteria have been described previously, and also apply to this work. [23] The 28 sites used here are diverse in geographic location and institution size, but cannot be specifically named due to N3C governance policies. We calculated person-level demographics and a number of social determinants of health variables at the area level. These variables are sourced from the Sharecare-Boston University School of Public Health Social Determinants of Health Index [23] , and were linked to patients based on the preferred county (majority residence) associated with the patient's 5-digit ZIP code. We then characterized this cohort by examining diagnoses, procedures, and medications that occurred between each patient's U09.9 index date and 60 days after index (hereafter referred to as our "analysis window"). Our objective in characterizing diagnoses around the U09.9 index date was not only to catalog conditions and symptoms that tend to co-occur with the U09.9 diagnosis, but also to determine which of those conditions and symptoms tend to co-occur with each other. In doing so, we begin to see clusters of conditions that are more likely to occur together within a single patient's record. First, we extracted all conditions in each patient's record within the analysis window, and identified the most frequently occurring conditions in the study population. We then constructed an adjacency matrix for the top 30 conditions, with values indicating the frequency of co-occurrence between two conditions in the study population. From this matrix, we constructed a weighted network with nodes representing individual diagnoses, edges between nodes representing co-occurrence, and edge weights corresponding to the count of patients with both conditions. In order to detect conditions that are more likely to co-occur in our study population than at random, we tested the Louvain [25] , Walktrap, [26] and Girvan-Newman [27] algorithms for community detection. We selected the Louvain algorithm in our final model, as it maximized modularity while retaining a reasonable resolution of detection. For further subgroup analyses, we present clusters detected within age-stratified condition co-occurrence networks. Additional details on community detection, network stability and subgroup analyses are available in Supplemental Methods. Characterizing common procedures around the time of U09.9 allowed us to assess current practice patterns (i.e., diagnostics and treatments) for patients receiving the code. We defined a "procedure" as any medical diagnostics or treatments rendered by a healthcare provider. We excluded non-informative records that simply reflect that an encounter took place (e.g., CPT 99212, "Office or other outpatient visit"), despite their technical classification as "procedure codes." We then aggregated remaining procedures into high-level categories (e.g., "radiography," "physical therapy") in order to discern the diagnostics and treatments that occured within each patient's analysis window. As with diagnoses and procedures, we extracted all medication records occurring within each patient's analysis window, in order to characterize newly prescribed medications that may be used to treat symptoms of Long COVID. In order to focus on newly prescribed medications and not long-standing prescriptions, we excluded medications for each patient for which there were records prior to the patient's U09.9 index. Medications were categorized using the third level of the Anatomical Therapeutic Chemical (ATC) classification system [28] . Each of the patients in our base population came from one of 28 N3C data-submitting health care organizations. Table 1 shows the breakdown of the study cohort by person-level demographics and area-level social determinants of health. It should be noted that greater severity of acute SARS-CoV-2 infection does not appear to have outsize influence in determining which patients end up with a U09.9 code; 1,722 of the U09.9 patients (21.3%) were hospitalized during their acute SARS-CoV-2 infection. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022 In addition to demographics, the N3C data also enables us to examine medication use and procedures that occur in each patient's analysis window, as shown in Figures 1 and 2 , respectively. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022. ; associated with fewer than 20 patients are not shown, per N3C download policy. An additional 4,070 patients had no new recorded medications in the analysis window; percentages are shown relative to all patients in the final study population (8,074). Procedures shown occur within 60 days after a patient's U09.9 diagnosis. Procedure records that simply reflect that an encounter took place (e.g., CPT 99212, "Office or other outpatient visit") are excluded. Category totals represent unique patient -procedure pairs, not necessarily unique individuals. Procedure classes associated with fewer than 20 patients are not shown, per N3C download policy. An additional 2,963 patients with a U09.9 code had no recorded procedures in the analysis window; percentages are shown relative to all patients in the final study population (8,074). We also analyzed uptake of the code itself, among sites using the code. There is a rapid increase in use of U09.9 by sites following the code's release (Figure 3) . Usage of U09.9 post-release is compared with usage of B94.8 ("Sequelae of other specified infectious and parasitic diseases") among COVID positive patients; some sites may have used B94.8 at the CDC's initial recommendation [29] as a placeholder code prior to U09.9's release. Once U09.9 became available, use of B94.8 at the same sites levels off but does not decrease. This suggests that both codes are still being used; indeed, we see both codes used in the records of 1,614 (20%) of N3C patients in our included U09.9 population. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022. ; Figure 3 . Clinical use of B94.8 levels off as U09.9 becomes available. Prior to U09.9's release, the CDC recommended use of B94.8 ("Sequelae of other specified infectious and parasitic diseases") as a placeholder code to signify Long COVID. Among the 28 sites using U09.9, we plotted the use of B94.8 (orange line) as a percentage of patients who had an acute COVID index (to exclude instances of B94.8, a general purpose code, used for non-COVID-related purposes). Compare this trajectory with U09.9's (blue line), which quickly ramps up in use after October 1, 2021. (U09.9 codes shown prior to that date have been retroactively applied to patients' records.) The definition of Long COVID[30] includes a wide-ranging list of symptoms and clinical features. Many of those features appear below in Figure 4 , a visualization of diagnoses that commonly co-occur with U09.9, and each other. The mix of co-occurring diagnoses as well as the clusters produced by the Louvain algorithm change when the cohort is subset into age groups. These age-based clusters are included as Supplemental Figures 2a-d. A full accounting of diagnoses co-occurring with U09.9 (i.e., within the analysis window) in at least 20 patients from our cohort is included as Supplemental Table 1. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022. ; Figure 4 . Clusters of co-occurring diagnoses among patients with a U09.9 code. When the Louvain algorithm is applied to the top 30 most frequent pairs of co-occurring diagnoses for U09.9 patients (i.e., diagnoses co-occurring in the same patient 0 through 60 days from U09.9 diagnosis date), three distinct clusters emerge (cardiopulmonary, neurological, metabolic). These clusters may represent rough subtypes of Long COVID presentations. The size of each box within a cluster reflects the frequency of that diagnosis relative to others in the diagram. Condition names are derived from the SNOMED CT terminology, mapped from their ICD-10-CM equivalents. Our findings suggest that Long COVID symptoms and associated functional disability may present differently depending on the patient, but commonly fall into one of these three identified clusters (cardiopulmonary, neurological, metabolic). When stratified by age, the diagnoses within each cluster change somewhat, though the themes remain constant (Supplemental Figure 2) . For the youngest group (<21 years of age; Supplemental Figure 2a) , note the appearance of multisystem inflammatory syndrome [31] within the respiratory cluster. Patients aged 65+ (Supplemental Figure 2d) were the most distinct, presenting with more chronic diseases associated with aging (e.g. congestive heart failure, atherosclerosis, atrial fibrillation). Diagnosis codes are frequently used as criteria to define patient populations. While diagnosis codes alone may not define a cohort with perfect accuracy, they are a useful mechanism to narrow a population from "everyone in the EHR" to a cohort highly enriched with the condition of interest. Our analysis of U09.9 shows that this code may serve in a similar capacity to identify Long COVID patients. However, temporality and rate of uptake by providers are critical issues that must be considered. U09.9 was released for use nearly two years into the COVID-19 pandemic, resulting in potentially millions of patients with Long COVID who "missed out" on being assigned the code. Moreover, nearly six months after the code was introduced, only about half of N3C sites . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022. ; https://doi.org/10.1101/2022.04.18.22273968 doi: medRxiv preprint have utilized U09.9. Our findings must thus be interpreted through this lens of partial and incremental adoption. More work is needed to understand clinical variability and barriers to uptake by providers. We investigated whether the use of non-specific coding such as B94.8 ("Sequelae of other specified infectious and parasitic diseases") could be used as a proxy for early case identification. Our findings show B94.8 use increasing among COVID patients from April 2021 to October 2021, indicating a potential shift in clinical practice patterns to code for Long COVID presentation as guided by the Centers for Disease Control [29] . However, B94.8 is used to code for any sequelae of any infectious disease, and is thus not likely specific enough to rely on for Long COVID case ascertainment in the EHR. The common procedures and medications around the time of U09.9 index provide insight into diagnostics and treatments currently used by providers for patients presenting with Long COVID, for which treatment guidelines remain under development [32] [33] [34] [35] . For new diseases where consensus is lacking, care is often ad hoc and informed by both the symptoms that patients present with and the available diagnostics and treatments that providers can offer. The identification and characterization of care patterns is an important step in designing future research to assess the efficacy and outcomes of these interventions. In our analysis of procedure and medication codes, the frequent use of respiratory medications and tests is unsurprising given that pulmonary manifestations of Long COVID are a predominant subtype of symptoms [19] . Interestingly, antibacterials were also used frequently; it is unclear whether patients with Long COVID are more susceptible to bacterial infections, or if there may be overuse of antibiotics in the setting of fluctuating respiratory Long COVID symptoms or viral infections [36, 37] . Both systemic and topical corticosteroids were also commonly used, presumably to treat persistent inflammation as a possible mechanism mediating Long COVID symptoms. Other frequently prescribed medication categories, such as cardiac, neuropsychiatric, gastrointestinal, and dermatologic medications, reflect the potential multi-system organ involvement and symptom clusters in Long COVID that we see in the analysis of conditions. Also of interest is the fact that some patients are receiving a number of rehabilitation services in the 60 days after diagnosis, such as physical and occupational therapy, which lends insight into the burden of functional disability for patients with Long COVID. Our diagnosis clusters suggest that Long COVID may not be a single phenotype, but rather a collection of sub-phenotypes that may benefit from different diagnostics and treatments. This may explain the hesitancy behind uptake of U09.9, as clinical presentation is not universal. Each of these clusters (cardiopulmonary, neurological, and metabolic) contains conditions and symptoms reported in existing Long COVID literature [38] , and clearly suggests that the definition of Long COVID is more expansive than lingering respiratory symptoms [39] . Of particular note is the appearance of myalgic encephalomyelitis -a disease which parallels Long COVID in many ways [40] [41] [42] -in the neurological cluster, suggesting not only frequent co-occurrence with a U09.9 diagnosis, but also co-occurrence with other neurological symptoms. The metabolic cluster is also hypothesis-generating, and follows prior research on the complex relationship between type II diabetes and COVID-19 [43, 44] . The cluster differences we see among age groups (Supplemental Figures 2a-d) make a strong case for age stratification when studying U09.9, and Long COVID in general. Regardless, given Long COVID's heterogeneity in presentation, course, and outcome, the clustering of symptoms may prove informative for future development of classification and diagnostic criteria. [45] We also investigated how demographics and social determinants of health may contribute to variation in use of U09.9. We found more women than men presenting with Long COVID across all age groups, consistent with literature and anecdotes from Long COVID clinic providers. [46] When evaluating the U09.9 cohort across age groups and socioeconomic status, Long COVID presentation was more heterogeneous. While our findings do not present a clear socio-demographic trend (see Table 1 ), the role of access to providers and the economic means to afford Long COVID care should continue to be studied for their role as confounders. All EHR data is limited in that patients with lower access or barriers to care are less likely to be represented. EHR heterogeneity across sites may mean that a U09.9 code at one site does not quite equate to a U09.9 code at another. Moreover, we are not able to know what type of provider issued the U09.9 diagnosis (i.e., specialty), and different clinical organizations have different coding practices. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 19, 2022. ; As the U09.9 code is still quite new and our sample size is limited, we cannot yet confidently label these clusters as clear "Long COVID subtypes." Rather, these clusters are intended to be hypothesis generating, with additional work underway by the RECOVER consortium to further develop and validate these clusters. It should also be noted that many symptoms are not coded in the EHR (and may, for example, be more likely to appear in free-text notes rather than diagnosis code lists). Future work will incorporate these non-structured sources of symptoms for use in our clustering methodology. Given the variable uptake of the U09.9 code, it is challenging to accurately identify comparator groups for this population-i.e., the absence of a U09.9 code cannot, at this time, be interpreted as the absence of Long COVID. This will continue to be an issue in future research, especially when evaluating the effect of PASC on patient morbidity and utilization of diagnostic testing and treatments. The recent release of ICD-10-CM code U09.9 to codify Long COVID will undoubtedly assist with future case ascertainment and computable phenotyping. However, a large number of patients who developed Long COVID prior to October 1, 2021 continue to be burdened with symptoms, and must also be included in data-driven cohort identification efforts for trial recruitment and retrospective analyses. Considering the caveats around rate of uptake among clinicians and late timing of the code's release, we recommend that when characterizing Long COVID using EHRs, U09.9 should not be used alone, but rather in combination with other strategies such as more complex computable phenotypes [21] . Our findings from the characterization of patients with the U09.9 diagnosis may be of use in refining phenotypes to identify pre-U09.9 patients that might have Long COVID. There is clear utility to the characterization of early use of U09.9, as it represents the first "hook" in EHR data that can be used to identify and assess current diagnostic and treatment patterns at scale. Moreover, given the heterogeneous presentation of Long COVID, clustering of co-existing conditions and potential symptoms may be valuable in informing future development of more detailed criteria for diagnosis of Long COVID and its subtypes. of Texas Medical Branch at Galveston -UL1TR001439: The Institute for Translational Sciences • Medical University of South Carolina -UL1TR001450: South Carolina Clinical & Translational Research Institute (SCTR) • University of Massachusetts Medical School Worcester -UL1TR001453: The UMass Center for Clinical and Translational Science (UMCCTS) • University of Southern California -UL1TR001855: The Southern California Clinical and Translational Science Institute (SC CTSI) • Columbia University Irving Medical Center -UL1TR001873: Irving Institute for Clinical and Translational Research • George Washington Children's Research Institute -UL1TR001876: Clinical and Translational Science Institute at Children's National (CTSA-CN) What's in a name? Issues to consider when naming Mendelian disorders A dyadic approach to the delineation of diagnostic entities in clinical genomics A Census of Disease Ontologies ICD-11: an international classification of diseases for the twenty-first century On Allusive Names for the Syphilitic Patient From the 16th to the 19th Century: The Role of Dermatopathology Coronaviridae Study Group of the International Committee on Taxonomy of Viruses. The species Severe acute respiratory syndrome-related coronavirus: classifying 2019-nCoV and naming it SARS-CoV-2 World Health Organization. Novel Coronavirus(2019-nCoV) Situation Report -22 Dissecting the early COVID-19 cases in Wuhan Diagnostic Codes to Identify COVID-19 Among Hospitalized Patients Positive Predictive Value of ICD-10 Diagnosis Codes for COVID-19 Positive predictive value of COVID-19 ICD-10 diagnosis codes across calendar time and clinical setting The long tail of Covid-19" -The detection of a prolonged inflammatory response after a SARS-CoV-2 infection in asymptomatic and mildly affected patients The lasting misery of coronavirus long-haulers Long covid: How to define it and how to manage it Long-Term Sequelae of COVID-19: A Systematic Review and Meta-Analysis of One-Year Follow-Up Studies on Post-COVID Symptoms WHO Clinical Case Definition Working Group on Post-COVID-19 Condition. A clinical case definition of post-COVID-19 condition by a Delphi consensus Assessment of the Frequency and Variety of Persistent Symptoms Among Patients With COVID-19: A Systematic Review Patient-Led Research Collaborative: embedding patients in the Long COVID narrative Characterizing long COVID in an international cohort: 7 months of symptoms and their impact Who has long-COVID? A big data approach The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment Synergies between centralized and federated approaches to data quality: a report from the national COVID cohort collaborative Fast unfolding of communities in large networks Computing Communities in Large Networks Using Random Walks Community structure in social and biological networks Structure and principles Coronavirus disease (COVID-19): Post COVID-19 condition Information for healthcare providers about multisystem inflammatory syndrome in children (MIS-C) COVID-19: evaluation and management of adults following acute viral illness Multidisciplinary collaborative consensus guidance statement on the assessment and treatment of fatigue in postacute sequelae of SARS-CoV-2 infection (PASC) patients Multi-disciplinary collaborative consensus guidance statement on the assessment and treatment of cognitive symptoms in patients with post-acute sequelae of SARS-CoV-2 infection (PASC) Multi-disciplinary collaborative consensus guidance statement on the assessment and treatment of breathing discomfort and respiratory sequelae in patients with post-acute sequelae of SARS-CoV-2 infection (PASC) High Value Care Task Force of the American College of Physicians and for the Centers for Disease Control and Prevention. Appropriate antibiotic use for acute respiratory tract infection in adults: Advice for high-value care from the American college of physicians and the centers for disease control and prevention Outpatient Antibiotic Prescribing for Acute Respiratory Infections During Influenza Seasons Characterizing Long COVID: Deep Phenotype of a Complex Condition A clinical case definition of post COVID-19 condition by a Delphi consensus Insights from myalgic encephalomyelitis/chronic fatigue syndrome may help unravel the pathogenesis of postacute COVID-19 syndrome Long COVID or Post-acute Sequelae of COVID-19 (PASC): An Overview of Biological Factors That May Contribute to Persistent Symptoms Will COVID-19 Lead to Myalgic Encephalomyelitis/Chronic Fatigue Syndrome? Front Med Post-acute sequelae of COVID-19: A metabolic perspective Diabetes and the Risk of Long-term Post-COVID Symptoms Distinctions between diagnostic and classification criteria? Findings From Mayo Clinic's Post-COVID Clinic: PASC Phenotypes Vary by Sex and Degree of IL-6 Elevation This research was funded by the National Institutes of Health (NIH) Agreement OT2HL161847-01. The views and conclusions contained in this document are those of the authors and should not be interpreted as representing the official policies, either expressed or implied, of the NIH.The analyses described in this publication were conducted with data or tools accessed through the NCATS N3C Data Enclave covid.cd2h.org/enclave and supported by NCATS U24 TR002306. This research was possible because of the patients whose information is included within the data from participating organizations (covid.cd2h.org/dtas) and the organizations and scientists (covid.cd2h.org/duas) who have contributed to the on-going development of this community resource [22] .The N3C data transfer to NCATS is performed under a Johns Hopkins University Reliance Protocol # IRB00249128 or individual site agreements with NIH. The N3C Data Enclave is managed under the authority of the NIH; information can be found at https://ncats.nih.gov/n3c/resources. The work was performed under DUR RP-5677B5.Authorship was determined using ICMJE recommendations.We gratefully acknowledge contributions from the following N3C core teams: (