key: cord-0883332-j7nymzwa authors: Lenehan, Patrick J.; Ramudu, Eshwan; Venkatakrishnan, A. J.; Berner, Gabriela; McMurry, Reid; O’Horo, John C.; Badley, Andrew D.; Morice, William; Halamka, John; Soundararajan, Venky title: Anemia during SARS-CoV-2 infection is associated with rehospitalization after viral clearance date: 2021-06-24 journal: iScience DOI: 10.1016/j.isci.2021.102780 sha: a9bf84d055f7eb4147792702efd86130e8da5f42 doc_id: 883332 cord_uid: j7nymzwa COVID-19 patients can experience symptoms and complications after viral clearance. It is important to identify clinical features of patients who are likely to experience these prolonged effects. We conducted a retrospective study to compare longitudinal lab test measurements (hemoglobin, hematocrit, estimated glomerular filtration rate, serum creatinine, and blood urea nitrogen) in patients rehospitalized after PCR-confirmed SARS-CoV-2 clearance (n=104) versus patients not rehospitalized after viral clearance (n=278). Rehospitalized patients had lower median hemoglobin levels in the year prior to COVID-19 diagnosis (cohen’s D = -0.50; p=1.2x10-3) and during their active SARS-CoV-2 infection (cohen’s D = -0.71; p=4.6x10-8). Rehospitalized patients were also more likely to be diagnosed with moderate or severe anemia during their active infection (OR = 2.18; p = 4.99x10-9). These findings suggest that anemia-related laboratory tests should be considered in risk stratification algorithms for COVID-19 patients. Since the first diagnosed case of COVID-19 in December 2019, over 170 million people have been infected with SARS-CoV-2 worldwide resulting in over 3.5 million deaths ("Johns Hopkins Coronavirus Resource Center," n.d.) . While significant progress has been made in understanding the pathogenesis of COVID-19, including the rapid development and clinical rollout of multiple vaccines (Bos et al., 2020; Corbett et al., 2020; Folegatti et al., 2020; Jackson et al., 2020; Mercado et al., 2020; Mulligan et al., 2020) , along with detailed characterizations of the SARS-CoV-2 entry receptor ACE2 Singh et al., 2020; Venkatakrishnan et al., 2020; Zhao et al., 2020; Ziegler et al., 2020) , there are still few options available for effective treatment of patients with severe COVID-19. Further, as the pandemic has progressed, there have been reports of long-lasting effects of COVID-19 even in patients who did not experience a severe disease course during their active infection period (Carfì et al., 2020; del Rio et al., 2020; Yelin et al., 2020) . However, the clinical, molecular, and demographic biomarkers characterizing patients who are more likely to experience these lasting effects after clearing SARS-CoV-2 ("long COVID") are not yet known. The need to answer such questions during the rapidly evolving COVID-19 pandemic has emphasized the requirement for tools facilitating real-time analysis of patient data as it is obtained and stored in large electronic health records (EHR) systems. Specifically, clinical research efforts to understand the features defining COVID-19 patients, or subsets thereof, fundamentally require reliable systems that enable (1) conversion of unstructured information (e.g., patients notes written by healthcare professionals) into structured formats suitable for downstream analysis and (2) temporal alignment and integration of such unstructured data with the already structured information available in EHR databases (e.g., lab test results, disease diagnosis codes). With these requirements in mind, we have previously reported the development of augmented curation methods that enable the rapid creation and comparison of defined cohorts of COVID-19 patients within a large EHR system (Pawlowski et al., 2021; Wagner et al., 2020) . For example, we have used natural language processing (NLP) to train disease diagnosis models which classify mentions of phenotypes in EHR notes as positive (i.e., Patient has Disease X), negative (i.e., Patient does not have Disease X), or other (e.g., Patient is suspected to have Disease X or has a family history of Disease X). Using this textual sentiment-based curation model, we found that diagnoses of anemia and acute kidney injury (AKI) were recorded more frequently in the notes of COVID-19 patients who were rehospitalized after PCRconfirmed SARS-CoV-2 clearance compared to patients who were not rehospitalized after viral clearance (Pawlowski et al., 2021) . These findings applied to notes both in the year prior to COVID-19 diagnosis and during active SARS-CoV-2 infection. Anemia, defined as a deficiency of red blood cells or hemoglobin in circulation, has several etiologies including vitamin or mineral deficiencies, chronic inflammation, drug-or infection-induced hemolysis, bone marrow suppression, and blood loss (Turner et al., 2021) . AKI is generally caused by reduced blood flow to the kidney (pre-renal AKI), direct damage to the kidney itself by drugs, infectious agents, or excessive inflammation (intrinsic AKI), or obstruction of outflow from the renal tubular system (post-renal AKI) (Makris and Spanou, 2016) . Both anemia and AKI are common in critically ill patients (Case et al., 2013; Girling et al., J o u r n a l P r e -p r o o f 2020; Mohsenin, 2017; Roubinian et al., 2019; Walsh et al., 2006; Warner et al., 2020) and have been suggested as biomarkers for mortality and disease severity in COVID-19 patients (Chan et al., 2021; Faghih Dinevari et al., 2021; Hariyanto and Kurniawan, 2020; Nadim et al., 2020; Oh et al., 2021) . However, our previously mentioned NLP analysis was the first to associate anemia with the long COVID syndrome and rehospitalization after SARS-CoV-2 clearance. Here we complement this work by evaluating whether diagnostic lab tests for anemia and AKI corroborate these phenotypic associations in a larger patient cohort. We split the set of hospitalized COVID-19 patients with confirmed viral clearance (n=382) into two groups: (1) post-clearance hospitalized ("PCH"; n=104) and (2) post-clearance non-hospitalized ("PCNH"; n=278), where viral clearance was defined by two consecutive negative SARS-CoV-2 PCR tests following a positive test (see Methods and Figure 1A -B). A demographic summarization of these two cohorts is provided in Table 1 . We then compared a set of selected lab test results between these cohorts during two time windows: (1) the year prior to COVID-19 diagnosis ("pre-COVID phase") and (2) the time during which each patient was SARS-CoV-2 positive according to their PCR results ("SARS-CoV-2 + phase"). Given our previous NLP-based findings (Pawlowski et al., 2021) , we considered both anemia-related and kidney function lab tests including hemoglobin, hematocrit, estimated glomerular filtration rate (eGFR), serum creatinine, and serum blood urea nitrogen (BUN) levels ( Figure 1C ). For each patient, we first considered the median values of each lab test over the designated interval. Consistent with our previous augmented curation derived findings, PCH patients had significantly lower median hemoglobin and hematocrit levels in both the pre-COVID phase (cohen's D = -0.50, p=1.2x10 -3 ; cohen's D = -0.48, p=2.5x10 -3 ) and the SARS-CoV-2 + phase (cohen's D = -0.71, p=4.6x10 -8 ; cohen's D = -0.69, p=8.5x10 -8 ) ( Table 2, Figure 2A-D) . Further, PCH patients had lower median eGFR and higher median BUN levels during the pre-COVID phase (cohen's D = -0.46, p = 0.02; cohen's D = -0.45, p=1.2x10 -3 ) and the SARS-CoV-2 + phase (cohen's D = 0.46, p = 0.01; cohen's D = 0.42, p=8.9x10 -6 ) (Table 2, Figure S1 ). We also tested whether extreme (i.e. minimum or maximum) values of a given lab test over the designated periods varied between PCH and PCNH patients, as a measure of central tendency (e.g., median) may fail to capture a single occurrence of phenotypes such as anemia or AKI. PCH patients had lower minimum values of hemoglobin, hematocrit, and eGFR in both the pre-COVID phase (cohen's D = -0.49, p=2.8x10 -3 ; cohen's D = -0.45, p=3.0x10 -3 ; cohen's D = -0.57, p=3.0x10 -3 ) and the SARS-CoV-2 + phase (cohen's D = -0.85, p=1.6x10 -10 ; cohen's D = -0.79, p=1.2x10 -9 ; cohen's D = -0.51, p=4.4x10 -4 ) ( Table 3 ; Figure 3 and Figure S2 ). They also had higher maximum serum BUN levels during both the pre-COVID phase (cohen's D = 0.50, p=6.6x10 -4 ) and the SARS-CoV-2 + phase (cohen's D = 0.60, p=5.2x10 -8 ) ( Table 4 and Figure S2 ). Taken together, these analyses corroborate our prior textual sentiment-based EHR findings, suggesting that patients who are rehospitalized after SARS-CoV-2 clearance are more likely to have pathologically altered anemia-related and renal function lab tests both prior to and during SARS-CoV-2 infection. As males and females have different normal ranges of hemoglobin and hematocrit, we performed sex-split subanalyses of anemia-related lab tests similar to those described above. Patient-level median hemoglobin and hematocrit during the pre-COVID phase were significantly lower in both the female (cohen's D = -0.66, p=0.01; cohen's D = -0.67, p=0.01) and male (cohen's D = -0.42, p=0.02; cohen's D = -0.37, p=0.05) PCH cohorts versus their PCNH counterparts (Tables 5-6, Figure 4A -D). These trends were even stronger during the SARS-CoV-2 + phase among both the female (cohen's D = -0.85; p= 7.0x10 -6 ; cohen's D = -0.91, p=7.0x10 -6 ) and male (cohen's D = -0.60; p=8.2x10 -4 ; cohen's D = -0.53, p=1.8x10 -3 ) cohorts (Tables 5-6, Figure 4E -H). Similarly, in our analysis of extreme values, minimum hemoglobin and hematocrit measurements during the pre-COVID phase tended to be lower in both female (cohen's D = -0.60; p=0.01; cohen's D = -0.58, p=0.01) and male (cohen's D = -0.42; p=0.05; cohen's D = -0.38, p=0.07) PCH patients (Tables 7-8, Figure 4I -L). These trends were again even stronger in the SARS-CoV-2 + phase among both females (cohen's D = -1.02; p=7.5x10 -7 ; cohen's D = -0.97, p=1.8x10 -6 ) and males (cohen's D = -0.74; p=1.1x10 -4 ; cohen's D = -0.66, p=2.7x10 -4 ) (Tables 7-8, Figure 4M -P). We next evaluated whether outright anemia occurred more frequently in the PCH cohort than the PCNH cohort (see Methods and Figure 5A ). Anemia was indeed observed more frequently in the PCH cohort during both the pre-COVID phase ( (Figure 5D-E) . To determine whether this association is specific to COVID-19, we performed a similar analysis among patients who were hospitalized within one week of influenza diagnosis since 2003 (see Methods). Anemia was indeed more frequently observed in the year prior to flu diagnosis and during the initial influenza-associated hospitalization among patients who were subsequently rehospitalized compared to those who were not, with the strongest trend observed for moderate to severe anemia during the flu-positive phase (49/127 [39%] vs. 158/754 [21%]; OR=1.84; p=3.81x10 -5 ( Figure S3 ). Similarly we assessed whether laboratory-diagnosed AKI occurs more frequently in the PCH cohort based on the creatinine-related components of the KDIGO (Kidney Disease: Improving Global Outcomes) criteria for diagnosis and staging of AKI in adults ( Figure S4A ) (Khwaja, 2012) . Any stage AKI was indeed more common in the PCH cohort than the PCNH (Figure S4D-E) . Intriguingly when split by sex, we found that this trend was driven by males, as PCH males were more likely to experience stage 2+ AKI in both the pre-COVID phase ( (Figure S4F-G) , whereas this was not true when comparing PCH and PCNH females (data not shown). To test whether the previous observations were affected by potential confounding demographic or clinical covariates, we performed a series of logistic regression analyses. First we evaluated the association between post-clearance rehospitalization and the following independent variables during the pre-COVID and SARS-CoV-2 + phases, separately: minimum hemoglobin, maximum BUN, sex, age, number of blood draws, and ICU admission status (for the SARS-CoV-2 + phase only) ( Table 9 ). The consideration of ICU admission status as a covariate was particularly important because ICU admission was more common among PCH than PCNH patients ( Table 1) . While none of these variables in the pre-COVID phase showed a significant association with rehospitalization status, minimum hemoglobin during the SARS-CoV-2 + phase was singularly associated with rehospitalization (β=-0.29, p=4.2x10 -5 ). We then modified our logistic regression analysis by replacing the minimum hemoglobin and maximum BUN terms with binary labels of moderate/severe anemia and stage 2+ AKI, respectively (Table 10) . Among the tested pre-COVID variables, only stage 2+ AKI was modestly associated with rehospitalization status (β=0.93, p=0.09). ICU admission during the SARS-CoV-2 + phase was also modestly associated with post-clearance rehospitalization (β=0.70, p=0.06), suggesting that patients with more severe courses of initial COVID-19 illness are more likely to experience subsequent hospitalization. However, moderate/severe anemia during the SARS-CoV-2 + phase was again the most strongly associated variable with rehospitalization (β=1.16, p=9.0x10 -5 ). Finally we performed a split cohort subanalysis to specifically test whether the robust association between anemia and rehospitalization status applied to both patients who were and were not admitted to the ICU during their index infection. Indeed, the rate of moderate/severe anemia was significantly higher in PCH patients than PCNH patients when considering only those who were admitted to the ICU ( (Figure S5A-B ). Approximately one year after the first confirmed case, the COVID-19 pandemic continues to ravage communities across the globe. While efforts early in the pandemic rightly focused on the acute lung inflammation caused by SARS-CoV-2, the subsequent realization that COVID-19 may have more lasting effects has mandated a better understanding of factors that predispose patients to experience long-term COVID-19 related complications. We have previously sought to address this knowledge gap using state-of-the-art NLP models deployed on a complete EHR system (Pawlowski et al., 2021) , and here we have expanded this effort to include the longitudinal analysis of laboratory measurements both prior to COVID-19 diagnosis and during active SARS-CoV-2 infection. This lab test analysis shows that anemia and renal function in the pre-COVID and SARS-CoV-2 + phases are associated with the risk of post viral clearance rehospitalization. Our logistic regression analyses suggest that AKI and static renal lab measurements are not independently associated with rehospitalization, with ICU admission during COVID-19 infection representing a likely confounding factor contributing to the observed trends. Indeed, AKI has previously been reported as a common morbidity in ICU patients (Case et al., 2013; Girling et al., 2020; Mohsenin, 2017) and was observed frequently among ICU admitted COVID-19 patients in our cohort, with 78% (94/121) and 39% (47/121) of ICU admitted patients experiencing stage 1+ and stage 2+ AKI, respectively. On the other hand, hemoglobin levels and the outright diagnosis of moderate or severe anemia are robustly associated with postclearance rehospitalization independent of sex, age, number of blood draws, and ICU admission status. While the pathophysiologic foundations for these associations are not clear, the findings do merit consideration in the context of COVID-19 clinical care. Indeed, pre-existing conditions are already integrated in the clinical decision-making algorithms around COVID-19, as the Centers for Disease Control and Prevention (CDC) has designated various chronic conditions as risk factors for severe COVID-19 infection (e.g. cancer, chronic kidney disease, chronic obstructive pulmonary disease, and cardiovascular diseases such as heart disease, obesity, and diabetes) (CDC, 2020). However, there is much less known regarding factors or conditions that place people at risk for subsequent complications such as rehospitalization after viral clearance. Once identified, such factors and conditions should similarly be incorporated into the clinical decision-making process when treating COVID-19 patients. Our finding that lower hemoglobin levels and the outright diagnosis of moderate or severe anemia is associated with post viral clearance rehospitalization has not been previously reported. While this analysis certainly does not establish a causal role for anemia in post clearance hospitalization, the robust association warrants further studies to determine the whether anemia mitigating therapies (e.g., vitamin or mineral supplementation, erythropoietin administration, or blood transfusion) engender long-term benefits in select COVID-19 patients. Further, it is interesting that fatigue has been commonly reported as both an acute symptom and a lasting effect of COVID-19 (Pascarella et al., 2020; Townsend et al., 2020; Wagner et al., 2020) , but the mechanisms underlying this phenotype have not been established. 295 of the 374 (79%) hospitalized COVID-19 patients in this study had at least mild anemia during their SARS-CoV-2 + phase, and 140 of the 374 (37%) patients had moderate or severe anemia (defined as hemoglobin < 10 g/dL) during this interval. It would be worthwhile to perform a longitudinal follow-up on these patients to determine whether they continue to experience anemia in the months following SARS-CoV-2 clearance, and whether the presence of such a post-COVID anemia is associated with reports of fatigue. Along with our previous analysis (Pawlowski et al., 2021) , this study illustrates the value of deploying sophisticated platforms across EHR systems that enable the integrated analysis of diverse data types including sentiment-laden text and laboratory test measurements. Taken together, these studies exemplify the value of leveraging augmented curation methods to first identify phenotypes that distinguish defined clinical cohorts and then cross checking these phenotypic associations through a hypothesis-driven analysis of related lab tests. This framework can be effectively scaled for other clinical research efforts not only in COVID-19 but also in any other disease areas of interest. This study has a few important limitations to consider. First, this analysis considers patients within only one EHR system; while this system does contain patients from multiple sites of clinical care in distinct geographic locations (Minnesota, Arizona, Florida), there are still likely underlying biases in important factors such as patient demographics and tendencies around the ordering of laboratory tests by clinicians. Such biases would prevent the studied cohort and their associated data points from serving as true representative samples of all COVID-19 patients. Second, the analyzed cohort was relatively small (n = 382) as most patients diagnosed with COVID-19 do not subsequently receive two confirmatory negative PCR tests, and only a subset of these 382 patients possessed data for each lab test of interest. Finally, the definition of the SARS-CoV-2 + window is imperfect as the true date of viral clearance for a given patient would likely precede their first negative PCR test by an unknown amount of time. ("phases") were defined relative to SARS-CoV-2 PCR testing results. (B) Of the 2,429 patients who were diagnosed by PCR with COVID-19 and subsequently confirmed to have cleared SARS-CoV-2 with two consecutive negative tests, we created two cohorts: (1) patients who were hospitalized during their index infection and not hospitalized after confirmed viral clearance (post-clearance non-hospitalized, or "PCNH"; n=278), and (2) patients who were hospitalized during their index infection and rehospitalized within 90 days of confirmed viral clearance (post clearance hospitalized, or "PCH"; n=104). (C) A defined set of anemia and kidney related lab test measurements were compared between the PCH and PCNH cohorts in the pre-COVID and SARS-CoV-2 + intervals. Red shading indicates normal ranges for hemoglobin and hematocrit, spanning from the lower limit of normal for females (12 g/dL hemoglobin, 35.5% hematocrit) to the upper limit of normal for males (17.5 g/dL hemoglobin, 48.6% hematocrit). For each comparison, statistics shown include the number of patients analyzed, Cohen's D, BH-corrected Mann Whitney U test pvalue, and the difference of medians between the two cohorts. Box and whisker plots depict median and IQR along with the 10th and 90th percentiles. Cohen's D, BH-corrected Mann Whitney U test p-value, and the difference of medians between the two cohorts. Box and whisker plots depict median and IQR along with the 10th and 90th percentiles. Table 1 . Demographics and clinical characteristics of study cohorts, including patients who were and who were not rehospitalized after PCR-confirmed clearance of SARS-CoV-2. Each demographic variable or clinical characteristic was tested for difference in proportion with a Fisher Exact test or a difference in magnitude (for continuous variables) using a Mann-Whitney U test, and p-values shown without parentheses were corrected for multiple testing using the Benjamini-Hochberg (BH) correction. Statistically significant differences (p < 0.05) are denoted with an asterisk (*). Table 9 . Logistic regression analyses to assess the association between post-viral clearance hospitalization and minimum hemoglobin, maximum BUN, or potential confounding variables during the pre-COVID and SARS-CoV-2 + phases. Confounding variables considered include sex, age, the number of blood draws in the given interval, and ICU admission status during the given interval. For each regression (row), the coefficient (β) and associated Bonferroni-adjusted p-value (p) are shown for each independent variable (column) assessed. The coefficient represents the log-odds ratio. . P-values were calculated using the log likelihood ratio test and adjusted using the Bonferroni correction. An association between an independent variable and post clearance hospitalization is considered significant if p < 0.05 (shown in bold). The association between post-viral clearance hospitalization and ICU admission during the pre-COVID interval was not analyzed because this information was not available for our cohort prior to April 2020. Binary variables were assigned as follows: sex: 0 = female, 1 = male; ICU admission during interval: 0 = not admitted to ICU, 1 = admitted to ICU. Table 10 . Logistic regression analyses to assess the association between post-viral clearance hospitalization and the diagnosis of moderate to severe anemia, the diagnosis of stage 2+ AKI, or potential confounding variables during the pre-COVID and SARS-CoV-2 + phases. Confounding variables considered include sex, age, the number of blood draws in the given interval, and ICU admission status during the given interval. For each regression (row), the coefficient (β) and associated Bonferroni-adjusted p-value (p) are shown for each independent variable (column) assessed. The coefficient represents the log-odds ratio. P-values were calculated using the log likelihood ratio test and adjusted using the Bonferroni correction. An association between an independent variable and post clearance hospitalization is considered significant if p < 0.05 (shown in bold). The association between post-viral clearance hospitalization and ICU admission during the pre-COVID interval was not analyzed because this information was not available for our cohort prior to April 2020. Binary variables were assigned as follows: moderate/severe anemia: 0 = no anemia, 1 = anemia; sex: 0 = female, 1 = male; ICU admission during interval: 0 = not admitted to ICU, 1 = admitted to ICU. Data S1. Code for statistical analyses presented in this manuscript, related to STAR Methods. This zip file contains several python notebooks with code corresponding to statistical analyses including comparisons of individual lab test values between PCH and PCNH cohorts, along with the presented logistic regression analyses. Further information and requests for information should be directed to and will be fulfilled by the lead contact, Venky Soundararajan (venky@nference.net). This study did not generate new reagents. • Data: The data supporting this study has not been deposited in a public repository because it contains personally identifiable information from human subjects which are protected by national privacy regulations, but a de-identified version of this data may be made available from the lead contact on request. A proposal with detailed description of study objectives and statistical analysis plan will be needed for evaluation of the reasonability of requests. Deidentified data will be provided after approval from the lead contact and the Mayo Clinic's standard IRB process for such requests. • Code: Original code from this analysis is available in Data S1. • Any additional information required to reanalyze the data reported in this paper is available from the lead contact upon request. The total cohort included 382 individuals. Each individual was part of one of the following two cohorts, on the basis of whether they were rehospitalized after PCR-confirmed clearance of SARS-CoV-2: (1) post clearance hospitalized ("PCH"; n=104), or (2) post clearance nonhospitalized ("PCNH"; n=278). More details describing the participant selection algorithm are provided in the Method Details and are illustrated in Figure 1 . Demographic and clinical characteristics of the analyzed cohorts (including age, sex, race, ethnicity, time to PCRconfirmed SARS-CoV-2 clearance, and ICU admission status during the index COVID-19 infection) are provided in Table 1 . This study was reviewed and approved by the Mayo Clinic Institutional Review Board (IRB 20-003278) as a minimal risk study. Subjects were excluded if they did not have a research authorization on file. The IRB approved was titled: Study of COVID-19 patient characteristics with augmented curation of Electronic Health Records (EHR) to inform strategic and operational decisions with the Mayo Clinic. The study was deemed exempt by the Mayo Clinic Institutional Review Board and waived from consent. The following resource provides further information on the Mayo Clinic Institutional Review Board and adherence to basic ethical J o u r n a l P r e -p r o o f principles underlying the conduct of research, and ensuring that the rights and well-being of potential research subjects are adequately protected (https://www.mayo.edu/research/institutional-review-board/overview). This was a case-control study. The primary outcome was rehospitalization status within 30 days of PCR-confirmed SARS-CoV-2 clearance. The exposure variables were anemia and kidney dysfunction as assessed through selected lab measurements detailed below. Cases and controls were selected from a cohort of 66,689 patients who presented to the Mayo Clinic Health System (including tertiary medical centers in Minnesota, Arizona, and Florida) and received at least one positive SARS-CoV-2 PCR test between the start of the COVID-19 pandemic and December 12, 2020 (see Figure 1B ). Post clearance hospitalized ("PCH") cases (n=104) were defined as patients who were hospitalized for COVID-19, had two documented negative SARS-CoV-2 PCR tests following their last positive test result, and were subsequently admitted to the hospital within 30 days of clearance. Post clearance non-hospitalized ("PCNH") controls (n=278) were defined as those who were hospitalized for COVID-19, had two documented negative SARS-CoV-2 PCR tests following their last positive test result, and were not hospitalized within 30 days of clearance. Demographic and clinical features of the PCH and PCNH cohorts are summarized in Table 1 . Laboratory results were assessed (1) during the year prior to COVID-19 diagnosis, referred to throughout this manuscript as the "pre-COVID phase" and (2) during the period in which a patient was positive for SARS-CoV-2 by PCR, referred to throughout this manuscript as the "SARS-CoV-2 + phase." COVID-19 diagnosis was conferred by a positive SARS-CoV-2 PCR test, and clearance was defined as two consecutive negative SARS-CoV-2 PCR tests occurring after a positive test. The estimated viral clearance date was taken as the date of the first negative PCR test in this sequence of two consecutive negative tests. The primary exposure variables were anemia and AKI. The selected laboratory measurements related to anemia included hemoglobin and hematocrit, and laboratory measurements related to AKI included serum creatinine, serum blood urea nitrogen (BUN), and estimated glomerular filtration rate (eGFR). The majority of eGFR measurements (~96%) were estimated by creatinine; these tests had a maximum recorded value of 90 mL/min/BSA, which corresponds to the lower limit of normal. The remaining 4% of eGFR measurements were estimated by cystatin C levels; for these tests, a value above 90 mL/min/BSA was possible and was indeed recorded in 5 of 31 cases. For a given lab test, we considered the median, maximum, and minimum measurements for each patient during the specified time windows (i.e. the pre-COVID and SARS-CoV-2 + phases). Histograms showing the number of measurements per patient in each time period for the selected tests are shown in Figures S6-S7 . Given the directionality of these tests (i.e. anemia is defined by low hemoglobin and hematocrit, while kidney dysfunction is characterized by increases in serum creatinine and BUN but a decrease in eGFR), we were primarily interested in comparing the patient-level minimum values of hemoglobin, hematocrit, and eGFR, and patient-level maximum values of serum creatinine and BUN in each time period. As shown in Table 1 , there were no statistically significant differences between these groups in age, relative cleared date (defined as the time to the first negative SARS-CoV-2 PCR test in a series of two consecutive negative tests after the last positive test), race, ethnicity, or sex. However, we did note that a higher fraction of PCNH cases were male as compared to PCH counterparts (58% vs. 52%). This potential confounding factor was addressed by performing (1) sex-split subgroup analyses (see Tables 5-8) and (2) multivariate logistic regression (see Tables 9-10 and Statistics below). Although hospitalization during index infection was required for inclusion in both the PCH and PCNH cohorts, this criterion does not necessarily ensure comparable severities of index infection. To better assess potential differences in index infection severity, we compared the rates of ICU admission and found this to be significantly higher in the PCH cohort compared to the PCNH cohort (48/104 [46%] vs. 73/278 [26%], p=4.0x10 -3 ; Table 1 ). Further, patients admitted to the ICU during index infection had slightly lower median hemoglobin measurements during the SARS-CoV-2 + phase than patients not admitted to the ICU (cohen's D = -0.27, p=0.01; Figure S8 ). Thus, we considered ICU admission as a potential confounding factor in our analyses. We addressed this by performing (1) subgroup analyses to determine whether differences between the PCH and PCNH cohorts were observed both in patients who were and were not admitted to the ICU (see Figure S5) and (2) multivariate logistic regression (see Tables 9-10 and Statistics below). We observed that patients in the PCH cohort were more likely to experience anemia in both the pre-COVID and SARS-CoV-2 + phases than patients in the PCNH cohort. Because hospitalized patients can experience anemia due to repeated blood draws for laboratory testing, we also considered the number of blood draws per patient as a potential confounding variable. To address this, we performed multivariate logistic regression (see Tables 9-10 and Statistics below). We classified patients in a binary fashion for each time window based on whether their lab tests were consistent with the clinical diagnosis of anemia or acute kidney injury. Classifications were defined according to the Mayo Clinic reference ranges for anemia and the KDIGO (Kidney Disease: Improving Global Outcomes) criteria for AKI (Khwaja, 2012) as follows (see also Figure 5A and Figure S4A ): • Anemia (mild, moderate, or severe): for males, hemoglobin < 13.5 g/dL or hematocrit < 38.3%. For females, median < 12.0 g/dL or hematocrit < 35.5%. Patient-level median values were considered for the pre-COVID phase, and patient-level minimum values were considered for the SARS-CoV-2 + phase. • Anemia (moderate or severe): for both males and females, hemoglobin < 10.0 g/dL. Patient-level median values were considered for the pre-COVID phase, and patient-level minimum values were considered for the SARS-CoV-2 + phase. • Acute kidney injury (stage 1, 2, or 3) : increase in serum creatinine by ≥0.3 mg/dL within 48 hours or an increase in serum creatinine to ≥1.5x the baseline value which is known or assumed to have occurred in the prior 7 days. The baseline was defined as the minimum value among all serum creatinine tests for the given patient in the prior 7 days. • Acute kidney injury (stage 2 or 3): increase in serum creatinine to ≥2x the baseline value which is known or assumed to have occurred in the prior 7 days, or a serum creatinine value of ≥4 mg/dL. The baseline was defined as the minimum value among all serum creatinine tests for the given patient in the prior 7 days. Using these classifications, we then tested whether the experience of anemia or acute kidney injury during the pre-COVID or SARS-CoV-2 + intervals was associated with subsequent rehospitalization by computing odds ratios and Fisher exact test p-values. To test whether trends in anemia-related measurements could be explained by differences in the number of blood draws received in the pre-COVID phase or SARS-CoV-2 + phase, we counted the number of blood draws in these time intervals for each patient. All tests with a documented source of "Blood", "Plasma", or "Serum" were first collected for each patient. For a given patient on a given day, we then took the count of the most frequently obtained test as the number of blood draws for that patient on that day. For example, if the record for Patient P on Day D contained 5 serum sodium measurements, 3 hemoglobin measurements, and 1 plasma IL-6 measurement, then we inferred that Patient P received 5 blood draws on Day D. To assess whether the observed association between anemia and rehospitalization is specific to COVID-19, we repeated a subset of our analyses on a cohort of influenza patients. We identified patients in the Mayo Clinic health system who were hospitalized between seven days prior to and seven days after a positive influenza diagnostic test in any year between 2003 and 2019. This hospitalization within one week of an influenza diagnosis was defined as the "index hospitalization." The group of hospitalized influenza patients was split into two cohorts based on whether they were rehospitalized within 30 days of discharge from their index hospitalization. We defined two time periods for each patient: (i) the pre-Flu phase, defined as the one year prior to the first positive influenza test for a given patient; and (ii) the Flu-positive phase, defined as the duration of the index hospitalization. Each patient was classified in a binary fashion for each time window based on whether their lab tests were consistent with the clinical diagnosis of anemia, as described previously for our COVID-19 analyses. Using these classifications, we then tested whether the experience of anemia during the pre-Flu or Flu-positive phases was associated with subsequent rehospitalization by computing odds ratios and Fisher exact test pvalues. Laboratory values were assessed within each time interval as patient-wise medians, minima, or maxima. To perform statistical comparisons between the PCH and PCNH cohorts, one-sided Mann-Whitney U-tests and Cohen's D were applied to continuous outcome measures, generating a p-value and an effect size measurement. The distribution of patient-wise median, minimum, and maximum values obtained for each laboratory measurement among this cohort were assessed with a Kolmogorov-Smirnov (KS) Test of Normality (Figures S9-S11 ). As these measurements did not follow a normal distribution (KS Test p-value < 0.05), the non-parametric Mann-Whitney U test was chosen for statistical comparisons. A one-sided test was used because these comparisons were performed as follow-up to our previous EHR-based analysis which found a higher prevalence of anemia and kidney injury in the PCH cohort (Pawlowski et al., 2021) , providing a pre-supposed direction of change for each tested laboratory measurement. For each set of comparisons performed, p-values were corrected using a Benjamini-Hochberg (BH) correction for multiple hypothesis testing. Differences were considered statistically significant and biologically relevant if the BH-corrected p-value was ≤ 0.05 and the cohen's D magnitude was ≥ 0.4. To assess categorical outcome measures (i.e., contingency tables of anemia or AKI status versus rehospitalization status), we computed odds ratios and Fisher exact test p-values. All of the tests described above were applied using the SciPy package v1.6.3 (Virtanen et al., 2020) in Python (version 3.5). Code corresponding to these analyses is provided in Data S1. To address potential confounding variables that may be related to the observed trends in laboratory measurements, we performed multivariate logistic regressions for both the pre-COVID and SARS-CoV-2 + phases (see Tables 9-10) . For each regression, the binary dependent variable was defined as post viral clearance rehospitalization status (i.e. assignment to the PCH cohort vs. PCNH cohort), and the independent variables included one anemia metric and one renal function metric along with sex (binary), age (continuous), number of blood draws J o u r n a l P r e -p r o o f (continuous) in the given time interval, and ICU admission status during that time interval (binary). Stated explicitly, the logistic regression equation was as follows: log(P PCH /(1 -P PCH )) = β 0 + β 1 *(Anemia Metric) + β 2 *(Renal Function Metric) + β 3 *(Sex) + β 4 *(Age) + β 5 *(Blood Draw Count) + β 6 *(ICU Admission Status). In one set of logistic regressions, the anemia and renal function metrics employed were minimum hemoglobin (continuous) and maximum BUN (continuous), respectively (see Table 9 ). In another set, the anemia and renal function metrics employed were diagnosis status for moderate or severe anemia (binary) and diagnosis status for stage 2+ AKI (binary), respectively (see Table 10 ). Binary variables were assigned as follows: anemia: 0 = no anemia or mild anemia, 1 = moderate or severe anemia; AKI: 0 = no AKI or stage 1 AKI, 1 = stage 2+ AKI; sex: 0 = female, 1 = male; ICU admission during interval: 0 = not admitted to ICU, 1 = admitted to ICU. Of note, data regarding ICU admission status was not available prior to April 2020, so this feature was omitted from the pre-COVID regression analyses. For each time interval, we performed separate regression analyses for the continuous and binarized anemia and AKI terms rather than including all terms in one model because these metrics are strongly correlated with each other, and multicollinearity of independent variables can negatively impact the estimation of logistic regression coefficients. Each regression yielded a coefficient (log odds ratios) and a p-value (calculated using the log likelihood ratio test) for each independent variable. P-values were adjusted within the output of each model using the SARS-CoV-2 strategically mimics proteolytic activation of human ENaC Ad26 vector-based COVID-19 vaccine encoding a prefusion-stabilized SARS-CoV-2 Spike immunogen induces potent humoral and cellular immune responses Epidemiology of acute kidney injury in the intensive care unit AKI in Hospitalized Patients with COVID-19 Long-term Health Consequences of COVID-19 Anemia predicts poor outcomes of COVID-19 in hospitalized patients: a prospective study in Iran Acute kidney injury and confirmed clearance of SARS-CoV-2 virus Long-Term Outcomes Among Patients Discharged From the Hospital With Moderate Anemia: A Retrospective Cohort Study Statsmodels: Econometric and Statistical Modeling with Python A Single-Cell RNA Expression Map of Human Coronavirus Entry Factors Persistent fatigue following SARS-CoV-2 infection is common and independent of severity of initial infection Knowledge synthesis of 100 million biomedical documents augments the deep expression profiling of coronavirus receptors EHR system reveals symptoms of impending COVID-19 diagnosis The prevalence and characteristics of anaemia at discharge home after intensive care Prevalence of and Recovery From Anemia Following Hospitalization for Critical Illness Among Adults Long-term consequences of COVID-19: research needs Single-Cell RNA Expression Profiling of ACE2, the Receptor of SARS-CoV-2 Rehospitalized patients have lower hemoglobin before and during SARS-CoV-2 infection Rehospitalized patients are more likely to experience anemia during active infection We thank Murali Aravamudan for his thoughtful review and feedback on this manuscript. We also thank Andrew Danielsen, Jason Ross, Jeff Anderson, and Sankar Ardhanari for their support that enabled the rapid completion of this study. This study was funded by nference.