key: cord-0699679-g66wjtk9
authors: Liu, Can; Martins, Andrew J.; Lau, William W.; Rachmaninoff, Nicholas; Chen, Jinguo; Imberti, Luisa; Mostaghimi, Darius; Fink, Danielle L.; Burbelo, Peter D.; Dobbs, Kerry; Delmonte, Ottavia M.; Bansal, Neha; Failla, Laura; Sottini, Alessandra; Quiros-Roldan, Eugenia; Lee Han, Kyu; Sellers, Brian A.; Cheung, Foo; Sparks, Rachel; Chun, Tae-Wook; Moir, Susan; Lionakis, Michail S.; Rossi, Camillo; Su, Helen C.; Kuhns, Douglas B.; Cohen, Jeffrey I.; Notarangelo, Luigi D.; Tsang, John S.; Abers, Michael S.; Apps, Richard; Bosticardo, Marita; Milanez-Almeida, Pedro; Mulè, Matthew P.; Shaw, Elana; Zhang, Yu; Castelli, Francesco; Muiesan, Maria Lorenza; Tomasoni, Gabriele; Scolari, Francesco; Tucci, Alessandra
title: Time-resolved Systems Immunology Reveals a Late Juncture Linked to Fatal COVID-19
date: 2021-02-10
journal: Cell
DOI: 10.1016/j.cell.2021.02.018
sha: da6e624c3b450bcf0399eecf0479f8336f6fa55a
doc_id: 699679
cord_uid: g66wjtk9

COVID-19 exhibits extensive patient-to-patient heterogeneity. To link immune response variation to disease severity and outcome over time, we longitudinally assessed circulating proteins as well as 188 surface protein markers, transcriptome, and T-cell receptor sequence simultaneously in single peripheral immune cells from COVID-19 patients. Conditional-independence network analysis revealed primary correlates of disease severity, including gene expression signatures of apoptosis in plasmacytoid dendritic cells and attenuated inflammation but increased fatty acid metabolism in CD56dimCD16hi NK cells linked positively to circulating IL-15. CD8+ T cell activation was apparent without signs of exhaustion. While cellular inflammation was depressed in severe patients early after hospitalization, it became elevated by days 17-23 post symptom onset, suggestive of a late wave of inflammatory responses. Furthermore, circulating protein trajectories at this time were divergent between and predictive of recovery-fatal outcomes. Our findings stress the importance of timing in the analysis, clinical monitoring, and therapeutic intervention of COVID-19.

COVID-19 is a potentially fatal disease caused by infection with the coronavirus SARS-CoV-2 (Zhou et al., 2020b) , which has been fueling a global pandemic since December 2019 with more than 98 million confirmed cases and 2.1 million deaths to date (https://www.who.int/emergencies/diseases/novelcoronavirus-2019). The clinical course of COVID-19 exhibits extensive heterogeneity: the infection can result in little to no symptoms, or mild disease with complete recovery, but also critical illness, hospitalization, and progression to Acute Respiratory Distress Syndrome (ARDS), tissue damage, and death (Wu and McGoogan, 2020) . Variables such as age and host genetic variations [e.g., TLR7 in men (van der Made et al., 2020) or genes involved in type I Interferon (IFN) production (Pairo-Castineira et al., 2020; Zhang et al., 2020c) ], presence of autoantibodies (Bastard et al., 2020; , and preexisting conditions, including obesity, diabetes mellitus, and cardiovascular disease have been identified as risk factors for severe disease and poor outcomes (CDC COVID-19 Response Team, 2020; Williamson et al., 2020) . The research community has rapidly mobilized to investigate how human immune response variations can contribute to disease severity, progression, and outcome (reviewed in (Grigoryan and Pulendran, 2020) ). Understanding the mechanisms underlying immune response dynamics and differences across patients is critical for designing effective therapeutics, prognostic tools, and prevention strategies better tailored to individual patients.

The immune system protects the host but can also drive life-threatening inflammatory pathologies (Matheson and Lehner, 2020; Tay et al., 2020) . Human immune responses to SARS-CoV-2 infection are highly dynamic; most of the data thus far came from studies of ho spitalized patients following symptom onset (Arunachalam et al., 2020; Laing et al., 2020; Lucas et al., 2020; Mathew et al., 2020; Moderbacher et al., 2020; Schulte-Schrepping et al., 2020; Su et al., 2020) . COVID-19 patients are broadly characterized by elevation in neutrophils and monocytes concomitant with lymphopenia in blood, particularly with decreases in CD4 + and CD8 + T cells (Lucas et al., 2020; Mathew et al., 2020) , including subsets such as γδ T cells (Laing et al., 2020) . However, consistent with ongoing adaptive responses, increases in the frequency of activated and cycling CD4 + and CD8 + T cells as well as activated B cells and plasmablasts have been detected (Laing et al., 2020; Mathew et al., 2020) . Accordingly, robust humoral responses have been observed with marked elevation of antibodies against SARS-CoV-2 proteins, such as the spike and nucleocapsid (Burbelo et al., 2020; Ju et al., 2020; Laing et al., 2020; Long et al., 2020; Zheng et al., 2020) . However, lack of coordination among and delay in antigen-specific T and B cell responses tend to mark severe disease (Moderbacher et al., 2020) . The level of proinflammatory cytokines such as IL-6, IL-8, IL-18, and TNF-α are often elevated in COVID-19 patients, and higher levels of IL-6 has been found to be predictive of poor outcomes in several studies Mandel et al., 2020) ; TNF-α, IP-10 (CXCL-10), and IL-10 levels have also been shown to carry prognostic information (Abers et al., 2021; Del Valle et al., 2020; Laing et al., 2020; Lucas et al., 2020) .

Innate immunity shows signs of dysregulation in COVID-19 . Early reports noted the low levels of type I Interferon (IFN) activity detected in peripheral blood or lungs of COVID-19 patients (Acharya et al., 2020; Blanco-Melo et al., 2020; Hadjadj et al., 2020) , suggesting a potential defect in viral defense. However, later reports found type I IFN transcriptional signatures (IFN-I signatures) in cells from bronchoalveolar lavage (BAL) fluids or peripheral blood of patients (Arunachalam et al., 2020; Liao et al., 2020; Schulte-Schrepping et al., 2020) . The levels of both circulating IFN-α proteins and IFN-I signature in blood were negatively associated with the time post symptom onset and more mildly, with disease severity (Arunachalam et al., 2020; Hadjadj et al., 2020) , suggesting that timing plays an important role in both the induction and experimental detection of type I IFN activities. In addition to the genetic evidence linking the type I IFN pathway to severe COVID-19 mentioned above, auto-antibodies against type I IFN have been found in some patients with life-threatening disease (Bastard et al., 2020; , further implicating the importance of type I IFN response to protection against severe disease. Aside from infected epithelial cells, plasmacytoid dendritic cells (pDC) are another major producer of type I IFNs in response to viral infections. The frequency of pDCs in peripheral blood has been found to be lower in COVID-19 patients than healthy controls (HCs) and negatively correlated with disease severity (Arunachalam et al., 2020; Kuri-Cervantes et al., 2020; Laing et al., 2020) , but substantial changes in pDC levels have not been found in BAL fluids or lung autopsies (Chua et al., 2020; Liao et al., 2020; Nienhold et al., 2020; Wauters et al., 2020) . In addition, cell surface levels of HLA-DR have also been found to be lower in classical monocytes (Giamarellos-Bourboulis et al., 2020; Moratto et al., 2020; Schulte-Schrepping et al., 2020) and myeloid DCs (Arunachalam et al., 2020) , suggesting that patients undergoing active inflammation and immune responses to SARS-CoV-2 have attenuated innate immune functions.

Given the importance of timing (Matheson and Lehner, 2020) and the involvement of diverse immune cells with evolving states in COVID-19, here we used multimodal single cell profiling and computational approaches to longitudinally assess hospitalized COVID-19 patients. In addition to providing a time-resolved single cell atlas of COVID-19 for further integration with other single cell datasets (Schulte-Schrepping et al., 2020; Su et al., 2020; Unterman et al., 2020; Wilk et al., 2020; Yao et al., 2020; Zhang et al., 2020b) , we revealed a network of dynamically evolving cell-type specific signatures linked to disease severity. Strikingly, our analyses uncovered a late time window during which the host immune response undergoes divergences correlated with disease severity and predictive of fatal outcomes. Our results raise the prospect of more personalized intervention strategies and stress the importance of timing in the analysis and clinical monitoring of COVID-19.

Peripheral blood mononuclear cells (PBMC) were obtained from 33 hospitalized patients from Brescia, Italy (Figures 1A and S1A) and 14 age-and gender-matched HCs (see Methods). The patients were classified as moderate (n=3), severe (n=5), and critical (n=25 with 4 deceased during hospitalization) based on the National Institutes of Health guidelines (https://www.covid19treatmentguidelines.nih.gov/overview/management-of-covid-19/) (Table S1 ). This distribution of disease severity is reflective of the early pandemic in Northern Italy when only the most severe patients were hospitalized due to capacity limitations. Longitudinal samples were obtained at up to four time-points from each patient [Time 0 (T0) to Time 3 (T3)] that covered a range of times post symptom onset, thus providing a unique opportunity for linking immune state variations at T0 (at or near the time of hospital admission) to disease status and severity and reconstructing temporal immune response trajectories by pooling information across patients ( Figure 1A ).

Peripheral single immune cells were profiled by CITE-seq (Stoeckius et al., 2017) , simultaneously measuring: 1) the expression of 188 surface protein markers (plus 4 isotype control antibodies) (Table  S2) ; 2) the mRNA transcriptome; and 3) B-and T-cell receptor (BCR/TCR) sequences of the variable/diversity/joining (V(D)J) region. In addition to total PBMCs, we sorted non-naïve B and T cells to increase their representation for T/B cell clonality and repertoire analysis. We integrated T cell clonality data here, while the corresponding B cell data will be reported separately. The data were analyzed at both the single cell and "pseudobulk" levels -the latter capturing the average gene expression profiles of cell clusters and states while mitigating single-cell mRNA measurement noise (Kotliarov et al., 2020; Soneson and Robinson, 2018) . In addition to multimodal single cell profiling, we have obtained and, in our analyses below, integrated blood cell count data and serum protein profiles (Abers et al., 2021) covering 63 circulating proteins including cytokines.

Our analyses focused on the gene expression program within cell clusters (capturing cell types and states). We built mixed effect statistical models to link these parameters to disease status (COVID-19 patients versus HCs), disease severity (after adjusting for the effects of timing; see below), and timing [time since symptom onset (TSO)] after accounting for age and experimental batch ( Figure 1B ). The goal was to first uncover cell type specific gene expression correlates of disease status and severity at T0 (near admission) followed by a dissection of how immune cell states evolve over time. Given the dynamic nature of the immune response to SARS-CoV-2, accounting for and disentangling the effects of timing when analyzing disease severity correlates were particularly important.

Given the large proportion of patients classified as "critical" [or a World Health Organization (WHO) ordinal score of ≥ 5] in our cohort, we first sought to use established clinical, inflammatory, and cell count parameters to develop a quantitative disease severity score; such a metric would allow us to delineate finer shades of disease severity (e.g., breaking ties among patients in the same severity category) and thus better power downstream analyses. Based on the literature (Del Valle et al., 2020; Lucas et al., 2020; Mandel et al., 2020; Schett et al., 2020) , we selected circulating proteins associated with inflammation (TNF-α/β, IL-6, IL-18), IFN responses (IP-10, CXCL9), immune responses (IFN-γ, IL-4, IL-13, IL-17), and clinical markers known to be corelated with COVID-19 severity, namely C-reactive protein (CRP), neutrophil/lymphocyte ratio (NLR), lymphocyte and platelet counts, fibrinogen, D-dimer, lactate dehydrogenase (LDH), and pulse oximetry to fraction of inspired oxygen (SpO2/FiO2) ratio. Excluding three patients with very late initial sample time-points (TSO ≥ 30 days), hierarchical clustering of the T0 patient samples using these parameters revealed several distinct patient profiles ( Figure 1C) , including 1) a profile of the most critical patients characterized by high levels of inflammatory markers but low in TNF-β and lymphocyte counts; 2) the moderate/severe patients with higher IL-17 and lymphocyte counts with either high or low levels of several cytokines; and 3) most of the critical-alive patients with a more mixed profile, consistent with the notion that a finer quantitative delineation of disease severity could be useful to characterize these patients.

Using ordinal partial least square regression and leave-one-out cross validation within the CITEseq cohort, we performed feature selection to identify parameters most predictive of the four severityoutcome categories ( Figure 1D ; SpO2/FiO2 ratio was not used because it was used to define clinical disease severity). Those positively associated with severity include NLR, LDH, IP-10, D-Dimer, IL-6, and IL-13, while lymphocyte count and TNF-β were correlated with less severe disease. These top parameters were then used along with the SpO 2 /FiO 2 ratio to derive a severity score for each patient. As expected, this disease severity metric (DSM), was significantly associated with the severity-outcome categories within our CITE-seq cohort ( Figure 1E ) and also with the WHO ordinal score ( Figure S1B ). This was validated in an independent set of 64 patients from Brescia for whom DSM was computed using the same formula derived from our CITE-seq cohort only ( Figure 1E ). The DSM of the critical-alive patients in both the CITE-seq and the validation cohorts spanned a wide range, from levels comparable to the moderate and severe categories, to those of the critical-deceased patients ( Figure 1E ). As a further indication that DSM can delineate clinically relevant disease severity, DSM was significantly associated with ICU admission status in the validation cohort, even when assessed in the critical-alive patients only ( Figure S1C ). DSM thus provides a quantitative tool to order patients on a continuous severity scale; in our analyses below, we also used DSM to bin patients into discrete balanced groups of lower (DSM-low: DSM<=median DSM) versus higher disease severity (DSM-high: DSM>median DSM).

Given the interpretability of immune cell subsets identified using surface protein markers , we used the 188 surface protein profile to cluster ~400,000 single cells and derived 30 coarse-resolution cell clusters (Table S3) Tables S2 and S3 ]. Additional higher resolution clustering within each coarse level subset identified finer subpopulations, e.g., distinct memory subsets within the CD4 + memory T cell cluster ( Figure 2C ). Manual gating was also used to obtain specific cell populations such as plasmablasts and circulating T follicular helper cells (cTfh). The relative frequency of the major cell populations obtained by CITE-seq is largely concordant with that obtained independently from 27-color flow cytometry ( Figure S2A ). Furthermore, after accounting for TSO and age, we confirmed several previously reported associations between peripheral immune cell frequencies and disease status or severity near hospital admission (T0) (Figures S2B and S2C) , including the negative correlation between severity (here captured by DSM) and the frequency of pDCs, CD8 + central memory T cells, and non-classical monocytes (Arunachalam et al., 2020; Grigoryan and Pulendran, 2020; Kuri-Cervantes et al., 2020; Laing et al., 2020; Mathew et al., 2020; Schulte-Schrepping et al., 2020) .

Due to the relatively lower number of cells obtained per sample by CITE-seq compared to flow cytometry, the frequency of rarer cell subsets showed weaker correlation with these two assays ( Figure  S2A ). However, the power of CITE-seq lies in its ability to assess gene expression within cell subsets defined by surface proteins. As an example, a mRNA-only UMAP visualization of classical monocytes (identified by surface proteins only) shows separation of cells based on disease status, DSM, and to a lesser extent, TSO ( Figures 2D-F) . Surface protein expression also differs across cells with distinct underlying RNA expression; for example, cells expressing CD163 or HLA-DR protein ( Figures 2G-H ) group together and are present in multiple donors ( Figure 2I ). Thus, these data (GEO Accession: GSE161918) constitute a rich multimodal time-resolved single cell dataset for exploring cell-type-specific transcriptomic and pathway signatures of disease severity and their dynamic trajectories.

We systematically searched for transcriptional correlates of disease status (COVID-19 versus HCs), severity (DSM), and time (TSO) within individual cell clusters. Our goal was to first examine correlates at or close to hospital admission (T0) after accounting for TSO differences across patients (Figures 3A and 3B) , followed by incorporation of longitudinal data within patients to reveal the effects of time (TSO) and the temporal evolution of cell type specific immune signatures in DSM-high and DSM-low patients separately ( Figures S3A and S3B) . We fit the expression of each gene within individual cell clusters ( Figure 2A , plus additional manually gated populations) via a model that incorporated the primary effect (COVID-19 versus HCs or DSM and TSO), age, and experimental batch, followed by gene set enrichment analysis (GSEA) (Subramanian et al., 2005) to uncover the biological processes and pathways involved.

IFN-I and early viral response gene signatures from live influenza challenge (Woods et al., 2013) or yellow fever vaccination (Querec et al., 2009 ) (see Methods for signature compilation) were elevated in COVID-19 relative to HCs across myeloid and lymphoid cell lineages ( Figure 3A , Table S4A ). These signatures were negatively associated with DSM (Figures 3B and 3C, Table S4B ): the DSM-high patients tended to have weaker IFN-I signatures than DSM-low patients even after accounting for TSO, although even some of the most critical patients had elevated IFN signatures relative to HCs ( Figure 3C ). IFN-I signatures decreased over time and the extent of the drop was more precipitous in DSM-low patients in most cell types, partly because those patients started off at a higher level ( Figures 3D and 3E ).

Translation and ribosome genes tended to be lower in most cell types with elevated IFN-I signatures (Figures 3A, 3G and S3C -Translation/Ribosome signatures, Table S4A ). Elevated type I IFNs are known to suppress protein translation to limit virus production (Ivashkiv and Donlin, 2014) . Consistently, as IFN-I signatures decreased over time, translation increased ( Figure S3D ). Interestingly, even though IFN-I signatures were negatively associated with DSM in most cell types, translation signatures were not positively associated with DSM in the same cell types ( Figure 3B , Table S4B ), indicating that additional regulation was involved to tune protein translation in COVID-19 patients.

pDC apoptosis is associated with elevated disease severity and reduced pDC frequency pDCs are major producers of type I IFNs and orchestrators of T and B cell responses upon viral infection (Swiecki, 2015) . It has been unclear why peripheral pDC frequencies were decreased in COVID-19 patients and negatively correlated with disease severity (Figures S2B and S2C) (Arunachalam et al., 2020; Kuri-Cervantes et al., 2020; Laing et al., 2020; Lucas et al., 2020) ; a better understanding of this observation could help explain the attenuated IFN-I signatures in the most severe patients ( Figure 3B , Table S4B ).

There is no evidence thus far that blood pDCs migrate into tissues after SARS-CoV-2 infection (Chua et al., 2020; Liao et al., 2020; Nienhold et al., 2020; Wauters et al., 2020) ; pDC infiltration was also not evident in bronchoscopy (Sánchez-Cerrillo et al., 2020) . Alternatively, our analysis revealed that apoptotic gene signatures in pDCs, including BRCA2, CASP3, CASP8, BID, BAK1, and XBP1, were positively J o u r n a l P r e -p r o o f associated with disease status and severity (some of the most critical patients had few or no pDCs, so they were filtered out), as were oxidative stress-induced senescence genes such as FOS, HIST1H2AC, PHC3, MDM4, CBX6, and CDKN2D (Figures 3A, 3B and 3F; Tables S4A and S4B) . We further validated that the pDC apoptosis signature was significantly increased in COVID-19 patients relative to HCs using publicly available single cell RNA-seq data from two independent COVID-19 cohorts from Germany (Figures S3E-G) (Schulte-Schrepping et al., 2020) , although the positive association with disease severity was not as apparent, likely due to weak statistical power given low cell numbers. The apoptotic gene signature score was also negatively correlated with pDC frequency (Pearson r=-0.72, p=0.045), which was positively associated with IFN-I signatures in several cell populations ( Figure 3G ), suggesting that pDC apoptosis might have contributed to lower peripheral pDC numbers, which then led to the depressed IFN-I signatures in cells. We also confirmed that circulating IFN-α2a levels in serum were positively correlated with IFN-I signatures in cells ( Figure 3G ; CD8 + memory T cell shown; similar for other cell types). These analyses further confirmed the negative association between IFN-I and translation signatures within cell types discussed above (Figures 3A and 3G) . Together, these results suggest that pDC apoptosis is a potential culprit of lower pDC frequencies and IFN-I signatures in severe COVID-19.

Thus far, our analyses identified cell-type-specific signatures associated with DSM near hospital admission (T0) after accounting for TSO ( Figure 3B and Table S4B ), but many of them might not reflect primary correlates of disease severity and could have emerged because they correlated with ones that were. Thus, we next applied conditional independence network analysis to infer primary (or direct) correlates of DSM (see Methods). Nodes in the resultant network represent the gene expression state of a biological process in a cell population, except that DSM itself is one of the nodes; two nodes are connected if the correlation between them across patients is statistically independent of their correlations with other nodes in the network.

We found four nodes directly connected to DSM ( Figure 4A ; the rest of the network is captured in Table S5 ). Oxidative stress-induced senescence in pDCs and fatty acid (FA) metabolism in CD56 dim CD16 hi NK cells were inferred as the primary positive correlates of DSM (Figure 4A and Tables  S4B and S5 ). The senescence signal is reflective of the aforementioned pDC apoptosis signature (Figures  3B and 3G, Table S4B ), e.g., the senescence and apoptosis signature scores in pDCs were significantly correlated across patients even though JUN is the only shared leading-edge gene in these two gene sets ( Figure S4A ). Thus, this result further supports that pDC apoptosis was a potential driver of disease severity.

The role of NK cell metabolism in COVID-19 is not known. However, CD56 dim CD16 hi NK cells are a cytolytic subset activated rapidly within hours of infection or after stimulation by cytokines such as IL-15 (Carson et al., 1994; De Maria et al., 2011) . Strikingly, circulating levels of IL-15 were indeed positively correlated with the FA signature score in these NK cells ( Figure 4B ), and this association was DSM dependent: IL-15 was itself correlated with DSM ( Figure 4C ) and IL-15 and the FA signature were no longer correlated once their associations with DSM were statistically removed ( Figure S4B ). Both FA biosynthetic and catabolic/oxidative genes were elevated in the most severe (e.g., the critical-deceased subjects) versus the milder patients, who tended to express these genes at lower levels than HCs ( Figure  4D ). By using single cell RNA-seq data from two independent German COVID-19 cohorts (Schulte-Schrepping et al., 2020) , we confirmed that this FA signature was indeed significantly associated with disease severity (Figures S4C-E) .

Despite the positive association between IL-15 (a proinflammatory cytokine) and the FA signature, the FA signature was negatively correlated with inflammation related processes, including the NF-κB (Figures 4E and 4G; p=0 .004) and IL-1 response signatures (see below; Figures 4F and 4G; see also Table S4A ), as well as with the mTORC1 signature (HALLMARK_mTORC1_signaling) (Pearson r=-0.63; p=0.0008) and the IFNG transcript (Pearson r=-0.46; p=0.02, Figures 4H and 4I ). This negative association between the FA and NF-κB related inflammation signature was also confirmed in the two independent German cohorts ( Figure S4F ) (Schulte-Schrepping et al., 2020) . Thus, CD56 dim CD16 hi NK cells from the most critical patients were residing in a potentially dysfunctional state with attenuated inflammation (relative to less severe cases) and low IFNG transcription despite exposure to increased IL-15, as well as elevation in both FA biosynthesis and oxidation genes. Consistent with this notion, while CD56 dim CD16 hi NK cells are known to produce IFN-γ early (within hours) after cytokine stimulation, IFN-γ production decreases or stops by 16 hours (De Maria et al., 2011) ; prolonged exposure of NK cells to inflammatory cytokines like IL-15 can actually lead to hyporesponsiveness to subsequent stimulation, partly via a loss in FA oxidation (Felices et al., 2018) . Thus, here the increased FA oxidation genes may reflect a compensatory mechanism to counteract this dysfunctional metabolic state, although it is unclear why FA biosynthetic genes were also increased.

Consistently, the two primary negative correlates of DSM, chemokine signaling and IL-1 response signatures in the same subset of NK cells (Figures 4A and 4F) , shared genes with the NF-κB and inflammation signatures, such as RELA, NFKB1B, STAT1, and IL8, although they also contained additional genes such as XCL1 and XCL2, which are chemokines that can be produced and secreted early after NK cell activation following an infection (Dorner et al., 2002) . This inflammatory attenuation in the most severe patients was also found in other cell types including classical monocytes ( Figures S4G and S4H ), as reported earlier by others (Arunachalam et al., 2020; Schulte-Schrepping et al., 2020) . Thus, the increased FA together with the attenuated inflammatory and mTORC1 signatures in the most severe patients might reflect an exhaustion-like NK cell state due to persistent exposure to inflammatory signals including IL-15.

Similar to the IFN-I signatures, the FA signature decreased over time, but here particularly in the DSM-high patients because those had the most elevated expression of the FA genes near T0 (Figures 4J and S3A and S3B) . In contrast, the inflammation signatures increased, including the NF-κB (HALLMARK_TNFα_signaling_via_NF-κB) and IL-1 signatures ( Figure 4K ; Tables S4C and S4D), and the IFNG transcript (data not shown). Consistently, circulating IL-15 levels decreased particularly in the DSMhigh patients ( Figure 4L ). Unlike the IFN-I signatures, however, both the FA and inflammation signatures deviated further from the HCs even though IL-15 dropped over time ( Figures 4J-L) . Thus, while exposure to IL-15 earlier in the disease course might have contributed to the dysregulated phenotypes in these NK cells, a reduction in IL-15 alone seemed insufficient to reset these cells; additional signals, such as those from other inflammatory cytokines, might have contributed as well (see below).

Exogenous corticosteroid use was associated with DSM (p=0.001, F test from a linear model accounting for age, sex, ICU and intubation statuses, and TSO) and thus might have played a role in driving the inflammatory attenuation signature in the CD56 dim CD16 hi NK cells. However, DSM was not associated with a glucocorticoid transcriptional response signature derived from human immune cells (Franco et al., 2019) ( Figure S5A ). Importantly, both this glucocorticoid transcriptional signature or another well-known marker of glucocorticoid or IL-10 exposure [TSC22D3/GILZ transcript (Berrebi et al., 2003; Cannarile et al., 2001) ] was not negatively associated with the NF-κB, inflammatory, and mTORC1 signatures and also not positively correlated with the FA signature ( Figures S5B and S5C) . Although TSC22D3 mRNA level was trending higher in COVID-19 patients than HCs, it was similar between patients on or off corticosteroids ( Figure S5D ). Unexpectedly, the NF-κB and glucocorticoid signatures were positively correlated across patients ( Figure S5B ). These observations together suggest that the glucocorticoid response signature might be driven by endogenous glucocorticoids as a part of negative feedback regulation resulting from earlier inflammatory activation in COVID-19 patients (Jamieson et al., 2010; Newton et al., 2017) . Thus, a "burnt-out"/exhausted, low inflammation, high FA gene expression state in CD56 dim CD16 hi NK cells could contribute to severe COVID-19 independent of exogenous corticosteroid use.

T cell signatures were interestingly not implicated as primary correlates of disease severity. Consistent with an acute viral infection, signs of T cell activation and proliferation such as cell cycle and related metabolic signatures appeared in both CD4 + and CD8 + T cells in COVID-19 patients compared to HCs ( Figures 3A and 3B ). Examining CD8 + T cells further at the single cell level including clonality information (utilizing our simultaneous TCR and gene expression measurements) revealed extensive heterogeneity ( Figures S6A and S6B ), including subsets of activated cells with high clonality that were more abundant in COVID-19 (i.e. cluster 14 in Figures S6A-D) . Similar patterns of heterogeneity were observed in CD4 + T cells, but most CD4 + subclusters were not expanded or less expanded compared to the CD8 + T cells (data not shown). Surprisingly, although clonality in all CD8 + memory T cells was unchanged between HCs and COVID-19 patients at T0, it was significantly higher in the DSM-high than the DSM-low patients after accounting for TSO ( Figure S6E ; p=0.023 from linear model testing the effects of DSM while accounting for age, batch and TSO), suggesting increased diversification (or depletion of clonal cells) in less severe patients relative to those with more severe disease.

Despite signs of activation and proliferation, we did not detect a cluster of "exhausted" CD8 + T cells associated with COVID-19 as suggested earlier (Laing et al., 2020; Zhang et al., 2020b) , based on surface markers such as PD-1 (CD279) (Figures S6B and S6C ). Furthermore, we did not detect association of exhaustion with disease status or severity by using surface markers or surface marker combinations (e.g., CD39 + PD1 + ) (Gupta et al., 2015) in clonally expanded CD8 + memory T cells in our cohort (Figures S6F and S6G) . Assessment using transcriptional signatures of exhaustion (Wherry et al., 2007) in our and in two German cohorts (Schulte-Schrepping et al., 2020) revealed a more complex picture given that these signatures overlap with those of cellular activation and translation/ribosome ( Figures S6H-K) , which were detected earlier as disease status or severity correlates of COVID-19 ( Figures 3A and 3B) . Thus, transcriptional signatures alone are not sufficiently specific to indicate whether the cells were more exhausted or merely more activated. Thus, while we uncovered an exhaustion related metabolic gene signature in CD56 dim CD16 hi NK cells as a primary disease severity correlate ( Figure 4A ), CD8 + T cells did not show signs of exhaustion beyond the transcriptional signatures expected of cellular activation or regulation by type I IFNs (Ivashkiv and Donlin, 2014) , even in patients with critical disease.

As shown earlier, while inflammatory signatures (e.g., HALLMARK_TNFα_signaling_via_NF-κB) were less elevated in the DSM-high patients than DSM-low patients early, they increased over time and stayed elevated compared to HCs as late as day 20 post symptom onset in cell subsets such as CD56 dim CD16 hi NK cells and classical monocytes (Figures 3B, 4K, S3A and S3B; Tables S4C and S4D) . Gene signatures of cytokine-mediated signaling pathways (e.g., IL-4/13 and IL-17 signaling) in classical monocytes also showed significant increases over time, particularly in the DSM-high patients ( Figure S3B ). Consistent with an earlier report (Lucas et al., 2020) , the frequency of classical monocytes and its CD163 hi subset increased in both patient groups over time and peaked around day 20 ( Figure 5A ). The same was true for absolute monocyte and neutrophil counts in blood ( Figure 5B ). Thus, we next further examine the dynamics around day 20, particularly to assess differences between the DSM-high versus -low patients.

We divided samples into three groups by TSO: before day 17 (27 samples), the week around day 20 (days 17-23 -16 samples), and after day 23 (5 samples), then fitted mixed effect models to evaluate changes between the first two periods in cell-type specific gene expression differences between DSMhigh and -low patients ( Figure S7A ; Table S4F ; see Methods). We detected several statistically significant "flips", for example, inflammatory signatures in CD56 dim CD16 hi NK cells and classical monocytes were lower in DSM-high patients before day 17 but rose to be higher than the DSM-low patients by days 17-23 ( Figure 5C ). We then assessed differences between the DSM-high and -low groups during days 17-23 ( Figure 5D ; Table S4E ). In addition to CD56 dim CD16 hi NK cells and classical monocytes, the NF-κB gene signature was elevated in non-classical monocytes, other NK cell subsets, MAIT cells, and memory and naïve B cells in DSM-high than -low patients ( Figure 5D ). The inflammation signature (i.e., the HALLMARK_Inflammatory_response gene set) behaved similarly (Figures 5D and 5E, Table S4E ). Heightened signals of cellular activation and proliferation were also detected, such as cell cycle signatures in CD8 + memory cells ( Figure 5D ; Table S4E ). We also assessed circulating proteins to reveal that several markers of inflammation (e.g., IL-6, soluble TNF receptors 1 and 2) were elevated in the DSM-high versus -low patients ( Figure 5F ; Table S7 ). Thus, this relative late period, or "juncture", in the disease course (days 17-23) was characterized by increased inflammatory divergence associated with disease severity.

We next hypothesized that the immune statuses and trajectories at the juncture period are also divergent between the critically ill patients with distinct survival outcomes ( Figure 6A ). Since our CITEseq cohort only has a few patients with fatal outcomes, we started from the larger Brescia cohort with circulating protein measurements (175 patients; see also (Abers et al., 2021) ) and generated two ageand gender-matched sub-cohorts of critical patients with either recovery (N=21) or fatal (N=17) outcomes. ICU status, intubation, and corticosteroid use were not significantly different between these two groups during days 17-23 (Table S6) .

As seen above ( Figure 5B ), blood monocyte and neutrophil counts also peaked around day 20 in these patients ( Figure S7B ). We found 20 proteins significantly different between the deceased and recovered groups during the days 17-23 juncture period (p<0.05 based on a mixed effect model and the estimated FDR is 16% -see Methods; Table S7 ), including proinflammatory cytokines such as TNF-α, IL-6, and IFN-γ. These were consistent with the DSM-high versus -low comparison ( Figure 5F and Table S7) , with 45/63 proteins showing the same direction of change between the deceased versus recovered patients as that between DSM-high versus -low groups (p=0.0896, Fisher's exact test; Figure S7C ), suggesting that the inflammatory divergences between patients of high vs. low disease severity were qualitatively similar to those between critical patients with distinct survival outcomes. Similarly, a majority of the DE proteins at the juncture had the same direction of change between the deceased and recovered individuals in an independent US cohort (Lucas et al., 2020) ( Figure S7D ).

We next reasoned that if, as suggested by the observations above, the days 17-23 window is an immune response juncture in the disease course, markers of inflammatory and other immune processes would not only differ between the two patient groups at the juncture but also exhibit temporal shifts during or after the juncture period relative to the period before the juncture-i.e., the juncture is a period of change and/or an inflection point orchestrating later changes. We thus searched for proteins that were DE between the patient groups during (days 17-23) or after the juncture (days 24-30 -"postjuncture") and also required that the differences between the deceased and recovered patients changed significantly over time between the pre-juncture (days 7-16) and juncture or between the pre-and postjuncture periods ( Figure 6A ). We found 12 proteins that satisfied these search criteria ( Figure 6B ) [all p<0.05 and overall FDR=14% based on a permutation test (i.e., ~2 proteins are expected to be false positives)] and are predictive of fatal outcomes based on cross-validation machine learning analysis ( Figure 6C ).

These proteins revealed striking divergences between deceased and recovered patients around the juncture ( Figures 6B and 6D ). Several markers of tissue inflammation or damage started to increase more at the juncture in the deceased cohort, such as E-Selectin, a marker of endothelial inflammation/leukocyte transmigration and Lipocalin-2/NGAL, which is known to be associated with acute kidney injury and innate immune response to bacterial infections (Flo et al., 2004; Haase et al., 2009) (Figures 6D) . Consistent with the IFN-I signatures above (Figures 3D-E, and S3A-B), IP-10, an IFNstimulated protein product, was decreasing prior to the juncture with a steeper decline in the recovered cohort to reach lower levels than the deceased group by day 20 ( Figure 6D ). As observed earlier in the DSM-high versus -low patients ( Figure 5F ), markers of inflammation and immune responses (e.g., IL-6, TNFRSF1B, IL-17) were increasing at a higher rate or persisting (e.g., IL-18) at an elevated level in the deceased group ( Figure 6D ). However, antibody responses to both the SARS-CoV-2 spike and nucleocapsid appeared slower in the deceased compared to the recovered cohorts ( Figure 6E ), consistent with the delay and potential miscoordination of antigen-specific adaptive immune responses in the most severe patients (Moderbacher et al., 2020) . Together, these findings further support our hypothesis that the days 17-23 period represents a critical juncture in the disease course characterized by a heightened wave of inflammatory responses in critical patients with fatal outcomes.

Here we integrated circulating cytokine and multimodal single cell profiling and computational approaches to assess the cell surface protein phenotype, transcriptome, and T-cell clonality of peripheral immune cells of COVID-19 patients over time. We revealed a network of cell type specific signatures linked to disease severity, dissected timing effects, and uncovered a late period during which the host immune response undergoes striking divergences between patients with distinct disease severity and outcomes. We validated key observations using public datasets in independent cohorts, including gene signatures of apoptosis in pDCs and FA metabolism and inflammatory attenuation in CD56 dim CD16 hi NK cell linked to disease severity.

Timing is a critical variable when assessing the host immune response to viral infections (Bernardes et al., 2020; Dunning et al., 2018; Huang et al., 2011; Zhang et al., 2020a) . Live influenza challenge studies in humans and animal models have painted a prototypical immune response trajectory involving antiviral/IFN-I signatures in blood a few days after infection concomitant with symptom onset, which peak by days 5-7 and wane shortly after, followed by T cell activation, germinal center reaction, and plasmablast expansion during days ~7-14, cumulating in peak circulating antibody levels by ~3-4 weeks after infection (Brandes et al., 2013; Huang et al., 2011; Pommerenke et al., 2012; Woods et al., 2013) . In contrast, SARS-CoV-2 infections often have asymptomatic upper respiratory infections early (Matheson and Lehner, 2020) , but more than 90% of cases displayed signs of pneumonitis a few days after symptom onset (Bernheim et al., 2020) , suggesting that the patients we are studying here also had asymptomatic infections in the upper airway early (first ~7-10 days after infection), followed by infection and inflammation in the lower airways and lung concomitant with symptom onset. Thus, at T0 (median=11 days post symptom onset), most patients in our study may be in the mid to late phases of epithelial type I interferon production in response to the lung infection Nienhold et al., 2020; Wauters et al., 2020) . Consistent with recent reports (Arunachalam et al., 2020; Hadjadj et al., 2020; Schulte-Schrepping et al., 2020) , we indeed detected increased IFN-I signatures in cells compared to matching HCs, including genes induced early in blood following influenza challenge or immunization with the Yellow Fever vaccine (Querec et al., 2009; Woods et al., 2013) .

As observed earlier (Hadjadj et al., 2020) but here with single cell resolution and after accounting for timing effects, we found that IFN-I signatures were negatively associated with disease severity at T0. Timing contributed substantially to the variability of the IFN-I signature given its steep temporal decline, especially in the less severe patients. Concomitant with the elevation of the IFN-I signature were decreases in protein translation genes; the IFN-I and translation signatures tended to be negatively correlated within cell types, potentially reflective of the suppressive effect of type I IFNs on protein translation to limit viral replication (Ivashkiv and Donlin, 2014) . However, whether a more prolonged suppression (e.g., with exogeneous type I IFN use as an intervention) could negatively impact protein production in cells vital to the immune response (e.g., plasma cells producing antibodies) warrants further investigation.

In addition to infected epithelial cells, pDCs are another potential major source of type I IFNs. Consistent with earlier studies (Arunachalam et al., 2020; Kuri-Cervantes et al., 2020; Laing et al., 2020) , we found that pDC frequencies were lower in COVID-19 patients than HCs and negatively associated with disease severity and IFN-I signatures in most immune cell subsets, suggesting that lower pDC levels might have contributed to the depressed IFN-I signatures in severe disease. Recent studies of lung autopsies and BAL fluids from COVID-19 patients did not indicate an association between pDC frequency and disease severity (Chua et al., 2020; Liao et al., 2020; Nienhold et al., 2020; Wauters et al., 2020) ; while pDCs can also migrate into secondary lymphoid organs via high endothelial venules (Diacovo et al., 2005; Randolph et al., 2008) , data is lacking in SARS-CoV-2 infections. Thus, it is less likely that lower blood pDC frequency was solely due to pDC migration out of the periphery. Instead, we detected an apoptosis signature in COVID-19 pDCs and the same signature was positively associated with disease severity and negatively associated with pDC frequency, thus providing a compelling hypothesis on why pDC frequencies were reduced in blood. Type I IFNs themselves can trigger apoptosis in pDCs based on animal studies (Swiecki et al., 2011) , presumably as a negative feedback to modulate pDC numbers and limit the extent of type I IFN activation. Corticosteroids are also known to induce apoptosis and thus reduce the frequency of circulating pDCs in humans (Boor et al., 2006; Shodell and Siegal, 2001) . Endogenous corticosteroids are induced by the hypothalamic-pituitary-adrenal system following viral infections to mediate negative feedforward regulation to partly limit the damaging effects of overt inflammation (Jamieson et al., 2010; Newton et al., 2017; Ruzek et al., 1997) . Indeed, the more depressed IFN-I signatures in critically ill patients may be driven, at least in part, by the stronger negative regulation induced by more severe infections and disease. These interactions should be taken into consideration when designing therapies that target the type I IFN-pDC axis (Acharya et al., 2020; Hung et al., 2020) .

Our disease severity network pointed to attenuated inflammation and an "exhaustion" like gene expression state in CD56 dim CD16 hi NK cells as a primary positive correlate of disease severity. This signature was positively associated with circulating levels of IL-15, had increased expression of both FA biosynthesis and oxidation genes, but paradoxically, lower IFNG transcripts, lower NF-κB, mTORC1 signaling, and inflammatory gene expression signatures. Long-term exposures (on the time scale of days but not hours) to inflammatory activation are known to induce metabolic changes in NK cells, including the upregulation and utilization of both glycolysis and OXPHOS programs (O'Brien and Finlay, 2019). The role of FA metabolism is not well known, but prolonged exposure to proinflammatory cytokines like IL-15 can induce an "exhaustion" like state linked to altered FA metabolism and hyporesponsiveness to future stimulations (Felices et al., 2018) . Consistent with our data on the lowering of the mTORC1 signature and IFNG mRNA levels, decreased mTORC1 activity in NK cells is known to be associated with lower IFN-γ production (Nandagopal et al., 2014) . The potential increase in lipid biosynthesis could also lead to lipid accumulation, which can downregulate mTORC1 signaling and thus NK cell effector functions (Michelet et al., 2018) . While the elevation in FA oxidation transcripts may reflect a compensatory mechanism to counteract the decreased FA oxidation capacity in NK cells persistently exposed to IL-15 (Felices et al., 2018) , the concomitant increases in both fatty acid biosynthesis and oxidation/catabolic genes may be a signature of dysfunction. While not much is known about NK cells, CD8 + T cells are known to upregulate FA biosynthetic genes briefly early after viral infection while FA oxidative genes appear later during the late effector/early memory phase to support memory cell functions, but not both simultaneously (Best et al., 2013; Windt et al., 2013) . Overall, our findings are consistent with earlier observations of NK cells from COVID-19 patients, including increases in "adaptive" NK cells known to be associated with chronic activation as well as signs of exhaustion suggested by impaired cytolytic functions and inflammatory cytokine production, and elevation of surface markers or corresponding transcripts of NK cell exhaustion (Maucourant et al., 2020; Osman et al., 2020; Varchetta et al., 2020; Wilk et al., 2020) .

The negative feedforward circuits regulating the delicate balance of type I IFN responses and pDC apoptosis discussed above might also play a role in and serve as cell "extrinsic" signals attenuating the inflammatory response in CD56 dim CD16 hi NK cells (and other cells such as classical monocytes) in severe disease. For example, endogenous corticosteroids can induce anti-inflammatory programs in monocytes (Franco et al., 2019) , and so can IL-10 produced by monocytes themselves (or macrophages in tissues) and other cell types such as regulatory T cells following inflammatory activation (Mosser and Edwards, 2008; Neumann et al., 2020; Wynn et al., 2013; Xue et al., 2014) . Indeed, circulating IL-10 levels are known to be elevated in COVID-19 patients (Lucas et al., 2020) ; here IL-10 persisted at a high level particularly in non-survivors but dropped rapidly in survivors (data not shown). IL-10 and glucocorticoids can also increase CD163 expression on the surface of monocytes (Sulahian et al., 2000; Vallelian et al., 2010) , thus might help explain the rise of CD163 + classical monocytes in our patients over time. While exogenous corticosteroid, which is known to lower all-cause mortality in COVID-19 (WHO Rapid Evidence Appraisal for COVID-19 Therapies (REACT) Working Group et al., 2020) , was used by some patients in our cohort and its use correlated with DSM, our assessment using glucocorticoid gene expression signatures suggests that exogeneous corticosteroid use did not play a role in shaping the inflammatory attenuation we observed. Our findings may also mirror some features of sepsis (López-Collazo et al., 2020) : after the initial phase of hyperinflammation, a secondary phase of compensatory responses mediated by negative feedforward circuits can leave the patient immunosuppressed (Bellinvia et al., 2020; Bone, 1996; Rubio et al., 2019; Venet and Monneret, 2018) .

Whether these attenuated inflammatory states were a consequence or cause of severe COVID-19 remains to be determined. Timing is again an important question. If attenuation happened early after the infection before or near symptom onset, it could negatively impact adaptive responses, which actually appeared largely intact in our patients. Consistent with previous reports, for example in (Laing et al., 2020; Mathew et al., 2020) , we observed CD8 + T cell activation and proliferation. However, in contrast to some earlier reports (Laing et al., 2020; Zhang et al., 2020b) , but consistent with a single cell study of severe patients using mRNA expression of exhaustion markers (Wilk et al., 2020) and a recent report that SARS-Cov-2-specific CD8 + PD1 + T cells are not exhausted (Rha et al., 2021) , clear signs of CD8 + T cell exhaustion were not detected. Another report found increased PD-1 and TIM3 positive cells in COVID-19 patients versus controls but not across severity groups (Laing et al., 2020) , thus suggesting that these might reflect T cell activation rather than exhaustion (Mathew et al., 2020) . Together, our observations suggest that the attenuated antigen presentation and inflammatory activation in certain innate immune cells likely occurred later after symptom onset and did not severely impact the initial adaptive response to the infection.

While the most severe patients had attenuated inflammation in CD56 dim CD16 hi NK cells and classical monocytes early, these signatures increased over time and were significantly more elevated than those in less severe patients by days 17-23. Circulating protein analysis confirmed that both severe (DSM-high) patients and non-survivors had elevated inflammatory and immune responses at this juncture compared to those with less severe disease (DSM-low) or critically ill survivors, respectively. This late period is distinct from known clinical junctures earlier in the disease course, e.g., the onset of J o u r n a l P r e -p r o o f ARDS and sepsis around days 10-12 post symptom onset (Zhou et al., 2020a) . However, the underlying driver of those earlier events might still be responsible for some of the divergences by days 17-23.

Secondary infections are potential triggers for this late wave of responses (Cox et al., 2020; Langford et al., 2020; Lansbury et al., 2020) . Day 17 was the reported median onset of coinfections in a large retrospective cohort study (Zhou et al., 2020a) and we observed elevation in potential markers of bacterial infections [e.g., Lipocalin-2/NGAL and LBP (Lipopolysaccharide (LPS) Binding Protein)] around that time. The secondary infection rates reported thus far have been highly variable (from a few to tens of percent) and therefore more data are needed to further assess the prevalence and types of co-infections associated with severe COVID-19 and whether they are triggers of the late inflammatory responses we observed. Given the acute organ injury and tissue damage in severe disease, translocation of commensal bacterial products and other danger signals of tissue origin could also contribute to this late response wave (Estes et al., 2010; Marchetti et al., 2013) . Indeed, the level of bacterial products (LPS and bacterial ribosomal DNA) in plasma has been reported to be associated with disease severity (Arunachalam et al., 2020) .

Intriguingly, blood transcriptomic analyses of hospitalized, severe influenza patients have also revealed a second wave of responses (peaking around day 13 post symptom onset) involving a "bacterial" transcriptional signature that is associated significantly more with disease severity than with detectable bacterial co-infections (Dunning et al., 2018) . Despite the timing differences and lack of information on the cellular origin of this late signature, these observations in influenza are qualitatively similar to ours in COVID-19. Thus, even without secondary infections, contributions from other factors including the lingering viral load and acute tissue injury in the critical patients, may play important roles in driving the responses at the late juncture (Hirsch et al., 2020; Zheng et al., 2020; Zhou et al., 2020c) . The lack of coordination among and delays in antigen-specific humoral and cellular immunity could be partly responsible for lingering viral loads (Moderbacher et al., 2020) , which could also then trigger a second wave of inflammatory responses at the late juncture, particularly as the strength of the negative feedforward regulation from earlier innate responses waned.

Our observations highlight the importance of timing in designing therapeutics strategies for COVID-19. For example, while persistent IL-15 exposure could potentially lead to an exhaustion like phenotype in NK cells (Felices et al., 2018) , an anti-IL-15 block followed by intermittent application of IL-15 (on-off-on) to revive and boost NK cell functions may help to ensure a productive but not overtly damaging response around the late juncture and thereafter. However, such strategies need to be guided by the immune and physiological profiles of the patient, e.g., whether there is an ongoing co-infection and the status of the patient around the critical juncture. Indeed, personalized intervention and clinical monitoring strategies hold promise to reduce the burden of COVID-19.

The relatively small size of our cohort of mostly severe and critical patients, sparse sampling of very early time-points (e.g., first week of symptom onset), and potential confounding factors such as therapeutic interventions in a natural history study are key limitations. Thus, further studies using larger patient cohorts with better longitudinal sampling will be helpful to further confirm and dissect our observations. Our longitudinal sampling within individual subjects was sparse; we thus borrowed information across subjects to infer trajectories and performed comparative analyses between patient groups. Denser sampling within individual subjects would be needed to dissect how earlier events may drive, for example, divergences at the late juncture. While we have performed conditional independence analyses to disentangle correlative relationships, we still lack the ability to dissect causality, which would require interventional and animal model studies. Methodologically, our single cell data does not directly measure parameters beyond gene and surface protein expression and variable lymphocyte receptor sequences. Thus, pathway activity and cellular parameters including metabolism were assessed using gene expression signatures as noted throughout. Direct measurements of those parameters would require further studies. See also Methods for limitations of antibody staining and surface protein quantification in CITE-seq.

J o u r n a l P r e -p r o o f ACKNOWLEDGMENTS We thank the NIAID OCICB (Office of Cyber Infrastructure and Computational Biology) and its Bioinformatics and Computational Biosciences (Contract HHSN316201300006W/HHSN27200002 to MSC, Inc) and Operations Engineering Branches for developing databases for patient and assay data; the NCI Advanced Biomedical Computational Science for data transformation support. We thank Helen Matthew, Sarah Webber, and Wade Green for logistical/regulatory assistance; the NCI CCR Sequencing Facility (CCR-SF) for sequencing support; the NIAID OCICB Locus team for high performance computing support; Iyadh Douagi, Thomas Moyer and Melanie Cohen from the NIAID Flow Cytometry Section for supporting BSL3 cell sorting experiments; Philip Johnson (University of Maryland) for discussions regarding T cell clonality analysis; Rohit Farmer for code review; Ronald Germain and Pamela Schwartzberg for comments on the manuscript. Illustrations in Figures 1B and 6A were created using BioRender.com. This research was supported by the Intramural Research Programs of the NIAID and NIDCR, Intramural Programs of the NIH Institutes supporting the NIH Center for Human Immunology, Regione Lombardia (project "Immune response in patients with COVID-19 and co-morbidities"), and federal funds from the National Cancer Institute, National Institutes of Health, under Contract No. HHSN261200800001E. The content of this publication does not necessarily reflect the views or policies of the Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.

JST and LDN conceived the study; AJM, CL, JST designed CITE-seq experiments with inputs from WWL, DM, JC, and SM; CL, JC, DM generated CITE-seq data with help from NB; CL, AJM, WWL, NR, and JST designed data analysis strategies and approaches; CL, AJM, WWL, NR performed data analysis and generated figures with help from FC; WWL organized sample meta-data with help from the NIAID COVID Consortium; KLH and BAS generated flow cytometry data; NIAID COVID Consortium provided logistical and infrastructure and analysis support; LI, AS, EQR and COVID Clinicians cared for patients, collected samples, and clinical data; OMD, LI, LF, AS, EQR, MSL, HCS, CR, and LDN clinically characterized patients with help from NIAID COVID Consortium and COVID Clinicians; LI, AS, KD, NB processed and handled patient samples; RS, LF, NB, TWC, SM contributed healthy control samples; DLF, DBK, MSL, LDN contributed circulating protein data; PDB and JIC contributed antibody data; CL, AJM, WWL, NR, and JST interpreted the data; JST drafted the manuscript; CL, AJM, WWL, NR and JST finalized the manuscript with inputs from other authors; JST supervised the study.

The authors declare no competing interests. (A) COVID-19 patient cohort overview and sample collection timeline. The distribution of sample collection timing since symptom onset (TSO) is summarized on the top while the sampling for each of the 33 donors (age and gender indicated in parentheses) is plotted below indicating individual T0 (first) up to T3 (fourth) timepoints. One patient (P098) with unknown TSO is depicted here by using his hospital admission date as the reference. Some samples were excluded from analysis after initial quality control as they did not have a sufficient number of cells (see Methods). (B) Experimental design, analysis questions, and approach. Frozen PBMCs from the collected samples indicated in (A) and age-matched healthy controls were thawed and pooled, followed by combined surface protein/mRNA single cell expression analysis (CITE-seq); data from individual samples were demultiplexed by a combination of SNP-based and 'hashtag' antibody staining (see Methods). After automated clustering (plus additional targeted cell populations identified by manual gating) and cell subset identification based on surface protein profiles, pooled RNA data ('pseudobulk') libraries were generated in silico for each sample per cell cluster/population, and then used for downstream analysis using mixed effect statistical modeling to assess the effect of disease severity and TSO, while controlling for other variables such as TSO (when characterizing the effect of disease severity), age and experimental batch. (C) Heatmap of 18 clinical and serum protein measurements of patients after correction for days since hospital admission. Measurement values are standardized across subjects. Samples are divided into four groups based on unsupervised hierarchical clustering. Patients admitted to the Intensive Care Unit (ICU) during their hospitalization are denoted with an asterisk (*) next to their labels. None of the T0 samples were collected at the ICU. (D) Parameter importance of the fitted coefficient values from partial-least-square (PLS) ordinal regression models of disease severity. Parameters are ordered by their absolute median coefficient values from independent training iterations in leave-one-out cross validation. Error bars indicate standard deviation of coefficient distribution across all iterations. (E) Distribution of patient disease severity metric (DSM) grouped based on the disease severity-outcome classification for the 30 patients with CITE-seq data (left panel, see Methods) and an independent set of 64 patients from Brescia (right panel). The DSM formula was trained using the CITE-seq cohort only. Significance of mean difference between groups is determined by two-tailed Wilcoxon test. *: p <= 0.05, **: p <= 0.01, and ***: p <= 0.001. See also Figure S1 and Table S1. (A) Gene set enrichment analysis (GSEA) result of COVID-19 versus HCs at T0. Selected gene sets (rowssee Methods) are grouped into functional/pathway categories. Differential expression moderated t statistics were computed from a linear model accounting for age, and experimental batch (TSO not accounted for as TSO does not exist for healthy individuals). Dot color denotes normalized gene set enrichment score (NES) and size denotes -log10(adjusted P value). P values were adjusted using the Benjamini-Hochberg method. The sign of the NES corresponds to increase/decrease of change in COVID-19 compared to HCs. Cell populations (columns) are grouped by lineage/type. See Table S4A for detailed results of all gene sets which passed the adjusted P value cutoff of 0.2. (B) Similar to (A) but the enrichment analysis was performed based on association with DSM at T0, and the model controlled for TSO. The sign of the NES score denotes the sign of the association with DSM. See also Table S4B . (C) Heatmap of type I IFN response in classical monocytes. Heatmap showing the scaled average mRNA expression (within the indicated cell cluster/population for a given subject) of overlapping/shared leading-edge (LE) genes from the GSEA analysis of COVID-19 versus HCs (see (A)) and association with DSM (see (B)). Selected top LE genes are labeled by gene symbol and only the sample from the first timepoint for each patient (T0) is shown. For gene expression heatmaps, subjects are grouped by HC and DSM classes; also shown are the DSM value and severity-outcome classification of each patient (top of the heatmaps). Subjects with a small number of cells not included in the GSEA test are marked by a # (# cell number < 8). (D) Per-sample gene set signature scores of the GO_Response to type I IFN gene set vs. TSO (in days) in DSM-low and DS-high groups. Classical monocyte is shown as an example. Individual samples are shown as dots and longitudinal samples from the same individual are connected by gray lines. Trajectories (using loess smoothing -see Methods) were fitted to the groups separately with the shaded areas representing standard error. The median score of age-and gender-matched HCs is shown as a green dotted line. Gene set scores were calculated using the union of LE genes from both the timing (TSO) and disease severity (DSM) association models using Gene Set Variation Analysis (See Methods). (E) Scatter plot comparing the effect size of association between TSO and the GO_Response to type I IFN gene set in the DSM-high (y axis) and DSM-low (x axis) patients. Each dot corresponds to a cell type/cluster. The effect size reflects the change of type I IFN signature enrichment with time (effect size < 0 corresponds to the enrichment decreasing with time). Cell types are colored by their statistical significance (adjusted p-value < 0.05) of whether the association with time is significantly different between the two DSM groups (based on a model with an DSM-TSO interaction term -see Methods). The area framed by blue borders corresponds to cell types with more precipitous drop of IFN signature over time in the DSM-low than -high groups. (F) Heatmap of apoptosis/cell death signature in pDCs. Similar to (C), but instead of the shared LE genes, all LE genes from COVID-19 versus HCs and association with DSM are shown. Shared LE genes and genes in the REACTOME_Oxidative stress induced senescence gene set are annotated on the right. (G) Heatmap showing the sample-level pairwise Pearson correlations among serum IFN-α2a level, pDC frequency, the apoptosis signature score in pDCs, as well as the IFN-I and protein translation signature scores in classical monocyte, CD56 dim CD16 hi NK, and CD8 memory T cell clusters (* p value < 0.05). Selected scatter plots are shown and the corresponding entry in the heatmap is indicated using bold borders. Samples were filtered to remove those with fewer than 8 cells or less than 40,000 unique molecular identifiers (UMI) in the pseudobulk library. Note that not all subjects had IFN-α2a measurements. Each dot indicates a subject and is colored by the severityoutcome category. See also Figure S3 and Table S4 . Edge value and width indicate Spearman correlation and its statistical significance, respectively, between DSM and the gene set signature score. In the legend: % leading-edge (LE) genes denotes the proportion of genes from the gene set that are in the LE based on gene set enrichment analysis. The top three LE genes based on effect size (association with DSM; see Methods) and other selected genes representative of the biology are shown for each gene set. (B and C, G and I) Scatter plots showing the correlation of circulating IL-15 level versus REACTOME_Fatty acid metabolism signature score (B) and DSM (C); REACTOME_Fatty acid metabolism score versus HALLMARK_TNFα signaling via NF-κB score and GO_Cellular response to IL-1 score (G); REACTOME_Fatty acid metabolism score versus HALLMARK_mTORC1 signaling score and normalized IFNG mRNA expression (I). Pearson correlation (R) and associated p value were computed using pairwise complete observations (see Methods). Each dot indicates a subject and is colored by the severity-outcome category. (D) Heatmap of REACTOME_Fatty acid metabolism LE genes from the GSEA analysis of DSM association in CD56 dim CD16 hi NK cells at T0 (see Figure 3B ). LE genes are grouped based on annotations obtained from Gene Ontology. Genes with an asterisk (*) are those with both biosynthetic and catabolic annotations. Genes in red are those in the fatty acid oxidation set. (E and F) Similar to (D), heatmaps of inflammation related gene sets in CD56 dim CD16 hi NK cells at T0: HALLMARK_TNFα signaling via NF-κB (E), GO_Cellular response to IL1 and KEGG_Chemokine signaling pathway (see (A)) (F). Heatmaps showing the scaled average mRNA expression (within the indicated cell cluster/population for a given subject) of LE genes from the GSEA analysis of DSM association in CD56 dim CD16 hi NK cells (see Figure 3B (A) Time course of monocyte subset frequencies in DSM-low and DSM-high groups. Classical monocyte is expressed as fraction of total PBMCs; the CD163 hi classical monocyte subset is expressed as a fraction of classical monocytes. The median cell frequency of age-and gender-matched HCs is shown as a green dotted line. Individual samples are shown as dots and longitudinal samples from the same individual are connected by gray lines. P-values shown are from mixed effect linear models indicating the statistical significance of the timing effect (i.e., TSO) accounting for age and experimental batch in DSM-low and DSM-high groups, respectively. Trajectories (using loess smoothing) were fitted to DSM-low and DSMhigh groups separately. TSO = days 17-23 period is highlighted with purple. (B) Similar to (A) but showing the absolute blood neutrophil and monocyte counts. The two green dotted lines mark the approximate reference range of cell counts in healthy adults. The shaded areas around trajectories denote standard error. (C) Effect size (normalized enrichment score from GSEA) comparison of the period before day 17 (TSO < day 17, green) and during the TSO = days 17-23 period (purple) for inflammatory related gene sets. Effect sizes correspond to differences between DSM-high vs. -low groups, e.g., effect size < 0 corresponds to the gene signature being less enriched in the DSM-high group than DSM-low group. P values shown are FDR adjusted from the test reflecting the temporal changes in the difference between DSM-high and -low groups from before the day 17 period to the days 17-23 time window (see Methods). (D) Similar to Figure 3A , but here showing GSEA results for assessing differences between the DSM-high versus DSM-low groups using only samples from days 17-23 since symptom onset. See Table S4E for detailed results of these selected gene sets and all sets that passed the adjusted P value cutoff of 0.2. (E) Time course of gene set signature scores of inflammatory related gene sets in DSM-low and -high patient groups in CD56 dim CD16 hi NK cells and classical monocytes. The gene set signature scores were calculated using the LE genes identified from the enrichment analysis shown in (C) above to highlight differences between the DSM-high versus -low groups during the days 17-23 period. The median score of age-and gender-matched HCs is shown as a green dotted line. (F) Time course of serum protein levels from DSM-low and -high patients respectively. Similar to (E). Top 4 serum proteins significantly different between DSM-high versus DSM-low groups during the days 17-23 period are shown. See Table S6 for the full list of differentially expressed proteins. See also Figure S7 , Tables S4 and S6. (A) Approach for assessing and validating the late immune response juncture hypothesis by using serum protein profiles of critical ill patients with either recovery or deceased outcomes. (B) Effect size plots of circulating serum proteins comparing the difference between critical deceased versus recovered patients before (days 7-16), during (days 17-23), and after (days 24-30) the juncture period. Mixed effect models were fitted to assess whether this difference between deceased and recovered groups changed significantly between 1) pre-juncture and juncture (top panel showing effect sizes in each period and its right bar plots showing the p value assessing the temporal difference) or 2) pre-juncture and after the juncture (bottom panel). The size of the circle denotes the statistical significance of the difference between deceased and recovered groups in the indicated period (p<0.05 is marked by solid outlines). See also Table S7 . (C) Outcome prediction performance (recovered vs. fatal) at (17-23 days; purple) or post (24-30 days; blue) juncture using leave-one-out cross-validation. Feature selection was performed using the same procedure that identified proteins shown in (B). Area under the curve (AUC) and permutation p-values are shown. (D) Similar to Figure 6F but showing serum protein levels of critical ill patients with recovered or deceased outcomes (see (A)). Trajectories (using loess smoothing) were fitted to the recovered versus deceased patient groups separately. Top proteins showing the largest temporal shifts in their differences between the deceased vs. recovered patients in (B) are shown. (E) Similar to (D) but for antibody measurements against SARS-CoV-2 spike and nucleocapsid proteins in critically ill patients with recovered or deceased outcomes. LU = Light Unit. See also Figure S7 , Tables S6 and S7. 

(A) GSEA results for glucocorticoid response signature (see Methods on how the signature was derived) in DSM association model. Positive enrichment corresponds to higher level in the DSM-high samples. (B and C) Scatter plot showing the correlations between the indicated signature scores (computed using GSVA) and the glucocorticoid response signature score (B) or the TSC22D3 mRNA expression level (C) in CD56 dim CD16 hi NK cells. Pearson correlations and p-values are reported for patients with corticosteroid use (orange; steroid use TRUE) and those without (cyan; steroid use FALSE) as well as for all samples (black). Each dot indicates a subject, shaped by steroid use status and colored by the severity-outcome category. (D) TSC22D3 mRNA expression levels of CD56 dim CD16 hi NK cells and classical monocytes in HC, no steroid-use and steroid-use COVID-19 patient groups. P-values from testing for differences between the no steroid-use and steroid-use groups are shown. P-values were obtained from an ANOVA test accounting for DSM, TSO, age and experimental batch.

Heatmap showing the gene expression profile of CD8 + T cell clusters identified based on single-cell mRNA expression of the leading-edge genes of selected pathways presented in Figure 3 (see text and Methods); only COVID-19 T0 samples are shown. Average scaled mRNA expression per cluster of genes (columns) in each of the CD8 + T cell clusters (rows) is shown. Gene memberships in selected gene sets are indicated by the color bars at the bottom of the heatmap. Total number of cells in each cluster is indicated on the right side of the heatmap. Bar plot shows the fraction of cells within each cluster that are defined as expanded (>1 cell detected per CDR3 α-β pair). Two small clusters (<32 cells each) are not shown. (B) Average expression of selected surface proteins in the clusters from (A). (C) Results of linear model accounting for age, and experimental batch, comparing the frequency of CD8 + T cell clusters from (A) between COVID-19 and HC samples. Shown is the difference in mean frequency between COVID-19 and HC (x-axis) versus the -log10 (p value); horizontal line denotes an unadjusted p-value of 0.05. 18 CD8 clusters were tested in linear models. (D) Fraction of overlap between RNA based clustering (from A) and surface protein based CD8 + naive and memory T cell cluster annotations (based on high resolution clustering). (E) CD8 + T cell clonality (median rarefied Simpson index, see Methods) in HC, DSM-low andhigh groups. P-values shown are from linear model adjusted for experimental batch, age, and TSO (green for assessing differences between COVID-19 versus HCs and red for correlation with DSM; TSO only adjusted for in the model assessing DSM association). (F) Coefficients from linear models of mean surface protein expression of canonical exhaustion markers fitted to COVID-19 patients and HCs. Positive coefficients (red bars) correspond to higher expression in COVID-19 versus HCs (above) or higher expression in DSM-high versus -low (below) and vice versa for negative coefficients (green bars). P values are indicated with red indicating significance at the 0.05 level (unadjusted). Among them, only CTLA4 was mildly significant but the association was in the opposite direction expected for exhaustion: its expression was lower in patients, particularly in the more severe. Similarly, insignificant changes or inconsistent directions of change were detected for the mRNA of these protein markers (data not shown). (G) Association of proportion of CD39 + PD1 + cells with COVID-19 versus HCs and DSM in clonal CD8 + memory T cells using different cutoffs for CD39 and PD1 surface protein expression DSB counts (0.5, 1). P values are from Wilcoxon tests. Similarly, we also tested whether the frequency of cells coexpressing multiple exhaustion markers in (F) was different. Independent of the surface marker combination or protein expression cutoff used, we saw no signs of exhaustion or association with DSM (data not shown). (H) GSEA results for assessing enrichment for known exhaustion signatures in DE genes for expanded CD8 + T cells in COVID-19 versus HCs and DSM-high versus DSM-low comparisons. Positive enrichment corresponds to higher level in the first group. Both exhaustion gene sets were not enriched in the DSM-high versus DSM-low comparison, indicating that exhaustion was not associated with disease severity. The enrichment in the COVID vs. HC comparisons could reflect cellular activation ("up" signature -see (I) below) and depression of translation genes in COVID-19 patients negatively associated with IFN-I signatures ("down" signature) -see also Figure 3A . (I) Gene set enrichment of Wherry et al. up and down genes in KEGG, GO BP, REACTOME, and Li BTM's. Only top hits (ranked by adjusted p values from Fisher's exact test) are shown. The "exhaustion up" genes appeared to reflect CD8 + T cell activation as they are enriched for NF-κB, JAK-STAT signaling, and IL-2 signaling genes. (J) GSEA result of Schulte-Schrepping et al. cohorts for exhaustion signatures of COVID-19 versus HCs and severe versus mild comparisons at T0. Differential expression moderated t statistics were computed from a linear model accounting for age, experimental batch and TSO (TSO only accounted for in the severe versus mild comparison). Dot color denotes normalized gene set enrichment score (NES) and size denotes -log10(adjusted P value). Dot shape indicates the significance of adjusted P values (adjusted P value < 0.05). Both cohorts 1 and 2 and both CD8 + T cell clusters in the original data and memory CD8 + T cell populations using transferred cell type labels from our CITE-seq data are shown. Figure 3A , but here showing GSEA results for assessing the differences of delta between disease severity groups (DSM-high versus DSM-low) between the days 17-23 time window and the period before (TSO < day 17). See Table S4F for detailed results of these selected gene sets and the entire set that passed the FDR/adjusted P value cutoff of 0.2. (B) Time course of blood neutrophil and monocyte counts in recovered and deceased groups. Corresponding range of HCs are shown with green dotted lines. Longitudinal samples from the same individual are connected by gray lines. Trajectories were fitted to recovered and deceased groups separately with the shaded areas representing standard error. (C) Effect size comparison of DSM-high versus DSM-low (CITE-seq cohort) and deceased versus recovered (critical patients with distinct outcome subcohorts -see Figure 6G ) at the days 17-23 period. Effect sizes were derived from the coefficients of the group comparison term in mixed-effect models. Proteins are colored by whether they were significantly different (with an unadjusted p value of less than or equal to 0.05) in the deceased vs. recovered ("outcome") and the DSM-high vs. -low comparisons ("DSM"). See also Table S7 for the full list of DE proteins. (D) Similar to (C). Effect size comparison of Brescia deceased versus recovered and an independent US cohort (Yale cohort, Lucas et al, 2020, Nature) deceased versus recovered patients (See Methods) for 38 overlapping circulating proteins/cytokines at the juncture period (TSO days 17-23). Proteins in red are those significantly different between the deceased and recovered patient groups in the Brescia cohort (unadjusted ANOVA p value < 0.05).

Further information and requests for resources and reagents should be directed to and will be fulfilled by the Lead Contact, John S. Tsang (john.tsang@nih.gov).

This study did not generate new reagents.

Raw and processed data from the single cell mRNA, surface protein, and TCR V(D)J sequencing and bulk RNAseq are available from the NCBI Gene Expression Omnibus, accession number GSE161918 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE161918). Analysis code, extended patient and sample metadata, serum protein and antibody data are available at: https://github.com/niaid/covid19-time-resolved. Additional Supplemental Items including Supplemental Tables S1-7 are available also from Mendeley Data at: http://dx.doi.org/10.17632/dzsgh44fwg.1.

De-identified patient plasma and serum samples were obtained from discarded, clinically indicated collection of blood samples obtained from 382 patients admitted at ASST Spedali Civili Brescia, following positive nasopharyngeal swab and/or positive serology for SARS-CoV-2 infection. As described in the following sections and the main text, a subset of 60 unique patients, including 33 patients who had peripheral immune cells collected and 38 critically ill patients with deceased or recovery outcomes (for circulating protein/cytokine-based validation of the "critical juncture" concept in Figure 7 ) were analyzed in this study (i.e., 7 recovered and 4 deceased patients also had PBMCs collected). Ethical approval was obtained from the Comitato Etico Provinciale (NP 4000 -Studio CORONAlab), Brescia (Italy). Blood samples were processed to obtain serum and plasma. In addition, EDTA-blood samples were processed using Ficoll density gradient (GE Healthcare, Marlborough, MA) to obtain PBMCs, then washed twice in RPMI-1640; 5x10 6 PBMC were then resuspended in 1 ml of freezing medium consisting of RPMI plus 20% fetal calf serum and 10% DMSO, and then frozen at -80ºC. Patient metadata is available in Tables S1 and S6 and deposited along with the sequencing data in GEO and code in Github listed above.

Blood samples of 14 age and gender matched HCs were collected at the NIH Clinical Center from individuals in Washington, DC metro area. These samples were obtained under NIH IRB-approved protocol 18-I-0101 and PBMCs were prepared using SepMate isolation tubes (STEMCELL Technologies, Vancouver, BC, Canada) and Ficoll density-gradient media, or under NIH IRB-approved protocol 99-CC-0168 and PBMCs were prepared using Ficoll-based density separation (without SepMate) according to manufacturer's protocol. Cells were cryopreserved by standard methods and stored in liquid nitrogen until needed. One anonymized donor was excluded from further analysis, as we discovered an extremely expanded B cell phenotype consistent lymphoma in this sample. Two additional healthy control samples were collected under NIH IRB-approved protocol 11-I-0187 and were used in preliminary testing to compare freezing of buffy coat samples with freezing of Ficoll-isolated PBMC (isolated as described for patients above); CITE-seq data was collected for both buffy and PBMC samples of each of these controls, but only the PBMC-derived data was used in our analysis. An additional leukopack-derived PBMC sample (anonymized and without demographic information, collected under NIH IRB-approved protocol 99-CC-0168 from the NIH Dept. of Transfusion Medicine) was used in each experimental batch as a technical control, but not used in later comparisons.

Cytokine/biomarker analysis was performed on EDTA plasma or sera obtained from either patients or HCs. The cytokine data used in this paper was previously reported for the total Brescia cohort (Abers et al., 2021) ; the detection method is repeated here for completeness: Because of limited available volume, patient samples were analyzed as single determinations. Duplicate determinations of control samples and samples from HCs yielded coefficients of variation that were normally <20%. Aliquots were stored in a -85ºC freezer prior to analysis. Cytokines (IL-1β, IL-1α, IL-2, IL-4, IL-5, IL-6, IL-7, IL-8, IL-10, IL-12p70, IL-12p40, IL-13, IL-15, IL-16, IL-17, IFN-γ, TNF-α, TNF-β, GM-CSF, VEGF, CCL-11/Eotaxin-1, CCL26/Eotaxin-3, CXCL10/IP-10, MCP-1/CCL2, MCP-4/CCL13, CCL22/MDC, MIP-1α/CCL3, MIP-1β/CCL4, CCL17/TARC) were measured using the V-PLEX Human Cytokine 30-Plex Kit (Meso Scale Discovery, Rockville, MD) an analyzed on a MESO QuickPlex SQ 120 reader (Meso Scale Discovery) according to the manufacturers specifications. Interferon-α2a was determined on a single analyte, ultra-sensitive S-PLEX IFN-α2a kit (Meso Scale Discovery) according to the manufacturer's specifications. The magnetic beads were analyzed on Bio-Plex 3D instrumentation (Bio-Rad, Hercules, CA). Standard curves were analyzed using nonlinear curve fitting and unknowns were calculated based on the derived equation. Samples that exceeded the highest standards were reanalyzed more dilute until the values fell within the range of the known standards.

LIPS assays were used to detect antibody to the SARS-CoV-2 nucleocapsid and spike proteins as previously described (Burbelo et al., 2020) . Briefly, plasmids expressing luciferase fused to either the SARS-CoV-2 nucleocapsid or spike protein were transfected into Cos-1 cells, and lysates were obtained. Heat-inactivated serum diluted 1:10 was added to the lysates, and protein A/G beads were added to capture antibody-antigen complexes. After washing the complexes, coelenterazine substrate was added and luciferase activity was quantified as light units (LUs) in a luminometer.

Pooled PBMC samples from different donors were washed with PBS and incubated with Zombie Red Fixable viability dye (1:1000 in PBS, BioLegend, San Diego, CA) for 20 minutes at 4°C protected from light. Then cells were washed with flow staining buffer (10% FBS in PBS) and Fc blocked (Human TruStain FcX, BioLegend) for 15 minutes on ice. The fluorescence-labeled antibody cocktail against human CD45 (APC/Cyanine7), CD3 (AF488), CD19 (APC), CCR7 (BV786), CD95 (BV650), IgD (PerCP-Cy5.5) and CD27(PE/Cyanine7; all antibodies obtained from Biolegend) were added at the end of blocking and incubated for 20 minutes at 4°C in the dark. Cells were washed and sorted on a BD Aria sorter (BD Biosciences, San Jose, CA) in Biosafety Level 3 (BSL3) lab. Non-naïve B cell population were gated by CD45 + CD19 + IgDor CD27 + and non-naïve T cell population were gated by CD45 + CD3 + CCR7 low or CD95 + .

Frozen PBMC samples were thawed, recovered and washed using RPMI media with 10% FBS and 10mg/mL DNase I (STEMCELL). Samples from different donors were pooled evenly before staining; single cells can be demultiplexed in silico using individual specific single nucleotide polymorphism (SNP) information during data analysis (see below). Multiple cell pools were prepared, such that cells from the same individual but different timepoints ( Figure 1B ) would be in different pools. PBMC pools were Fc blocked (Human TruStain FcX, BioLegend) and stained with Totalseq-C human 'hashtag' antibodies (BioLegend), washed with staining buffer (2% BSA in PBS). This round of hashtag staining allows the different pools to be identified in the analysis, which when combined with the SNP-based demultiplexing, allows full identification of each sample. A fraction of the combined cells was used for sorting non-naïve T and B cells (see above). Separately for the unsorted and sorted cell fractions, hashtagged PBMC pools were combined and cells were stained with a cocktail of TotalSeq-C human lyophilized panel (BioLegend) of 192 surface proteins (Table S2) . Then, cells were washed three times, resuspended in PBS, and counted before proceeding immediately to the single cell partition step. The antibody concentrations within the lyophilized panel were optimized by the manufacturer using healthy PBMC samples. Note that the antibody concentrations used for CITE-seq were optimized by the manufacturer based on healthy PBMC samples, thus may not be optimal for COVID-19 samples. We have not independently verified the specificity of each of the antibodies, although most if not all have been used in flow/mass cytometry in the field and a subset has been titrated for CITE-seq in our previous work that also shows concordance with data from flow cytometry (Kotliarov et al., 2020) . In addition, because we used staining conditions designed to retain optimal mRNA measurements, some surface markers that would stain better under different conditions (e.g., CD197/CCR7), were not detected well in our data; therefore, negative staining for certain markers should be interpreted with caution.

PBMC samples were mixed with the reverse transcription (RT) mix and partitioned into single cell Gel-Bead in Emulsion (GEM) using 10x 5' Chromium Single Cell Immune Profiling Next GEM v1.1 chemistry (10x Genomics, Pleasanton, CA). The RT step was conducted in the Veriti Thermo Cycler (ThermoFisher Scientific, Waltham, MA). Single cell gene expression, cell surface protein, T cell receptor (TCR) and B cell receptor (BCR) libraries were prepared as instructed by 10x Genomics user guides (https://www.10xgenomics.com/resources/user-guides/). All libraries were quality controlled using Bioanalyzer (Agilent, Santa Clara, CA) and quantified using Qubit Fluorometric (ThermoFisher). 10x Genomics 5' Single cell gene expression, cell surface protein tag, TCR and BCR libraries were pooled and sequenced on Illumina NovaSeq platform (Illumina, San Diego, CA) using the sequencing parameters recommended by the 10x Genomics 5' v1.1 user guide. Sequencing saturation of the libraries ranged from approximately 60-80% for the cDNA and 20-40% for the surface protein tag libraries.

Flow cytometry was performed for a subset of 33 PBMC samples using a Cytek Aurora spectral cytometer (Cytek Biosciences, Fremont, CA) with a 27-color panel comprising viability dye and antibodies to CD3, CD4, CD8, CD11c, CD14, CD16, CD19, CD20, CD24, CD25, CD27, CD38, CD45, CD45RA, CD45RO, CD56, CD123, CD127, CCR4, CCR6, CCR7, CCR10, CXCR3, CXCR5, HLA-DR, and IgD. The frequency of major populations was determined using Flowjo Software v10 (BD Biosciences) based on previously described manual gating strategies (Finak et al., 2016) .

For each sample, a sample of 100,000-500,000 cells was processed in Trizol using the miRNAeasy micro kit (Qiagen, Germantown, MD) and standard RNA sequencing libraries were generated using Illumina Truseq library preparation kits. These libraries were used to generate SNP calls for each donor. Sequencing results were demultiplexed and converted to FASTQ format using Illumina bcl2fastq software. The sequencing reads were adapter and quality trimmed and then aligned to the human genome using the splice-aware STAR aligner and SNP calls were generated using the previously published protocol (Blay et al., 2019) . We used the software package demuxlet (Kang et al., 2018) to then match single cells in the 10x RNAseq data to each donor and identify doublets. Because multiple samples from different timepoints for each donor may be collected and could not be demultiplexed by this method alone, we also used 'hashtag' antibodies (Biolegend) to uniquely label the different timepoints (Stoeckius et al., 2018) .

Details on the statistical testing and values reported associated with each figure are reported in the figure legend or results section text.

Clinical severity classification was assigned as follows:

Mild: modest symptoms, no pneumonia. Moderate: Fever and respiratory symptoms plus radiological evidence of pneumonia. Use of low-flow oxygen is still part of this phenotype; O2 saturation is >93% at rest.

Severe: oxygen saturation at rest 93% or lower, or respiratory rate >30/min, or PaO2/FiO2 <300; use of low-flow oxygen is still part of this phenotype.

Critical: any one of the following: Mechanical ventilation (CPAP, BiPAP, intubation, hi-flow oxygen), septic shock, organ damage requiring admission in the ICU.

Quantitative disease severity scores, namely DSM, were derived for the cohort of patients with PBMC samples. The three subjects with initial PBMC sample collected at or later than 30 days since their symptom onset were excluded from this analysis (see Figure 1A) . A total of 18 clinical/lab data and serum protein measurements were assembled from within 2 days of the earliest PBMC sample collection date for each of the remaining 30 subjects. These 18 measurements include TNF-α/β, IL-6, IL-18, IP-10, CXCL9, IFN-ϒ, IL-4, IL-13, IL-17, C-reactive protein (CRP), fibrinogen, D-dimer, lactate dehydrogenase (LDH), lymphocyte and platelet counts, neutrophil/lymphocyte ratio (NLR), and pulse oximetry to fraction of inspired oxygen (SpO 2 /FiO 2 ) ratio. Missing values were imputed using an additive regression approach implemented in the R Hmisc package (Harrell Jr et al., 2019) . The measurements were normalized by accounting for difference in days between hospital admission and sample collection (i.e., by regressing out the "days since admission" variable using the removeBatchEffect function from limma). Since SpO 2 /FiO 2 ratios were used in the initial clinical disease severity assessment, only the remaining 17 corrected measurements were employed as inputs to building machine learning models for predicting patient disease severity. We used an ordinal partial least squares (PLS) regression approach with the response categories ordered from the least to most severe (i.e. Moderate-Alive, Severe-Alive, Critical-Alive, and Critical-Deceased). Classification performance was determined based on leave-one-out cross-validation where in each iteration one sample was reserved for testing and the rest were utilized for training of the model. Standardized model coefficients from the cross-validation models were evaluated for importance of each parameter. The first principal component of the top 8 most important parameters along with SpO 2 /FiO 2 ratios were used as a continuous severity score, which we referred to as DSM. To validate our approach, DSM scores were also independently calculated using the same parameters for an additional 64 patients from the Brescia cohort with protein and clinical measurements collected <30 days since symptom onset ( Figure 1E ). The modeling analyses were carried out with the R plsRglm package (Bertrand and Maumy-Bertrand, 2019) using proportional-odds logit models.

For several of the later analyses, we divided the 30 patients equally into DSM-high and DSM-low groups by using the median DSM value as the cutoff. Given the frequent use of this patient grouping, we assessed its robustness against the random number generator "noise" in the imputation step used above. We drew 200 random seeds for the imputation step to generate different DSM-high and -low groups. We then computed the rand-index (McShane et al., 2002) , which measures the similarity between two different partitions of the subjects, between our observed DSM-high and -low partition and each of the random partitions. The average rand-index was 0.823, indicating that our disease severity grouping was robust to the random number generator "noise" in the imputation step. We have saved the seed used to generate the DSM-high and -low groups used in the paper (see Code Availability above).

CellRanger (10x Genomics) version 3.1.0 was used to map cDNA libraries to the hg19 genome reference and to count antibody tag features. Data were further processed using Seurat (v.3.1.0, 3.1.4, or 3.2.2) (Stuart et al., 2019) running in R v3.6.1. After filtering to single cell based on demuxlet output, we further demultiplexed the timepoints using the hashtag antibody staining using Seurat's HTOdemux function. We removed cells with less than 200 or greater than 4,000 detected genes, greater than 30% mitochondrial reads, cell surface protein tag or mRNA counts greater than 15,000, or hashtag antibody counts greater than 5,000.

We normalized the CITE-seq protein data using the DSB method (Mulè et al., 2020) [available at https://github.com/niaid/dsb], which removes technical noise associated with unbound antibody by rescaling each protein based on protein levels detected in empty droplets, then denoises each single cell by defining and regressing out the non-biological, technical component of the cell's protein counts. The following parameters were used in the dsb normalization function: define.pseudocount = TRUE, pseudocount.use = 10, denoise_counts = TRUE. We set the use.isotype.control parameter to TRUE which models and regresses out a covariate corresponding to the technical component of the cell's protein library by combining the per cell background and isotype control counts.

J o u r n a l P r e -p r o o f DSB normalized cell surface protein data were indexed and exported as csv file and gated in Flowjo Software v10 (BD Biosciences). Gated FCS files and gating files were imported and mapped to CITE-seq single cell data using R packages CytoML (Finak et al., 2018) and flowWorkspace R package version 4.2.0 (Finak and Jiang, 2020) . Treg cells were gated on CD4 + CD25 hi CD127 lo . Given the poor CXCR5 staining which could lead to contamination in the downstream differential expression analysis, cTfh cells were gated within CD4 + T cells following RNA based clustering as follows (He et al., 2013; King et al., 2008) : Clustering was first performed on the scaled RNA data using variable features from Seurat's FindVariableFeatures, ScaleData, RunPCA, FindNeighbors (first 15 PCs) and FindClusters (resolution 0.6) functions; ICOS hi PD1 hi clusters were further gated by using DSB value cutoffs of 4 and 0 for ICOS and PD1, respectively. The gated population exhibited higher expression of ICOS, MAF and SH2D1A (encoding SAP protein) mRNA (King et al., 2008; Xin et al., 2018) and was enriched for some reported human Tfh signatures (Locci et al., 2013; Weinstein et al., 2014) in comparison with other CD4 + T cell clusters. As a cautious note:: given our gating scheme using ICOS and PD1, the cTfh cluster could overlap with some activated non-cTfh CD4 + T cells. CD71 + B cells were gated with a DSB value cutoff of 3 for CD71 within memory B cells. Plasmablasts were gated as CD19 + CD20 -CD38 hi .

DSB-normalized protein data, excluding the isotype control antibodies, were used to generate the Euclidean distance matrix computed for all single cells. The matrix was then used to build the shared nearest neighbor (SNN) graph followed by k-nearest neighbors clustering using the FindNeighbors and FindClusters functions, respectively, in Seurat (v3.1.0) using the Louvain algorithm. Two stages of clustering were performed: each of the three batches of single cells were first clustered separately using a resolution of 0.5. Major cell type clusters were manually identified within each batch and matched across batches. Cells of each major cell type were then merged across batches, and a second round of clustering (resolution = 1) was performed within each cell type, using DSB-normalized protein data that were batch-corrected using removeBatchEffect from the limma R package (Ritchie et al., 2015) . The resulting clusters were manually annotated, and grouped into fine (e.g., CD4 central memory) and coarse levels (e.g. CD4 memory).

Cell frequencies expressed as percent parent or percent total were tested for association with COVID vs. healthy and DSM through linear models using the limma package. All models controlled for age, batch, and days since symptom onset, with the exception of models comparing to healthy, where days since symptom onset was not controlled for as healthy individuals do not have a time since onset. Age was modeled as a continuous variable representing the age in years of the subject. The batch variable was modeled as factor variable representing the experimental batch (3 in total), and days since symptom onset was modeled as a continuous variable representing the number of days since self-reported symptom onset.

Using the limma workflow as described in "Cell frequency differential abundance", cell populations different between patient timepoint 0 (T0) samples and HC were identified with a model with the following formula in R: ~ covid_vs_healthy + age + batch, where covid_vs_healthy is a factor variable with two levels, COVID-19 and HC. P values were determined using the moderated T statistics associated with the covid_vs_healthy coefficient.

Using the limma workflow as described in "Cell frequency differential abundance", cell populations associated with DSM in patient T0 samples were identified with a model with the following formula in R: ~ DSM + age + batch + days_since_onset, where DSM is the continuous measure of DSM. P values were determined using the moderated T statistics associated with the DSM coefficient.

To create "pseudobulk" RNA libraries for differential expression analysis, all unsorted cells in a given sample were computationally "pooled" according to their automated cluster assignment by summing all reads for a given gene in a given cell-type in a given sample. In addition, this procedure was applied to specifically gated populations of interest (circulating T follicular helper (cTfh), Plasmablast, CD71 high B cells) from the sorted cells as these were not present at high enough quantities in the unsorted cells. Pseudobulk libraries made up by few cells and therefore likely not modeled properly by bulk differential expression methods were removed from analysis for each cell-type separately to remove samples that contained fewer than 8 cells and less than 40000 unique molecular identifier counts detected after pooling. Lowly expressed genes were removed for each cell type individually using the filterByExpr function from edgeR (McCarthy et al., 2012) . Differentially expressed genes were identified using the limma voom (Law et al., 2014) workflow which models the log of the cpm (counts per million) of each gene. Scaling factors for library size normalization were calculated with the calcNormFactors function with method = "RLE". In models utilizing repeated samples from the same subject, subject random effects were incorporated using the duplicateCorrelation function in limma. Models controlled for age, batch, and days since symptom onset, with the exception of models comparing to HC, where days since symptom onset was not controlled for as HC do not have a time since onset. Age was modeled as a continuous variable representing the age in years of the subject. The batch was modeled as factor variable representing the experimental batch (3 in total), and days since symptom onset was modeled as a continuous variable representing the number of days since self-reported symptom onset.

For heatmaps showing pseudo-bulk, sample level data, the log(cpm) for each sample for a given celltype was calculated by pooling cells as described in "Pseudobulk differential expression analysis". Library size normalization was performed without additional scaling factors. All Single cell gene expression heatmaps show scaled log(counts per 10000) values computed using the logNormalize function in Seurat. Heatmaps were scaled to have mean = 0 and variance = 1 for each gene.

Enriched gene sets were identified using the pre-ranked gene-set enrichment analysis (GSEA) algorithm implemented in the FGSEA R package (Sergushichev, 2016) . Genes were ranked using the moderated T statistics for the relevant coefficient from the limma voom model. Enrichment was assessed with a gene set list that included GO BP, KEGG, Reactome, MSIGDB's Hallmark collection, Blood Transcriptomic Modules (Li et al., 2014) , T cell exhaustion signatures (both upregulated and downregulated genes) from chronic murine LCMV infection (Wherry et al., 2007) , the top genes that were consistently upregulated across two cohorts at early timepoints in response Yellow Fever vaccination (Querec et al., 2009) , and lastly the union of the top50 genes that allow for discrimination between symptomatic and asymptomatic infected inviduals in H1N1 and H3N2 infection in a human challenge study (Woods et al., 2013) . Selected pathways shown in figures (3A, 3B and 6D) and listed in Table S4 were manually curated to select gene sets relevant to immunology and often enriched in several cell-types across the various differential expression comparisons.

Specific module scores representing enriched pathway activities were calculated for relevant samples using leading edge genes identified by GSEA to enhance signal-to-noise ratio. Prior to score calculation, the pseudobulk gene counts were normalized with the varianceStabilizingTransformation function from DEseq2. The scores were generated using gene set variation analysis (GSVA) method implemented by the GSVA R package (Hänzelmann et al., 2013) . DSM-high and DSM-low group trajectories (module scores versus TSO) were generated respectively using loess smoothing function with default span parameter (span = 0.75).

Using the pseudobulk limma voom workflow as described in "Pseudobulk differential expression analysis", differentially expressed genes between patient T0 samples and HC were identified with a model with the following formula in R: ~ 0 + covid_vs_healthy + age + batch, where covid_vs_healthy is a factor variable with two levels, COVID-19 and HC. The contrasts.fit function was then used to compare the estimated means between COVID-19 and HCs. Gene set enrichment analysis of the moderated T statistics was performed as described in "Gene set enrichment of DE genes"

Using the pseudobulk limma voom workflow as described in "Pseudobulk differential expression analysis", genes associated with DSM in patient T0 samples were identified with a model with the following formula in R: ~ DSM + age + batch + days_since_onset, where DSM is the continuous measure J o u r n a l P r e -p r o o f of DSM. Gene set enrichment analysis of the moderated T statistics of the DSM term was performed as described in "Gene set enrichment of DE genes"

Genes induced/upregulated by glucocorticoid treatment were derived from Franco et al. (Franco et al., 2019) as part of a previous single cell RNAseq study of infant cord blood mononuclear cells (manuscript in preparation). Briefly, upregulated genes (between baseline and post glucocorticoid treatment) across all the tested cell types from Franco et al. were selected, filtered to only genes that were not also in the reactome 'immune system' pathway (to exclude immune genes), then further filtered to the top 50 most highly expressed based on average expression across all cells from our single cell RNA-seq data set generated from infant cord blood (to filter out genes with lower average expression across diverse hematopoietic cell lineages).

Single CD4 + and CD8 + T cells were separately clustered using mRNA expression profiles of genes found to be in the 'leading edge' of key gene sets significantly associated with COVID-19 vs. HCs or DSM from the above GSEA analyses (see pathways highlighted in Figure 5A ). Clustering was performed on the scaled RNA data after regressing out Batch and Donor variation using Seurat's ScaleData function. PCA was performed using the leading-edge genes, and the shared nearest neighbors graph was constructed using the first 15 PCs using the Seurat's FindNeighbors function. Louvain clustering was performed on the graph using a resolution of 1 using Seurat's FindClusters function. Association of these clusters with COVID-19 patients vs HC and DSM was assessed using linear models as described in "Cell frequency differential abundance: COVID-19 patients vs. HC" and "Cell frequency differential abundance: COVID-19 patients vs. HC", using all T0 samples and all HC samples. 18 CD8 clusters were tested and 15 CD4 clusters were tested.

CellRanger (10x Genomics) version 3.1.0 was used to assemble V(D)J contigs. Cells were considered part of the same clone they shared the same set of productive CDR3 nucleotide sequences (exact matches of alpha and beta chains). This was determined using the "raw_clonotype_id" of the filtered_contig_annotations.csv output from CellRanger (https://support.10xgenomics.com/single-cellvdj/software/pipelines/latest/algorithms/annotation).

Cells were removed from analysis as described in "Single cell data processing". Cells were assigned to clones as described in "TCR data processing". A cell was considered to be part of a non-singleton clone (i.e., clonal) if there were at least two cells from that clone in the same sample (Subject, timepoint combination). All other cells were deemed singleton clones.

Clonality in the CD8 + memory T cells in HC and patient T0 samples was determined as follows. Cells were assigned to clones as described in "TCR data processing". Using only the sorted T cells labelled as CD8 Memory according to the automated clustering, samples with fewer than 100 cells were removed from further analysis. As measurements of clonality are biased by differing number of total cells in a sample, all samples were down-sampled to 100 cells and Simpson's index (∑ , where R = the total number of unique clones in the subsample and p i = the number of cells from clone i in a given subsample) was computed as a measure of clonality; Simpson's index can be interpreted as the probability that two randomly selected cells are from the same clone. This process was repeated 1000 times with random subsamples of 100 cells per sample and the median Simpson's index over the 1000 subsamples was used as a point estimate for a given sample. Differences in clonality (median Simpson's index) between HC and COVID-19 patient and the association of clonality with DSM were then assessed with mixed effect models in the lme4 R package (Bates et al., 2015) . Models were adjusted for age and days since symptom onset. In addition, batch was included as a random effect. Thus, the formulas in R were:

• median_simpson_index ~ DSM + age + days_since_onset + (1|batch) • median_simpson_index ~ covid_vs_healthy + age + (1|batch)

The median_simpson_index is the median of the Simpson index over 1000 subsamples as described above; DSM is a continous measure of DSM; and covid_vs_healthy is factor variable with levels COVID19 and HC. The terms for age, batch and days_since_onset were coded as described in "Pseudobulk differential expression analysis". Coefficients representing the association with DSM and the difference in means between COVID-19 patient and HC samples were tested for significance using the summary function from the lmerTest package (Kuznetsova et al., 2017) .

Cells were filtered to include only the sorted, clonal (singleton clones removed) CD8 memory T cells. Pseudobulk differential expression of canonical exhaustion surface markers (CD279, TIGIT, CD244, CD152, CD223, CD366) was tested using linear models through the limma package. To find CITE-seq surface protein markers differentially expressed between healthy and COVID, and associated with DSM, the mean DSB count per sample in the clonal CD8 memory cells was modeled using the traditional limma workflow. However, to model the RNA expression of the related genes, limma voom was used as previously described "Pseudobulk differential expression analysis." The model comparing COVID to HC was identical to that described in "Differential expression: COVID vs. HC" and the model for DSM was identical to that described in "Differential expression: DSM."

Using the same models of RNA expression in clonal CD8 memory cells as described in "Clonal T cell exhaustion marker differential expression" (COVID vs HC and DSM models), differential expression was J o u r n a l P r e -p r o o f assessed for all genes passing read depth filters using filterByExpr function from edgeR. Enrichment of exhaustion gene sets in differentially expressed genes in clonal (singleton clones removed) CD8 memory cells using the pre-ranked GSEA algorithm implemented in the FGSEA R package. Genes were ranked using the moderated T statistics for the relevant coefficient from the limma voom model.

Enrichment of the Wherry exhaustion signatures (Wherry et al., 2007) in GO BP, Reactome, KEGG, MSIGDB Hallmark collection, and Blood Transcriptomic Modules (Li et al., 2014) was assessed using Fisher's exact tests comparing the overlap of each exhaustion signature with each gene set. The union of all gene sets including the exhaustion signatures was used as the background. P values were adjusted with the Benjamini-Hochberg method (Benjamini and Hochberg, 1995) .

Cells were filtered to include only the sorted, clonal CD8 memory T cells. To evaluate whether exhaustion was associated with DSM and COVID-19 patients vs. HC, we evaluated co-expression of canonical markers (CD279, TIGIT, CD244, CD152, CD223, CD366) since exhaustion is often defined by simultaneous expression of multiple markers (Wherry and Kurachi, 2015) . To robustly evaluate exhaustion marker co-expression, we used several cutoffs (.5, 1, 1.5, 2 DSB counts) to determine whether a cell was positive for a marker, and 2 cutoffs (2 or 3 markers positive) to determine whether the cell was positive for enough markers to be deemed exhausted. The proportion of clonal CD8 memory cells with an exhaustion-like phenotype was then calculated by dividing the number of "exhausted" and clonal CD8 memory cells by the total number of clonal CD8 memory cells in that sample. A similar approach was applied to determine the frequency of CD39 + PD1 + cells except it was required that both markers were above the given threshold.

A severity network ( Figure 4A ) was created using a more expansive collection of annotated gene sets (similar to Figures 3A and 3B ) with false discovery rate (FDR) ≤ 0.2 in the DSM gene set enrichment analysis (Table S4B) to infer "direct" correlations (see below) among biological processes in different cell subsets and their association with disease severity. Sample gene set scores generated using leading edge genes identified from the DSM differential expression GSEA were used for this analysis. Gene sets whose scores were not significantly correlated with DSM (Spearman's correlation; p > 0.05) were removed. To distinguish between direct and indirect associations ("direct" are those that are correlated conditional on all other variables), an inverse correlation matrix of the remaining gene sets and DSM was derived using a regularized regression approach called Lasso (Meinshausen and Bühlmann, 2006) as implemented in the R Huge package (Zhao et al., 2012) . Network nodes that are conditionally independent of each other would have zero coefficients in the inverse correlation matrix. By scanning the number of non-zeros in the matrix through a range of penalty values (λ) from 0 to 1, a parameter of 0.35 was selected as it represented an inflection point on the curve where the number of direct connections began to stabilize. Using DSM as the central node, a 2-level network was created with edges between nodes indicating possible direct and significant correlation (Table S5 ; Spearman's correlation p value ≤ 0.05).

Using the same limma workflow as described in "Cell frequency differential abundance", an interaction model was fit using all COVID-19 samples with the formula in R: ~ DSM_group:days_since_onset + DSM_group + days_since_onset + age + batch, where DSM_group is a factor variable with levels DSM_high and DSM_low. The following contrasts were then used with the contrasts.fit function to find cell populations associated with days_since_onset in DSM high and DSM low respectively (DSM_low was reference level of DSM_group factor variable, and thus the days_since_onset term represents the slope for days_since_onset in the DSM_low group, and DSM_high:days_since_onset represents the difference in slopes between the two groups):

• "days_onset_in_DSMhigh" : "DSM_high:days_since_onset + days_since_onset", • "days_onset_in_DSMlow" : "days_since_onset"

Using the same pseudobulk limma voom workflow as described in "Pseudobulk differential expression analysis", the DSM high group was contrasted with the DSM low group specifically at days 17-23 through the use of a contrast matrix and the contrasts.fit function in limma. The model was fit using all COVID-19 samples to improve variance estimates for individual genes and the following groups of samples were defined:

The formula used in R was ~ 0 + group + age + batch + days_since_onset, where group is a factor variable using the groups defined above. For the comparison of DSM high vs. DSM low at the juncture, the contrast matrix specifically compared the DSM_high_17<=days<=23 group to the DSM_low_17<=days<=23 group. For the interaction model, the contrast matrix compared the difference in means between DSM high and low at the juncture vs. before the juncture, i.e., (DSM_high_17<=days<=23 -DSM_low_17<=days<=23) -(DSM_high_days<17-DSM_low_days<17). Enriched gene sets for the contrast were identified as described in "Gene set enrichment of DE genes."

J o u r n a l P r e -p r o o f

Using the same pseudobulk limma voom workflow as described in Pseudobulk differential expression analysis", an interaction model was fit using the same approach described in "Differential abundance of cell populations: time since onset in DSM high and DSM low". Enriched gene sets for the days_onset_in_DSMlow and days_onset_in_DSMhigh contrasts were identified as described in "Gene

For each of the 63 serum protein measurements, we tested for abundance difference between DSMhigh and -low patients during the days 17-23 since symptom onset period. A total of 11 DSM-high and 10 DSM-low patients in our CITE-seq cohort had at least one measurement during this time interval. Protein levels were estimated by a mixed-effects model as follows:

Protein expression ~ dsm-group + sex + age + days from symptom onset to measurement + (1|subject)

ANOVA unadjusted p-value of 0.05 or below for the dsm-group coefficient term was flagged as significant. Models with singular fits were not discarded for this purpose.

All deceased patients (critical-deceased category) with longitudinal serum protein measurements (n=17) in the larger Brescia cohort (see Patients above) were identified. Each of them was then matched with critical-alive subjects of the same sex and within +/-3 years in age (n=21). We built two types of mixedeffect models to search for proteins exhibiting dynamic divergence around the days 17-23 juncture.

First, serum samples were divided into three time-interval groups based on TSO: days 7-16 (prejuncture; 35 samples), days 17-23 (juncture; 38 samples), and days 24-30 (post-juncture; 28 samples) from symptom onset.

We then searched for proteins that 1) were differentially abundant between deceased and recovered patients at the juncture or post-juncture, and 2) the differences between deceased and recovered patients increased significantly at the juncture compared to the pre-juncture, or at the post-juncture compared to the pre-juncture. Identifying these proteins would help assess the extent of divergences around the juncture and the corresponding biological processes that shifted temporally at or after the juncture. We included the post-juncture period in our analysis because significant divergences after the juncture could be a result of what happened during the juncture.

For each of the 63 proteins, differential abundance at each interval was assessed by the model:

Model 1: Protein expression ~ patient group + sex + age + days from symptom onset to measurement + (1|subject) J o u r n a l P r e -p r o o f ANOVA unadjusted p-value of 0.05 or below for the patient group (deceased vs. recovered) coefficient term was used to select proteins with significant differences for each interval (Table S7) .

Next, significant changes between time intervals was assessed by the following model:

Model 2: Protein expression ~ patient group*interval + sex + age + days from symptom onset to measurement + (1|subject)

where the group*interval term was used to evaluate the temporal changes (between intervals) in the differences between the deceased and recovered groups. Proteins with an unadjusted p-value of ≤ 0.05 from the resulting t statistics were selected (Table S7) .

A subset of the 12 proteins that were 1) significantly different between the two patient groups at the juncture or post-juncture based on model 1, and 2) significantly shifted temporally based on model 2 are shown in Figure 7B with select protein trajectories shown in Figure 7D .

False discovery rates (FDR) were estimated by running the same search procedures with 1000 permutations of the subject outcome labels. In total, 0.6% of permuted searches resulted in 12 or more proteins matching the search criteria (hits). All 12 proteins had an estimated p value of ≤ 0.05 individually based on this permutation test (Table S7 ). The average of number of protein hits was 1.73, leading to an overall FDR of 1.73/12 or 14.4% (i.e., ~2 of the 12 proteins may be false positives).

We also assessed the concordance between the two models. The effect sizes estimated by these models were highly correlated for shared terms (Pearson's correlation r = ~0.9 among all proteins tested). However, two identified proteins in this group, Reg3α and CD25/IL-2Rα, had differences of more than 0.5 in the pre-juncture effects (i.e., differences between deceased and recovered patients) estimated by the post-juncture v. pre-juncture interval model vs. the pre-juncture DE model. We thus did not focus on these two proteins in our data interpretation. Additionally, we also compared model 1 results at the juncture to results from the dsm-high and -low comparison (see previous section). Concordance of their effects had a p-value of 0.0896 as assessed by Fisher's exact test ( Figure S7C ). This mild concordance is not surprising given that the former is contrasting only very critical patients but with different outcomes, while the latter involves largely survivors with different disease severity.

To assess whether the above procedure can select proteins predictive of outcome using independent training and testing sets, using a leave-one-out cross-validation scheme we applied the above procedure to select proteins in training sets only and then the selected proteins were used as inputs to fit a logistic PLS model, which was then employed to compute a prediction score for each sample in the unseen patient. The prediction was carried out independently for the juncture (d17-23 TSO) and post-juncture (d24-30 TSO) periods. Based on these sample scores, classification performance was measured by the area under the receiver operating characteristic (ROC) curve ( Figure 7C ). Statistical significance of the prediction performance was evaluated by 1000 permutations of all subject labels to assess the ability of this procedure to generate predictive models. In each iteration of the permutation cross-validation, if no proteins were selected by the procedure, the unseen patient was assigned a score of 0.5 (middle point akin to best random guess) by default as no model could be constructed.

Time trajectories for the same cohort of recovered and deceased patients were created using all the measurements of the 12 temporally divergent proteins ( Figure 7D ). The same was done also for antibody measurements against SARS-CoV-2 spike and nucleocapsid proteins ( Figure 7E , (Burbelo et al., 2020) ). Group trajectories were generated using the loess smoothing function in R (v3.6.2) with default span parameter (span = 0.75). For each subject, Individual measurements collected on the same day were averaged in the plots.

Published serum protein measurement data from a US cohort of 113 COVID-19 patients (Lucas et al., 2020) were obtained to assess the differential abundance results observed in the Brescia cohort at the juncture. A sub-cohort of eight deceased/hospice (coded as 2 or 3 in the "Latest Outcome" column of the dataset) and four age-and sex-matched critical (4 or 5 in "Clinical Score" at any time point) patients with samples collected between days 17 and 23 since symptom onset TSO was identified. Differential abundance analysis was carried out using the same procedures for the 38 overlapping proteins measured in our and the Lucas et al. studies. Concordance of statistically significant results found in the Brescia cohort (only 12 of 20 DE proteins at the juncture identified in our Brescia cohort can be examined) was assessed by Spearman's correlation of effect sizes.

Data from cohorts 1 and 2 of Schulte-Schrepping et al. were downloaded from fastgenomics at the following url: https://beta.fastgenomics.org/datasets/detail-dataset-952687f71ef34322a850553c4a24e82e, https://beta.fastgenomics.org/datasets/detail-dataset-7ae02f5553074bda92c14a8f0bce2d24. Using the pre-annotated clusters from the original publication, single cell gene expression data were pooled into pseudobulk libraries and differential expression models were run as described in "Pseudobulk differential expression analysis". Associations with disease severity were determined using severity classifications present in the downloaded Seurat object metadata. Models controlled for TSO and age. Gene set enrichment was performed as described in "Gene set enrichment of differentially expressed genes" for select gene sets we wanted to validate. To assess exhaustion signatures in CD8 + T memory cells as close to ours as possible ( Fig. S6I and S6J) , we utilized Seurat's "TransferData" function to transfer cell cluster annotations from the single data of the Brescia cohort reported here to the data of Schulte-Schrepping et al. Note that since clonality/TCR sequencing data are not available in the Schulte-Schrepping et al. dataset, we could not assess clonally expanded cells only.

J o u r n a l P r e -p r o o f

The tidyverse suite of R packages was utilized for data table handling and data plotting (Wickham and RStudio, 2019) [https://cran.r-project.org/web/packages/tidyverse/index.html]. ComplexHeatmap was used for heatmap visualizations (Gu et al., 2016) .

J o u r n a l P r e -p r o o f SUPPLEMENTAL TABLE TITLES AND LEGENDS   Table S1 : Clinical characteristics of the COVID-19 patients in this cohort. Related to Figure 1 . Figure 6G ). The coefficients/effect sizes and associated p-values of each term in mixed-effect models are provided. All times are expressed as days since symptom onset. Related to Figures 6 and 7 .

• Multiparameter metric captures finer shades of disease severity to empower analyses • Network analysis reveals primary disease severity correlates in pDCs and NK cells • Analysis of timing effects suggests a late immune response juncture • Inflammatory divergence at the late juncture predicts fatal-recovery outcomes A multiparameter metric developed by integrating circulating cytokines, multimodal single cell profiling, cell surface protein and immune cell analyses of COVID-19 patients over time, reveals a late wave of diverging inflammatory and host immune responses that predicts recovery versus fatality.

J o u r n a l P r e -p r o o f T0   T0  T3   T0   T0   T1   T0   T0   T0   T0   T1   T1   T1   T0  T1   T0   T0   T0   T0   T0   T0   T1   T1   T1   T2   T1   T3   T1   T3   T0  T1   T0   T0   T1   T0  T1   T0   T0   T0   T0   T1   T1   T2   T2   T0   T0   T1   T0   T1 T2   T0   T1   T0  T1   T1  T2   T1   T0   T2   T2   T2   T2   T1   T1   T1 Juncture AUC:

An immune-based biomarker signature is associated with mortality in COVID-19 patients

Dysregulation of type I interferon responses in COVID-19

Systems biological assessment of immunity to mild versus severe COVID-19 infection in humans

Autoantibodies against type I IFNs in patients with life-threatening COVID-19

Fitting Linear Mixed-Effects Models Using lme4

The unleashing of the immune system in COVID-19 and sepsis: the calm before the storm?

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

Longitudinal Multi-omics Analyses Identify Responses of Megakaryocytes, Erythroid Cells, and Plasmablasts as Hallmarks of Severe COVID-19

COVID-19): Relationship to Duration of Infection

Synthesis of glucocorticoid-induced leucine zipper (GILZ) by macrophages: an anti-inflammatory and immunosuppressive mechanism shared by glucocorticoids and IL-10

Transcriptional insights into the CD8 + T cell response to infection and memory T cell formation

Imbalanced Host Response to SARS-CoV-2 Drives Development of COVID-19

Assessment of kinship detection using RNA-seq data

Sir Isaac Newton, sepsis, SIRS, and CARS

Prednisolone suppresses the function and promotes apoptosis of plasmacytoid dendritic cells

A Systems Analysis Identifies a Feedforward Inflammatory Circuit Leading to Lethal Influenza Infection

Sensitivity in Detection of Antibodies to Nucleocapsid and Spike Proteins of Severe Acute Respiratory Syndrome Coronavirus 2 in Patients With Coronavirus Disease

Cloning, chromosomal assignment and tissue distribution of human GILZ, a glucocorticoid hormone-induced gene

Interleukin (IL) 15 is a novel cytokine that activates human natural killer cells via components of the IL-2 receptor

Preliminary Estimates of the Prevalence of Selected Underlying Health Conditions Among Patients with Coronavirus Disease 2019 -United States

COVID-19 severity correlates with airway epithelium-immune cell interactions identified by single-cell analysis

Co-infections: potentially lethal and unexplored in COVID-19

Revisiting human natural killer cell subset function revealed cytolytic CD56(dim)CD16+ NK cells as rapid producers of abundant IFN-gamma on activation

An inflammatory cytokine signature predicts COVID-19 severity and survival

Adhesive mechanisms governing interferon-producing cell recruitment into lymph nodes

MIP-1α, MIP-1β, RANTES, and ATAC/lymphotactin function together with IFN-γ as type 1 cytokines

Progression of whole-blood transcriptional signatures from interferon-induced to neutrophil-associated patterns in severe influenza

Damaged intestinal epithelial integrity linked to microbial translocation in pathogenic simian immunodeficiency virus infections

Continuous treatment with IL-15 exhausts human NK cells via a metabolic defect

flowWorkspace: Infrastructure for representing and interacting with gated and ungated cytometry data sets

Standardizing Flow Cytometry Immunophenotyping Analysis from the Human ImmunoPhenotyping Consortium. Sci Rep

CytoML for cross-platform cytometry data sharing

Lipocalin 2 mediates an innate immune response to bacterial infection by sequestrating iron

Immune regulation by glucocorticoids can be linked to cell type-dependent transcriptional responses

Complex Immune Dysregulation in COVID-19 Patients with Severe Respiratory Failure

The immunology of SARS-CoV-2 infections and vaccines

Complex heatmaps reveal patterns and correlations in multidimensional genomic data

CD39 Expression Identifies Terminally Exhausted CD8+ T Cells

Accuracy of neutrophil gelatinase-associated lipocalin (NGAL) in diagnosis and prognosis in acute kidney injury: a systematic review and meta-analysis

Impaired type I interferon activity and inflammatory responses in severe COVID-19 patients

GSVA: gene set variation analysis for microarray and RNA-seq data

Circulating precursor CCR7(lo)PD-1(hi) CXCR5 + CD4 + T cells indicate Tfh cell activity and promote antibody responses upon antigen reexposure

Acute kidney injury in patients hospitalized with COVID-19

Temporal dynamics of host molecular responses differentiate symptomatic and asymptomatic influenza a infection

Triple combination of interferon beta-1b, lopinavir-ritonavir, and ribavirin in the treatment of patients admitted to hospital with COVID-19: an open-label, randomised, phase 2 trial

Regulation of type I interferon responses

Bacterial co-infection and secondary infection in patients with COVID-19: a living rapid review and meta-analysis

Co-infections in people with COVID-19: a systematic review and meta-analysis

voom: Precision weights unlock linear model analysis tools for RNA-seq read counts

Molecular signatures of antibody responses derived from a systems biology study of five human vaccines

Single-cell landscape of bronchoalveolar immune cells in patients with COVID-19

Prognostic value of interleukin-6, C-reactive protein, and procalcitonin in patients with COVID-19

Human Circulating PD-1+CXCR3−CXCR5+ Memory Tfh Cells Are Highly Functional and Correlate with Broadly Neutralizing HIV Antibody Responses

Antibody responses to SARS-CoV-2 in patients with COVID-19

Immune Response and COVID-19: A mirror image of Sepsis

Longitudinal analyses reveal immunological misfiring in severe COVID-19

Presence of Genetic Variants Among Young Men With Severe COVID-19

Cytokine prediction of mortality in COVID19 patients

Longitudinal immune profiling reveals key myeloid signatures associated with COVID-19

Microbial translocation in the pathogenesis of HIV infection and AIDS

How does SARS-CoV-2 cause COVID-19?

Deep immune profiling of COVID-19 patients reveals distinct immunotypes with therapeutic implications

Natural killer cell immunotypes related to COVID-19 disease severity

Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation

Methods for assessing reproducibility of clustering patterns observed in analyses of microarray data

High-Dimensional Graphs and Variable Selection with the Lasso

Metabolic reprogramming of natural killer cells in obesity limits antitumor responses

Antigen-Specific Adaptive Immunity to SARS-CoV-2 in Acute COVID-19 and Associations with Age and Disease Severity

Flow Cytometry Identifies Risk Factors and Dynamic Changes in Patients with COVID-19

Exploring the full spectrum of macrophage activation

Normalizing and denoising protein expression data from droplet-based single cell profiling

The Critical Role of IL-15-PI3K-mTOR Pathway in Natural Killer Cell Effector Functions

An open resource for T cell phenotype changes in COVID-19 identifies IL-10-producing regulatory T cells as characteristic of severe cases

Glucocorticoid and cytokine crosstalk: Feedback, feedforward, and co-regulatory interactions determine repression or resistance

Two distinct immunopathological profiles in autopsy lungs of COVID-19

Immunometabolism and natural killer cell responses

Impaired natural killer cell counts and cytolytic activity in patients with severe COVID-19

Genetic mechanisms of critical illness in Covid-19

Global Transcriptome Analysis in Influenza-Infected Mouse Lungs Reveals the Kinetics of Innate and Adaptive Host Immune Responses

Systems biology approach predicts immunogenicity of the yellow fever vaccine in humans

Migration of dendritic cell subsets and their precursors

PD-1-Expressing SARS-CoV-2-Specific CD8+ T Cells Are Not Exhausted, but Functional in Patients with COVID-19

limma powers differential expression analyses for RNA-sequencing and microarray studies

Current gaps in sepsis immunology: new opportunities for translational research. The Lancet Infectious Diseases

Characterization of Early Cytokine Responses and an Interleukin (IL)-6-dependent Pathway of Endogenous Glucocorticoid Induction during Murine Cytomegalovirus Infection

COVID-19 severity associates with pulmonary redistribution of CD1c + DCs and inflammatory transitional and nonclassical monocytes

COVID-19: risk for cytokine targeting in chronic inflammatory diseases?

Severe COVID-19 Is Marked by a Dysregulated Myeloid Cell Compartment

An algorithm for fast preranked gene set enrichment analysis using cumulative statistic calculation

Corticosteroids depress IFN-α-producing plasmacytoid dendritic cells in human blood

Elevated Calprotectin and Abnormal Myeloid Cell Subsets Discriminate Severe from Mild COVID-19

Bias, robustness and scalability in single-cell differential expression analysis

Simultaneous epitope and transcriptome measurement in single cells

Cell Hashing with barcoded antibodies enables multiplexing and doublet detection for single cell genomics

Comprehensive Integration of Single-Cell Data

Multi-Omics Resolves a Sharp Disease-State Shift between Mild and Moderate COVID-19

Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles

Human monocytes express CD163, which is upregulated by IL-10 and identical to p155

The multifaceted biology of plasmacytoid dendritic cells

Type I interferon negatively controls plasmacytoid dendritic cell numbers in vivo

The trinity of COVID-19: immunity, inflammation and intervention

Single-Cell Omics Reveals Dyssynchrony of the Innate and Adaptive Immune System in Progressive

Glucocorticoid treatment skews human monocyte differentiation into a hemoglobin-clearance phenotype with enhanced heme-iron recycling and antioxidant capacity

Unique immunological profile in patients with COVID-19

Advances in the understanding and treatment of sepsis-induced immunosuppression

Diverse Functional Autoantibodies in Patients with

Discriminating Mild from Critical COVID-19 by Innate and Adaptive Immune Single-cell Profiling of Bronchoalveolar Lavages

Global transcriptome analysis and enhancer landscape of human primary T follicular helper and T effector lymphocytes

Molecular and cellular insights into T cell exhaustion

Molecular Signature of CD8+ T Cell Exhaustion during Chronic Viral Infection

WHO Rapid Evidence Appraisal for COVID-19 Therapies (REACT) Working Group

Association Between Administration of Systemic Corticosteroids and Mortality Among Critically Ill Patients With COVID-19: A Meta-analysis

tidyverse: Easily Install and Load the "Tidyverse

A single-cell atlas of the peripheral immune response in patients with severe COVID-19

Factors associated with COVID-19-related death using OpenSAFELY

CD8 memory T cells have a bioenergetic advantage that underlies their rapid recall ability

A Host Transcriptional Signature for Presymptomatic Detection of Infection in Humans Exposed to Influenza H1N1 or H3N2

Characteristics of and Important Lessons From the Coronavirus Disease 2019 (COVID-19) Outbreak in China: Summary of a Report of 72 314 Cases From the Chinese Center for Disease Control and Prevention

Macrophage biology in development, homeostasis and disease

Single-cell RNA sequencing unveils an IL-10-producing helper subset that sustains humoral immunity during persistent infection

Transcriptome-Based Network Analysis Reveals a Spectrum Model of Human Macrophage Activation

Cell type-specific immune dysregulation in severely ill COVID-19 patients

Two waves of pro-inflammatory factors are released during the influenza A virus (IAV)-driven pulmonary immunopathogenesis

Single-cell landscape of immunological responses in patients with COVID-19

Inborn errors of type I IFN immunity in patients with life-threatening COVID-19

The huge Package for Highdimensional Undirected Graph Estimation in R

Viral load dynamics and disease severity in patients infected with SARS-CoV-2 in Zhejiang province

Clinical course and risk factors for mortality of adult inpatients with COVID-19 in Wuhan, China: a retrospective cohort study

A pneumonia outbreak associated with a new coronavirus of probable bat origin

Clinical Course of 195 Critically Ill COVID-19 Patients: A Retrospective Multicenter Study