key: cord-0991424-pc0adfr4 authors: Haynes, W. A.; Kamath, K.; Bozekowski, J.; Baum-Jones, E.; Campbell, M.; Casanovas-Massana, A.; Daugherty, P. S.; Dela Cruz, C. S.; Dhal, A.; Farhadian, S. F.; Fitzgibbons, L.; Fournier, J.; Jhatro, M.; Jordan, G.; Kessler, D.; Klein, J.; Lucas, C.; Luchsinger, L. L.; Martinez, B.; Muenker, M. C.; Pischel, L.; Reifert, J.; Sawyer, J. R.; Waitz, R.; Wunder, E. A.; Zhang, M.; Yale IMPACT Team,; Iwasaki, A.; Ko, A. I.; Shon, J. C. title: High-resolution mapping and characterization of epitopes in COVID-19 patients date: 2020-11-26 journal: nan DOI: 10.1101/2020.11.23.20235002 sha: f663d014e1faa2077c19bbf5ceaf941d6abfe59b doc_id: 991424 cord_uid: pc0adfr4 Fine scale delineation of epitopes recognized by the antibody response to SARS-CoV-2 infection will be critical to understanding disease heterogeneity and informing development of safe and effective vaccines and therapeutics. The Serum Epitope Repertoire Analysis (SERA) platform leverages a high diversity random bacterial display library to identify epitope binding specificities with single amino acid resolution. We applied SERA broadly, across human, viral and viral strain proteomes in multiple cohorts with a wide range of outcomes from SARS-CoV-2 infection. We identify dominant epitope motifs and profiles which effectively classify COVID-19, distinguish mild from severe disease, and relate to neutralization activity. We identify a repertoire of epitopes shared by SARS-CoV-2 and endemic human coronaviruses and determine that a region of amino acid sequence identity shared by the SARS-CoV-2 furin cleavage site and the host protein ENaC-alpha is a potential cross-reactive epitope. Finally, we observe decreased epitope signal for mutant strains which points to reduced antibody response to mutant SARS-CoV-2. Together, these findings indicate that SERA enables high resolution of antibody epitopes that can inform data-driven design and target selection for COVID-19 diagnostics, therapeutics and vaccines. The novel coronavirus SARS-CoV-2 global pandemic has affected millions of people 55 world-wide and led to a major healthcare crisis. Considerable research has gone into 56 understanding the myriad symptoms that are seen in patients as well as the stark contrast 57 between the large number of mild or asymptomatic cases and the staggering death toll around 58 the world 1-5 . Determining the factors that contribute to different disease manifestations, severity 59 and immunity is critical to adequate therapeutic intervention, improved patient outcomes, and 60 vaccine design. One avenue that is being extensively explored is the degree to which an immune 62 response to the virus protects, or harms, an individual. Although it is possible that a pre-existing 63 exposure to common coronaviruses may have a protective role during SARS-CoV-2 infection 6,7 , 64 it has also been proposed that antibodies to SARS-CoV-2 may sometimes be directly Of paramount urgency is the development of a vaccine against SAR-CoV-2. Along with 76 the initial step of defining an effective vaccine for the immediate crisis, factors such as viral 77 mutation rate and the uncertainty of long-term immunity could play a large role in ongoing 78 management. It is unclear if it will be possible to develop "sterilizing immunity" to the virus, thus 79 preventing infection completely [18] [19] [20] . A yearly "flu-type" immunization would necessitate Epitope Repertoire Analysis (SERA), a high throughput, random bacterial peptide display 90 technology that enables assessment of SARS-CoV-2 seropositivity and high-resolution mapping 91 of epitopes across any arbitrary proteome, including wild-type SARS-CoV-2, its mutant strains, 92 common coronaviruses, and the human proteome. We have leveraged over 2,000 pre-pandemic immune repertoires and over 500 COVID-94 19 cases to identify the antigens and epitopes that elicit a SARS-CoV-2 humoral response. We show that while antibody profiles of individuals are heterogeneous, epitope-level resolution 96 enables a range of analyses and visualizations, from the earliest epitopes to elicit an antibody 97 response, to putative mapping of structural epitopes that may be important for neutralization or 98 immunity. Combining epitope motifs into a panel yields a diagnostic classifier that distinguished 99 NAT+ cases from controls with accuracy comparable to serological tests in current use. Differences in the quantity and quality of epitopes in mild versus moderate and severe disease 101 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 can be seen at sites of biological and clinical interest. In silico analysis of epitope repertoires on 102 wild-type and mutant SARS-CoV-2 proteins suggests that some mutations may result in loss of 103 antibody reactivity to mutant SARS-CoV-2 infections while analysis against the human 104 proteome identified SARS-CoV-2 antibodies that may cross-react with human proteins and 105 contribute to disease pathogenesis. These capabilities are all possible through informatics 106 analysis of a single assay that requires a minimal amount of serum from each subject. We applied SERA to discover and validate SARS-CoV-2 antigens and epitopes across 111 the complete viral proteome from 779 COVID-19 serum samples taken from multiple cohorts of 112 individuals with recent or past SARS-CoV-2 infection, which in total include 579 unique subjects 113 (Table 1) . We additionally leverage a large database of 1997 pre-pandemic controls. The 114 majority of the subjects were confirmed SARS-CoV-2 positive by nucleic acid testing (NAT). For Cohorts I, II, and III, extensive characterization was available for covariates that included 116 disease severity, date of symptom onset, and in many cases, serological testing (Supplemental 117 Table 1 ). Patient samples were all screened using the previously published SERA assay, which 119 enables high throughput characterization of antibody epitopes (Figure 1) 26, 27 . In brief, serum or 120 plasma is incubated with the randomized bacterial display peptide library; antibodies bind to 121 peptides that mimic their natural epitopes and are then separated from unbound library 122 members using affinity-coupled magnetic beads. The resulting bacterial pools are grown 123 overnight, plasmids encoding the antibody-binding peptides are purified, and the peptide-124 encoding regions are PCR amplified and barcoded with well-specific PCR indices. Ninety-four 125 samples are normalized, pooled together and sequenced via next-generation sequencing 126 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 (NGS). The output of SERA is a set of approximately 1 million peptide sequences for each 127 individual, representing their unique epitope repertoire. After SERA screening, we applied two 128 complementary discovery tools, IMUNE and PIWAS, to identify antigens and epitopes involved 129 in the SARS-CoV-2 immune response ( Figure 1 ). Characterization of SARS-CoV-2 proteome antigens and epitopes To establish an understanding of relevant SARS-CoV-2 antigens and epitopes, we 133 analyzed the SARS-CoV-2 proteome with protein-based immunome wide association studies 134 (PIWAS). Briefly, PIWAS identifies epitope signal in the context of an arbitrary proteome by tiling 135 and smoothing kmer sequences across the entire proteome 28 . PIWAS derives power at both the 136 cohort and single sample level through statistical comparisons to a large database of pre-137 pandemic controls. Using the reference SARS-CoV-2 proteome from Uniprot, we performed 138 PIWAS of 579 COVID-19 samples compared to 497 pre-pandemic controls, with 1500 additional 139 pre-pandemic controls serving as a normalization cohort. In addition to the established antigens 140 spike and nucleocapsid, we observed highly significant signals for protein 3a, non-structural 141 protein 8 (NSP-8), membrane protein, and replicase polyprotein 1ab ( Figure 2A ). We further 142 examined epitope-level signal for the top IgG and IgM antigens identified by PIWAS ( Figure 2B ). Within spike and nucleoprotein, we observed multiple epitopes that are conserved across a 144 large portion of the COVID-19 patient population. In contrast, epitope signals for protein 3a, 145 NSP-8, and membrane protein (IgM) are largely characterized by a single, dominant epitope. While the receptor binding domain (RBD) of spike is highly important in host infection by the 147 virus, we observe no conserved epitope signal against this region of spike (amino acids 331-148 524). Instead, we observe private spike epitopes in a subset of patients in our cohorts ( Figure 149 2C, Figure S1 ). We highlight patients with epitopes observe in multiple longitudinal draws, to 150 decrease the likelihood of false positive signal. 152 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 Unbiased, proteome-independent epitope analysis 153 The IMUNE algorithm identified mapping and non-mapping epitope motifs that were 154 highly enriched in COVID-19 repertoires (Methods) 27 . Linear epitopes identified by IMUNE 155 largely overlapped with those identified by PIWAS (Figure 2 ). The IgG linear motifs mapped to 156 epitopes on spike glycoprotein (n=10), nucleoprotein (n=8) and NSP8 (n=2). IgM linear motifs 157 mapped to a single epitope at the furin cleavage site on spike glycoprotein that was also a 158 target for IgG antibodies, as well as one epitope on the SARS-CoV-2 membrane protein. A 159 significant number of motifs identified by IMUNE did not directly map to linear regions of the 160 SARS-CoV-2 proteome. We have observed from studies with monoclonal antibodies that non-161 mapping motifs tend to represent mimotopes of both linear and structural epitopes. Motifs were selected for inclusion in the SARS-CoV-2 epitope map if they demonstrated 163 a specificity of at least 98% in 497 pre-pandemic controls (Methods). The resulting SARS-CoV-2 164 panel of 45 IgG and 14 IgM motifs was compiled into a semi-quantitative epitope map, enabling 165 visualization of motif enrichment for all evaluated COVID-19 and control samples ( Figure 3A ). We observed that an unlabeled, hierarchical clustering of samples based on these motif 167 enrichments largely separates pre-pandemic control samples from COVID-19 patients. Focusing on those motifs with linear hits to SARS-CoV-2, we further observed sub-clusters of 169 patients with reactivity to specific isotypes and antigens, from left-to-right: spike IgG+IgM, spike 170 and membrane IgM, spike IgM, nucleocapsid IgG, and broadly reactive. Table 2 ). We normalized and summed motif enrichments to generate 177 a composite score and compared sub-panels to identify the panel with the maximal diagnostic 178 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint performance on the training cohort (Methods). A composite score of ≥25 was set as a cutoff for 179 both IgG and IgM panels to obtain a specificity of ≥99% on the pre-pandemic training controls 180 (Table 1 ). The panel performance was evaluated on a test cohort of 427 COVID-19 samples 181 that were confirmed positive by NAT testing (Table 1 , testing cohorts I-III). The classifier with the 182 best overall performance is shown in Figure 3B . The sensitivity varied between 54% and 82% 183 across the NAT+ cases from different cohorts, primarily based on the timing of blood collection 184 relative to symptom onset for each cohort. A specificity of 99.3% for IgG and 99.1% for IgM was 185 achieved on a test set of 1500 pre-pandemic repertoires that were tested for acute illness. Combining the IgG and IgM panels into a single test resulted in a panel specificity of 98.7%. Notably, no pre-pandemic samples were co-positive for IgG and M, thus the specificity for 188 subjects that were positive for both IgG and IgM was 100% in the test control set. Forty-two 189 percent of all tested COVID-19 samples met these criteria. We plotted the SERA scores for samples from cohorts I and II, where timing of the blood 191 draw relative to date of symptom onset was provided ( Figure 3B) mapped to a single location on the surface of spike glycoprotein ( Figure 3C ). Using these same 203 methods, we examined possible structural maps of motifs without linear maps to SARS-CoV-2. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 We highlight one exemplary motif, YWXYFXK which was found to map to the RBD of spike 205 glycoprotein ( Figure S2 ). Based on our previous observations that tryptophan tends to confer 206 structural characteristics on our peptide epitopes, we additionally examined the mapping of the 207 slightly modified motif YXXYFXK which we found also maps strongly to spike ( Figure 3C ). In 208 addition to the highlighted match to spike, YXXYFXK had two less feasible maps to spike 209 glycoprotein ( Figure S2 ). We also investigated the potential neutralization capacity of these motifs. We plotted the 211 neutralization titer of each sample against the enrichment of the motif in those samples ( Figure 212 3D). For both motifs, we observed that higher enrichment values tended to be present in 213 patients with higher titer neutralization activity. Motifs and epitopes associated with disease severity Based on prior studies that described subjects with severe disease possessing a 217 stronger and, perhaps, earlier humoral IgG response in both spike and nucleoprotein relative to 218 subjects with mild disease 29,30 , we examined differences in epitope severity detected by SERA. We compared the SERA IgG panel score (developed to distinguish COVID-19 patients from 220 pre-pandemic controls) across the spectrum of severities present in our population ( Figure 4A ). We observed a significant elevation of the panel score in patients with severe or moderate 222 disease compared to their mild disease counterparts. To understand the specific epitopes 223 driving the severity delineation, we identified the 10 motifs with the most significant t-test p-value 224 when comparing severe and mild disease ( Figure 4B ). We observe a potential confounding of 225 days since onset of symptoms with the SERA IgG score ( Figure S3 ). All 10 motifs were 226 identified in the IgG screen and 9 out of 10 motifs did not possess a linear map to SARS-CoV-2. In the hierarchical clustering of samples, we observe subsets of severe patients with preferential 228 enrichment for differing motifs. After splitting our data into 2/3 training and 1/3 testing cohorts, . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint we built a simple LASSO model to classify moderate/severe from mild disease, and observed 230 encouraging performance (training AUC 0.92, testing AUC 0.9, Figure S3 ). One of the distinguishing features of the SARS-CoV-2 coronavirus is the acquisition of 232 polybasic residues (RRAR) at the cleavage site of the S1/S2 boundary. Cleavage of spike 233 protein at this site is required to enable viral membrane fusion 31,32 . It has been proposed that 234 this novel sequence enables the virus to take advantage of host proteases, such as furin, that 235 cleave proteins with this recognition sequence, thereby increasing the potential tropism of the 236 virus relative to other coronaviruses 31,33 . We asked if this site elicited an immune response, and 237 if so, was it seen differentially in subjects with different disease severity. In the spike epitope 238 map, signal at this sequence location is both prominent and prevalent in the cohorts -120 out 239 of 385, or 31% of subjects, had epitope signals >99% of that seen in controls. We also 240 determined that the site elicited a statistically significantly stronger immune response in subjects 241 with severe disease relative to subjects with mild or moderate disease ( Figure 4C ). Specifically, 242 39%, 23%, and 20% of severe, moderate, and mild cases, respectively, had strong epitope 243 signals greater than 99% of that in the controls. The novel eight amino acid furin cleavage site (RRAR|SVAS) in spike maps identically to 245 a peptide sequence in one protein in the human proteome, the amiloride sensitive sodium 246 channel ENaC-α protein 33 . This protein is expressed on the surface of multiple tissues 247 implicated in COVID-19 pathology, and similar to spike, requires cleavage for activation. As the 248 sites share the eight amino acid furin cleavage sequence, not surprisingly, we see a highly 249 correlated PIWAS immune signal in both proteins ( Figure S3 ) that is also statistically stronger in 250 severe disease relative to mild or moderate disease ( Figure 4D ). We also note that in severe 251 cases, a number of very strong epitopes in ENaC-α outside of the cleavage site are seen 252 relative to mild cases. The signal at both sites was also seen to increase over time, particularly 253 between 2 and 4 weeks, indicating a likely adaptive immune response to this site ( Figure S3 ). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 In addition to spike and nucleoprotein, a robust immune response has been described 255 against the ORF8 protein 34 . Several reports have described a variant of SARS-CoV-2 with a 256 382-nucleotide deletion in ORF7b and ORF8 as well as an association of the deletion with a 257 milder disease course 35 . While we do not have genotype information for all strains, based on the 258 GISAID database we assume that most of the samples in our cohorts do not have this deletion. To explore the possible association of immune response with disease severity, we analyzed the 260 PIWAS signal against ORF8, which encompasses most of the 382-nucleotide deletion. While 261 there appear to be more extremely high signals in severe cases, using an outlier sum statistic, 262 the PIWAS signal in ORF8 does not reach statistical significance in severe cases relative to mild CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint average PIWAS value. After evaluating enrichment for these cross-reactive spike epitopes in 281 COVID-19 cases with different disease severity, we found there was no statistical difference 282 between severe, moderate, and mild cases ( Figure 5C ). In contrast to spike, nucleoprotein exclusively contained epitopes specific to SARS-CoV-284 2 and SARS, with 4 epitopes against the SARS-CoV-2 proteome ( Figure 5D ). Strong epitopes 285 were observed against SARS-CoV-2 at regions 150-178 (alignment indices and 392-286 419 (alignment indices 480-510) with no signal observed against hCoVs ( Figure 5E ). We 287 determined that these nucleoprotein epitopes were significantly enriched in severe and/or 288 moderate cases compared to mild cases ( Figure 5F ). Epitope signal in mutated SARS-CoV-2 strains To study the possible effects of known mutations to the SARS-CoV-2 virus on antibody 292 response, we leveraged the ability of PIWAS to interrogate the SERA database with any 293 sequence of interest. In the 96,437 sequenced strains from GISAID, we enumerated 21,127 294 distinct amino acid mutations to spike glycoprotein, nucleoprotein, envelope protein, and 295 membrane protein 36,37 . For each mutation, we compared epitope signal against the wild-type 296 (WT) and mutant position in every COVID-19 specimen. We observed a bias towards mutations 297 yielding a decreased PIWAS signal relative to WT ( Figure 6A ). A subset of these mutations 298 yielded decreased signal across a large number of COVID-19 patients ( Figure 6B ). To assess 299 the significance of this decreased epitope signal, we in silico randomly mutated amino acids 300 throughout the same protein sequences as a null distribution. The Kolmogorov-Smirnov test 301 comparing the observed and null distributions was highly significant (p=3e-11), indicating that 302 the bias towards mutants that generate a decreased epitope signal exceeds that which would 303 be explained purely due to chance ( Figure S4 ). For membrane protein, nucleoprotein, and spike 304 glycoprotein, we highlight exemplar mutations which resulted in decreased epitope signal 305 across a large number of patients ( Figure 6C ) and, in the case of spike glycoprotein, are on the 306 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint surface of the protein according to the crystal structures considered in this paper 31,32 . In 307 contrast, the dominant spike glycoprotein D614G exhibits no epitope signal for either the wild-308 type or mutant strains ( Figure 6D ). While conventional serology is a cornerstone of infectious disease diagnosis, the 312 COVID-19 pandemic has raised many questions not answered by these testing modalities 313 alone. Here we have shown that high-content random bacterial peptide display library screening 314 using SERA provides a tool to broadly and deeply probe individual antibody repertoires. These 315 profiles, both individually and in the aggregate, can yield insights into disease severity, 316 immunity, cross-reactivity to other coronaviruses (including SARS-CoV-2 mutant strains), and 317 autoimmune sequelae. By taking a focused, proteome-constrained approach to identifying signal against the 319 SARS-CoV-2 proteome, we both reiterate the established immunological relevance of spike and 320 nucleoprotein as well as identify less described signals against protein 3a and NSP-8. Epitope-321 level characterization of these antigens highlights particularly immunogenic epitopes within each 322 protein, which might serve as targets in the development of vaccines and therapeutics. In CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 likely yielding an abundance of structural mimotopes (also reflected in the quantity of non-332 mapping motifs in our diagnostic panel). Leveraging our database of thousands of pre-pandemic repertoires collected from 334 healthy individuals as well as people with infections, autoimmune diseases, and cancer across 335 all age groups and geographies, we were able to assess the specificity of the SARS-CoV-2 336 antibody response and identify a panel of epitopes that could distinguish COVID-19 cases from 337 controls with accuracy similar to conventional serological testing. We further investigated the possible origins of the non-mapping motifs by attempting to 339 map them structurally to spike glycoprotein. We validated the mapping by showing that it 340 accurately identifies the linear motif LPFQQ, and then applied the method to non-mapping 341 motifs. We find that the motif YWXYFXK exclusively maps to the RBD. However, previous 342 observations have suggested that tryptophan (W) may be more important for conferring 343 structure to the mimotope than for identity mapping, yet still the more general motif YXXYFXK 344 maps to RBD as well. When combined with the neutralization titers of samples in which this 345 motif is enriched, it is possible to speculate about the mechanism of neutralization. If the 346 antibodies that recognize this motif bind to the RBD of spike glycoprotein, they may block the 347 ability of SARS-CoV-2 to bind to ACE2 and inhibit viral entry and infection. One of the ongoing areas of development in this approach is that while we have 349 assumed that the identity of the residues remains constant between motif and the epitope, it is 350 possible that amino acid substitutions could be allowed. We have attempted a first pass to 351 mitigate this by allowing for a more general mapping with the modification of the W to an X, but 352 we continue to iterate on this model to more accurately account for residue mismatches in 353 mapping motifs to the structure. While these methods are still under development, the results 354 here demonstrate the applications of such a method. Consistent with previous studies, we find that the humoral immune response against 356 SARS-CoV-2 is stronger in severe and moderate disease relative to mild disease 4,39,40 . This . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 finding is consistent with a general pattern of disease associated with immunopathology in 358 COVID-19. We also identified specific epitope profiles that correlate with disease severity and 359 combined these epitopes into a preliminary disease severity classification model. To further 360 validate these findings we would require a separate, validation cohort of patients which span 361 mild and severe disease states. Importantly, many of our disease severity analyses are Using in silico analysis of repertoires on the human proteome, we are also able to 366 identify candidate cross-reactive or novel autoantigen epitopes that may be important in disease 367 pathogenesis. The polybasic cleavage site seen in SARS-CoV-2 is unique among 368 coronaviruses and potentially enables it to increase its tissue tropism 33 . We demonstrate that 369 the immune response at this site is predicted to be both significantly prominent and prevalent . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint has also noted that binding at this site reduces the RBD-ACE2 binding energy, and therefore 384 could be a potential site for neutralizing antibodies 44 . Functional analyses and experiments are 385 thus required to investigate whether cross-reactivity occurs and to distinguish the effect of 386 antibodies binding to either the viral or host antigens at this site. The ability to query potential 387 autoantigen signal using SERA in the context of SARS-CoV-2 infection and epitopes is an area 388 for continued inquiry given the mounting data supporting the significance of autoantigens in the 389 immunopathology of COVID-19 10,45 . Milder disease has been described in subjects with the 382-ORF8 deletion variant, and 391 the ORF8 protein has been noted to be associated with strong humoral response 35,46 . In our 392 study, we also see significant response relative to a pre-pandemic cohort in ORF8. While a few 393 epitopes appear quite strong in some individuals in ORF8 with severe disease, the overall signal 394 across the antigen was not seen to be statistically significant in mild versus severe disease. Specific, strong epitope signals in ORF8 could be postulated to contribute to severe disease 396 through a variety of mechanisms, but this would also need to be explored through further 397 epidemiological and experimental analysis. By evaluating epitope signal in COVID-19 cases against common human coronavirus 399 (hCoV) proteomes, we predicted prevalent cross-reactive epitopes particularly in the S2 domain 400 of spike. Given the strength and prevalence of these cross-reactive epitopes, it is plausible that 401 previous exposure to hCoVs contributed to these antibody responses, a boosting phenomenon 402 recently described in COVID-19 cases 47 . In particular, the cross-reactive epitope at spike amino 403 acids 809-834 near the fusion domain has been shown to elicit an antibody response in SARS- CoV-2 uninfected adolescents and adults 48 . Interestingly, antibodies targeting this epitope 405 demonstrated neutralizing capacity using antibody depletion assays 38 . More broadly, the 406 presence of spike-reactive T cells in healthy donors has been observed against SARS-CoV-2 as 407 well as hCoVs 229E and OC43, primarily reactive towards the spike S2 domain 6 . While these 408 findings suggest a role for cross-reactive epitopes in the response to SARS-CoV-2 infection, it is 409 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 uncertain what impact pre-existing antibodies have towards protection, immunity, and disease 410 progression. Recent studies suggest that pre-existing antibodies from hCoVs exist but are not 411 associated with protection 47,49 . We observed that prevalent cross-reactive epitopes in spike 412 were not associated with COVID-19 severity while multiple nucleoprotein epitopes specific to 413 SARS-CoV-2 and SARS were significantly enriched in severe cases compared to mild. Notably, it has been shown that convalescent COVID-19 patients exhibited a shift in antibody response 415 towards spike compared to a nucleoprotein-directed antibody response in deceased patients 3 . Given that cross-reactive epitopes were observed in spike, additional investigation will be critical 417 towards understanding pre-existing antibody responses that may impact SARS-CoV-2 infection 418 and COVID-19 progression. The decreased epitope signal in COVID-19 patients against mutant strains of SARS- The dominant strain of SARS-CoV-2 which is now in circulation possesses the D614G 431 mutation. Based on our data, neither the wild-type nor the mutant confer a significant linear 432 epitope, consistent with observations that the mutation is most notable for its effect on the 433 structure of spike 55 . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 We acknowledge various limitations with the SERA platform that impact this study. Much 435 of this study has focused on dominant epitopes prevalent in COVID-19 cases, but many of the 436 private epitopes not explicitly discussed here, particularly in spike RBD, are critical to fully While a random peptide library enables unique opportunities to identify structural mimics, much 441 work remains in cataloguing and mapping these mimics to their cognate antigens. In summary, we present the application of SERA to assess SARS-CoV-2 seropositivity 443 and to characterize a high-resolution map of motifs and epitopes in individuals and populations. We demonstrate the ability of the platform to assess disease severity, to identify structural 445 motifs associated with neutralization, to compare in silico epitope response to multiple 446 coronavirus strains, to assess potential immune escape at sites of variation, to evaluate 447 longitudinal changes in signal, and to reveal potential autoantigen response, all with one assay. The random nature of the libraries, the ability to identify linear mimics of structural epitopes, and 449 the ability to leverage quality-controlled reference data from a large pre-pandemic cohort all 450 contribute to SERA's ability to elucidate the humoral immune response in SARS-CoV-2 451 infection. Our findings support those of other studies that find clear differences in the humoral 453 response of individuals with different clinical severity and trajectories. While we may identify 454 associations between high resolution epitope and motif signals and disease severity, much work 455 is required to establish functional or causal relationships. Examining and correlating epitopes to 456 clinical efficacy in the context of vaccines and therapeutic antibodies will help to elucidate the 457 connection between measured immune response and patient outcome. Yet the epitope landscape can change, as it is already clear that coronaviruses mutate 459 and SARS-CoV-2 is no exception. Potential changes in the infectivity of the virus in just this first 460 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 year of the current health crisis underscore the need to track evolving immune responses and CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 Biobanked sera or plasma from individuals that previously tested positive for COVID-19 487 were provided by the SBCH Biobank. Clinical data, including age, sex, and disease severity 488 were obtained by SBCH staff for inclusion in the biobank. Specimens were collected from both CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint based on a positive SERA IgG or IgM result that was subsequently confirmed by S1 spike and 512 nucleocapsid ELISA IgG in the majority of cases. Cases from healthy donors were classified as 513 mild disease. Spike S1 and nucleoprotein ELISA The SARS-CoV-2 Spike S1 and N antigen ELISA data were provided by Yale and 531 LabCorp. Spike S1 and nucleoprotein ELISAs on the SBCH COVID-19 samples were performed 532 in house using recombinant proteins (ACRO Biosystems, S1N-C52H3 and NUN-C5227, 533 respectively). A cut-off value for positivity was established using 3 times the standard deviation 534 of 502 pre-pandemic controls for the IgG and 82 pre-pandemic controls for IgM assays. Briefly, 535 plates (Nunc MaxiSorp) were coated with recombinant proteins, 0.5 ug/ml for IgG and 1 ug/ml 536 for IgM at 4°C overnight. After washing, plates were blocked with PBS containing 5% non-fat 537 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 milk for 2 hours at room temperature. Plates were then incubated with serum diluted 1/250 in 538 blocking buffer for 1 hour at room temperature. Plates were washed, then incubated with HRP-539 goat anti-human IgG or HRP-donkey anti-human IgM (Jackson ImmunoResearch) secondary 540 antibody diluted 1/10,000 in blocking buffer for 1 hour at room temperature. After washing, the 541 reaction was developed with 3,3',5,5'-teramethylbenzidine substrate solution (ThermoFisher) for CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 Patient and healthy donor sera were isolated as before and then heat treated for 30m 563 at 56 °C. Sixfold serially diluted plasma, from 1:3 to 1:2430 were incubated with SARS-CoV-2 564 for 1 h at 37 °C. The mixture was subsequently incubated with VeroE6 cells in a 6-well plate for 565 1hour, for adsorption. Then, cells were overlayed with MEM supplemented NaHCO 3 , 4% FBS CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint To generate a diagnostic score that classified subjects as serologically positive for 603 antibodies to COVID-19, motif enrichment values were normalized using the mean and standard 604 deviation of enrichments within the training set of pre-pandemic control repertoires. Individual 605 SARS-CoV-2 motif normalized "z-scores" were then summed to obtain a composite score for 606 each sample. A composite score of 25 was established as a cutoff for positivity for each panel to 607 obtain a specificity of >99% on the pre-pandemic training controls. Structural motif mapping 610 Structural motif mapping was carried out by identifying a network of neighboring 611 residues on the surface of a protein structure and looking in that network for matches to the 612 motif of interest. Neighboring residues were residues which had ⍺-carbons within 8 Å. The 613 surface of the protein structure was calculated using the MSMS program 62 with a probe radius 614 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint of 2.5 Å. These values are line with other algorithms for mapping mimotopes to structures 63,64 615 and were further optimized using our in-house dataset of monoclonal antibodies (data not 616 shown). Once each match to a motif was found in the surface network of neighbors, each residue 618 is scored by the number of ways it was found to match to a motif. For example, the motif SE [RI] 619 might map to SER on the surface, and additionally to SEI on the surface, where the same S and CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 We identified Uniprot reference proteomes for the four common human coronaviruses CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 calculated and compared PIWAS scores for the wild-type and mutant sequences. To assess 667 significance of the observed bias, we generated in silico random mutations to these same 668 proteins and performed the same analysis. We compared the actual and random signals using a CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 Serimmune, paid employment at Serimmune, board membership at Serimmune, and patent 693 applications on behalf of Serimmune. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint 735 transmission dynamics of SARS-CoV-2 through the postpandemic period. Science (80-. ). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101 Estimation of the asymptomatic ratio of novel coronavirus infections 697 (COVID-19) Estimating the asymptomatic 699 proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond 700 Princess cruise ship Distinct Early Serological Signatures Track with SARS-CoV-2 Survival Linear B-cell epitopes in the spike and nucleocapsid proteins as 704 markers of SARS-CoV-2 exposure and disease severity Deep immune profiling of COVID-19 patients reveals distinct 706 immunotypes with therapeutic implications SARS-CoV-2-reactive T cells in healthy donors and patients with COVID-708 19 Selective and cross-reactive SARS-CoV-2 T cell epitopes in unexposed 710 humans. Science (80-. ) Covid-19 and autoimmunity Mapping Systemic Inflammation and Antibody Responses in Multisystem 10 Auto-antibodies against type I IFNs in patients with life-threatening 715 COVID-19. Science (80-. ) Do cross-reactive antibodies cause neuropathology 717 in COVID-19? Immune-mediated neurological syndromes in SARS-CoV-2-infected 719 patients COVID-19 and flu, a perfect storm Coronavirus creates a flu season guessing game The lasting misery of coronavirus long-haulers Why the Patient-Made Term 'Long Covid' is needed Rapid Decay of Anti-SARS-CoV-2 Antibodies in Persons with Mild 729 Seasonal coronavirus protective immunity is short-lasting Will SARS-CoV-2 become endemic? Science (80-. ). (2020) & Lipsitch, M. Projecting the 22 Antibody responses to SARS-CoV-2 in patients with COVID-19 Robust neutralizing antibodies to SARS-CoV-2 infection persist for 740 months Viral epitope profiling of COVID-19 patients reveals cross-reactivity and 742 correlates of severity Linear epitopes of SARS-CoV-2 spike protein elicit neutralizing antibodies in 744 COVID-19 patients Antibody epitope repertoire analysis enables rapid antigen discovery 746 and multiplex serology Identification of disease-specific motifs in the antibody specificity 748 repertoire via next-generation sequencing Wide Association Studies (PIWAS) for the discovery of significant disease-associated 751 antigens Serum-IgG responses to SARS-CoV-2 after mild and severe COVID-753 19 infection and analysis of IgG non-responders Antibody Profiles According to Mild or Severe SARS-CoV-2 Infection Receptor binding and priming of the spike protein of SARS-CoV-2 for 32 SARS-CoV-2 and bat RaTG13 spike glycoprotein structures inform 759 on virus evolution and furin-cleavage effects SARS-CoV-2 strategically mimics proteolytic activation of human ENaC. Elife (2020) ORF8 and ORF3b antibodies are accurate serological markers of early 764 and late SARS-CoV-2 infection Effects of a major deletion in the SARS-CoV-2 genome on the severity 766 of infection and the inflammatory response: an observational cohort study Data, disease and diplomacy: GISAID's innovative 769 contribution to global health Global initiative on sharing all influenza data -from 771 vision to reality Two linear epitopes on the SARS-CoV-2 spike protein that elicit 773 neutralising antibodies in COVID-19 patients Neutralizing Antibody Responses to Severe Acute Respiratory Syndrome 776 Coronavirus 2 in Coronavirus Disease 2019 Inpatients and Convalescent Patients Antibody responses against SARS coronavirus are correlated with 779 disease outcome of infected individuals Lung symptoms in pseudohypoaldosteronism type 1 are associated Autosomal dominant pseudohypoaldosteronism type 1: Mechanisms, 784 evidence for neonatal lethality, and phenotypic expression in adults A Pathophysiological Model for COVID-19: Critical 787 Importance of Enhanced binding of SARS-CoV-2 spike protein to receptor 790 by distal polybasic cleavage sites Coagulopathy and Antiphospholipid Antibodies in Patients with Covid-19 The ORF8 Protein of SARS-CoV-2 Mediates Immune Evasion through 794 Seasonal human coronavirus antibodies are boosted upon SARS-796 CoV-2 infection but not associated with protection Preexisting and de novo humoral immunity to SARS-CoV-2 in humans Absence of SARS-CoV-2 neutralizing activity in pre-pandemic sera from Glycoprotein To Avoid Neutralization Escape Effects of Human Anti-Spike Protein Receptor Binding Domain Antibodies on 807 Severe Acute Respiratory Syndrome Coronavirus Neutralization Escape and Fitness Neutralizing human monoclonal antibodies to severe 810 acute respiratory syndrome coronavirus: target, mechanism of action, and therapeutic 811 potential Studies in humanized mice and convalescent humans yield a SARS-813 CoV-2 antibody cocktail. Science (80-. ) Antibody cocktail to SARS-CoV-2 spike protein prevents rapid mutational 815 escape seen with individual antibodies. Science (80-. ) SARS-CoV-2 Spike protein variant D614G increases infectivity and 817 retains sensitivity to antibodies that target the receptor binding domain. bioRxiv (2020) Structure, Function, and Antigenicity of the SARS-CoV-2 Spike 820 Structural Impact of Mutation D614G in SARS-CoV-2 Spike Protein Enhanced Infectivity and Therapeutic Opportunity The D614G mutation of SARS-CoV-2 spike protein enhances viral infectivity 825 and decreases neutralization sensitivity to individual convalescent sera The D614G mutation in the SARS-CoV-2 spike protein reduces S1 UniProt: the universal protein knowledgebase Reduced surface: An efficient way to 834 compute molecular surfaces MIMOX: A web tool for phage 836 display based epitope mapping Pepitope: Epitope mapping from affinity-selected peptides The Protein Data Bank RCSB Protein Data Bank: Biological macromolecular structures 843 enabling research and education in fundamental biology, biomedicine, biotechnology and 844 energy PDB file parser and structure class Biopython: Freely available Python tools for computational molecular membrane fusion. Nature 1-8 (2020). doi:10.1038/s41586-020-2772-0 758 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 781 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 with deficiency of the α-subunit of the epithelial sodium channel. J. Pediatr. 135, [739] [740] [741] [742] [743] [744] [745] 801 individuals with recent seasonal coronavirus infection. medRxiv (2020). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 827 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 shedding and increases infectivity. bioRxiv Prepr. Serv. Biol. (2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 All p-values were calculated using outlier sum statistical test. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101 https://doi.org/10. /2020 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10.1101/2020.11.23.20235002 doi: medRxiv preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) preprintThe copyright holder for this this version posted November 26, 2020. ; https://doi.org/10. 1101