key: cord-0426921-rx7z5jnx authors: Kubiniok, Peter; Marcu, Ana; Bichmann, Leon; Kuchenbecker, Leon; Schuster, Heiko; Hamelin, David; Despault, Jérome; Kovalchik, Kevin; Wessling, Laura; Kohlbacher, Oliver; Stevanovic, Stefan; Rammensee, Hans-Georg; Neidert, Marian C.; Sirois, Isabelle; Caron, Etienne title: The Global Architecture Shaping the Heterogeneity and Tissue-Dependency of the MHC Class I Immunopeptidome is Evolutionarily Conserved date: 2020-09-29 journal: bioRxiv DOI: 10.1101/2020.09.28.317750 sha: 29943e10346b2385d9aa8d8756a929b3748c2cc5 doc_id: 426921 cord_uid: rx7z5jnx Understanding the molecular principles that govern the composition of the mammalian MHC-I immunopeptidome (MHC-Ii) across different primary tissues is fundamentally important to predict how T cell respond in different contexts in vivo. Here, we performed a global analysis of the mammalian MHC-Ii from 29 and 19 primary human and mouse tissues, respectively. First, we observed that different HLA-A, -B and -C allotypes do not contribute evenly to the global composition of the MHC-Ii across multiple human tissues. Second, we found that peptides that are presented in a tissue-dependent and -independent manner share very distinct properties. Third, we discovered that proteins that were evolutionarily hyperconserved represent the primary source of the MHC-Ii at the organism-wide scale. Finally, we uncovered a remarkable antigen processing and presentation network that may drive the high level of heterogeneity of the MHC-Ii across different tissues in mammals. This study opens up new avenues toward a system-wide understanding of antigen presentation in vivo and may serve as ground work to understand tissue-dependent T cell responses in autoimmunity, infectious diseases and cancer. In adaptive immunity, CD8+ T cells have the ability to eradicate abnormal cells through recognition of small peptide fragments presented by MHC (human leucocyte antigen (HLA) in humans) class I molecules. In this context, jawed vertebrates evolved an important antigen processing and presentation system capable of presenting thousands of different MHC class I peptides on the surface of virtually any nucleated cells (Neefjes et al., 2011) . In mammals, around 200 different cell types are decorated by large repertoires of self-MHC-I-associated peptides, collectively referred to as the mammalian MHC-I immunopeptidome (MHC-Ii) (Caron et al., 2017; Vizcaíno et al., 2019) . The inter-and intra-individual complexity of the MHC-Ii account for its overall heterogeneity (Gfeller and Bassani-Sternberg, 2018; Maccari et al., 2017; Vizcaíno et al., 2019) . In fact, each MHC-I allotype generally presents a distinct subset of peptide antigens, which are characterized by the presence of specific anchor residues that are necessary to bind MHC-I (Falk et al., 1991) . In human, up to six different HLA-I allotypes are expressed at the individual level, and thousands, if not millions of different HLA-I allotypes are expressed across human populations, hence increasing enormously the inter-individual heterogeneity of the MHC-Ii . In contrast, the murine MHC-Ii is much simpler. For instance, in the C57BL/6 mouse strain, peptide antigens are presented by only two classical MHC-I molecules (H2D b and H2K b ), and a total of approximately 200 different MHC-I allotypes are expressed among the most commonly used mouse strains (http://www.imgt.org/IMGTrepertoireMH/Polymorphism/haplotypes/mouse/MHC/Mu_haplotypes.html). In addition to its allotype-dependent composition, the mammalian MHC-Ii is also complicated by its tissue-dependency. In fact, two pioneering mapping studies recently pointed toward large variations in the repertoire of MHC-I-associated peptides across different tissues (Schuster et al., 2018) (Marcu et al. accompanying manuscript; https://www.biorxiv.org/content/10.1101/778944v2). However, very little is known about the molecular principles that shape the tissue-dependent processing and presentation of peptide antigens at the organism level. Classical biochemistry approaches have established the blueprint of antigen processing and presentation (Neefjes et al., 2011; Yewdell et al., 2003) . In a nutshell, the biogenesis of peptides presented by MHC-I molecules is initiated with the transcription and translation of the source genes, and the resulting proteins are typically degraded by the proteasome and/or immunoproteasome in the nucleus and cytosol (Kincaid et al., 2011) . Cytosolic peptides are rapidly targeted by cytosolic aminopeptidases, such as thimet oligopeptidase (TOP) (York et al., 2003) , leucine aminopeptidase (LAP) (Towne et al., 2005) , and tripeptidyl peptidase II (TPPII) (Reits et al., 2004) , which trim and destroy most peptides. A fraction of peptides escapes destruction by translocation into the endoplasmic reticulum (ER) lumen via transporter associated with antigen presentation (TAP) (Reits et al., 2000; Yewdell et al., 2003) . In the ER, peptides may be further trimmed by ER aminopeptidase associated with antigen processing (ERAAP) and then bind MHC-I molecules for stabilization by the peptide loading complex (Blees et al., 2017; Serwold et al., 2002) . Once stable, MHC-I-peptide complexes are released from the ER and are transported to the cell surface for peptide presentation to CD8+ T cells. Modern immunopeptidomics is driven by high-resolution mass spectrometry (MS) and investigates the composition and dynamics of the MHC-Ii (Caron et al., 2015) . Complementing classical biochemistry techniques, immunopeptidomic technology platforms have yielded important systematic insights into the biogenesis of the MHC-Ii (Granados et al., 2015) . For instance, they have refined binding motifs for a wide range of MHC-I alleles in human (Abelin et al., 2017; Gfeller and Bassani-Sternberg, 2018) , they have indicated that large numbers of MHC-I peptides derive from genomic 'hotspots' (Müller et al., 2017; Pearson et al., 2016) (Marcu et al. accompanying manuscript) as well as non-coding genomic regions (Laumont et al., 2018) , and they have demonstrated that abundant transcripts and proteins contribute preferentially to the composition of the MHC-Ii (Abelin et al., 2017; Bassani-Sternberg et al., 2015; Fortier et al., 2008; Granados et al., 2012; Pearson et al., 2016) . Furthermore, immunopeptidomic approaches have validated that defective ribosomal products (DRiPs), immunoproteasome subunits as wells as other key players involved in the processing of peptide antigens (e.g. proteasome, ERAAP) markedly influence the repertoire of peptides presented by MHC-I molecules (Bourdetsky et al., 2014; Milner et al., 2013; Nagarajan et al., 2016; Trentini et al., 2020; Verteuil et al., 2010) . The understanding of how the MHC-Ii is generated in different primary tissues in vivo, in human as well as in animal models, is fundamentally important to rationalize and predict how T cells respond in various contexts (Tscharke et al., 2015) . However, immunopeptidomics studies that focused on the systematic deciphering of the MHC-Ii biogenesis have been almost exclusively conducted in transformed cells. Therefore, the rules that govern the composition and tissuedependency of the mammalian MHC-Ii remains poorly understood and many fundamental questions remain unanswered to date. For instance, what is the relative contribution of individual HLA-I allotypes to the composition of the MHC-Ii within and across tissues? To what extent does the MHC-Ii conceal tissue-specific patterns/signatures that are conserved across species? What are the many transcription factors, proteases and trafficking proteins involved in the generation, elimination and transport of MHC-I peptides in different tissues, and how does the expression and activity of those architects influence the tissue-dependency and overall heterogeneity of the MHC-Ii at the organism-wide scale? In this study, we applied a systems-level, cross-species approach to tackle these fundamental questions. Two immunopeptidomic mapping studies have very recently drafted the first tissue-based atlases of the mouse and human MHC-Ii (Schuster et al., 2018) (Marcu et al. accompanying manuscript) . These pioneering mapping efforts provide qualitative and semi-quantitative information about the currently detectable repertoire of MHC-I peptides in most organs, both in mouse and human. Specifically, the mouse atlas was generated from 19 normal primary tissues extracted from C57BL/6 mice expressing H2K b and H2D b (Schuster et al., 2018) . The human atlas was generated from 29 human benign tissues extracted from 21 different subjects expressing a total of 51 different HLA-I allotypes (Marcu et al. accompanying manuscript; see NOTE below for the reviewers and editor) (Figure 1) . Those HLA-I allotypes cover the most frequent HLA-A, -B and -C alleles in the world. Below, we first focused on the analysis of the MHC-Ii in different mouse and human tissues to provide a general understanding of the heterogenicity, tissue-dependency and conservation patterns of the MHC-Ii. Next, we connected tissue immunopeptidomes to RNA-seq and protein expression data found in various tissue-based atlases (Geiger et al., 2013; Söllner et al., 2017; Wang et al., 2019) to dissect how the mammalian MHC-Ii is being shaped in different tissues (Figure 1 ). [NOTE FOR THE REVIEWERS AND EDITOR: Please note that the entire bioinformatic analysis of the human MHC-Ii dataset in the current version of the manuscript was performed using an early/unreleased version of the HLA ligandomic dataset whereas the bioinformatic analysis in the original HLA Ligand Atlas manuscript (Marcu et al. accompanying manuscript; https://www.biorxiv.org/content/10.1101/778944v2) was performed using the latest version of the data. Since the two manuscripts were not created using the exact same version of the dataset, the reviewers may pinpoint technical (not conceptual) inconsistencies between the two manuscripts at this review stage. For instance, human thymuses were absent in the early/unreleased version and were incorporated in the latest version. In addition, in the latest version, sample names/data source were annotated differently, individual replicates (sometimes entire samples) were removed in the quality control step, inconsistent tissue names were merged and some tissues were excluded, including stomach-related tissues due to quality issues. To solve this technical inconsistency issue, we plan to re-perform our bioinformatic analysis using the latest dataset in preparation of a revised version of the manuscript. We anticipate that reprocessing all the data would reduce the noise and improve the quality of the figures but would not change any of the conceptual findings currently stated in the manuscript.] A key open question regarding the heterogeneity of the human MHC-Ii is whether individual HLA-I allotypes contribute evenly or unevenly to the composition of the MHC-Ii across different tissues. In fact, every subject presents up to two HLA-A, two HLA-B and two HLA-C allotypes. If all allotypes were evenly represented at the cell surface across tissues, one would expect similar proportions of peptides assigned to each allotype in all tissues. To address this question, we first assessed the global tissue distribution of all detectable peptides that were assigned to HLA-A, -B and -C. Among 33 sampled benign tissues extracted from a total of 13 autopsy different subjects, we found HLA-A, -B and -C immunopeptidomes to be unevenly represented across tissues (Figure 2A) . In fact, we investigated the contribution of each HLA-A, -B and -C allotypes expressed in the three subjects for which the most tissues had been sampled (i.e. AUT-DN11, AUT-DN13 and AUT-DN12) (Figure 2 B-D) . Interestingly, we observed differential peptide distribution across tissues for many HLA-I allotypes. For instance, ~55% of peptides in the Colon of subject AUT-DN12 were assigned to A*02:01 compared to ~22% on average in all other tissues, resulting in an enrichment of about 2.5-fold for A*02:01 ( Figure 2D ). The enrichment of A*02:01 peptides in the Colon of subject AUT-DN12 was also further accompanied by an underrepresentation of A*11:01, B*15:01 and B*35:01 in the Colon, and an enrichment of C*03:04 and C*04:01 alleles ( Figure 2D) . Similarly, we also noted that ~55% of peptides in the liver of subject AUT-DN13 were assigned to HLA-B40:02 compared to ~20% on average in all other tissues, resulting in an enrichment of about 2.8-fold for this specific HLA-B subtype in this particular subject ( Figure 2C ). Furthermore, we noted that the stomach over-represented peptides associated to HLA-C (i.e. C*07:01, C*07:01 and C*04:01 in subject AUT-DN11, AUT-DN13 and AUT-DN-12, respectively). To provide a global picture about enrichment values that are associated to individual HLA-I allotypes, we calculated the average enrichment of all HLA-I allotypes across the investigated subjects and highlighted alleles that were enriched by more than 1.5-fold in at least one tissue ( Figure 2E ). This analysis highlighted 65 enrichment values distributed across 27 specific HLA-I subtypes and 29 different tissues ( Figure 2E) . Overall, those enrichment values ranged from 1.5 to 6.1-fold, and seven (out of 11) HLA-A, 10 (out of 16) HLA-B and 10 (out of 10) HLA-C subtypes were assigned in at least one tissue with an enrichment value above 1.5-fold. Whether the quality of tissues sampled can lead to artifactual data and affect these enrichment values is a possibility and would need to be further investigated. Nevertheless, these results strongly indicate that HLA-I allotypes do not contribute evenly to the composition of the MHC-Ii across different tissues and subjects, and therefore, considerably contribute to the overall heterogeneity of the human MHC-Ii. Antigen processing and presentation is a conserved and ubiquitous biological process. Here, we hypothesized that the MHC-Ii might conceal tissue-dependent immunopeptidomic signatures that are conserved across mammalian organisms. Therefore, we sought to identify immunopeptidomic patterns/signatures between mouse and human. First, we looked at the distribution of MHC-I peptides counts that were detected by MS across different mouse ( Figure 3A ) and human ( Figure 3B ) tissues. We noted that specific mouse organs yielded high numbers of MHC-I peptides (e.g. Spleen) whereas immune privilege organs (e.g., Brain, Testis, Ovary) yielded low numbers of (Figure 3D and E). PCA were performed from highly heterogenous immunopeptidomic data integrating peptides presented by two and 51 different MHC-I allotypes, respectively. Despite the high heterogeneity, our analysis revealed two main clusters in each species. Notably, immune-related organs clustered together in both species (see cluster 2 in Figure 3D and E). Immune clusters included Spleen, Bone Marrow, Lymph nodes and Thymus (mouse), as well as other types of non-immune related organs such as Kidney, Lung, Liver and Colon. This observation raised the following question: what are the MHC-I peptides that are either shared or unique across these tissues? To address the above question, we created connectivity matrices, which summarize the number of MHC-I peptides shared and uniquely observed between all possible pairs of tissues in mouse ( Figure 3F ) and human (Supplementary Figure 1) . The number of uniquely observed/tissuespecific peptides can be found along the diagonal of the connectivity matrices in Figure 3F and Supplementary Figure 1 . In mouse, we observed that 15% (910 out of the 6097 unique peptides found in mouse) of the total H2D b /K b immunopeptidome was shared across Spleen, Bone Marrow, Kidney, Lung, Liver and Colon ( Figure 3F) . As an example, 1768 peptides (29% of the total H2D b /K b immunopeptidome) were shared between Spleen and Kidney, and 1288 peptides (21% of the total H2D b /K b immunopeptidome) were shared between Bone marrow and Liver ( Figure 3A ). In human, we observed that 4% of the total HLA-ABC immunopeptidome was shared across these six organs for all subjects. Once deconvolved by allotype or subject, we observed that, on average, 3% (range: 1% HLA-C*07:04 -10% HLA-A*01:01) and 1.6% (range: 1% AUT-DN13 -2.6%, AUT-DN17) of HLA-I peptides were shared across these organs, respectively (Supplementary Figure 2) . In contrast, larger fractions of MHC-I peptides were found to be uniquely observed/tissue-specific in each species. Overall, 36% (2193 out of 6097 unique peptides) and 42% (29375 out of 69919 unique peptides) of the total H2D b /K b -and HLA-ABCimmunopeptidome were uniquely observed in specific tissues, respectively. Thus, these data suggest that a significant proportion of MHC-I peptides might be tissue-specific whereas a relatively smaller proportion of peptides are shared across various immune and non-immune organs, both in mouse and human. This conclusion can be raised from the current data generated using the presently available state-of-the-art MS technology. However, one has to consider that MS is not yet as sensitive as other genomic technologies such and whole genome sequencing or RNAseq methods. Therefore, it is possible that the tissue-specific peptides described above were also presented on a wider range of tissues, but were not detected due to their lower abundance and the detection limit of the mass spectrometer. Hence, the exact numbers and proportions mentioned above will possibly change in the future as MS technology evolves, but the concept of tissuespecific and shared peptides across tissues of an entire organism will unlikely change. With this consideration in mind, we conclude that both the mouse and human MHC-Ii are composed of peptides that are tissue-specific as well as shared across different tissues. These two categories of MHC-I peptides may show distinct properties or trends, and were further investigated below. To investigate the properties of tissue-specific peptides versus those that are presented across a wide range of tissues, we sought to assess the influence of peptide intensity (MS1)/abundance and MHC binding affinity on tissue distribution. Hence, we plotted the frequency of peptide Tissue-specific MHC-I peptides arise from genes that are almost uniquely expressed in the peptide-producing tissue. Expression of tissue-specific source proteins contributes to shaping the tissue-specificity of the human MHC-Ii (Marcu et al. accompanying manuscript). Pioneering work in mice also proposed that transcriptomic signatures can be conveyed to the cell surface in the MHC-Ii (Fortier et al., 2008) . However, how gene expression shapes the composition of the mouse MHC-Ii across many different tissues remains to be clarified. To address this, we first assigned every mouse MHC-I peptide found in the tissue draft atlas of the MHC-Ii to its source gene. Using an RNA-Seq atlas of gene expression in mouse normal tissues (Söllner et al., 2017) , we next assessed the transcript abundance of the MHC-I peptide source genes in nine tissues for which mRNA expression data were available (i.e. Brain, Colon, Heart, Kidney, Liver, Pancreas, Small intestine, Stomach and Thymus) (Figure 4) . By doing so, we found that genes coding for any detectable MHC-I peptides as well as for tissue-specific MHC-I peptides were more actively transcribed compared to genes that were not coding for any detectable MHC-I peptides (Figure 4 A,B) . Next, we reasoned that tissue-specific MHC-I peptides could derive from tissue-specific transcripts. To test this hypothesis, we averaged for every tissue the transcript abundance of genes coding for tissue-specific peptides and compared their expression across the nine tissues ( Figure 4C ). As depicted, we observed that brain-specific MHC-I peptides derived from genes that were uniquely expressed in the brain. Interestingly, liver-specific MHC-I peptides derived from genes that were predominantly, but not exclusively expressed in the liver-an expression pattern that was observed for seven out of nine tissues (colon, kidney, liver, pancreas, small intestine, stomach and thymus; Figure 4C ). These data suggest that tissue-specific MHC-I peptides are generally derived from genes that are highly expressed in the same tissue of origin. Together, these results are in accordance with conclusions drawn in human from Marcu et al. (accompanying manuscript) and enforce the notion that gene expression plays a fundamental role in shaping the tissue specificity of the MHC-Ii in mammals. Above, we provided evidence that the MHC-Ii is composed of tissue-specific peptides as well as peptides that are widely presented across many different tissues. While tissue-specific MHC-I peptides appear to stem from genes predominantly expressed in the original tissue, we asked whether MHC-I peptides that were presented across most tissues derived from highly transcribed genes across the entire human or mouse genome. To answer this question, we created a selection of MHC-I peptides that were widely represented among the sampled tissues, referred herein as 'housekeeping/universal MHC-I peptides' (Supplementary Figure 6A) . While this selection is straightforward for the mouse data where we considered peptides identified in at least 18 of the 19 tissues as housekeeping/universal peptides, a more complex approach was needed to select those peptides in the human dataset where several subjects, each representing a specific set of HLA-I alleles, were present. Details about the selection of those peptides in the human immunopeptidome Figures 6 & 7) . Strikingly, we discovered that these genes were among the most transcriptionally expressed genes across the entire mouse ( Figure 5A ) and human (Figure 5B) genome. This result is in line with the above observation that widely presented peptides across the organism are of high intensity/abundance ( Supplementary Figures 3 & 4) . Moreover, it is noteworthy that those housekeeping/universal MHC-I peptides did not preferentially originate from large (heavy) proteins, as it could have been expected due to the higher numbers of possible peptide antigen products from large proteins (Supplementary Figure 8) . Genes expressed in the majority of tissues in an organism are widely known as housekeeping genes and are thought to play essential roles in cellular integrity, energy management and replication (Zeng et al., 2016) . Due to their vital functions, several research groups have described an evolutionarily hyperconservation of housekeeping genes across vertebrates and even yeast, suggesting that these vital genes build the foundation of life (She et al., 2009; Zhu et al., 2008) . Akin to housekeeping genes, peptides that are represented in most tissues across an entire organism-referred above to as housekeeping/universal MHC-I peptides-could hypothetically originate from proteins that are preferentially hyperconserved across evolution. In this regard, it is reasonable to speculate that hyperconserved source proteins may have co-evolved for millions of years with ancients and ubiquitous degradation systems to become the fundamental ground source of MHC-I peptides for most tissues. To address this concept, we took advantage of the genome alignments between mouse and 59 vertebrates as well as between human and 99 vertebrates, made available by the UCSC Genome Browser (Lee et al., 2020 ) (see Materials and Methods section 'Conservation of source genes from universal MHC-I peptides' for details). To assess evolutionarily conservation across species, PhastCons scores (Siepel et al., 2005) , which predict the probability of conservation for every base-pair in the aligned genomes, were consulted for mouse and human genes of interest (see Materials and Methods for details). When comparing the conservation scores of tissue-specific MHC-I peptide source genes with those from housekeeping/universal MHC-I peptide source genes, the latter were significantly more conserved at the Promoter and Exon level, both in Mouse (p-value = 1.2x10 -16 ; p-value = 2.0x10 -72 ) ( Figure 5C ) and Human (p-value = 3.4x10 -16 ; p-value = 3.3x10 -98 ) ( Figure 5D ). For example, the conservation probability (PhastCons score) of half of the more conserved Exons (Cumulative Frequency > 0.5) of a tissue-specific peptide source gene in mouse is greater than 45%, whereas the conservation probability of a housekeeping/universal peptide source gene in mouse is greater than 80%, making the latter set of genes preferentially hyperconserved. Thus, this analysis indicates that tissue-specific versus housekeeping/universal MHC-I peptide source genes do not share the same degree of conservation across evolution. Together, our results suggest that highly expressed and hyperconserved genes are preferential sources of MHC-I peptides at the organismwide scale. Global correlative analysis between tissue proteomes and immunopeptidomes unveils a core antigen processing and presentation network. We described above that the total number of MS-detectable MHC-I peptides was highly variable from one tissue to another, both in mouse and human (Figure 3 and Supplementary Figure 1) . Differential expression and activity of antigen processing and presentation proteins in different tissues may contribute to this high variability (Rock et al., 2016) . In fact, transcript levels of HLA-I, TAP1/2 and immunoproteasome were shown to correlate positively with the total number of MHC-I peptides detected across different human tissues (Marcu et al. accompanying manuscript). Notably, the same correlation pattern was observed for MHC-I proteins in mouse tissues (Schuster et al., 2018) . To date, such correlative analysis has only been applied for a handful of proteins. A computational approach could be used to systematically identify any protein of the proteome for which their respective abundance across tissues correlate with the total number of MHC-I peptides across those same tissues. Therefore, we set out to apply this correlative approach at the proteomewide scale using protein abundances measured across different mouse and human tissues from two tissue-based proteomics atlases generated by MS ( Figure 6A ) (Geiger et al., 2013; Wang et al., 2019) . First, we computed a total of 4,255 (on 4,255 proteins) and 80,024 (on 10,095 proteins) correlations in mouse and human, respectively (see Materials and Methods). Strikingly, we found a subset of 126 and 220 correlating proteins in mouse and human, respectively, whose abundance significantly correlated with the total number of MHCI peptide counts in a given tissue (p-value < 0.025 and Rsquared > 0.5 in Mouse; p-value < 0.05 and R-squared > 0.4 for at least two subjects in Human) (Supplementary Figure 9A & B, Supplementary Tables 3 & 4) . From the 126 mouse proteins, 106 correlated positively (84%) and 20 correlated negatively (16%) with MHCI peptide counts. Out of the 220 significantly correlating human proteins, 153 correlated positively (70%) and 67 negatively (30%) (Supplementary Figure 10) . To broadly assess biological processes and molecular functions in which these proteins are implicated, we performed gene ontology (GO) analysis on these significantly correlating proteins (Supplementary Table 5 ). From the top 100 most significantly enriched GO terms implicated in biological processes in mouse and human, 40 were shared across both species (Supplementary Figure 11A) . Similarly, from the 59 (mouse) and 74 (human) enriched GO terms assigned to molecular functions, 29 were found in both organisms (Supplementary Figure 11B) . Remarkably, the shared GO terms were attributed to proteins implicated in the regulation of the immune system, antigen generation and protein degradation ( Figure 6B) . Furthermore, manual curation of the literature allowed us to associate those proteins to specific functional modules known to orchestrate transcription (e.g., STAT2/3, NFKB), TCR-MHC signaling (e.g., CD4, LCP1, VAV1/3) and antigen processing (e.g., PSMB3, PSME1, ERAP1) ( Figure 6C & D) . Among the latter, many proteasome subunits, proteases, carboxy-and aminopeptidases, as well as vesicular traffic proteins were identified (Figure 7) . For example, human PSME1 is part of the proteasome activator complex (PA28) and is required for efficient antigen processing (Sijts et al., 2001 (Sijts et al., , 2002 ; PSMB3 is a component of the 20S core proteasome complex (Elenich et al., 1999; Huber et al., 2012) ; and ERAP1 plays a central role in peptide trimming for the generation and presentation of MHC-I peptides (Serwold et al., 2002) . Table 5) . Thus, our systems-level analysis allowed us to identify many known key players as well as potentially new components of the antigen processing and presentation pathway. This led us to propose that we may have discovered a core antigen processing and presentation network composed of proteins involved in the generation, processing, presentation and recognition of MHC-I peptide antigens at the organism-wide scale. This study therefore opens up new avenues to further explore the architecture and dynamics of antigen processing and presentation in mammalian systems. The components of the antigen processing and presentation pathway shape how T cells respond to self and non-self (Rock et al., 2016) . Those components have been traditionally discovered using hypothesis-driven approaches or genomic screening of cell lines presenting a phenotype of interest (Burr et al., 2019; Neefjes et al., 2011; Paul et al., 2011) . MS-based immunopeptidomic approaches have also been used to validate the impact of those proteins on the global composition of the MHC-Ii using in vitro or ex vivo model systems (Alvarez-Navarro et al., 2015; Nagarajan et al., 2016; Verteuil et al., 2010) . To date, no study has taken advantage of the uncharted combination of immunopeptidomic, proteomic, transcriptomic and genomic data from a range of different primary tissues to infer the fundamental principles that form the mammalian MHC-Ii. In fact, akin to systems immunology methods (Villani et al., 2018) , we deployed in this study an unbiased immunopeptidomic data-driven strategy using multiple tissue-based omics datasets, both in mouse and human, to i) reinforce the notion that the composition of the mammalian MHC-Ii is highly context-dependent, ii) provide fundamental information about the tissue-dependency, conservation and biogenesis of the MHC-Ii at the organism-wide scale, and iii) uncover a remarkable number of proteins that may collectively orchestrate the content and tissue-specificity of the MHC-Ii. In this study, we found that many proteins of the ubiquitin-proteasome degradation system as well as many proteases, amino-and carboxypeptidases were much more abundant in organs presenting a large number of MHC-I-peptide complexes. In addition, proteins known to negatively regulate protein degradation were found to be more abundant in organs presenting low numbers of MHC-I peptides. In fact, correlations between protein abundances and numbers of MHC-I peptides detected in tissues were noted to be remarkably informative and could be used to systematically infer the role of new proteolytic enzymes in antigen processing. Proteolytic enzymes are critically important in antigen processing. Beside the proteasome, ~20 proteases act in the MHC-I presentation pathway and can alter presented peptides (Lázaro et al., 2015) . ERAP1 is probably the most relevant example here since this aminopeptidase plays a major role in antigen processing through N-terminal peptide trimming into the endoplasmic reticulum and is associated with a number of different autoimmune diseases (Hanson et al., 2018; Serwold et al., 2002) . Other aminopeptidases such as methionine aminopeptidase 2 (METAP2), leucine aminopeptidase 3 (LAP3), peptidase D (PEPD) and dipeptidyl peptidase 4 (DPP4) were showcased in this study, the latter also known to be involved in TCR-MHC signaling (Ghersi et al., 2006; Wagner et al., 2016) . Interestingly, we also identified five carboxypeptidases (CPE, CNDP1, CPVL, CNDP2 and PRCP), none of them reported so far to influence the repertoire of MHC-I peptides. These carboxypeptidases might therefore represent new players of the antigen processing and presentation pathway. If tested and validated, such findings would be particularly interesting because angiotensin-converting enzyme (ACE) is the only ER-resident carboxypeptidase documented so far, and was shown to be immunologically relevant through production of minor histocompatibility antigens, polyoma virus epitopes and HIV gp160 epitope (Neefjes et al., 2011; Shen et al., 2011) . The use of chemical inhibitors and CRISPR technology together with highthroughput immunopeptidomic experiments would be of great value in this context to systematically investigate the role of those proteins in shaping the composition and heterogeneity of the MHC-Ii in different cell and tissue types. Two distinct categories of self-peptides were investigated in this study: those that are most likely tissue-specific and those that are widely presented across most tissues, i.e. the housekeeping/universal MHC-I peptides. Interestingly, these two categories of self-peptides share very distinct intrinsic features. The latter is composed of peptides that are highly abundant and strong MHC-I binders in addition to derive from highly expressed genes that are preferentially hyperconserved across evolution. In contrast, tissue-specific peptides are relatively less abundant and are encoded by genes that are strongly expressed in the tissue of origin, but weakly or not expressed in most tissues. Given the distinct properties of those self-peptides, their respective impact toward various immunological processes could be dramatically different, for triggering T cell tolerance in particular. In fact, tolerance mechanisms through recognition of self-peptides, both in the thymus and in the periphery, are critical to eliminate or control self-reactive T cells that would otherwise lead to autoimmunity (Granados et al., 2015; Verteuil et al., 2012; Xing and Hogquist, 2012) . Failure to T cell tolerance against housekeeping/universal peptides would have devastating consequences as self-reactive T cells would destroy most organs across the entire organism. Fortunately, we observed that genes coding for those peptides are among the most expressed across entire genomes, hence, increasing the probability that those peptides will be abundantly presented in the thymus to trigger clonal deletion of immature self-reactive T cells recognizing those peptides. Moreover, we made the fundamental observation that housekeeping/universal peptides originate from hyperconserved genes. Therefore, the adaptive immune system may have evolved for 500 million years a remarkable mechanism enabling the elimination of those T cells in a highly efficient manner. In contrast, controlling self-reactivity of T cells recognizing tissue-specific peptides might be more challenging, thereby rationalizing the need for peripheral tolerance processes to avoid tissue-specific autoimmunity (Matsumoto et al., 2019) . Another important observation in this study was that the multiple HLA-I allotypes expressed by a given individual may contribute unevenly to the composition of the MHC-Ii from one organ to another. For instance, HLA-B40:02-associated peptides were found to be particularly enriched in the liver of a given individual compared to all the other organs. Overall, 65 enrichment patterns were observed across 27 specific HLA-I subtypes and 29 different tissues. These are important basic observations because peptide antigens that are processed and presented in a tissue-dependent fashion may cause differential phenotypic consequences in response to the same signal. For instance, in infectious diseases, Plasmodium parasites (malaria) and have the ability to reach and infect many host tissues (Coban et al., 2018; Wadman et al., 2020) . In this context, CD8+ T cells may behave very differently from one tissue to another following tissue-dependent processing and presentation of pathogen-derived peptide antigens, thereby likely impacting the overall efficiency of viral clearance by T cells. Interestingly, tissue-dependent antigen presentation may also lead to a web of tissue-resident memory T cells that functionally adapt to their environment to stop viral spread across the organism (Kadoki et al., 2017) . Hence, tissue-specific variations in the MHC-Ii likely play a role in controlling infections or determining the severity of a disease. One can anticipate that immunopeptidomics approaches will be increasingly powerful in the future to investigate the dynamics of the MHC class I antigen processing and presentation pathway in vivo and evaluate its impact on tissue-dependent T cell responses in the organism. Systems understanding of MHC-I antigen presentation at the organism level is at an early stage. In the future, we envision that further improvement in proteomics and immunopeptidomics technologies will enable robust, precise and comprehensive measurements of proteomes and immunopeptidomes in response to a wide range of immunological perturbations. Integration of those measurements over time, together with new high-throughput TCR-MHC peptide interaction studies (Dendrou et al., 2018; Moritz et al., 2019) , will help understand how widespread and tissuespecific changes in peptide processing and presentation impact tissue-dependent T cell responses, and hence, help understand inter-organ communications between T cell networks to shape the organismal circuitry of immunity (Chevrier, 2019; Kadoki et al., 2017) . From a synthetic biology perspective, in-depth understanding of how MHC-I-associated peptides are generated in vivo will enable accurate prediction of their dynamics, and ultimately, will accelerate the engineering of new biological systems to control their presentation and function in immunity. Raw data from the mouse immunopeptidome dataset (Schuster et al., 2018) were downloaded and re-analyzed using "PEAKS 9 (Bioinformatics Solutions Inc., Waterloo, Ontario, Canada)" (Tran et al., 2018) . Peptides identified with an FDR<1% were further assessed for binding to the MHC-I alleles H2Kb and H2Db using NetMHCpan4.0 (Jurtz et al., 2017) . Peptides with a length of 8,9,10,11 or 12 amino acids and a NetMHCpan-4.0 Rank score smaller than 0.5 (Rank < 0.5) were selected as MHC-I peptides. A collection of all mouse MHC-I peptides is made available in Supplementary Table 1. All downstream data analysis is based on this set of MHC-I peptides. Mouse RNAseq data were obtained from (Söllner et al., 2017) supplementary materials and can be found in Supplementary Table 1 . Data were used for further analysis in the form provided. Mouse proteomic data were downloaded from (Geiger et al., 2013) supplementary materials and can be found in Supplementary Table 3 . Data were normalized across tissues based on median intensities and used for further analysis in the form provided. Human immunopeptidome data were obtained from an early/unreleased version of the data (see note above for the reviewers and editor). The resulting peptides that were assigned to their tissues, subjects and alleles can be found in Supplementary Table 2 . Briefly, raw MS files were searched as described in Marcu et al. (accompanying manuscript) . We used NetMHCpan-4.0 to annotate the best binding allele to each peptide. Peptide binding was predicted for alleles present in the subject from which peptide originated [based on allele genotyping as described in Marcu et al. (accompanying manuscript)]. Out of six alleles genotyped to each subject, the allele with the lowest NetMHCpan-4.0 Rank score was assigned to a given peptide. Peptides with a NetMHCpan-4.0 rank smaller than 2 (Rank<2) were considered MHC-I peptides. The quantitative information, as reported by MHCquant (Bichmann et al., 2019) , was also used in the current manuscript. Raw peptide intensities were used as approximative quantitative information and no normalization was performed due to the heterogeneous nature of pulldowns and primary tissue samples. Note that the latest version of the qualitative data can be found at https://hla-ligand-atlas.org/. Human RNAseq data were obtained from the GTEX repository https://www.gtexportal.org/home/ (accessed January 10 th 2020), the dataset used was: 'GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct'. The subset of data used for this manuscript can be found in Supplementary Table 2. Data were used for further analysis in the form provided. publication can be found in Supplementary Table 4 . Data were used for further analysis in the form provided unless stated differently. Principal component analysis and visualization was performed in R using the FactoMineR package (Lê et al., 2008) . Input variables consist of all tissues for which immunopeptidomic data are available in mouse or human. For each tissue, a vector of individual peptide intensities (log10 transformed) was loaded. The first two dimensions accounting for most of the variability in the data were plotted (Figure 2 D & E). For every possible pair of tissues, the number of overlapping peptides was determined for the mouse and human immunopeptidomes, respectively. A peptide was considered overlapping if an intensity value had been reported in both tissues. A connectivity matrix was generated from the resulting data for mouse and human, respectively ( Figure 2F and Supplementary Figure 1) . Noteworthily, the number of peptides unique to a given tissue is depicted along the diagonal of depicted heatmaps. The proportion of peptides represented by a specific allele in a given tissue was calculated for every subject. Similarly, the mean proportion of every allele across tissues was calculated for every subject. These values were then used to calculate the over-or under-representation of each allele in a tissue compared to the mean as follows: Subject dependent allele enrichment in tissues: Examples for subject specific allele representations can be found in Figure 2 B-D. In order to assess trends across all subjects, we calculated the mean of these over-and under-representation values for all alleles across all subjects. To find trends among the data, we focused only on alleles over-represented by, on average, at least 1.5-fold in a given tissue across all subjects. Results are depicted in Figure 2 E. Note that every over-representation is coupled to an under-representation, allowing for straightforward visualization of results when only depicting over-representations. Source genes of mouse MHC-I peptides available from the Peaks results were mapped to ENSEMBL identifiers using the mouse annotation package org.Mm.eg.db in R (DOI: 10.18129/B9.bioc.org.Mm.eg.db). These source genes were then mapped to the genes in the RNAseq dataset (Söllner et al., 2017) to assess their tissue-dependent RNAseq expression (Supplementary Table 1 ). All mappings between different gene identifiers were performed using the R package AnnotationHub (DOI: 10.18129/B9.bioc.AnnotationHub). Genes mapped to a peptide which is present in only one of the nineteen tissues analyzed in the mouse immunopeptidome are considered to be source genes of tissue specific MHC-I peptides. We have not assessed to what extend additional MHC-I peptides from such a gene are represented across tissues (for genes where more than one immunopeptide was identified). We found 1620 source genes from tissue-specific MHC-I peptides in mouse. Universal MHC-I peptides are defined as MHC-I peptides present in at least 18 of the 19 mouse tissue samples (Supplementary Figure 6A) . 90 of such peptides from 85 genes were found in the mouse dataset (Supplementary Table 1 ). We calculated the conservation of the exon and promoter regions of the corresponding source genes and compared their genetic conservation to those from source genes of tissue specific MHC-I peptides. Conservation scores were extracted in form of PhastCons conservation probabilities (Siepel and Haussler, 2004; Siepel et al., 2005) PhastCons values for the exon and promoter regions of source genes from universal MHC-I peptides and source genes from tissue-specific MHC-I peptides were calculated and compared using the Wilcoxon rank sum test, respectively. This analysis and workflow were inspired by Zeng et al (Zeng et al., 2016) and Zhu et al (Zhu et al., 2008) who investigated the genomic conservation of housekeeping genes compared to tissue specific genes in mouse and human, respectively. Furthermore, ideas for the implication of PhastCons conservation rates were derived from Sun et al. (Sun et al., 2014) . Molecular weights of proteins were retrieved from www.uniprot.org (Complete Mus musculus proteome, reviewed + un-reviewed proteins, accessed June 17 2020). Uniprot identifiers were matched to ENSEMBL gene identifiers and used for analysis. Source genes of human MHC-I peptides were mapped to ENSEMBL identifiers using the human annotation package org.Hs.eg.db in R (org.Hs.eg.db: Genome wide annotation for Human. R package version 3.8.2). These source genes were then mapped to the genes in the RNAseq dataset ('GTEx_Analysis_2017-06-05_v8_RNASeQCv1.1.9_gene_median_tpm.gct') to assess their tissue-dependent RNAseq expression (Supplementary Source genes representing one or more MHC-I peptides that were measured in only one tissue sample in the human immunopeptidome dataset were considered source genes from tissue-specific MHC-I peptides. In human we found 8846 of such genes. Similar to the mouse analysis, we did not assess to what extent these genes yield additional peptides present in more than one tissue sample. Defining source genes from universal MHC-I peptides in human is less straightforward compared to the mouse due to the heterogeneity of subjects from which tissues were sampled and HLA alleles representation. Hence, we defined a source gene from universal MHC-I peptides in the available human immunopeptidome as a gene for which one or more MHC-I peptides were either 1) present across all tissues in at least two patients or 2) present across all samples in which the assigned Annotating the molecular weight of immunopeptide source genes (Human) Molecular weights of proteins were retrieved from www.uniprot.org (Complete Homo sapiens proteome, reviewed + un-reviewed proteins, accessed June 17 2020). Uniprot identifiers were matched to ENSEMBL gene identifiers and used for analysis. Protein expression data from mouse and human proteomic tissue drafts (Supplementary Tables 3 & 4) were obtained from (Geiger et al., 2013) Gene set enrichment analysis (GSEA; http://www.broad.mit.edu/gsea/) was performed using GSEA software and the Molecular Signature Database (MsigDB) on proteins from systematic cross-tissue analysis of MHC class I peptides and protein expression. Top 100 significant gene sets using the Biological Process and Molecular Function modules overlap analysis were considered significant with P value and FDR < 0.05. We acknowledge our use of the GSEA, GSEA software, and MSigDB (Subramanian et al., 2005) . Results can be found in Supplementary Table 5 . PK and IS analyzed the data. AM, LB, LK and HS generated and analyzed the data. EC and PK wrote the manuscript and all the co-authors provided critical comments. Heiko Schuster is employee of Immatics Biotechnologies GmbH. Stefan Stevanović is inventor of patents owned by Immatics Biotechnologies GmbH. Hans-Georg Rammensee is shareholder of Immatics Biotechnologies GmbH and Curevac AG. Figure 11) were manually curated from the literature and were classified based on their respective biological function: proteasome, aminopeptidase, carboxypeptidase, protease, ubiquitin protein, GTPase-activating protein or regulator, guanine nucleotide-exchange factor (GEF), actin binding protein, transcriptional regulator, binding protein, cell adhesion protein, tyrosine or serine-threonine protein kinase (Kinase) and enzyme. The roles of specific proteases, amino-and carboxypeptidases for processing peptide antigens remain unexplored and are depicted in grey. Supplementary Figure: 1: Connectivity map human immunopeptidome. This human heat map integrates immunopeptidomic data from 38 HLAI allotypes and 13 different subjects. The human heat map is not deconvoluted by HLA allotype nor subject and therefore provide a bird's eye view of the human class I immunopeptidome. Note: The number of uniquely observed/tissue-specific peptides can be found along the diagonal. Supplementary Figure 9 : Large scale correlation of protein intensities with the total count of MHCI peptides per tissue in human and mouse datasets. (A) R-squared of linear fits plotted against the corresponding p-values for the human data. Proteins whose fits show R-squared values> 0.4 (p-value<0.05) in at least two subjects are considered significant. (B) R-squared of linear fits plotted against the corresponding p-values for the mouse data. Proteins whose fits show R-squared values> 0.5 (p-value<0.025) are considered significant. Mass Spectrometry Profiling of HLA-Associated Peptidomes in Mono-allelic Cells Enables More Accurate Epitope Prediction ERAP1 polymorphism relevant to inflammatory disease shapes the peptidome of the birdshot chorioretinopathy-associated HLA-A* 29: 02 antigen Mass Spectrometry of Human Leukocyte Antigen Class I Peptidomes Reveals Strong Effects of Protein Abundance and Turnover on Antigen Presentation MHCquant: Automated and reproducible data analysis for immunopeptidomics Structure of the human MHC-I peptide-loading complex The nature and extent of contributions by defective ribosome products to the HLA peptidome An Evolutionarily Conserved Function of Polycomb Silences the MHC Class I Antigen Presentation Pathway and Enables Immune Evasion in Cancer Analysis of Major Histocompatibility Complex (MHC) Immunopeptidomes Using Mass Spectrometry A Case for a Human Immuno-Peptidome Project Consortium Decoding the Body Language of Immunity: Tackling the Immune System at the Organism Level Tissue-specific immunopathology during malaria infection HLA variation and disease The complete primary structure of mouse 20S proteasomes Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules The MHC class I peptide repertoire is molded by the transcriptome Initial Quantitative Proteomic Map of 28 Mouse Tissues Using the SILAC Mouse Predicting Antigen Presentation-What Could We Learn From a Million Peptides? Front Immunol 9 The Protease Complex Consisting of Dipeptidyl Peptidase IV and Seprase Plays a Role in the Migration and Invasion of Human Endothelial Cells in Collagenous Matrices MHC I-associated peptides preferentially derive from transcripts bearing miRNA response elements The nature of self for T cells-a systems-level perspective The genetics, structure and function of the M1 aminopeptidase oxytocinase subfamily and their therapeutic potential in immune-mediated disease Constitutive Proteasome Crystal Structures Reveal Differences in Substrate and Inhibitor Specificity NetMHCpan-4.0: Improved Peptide-MHC Class I Interaction Predictions Integrating Eluted Ligand and Peptide Binding Affinity Data Organism-Level Analysis of Vaccination Reveals Networks of Protection across Tissues Mice completely lacking immunoproteasomes show major changes in antigen presentation Noncoding regions are the main source of targetable tumor-specific antigens rtracklayer: an R package for interfacing with genome browsers Proteolytic enzymes involved in MHC class I antigen processing: A guerrilla army that partners with the proteasome FactoMineR : An R Package for Multivariate Analysis UCSC Genome Browser enters 20th year IPD-MHC 2.0: an improved inter-species database for the study of the major histocompatibility complex Tissue-specific autoimmunity controlled by Aire in thymic and peripheral tolerance mechanism The effect of proteasome inhibition on the generation of the human leukocyte antigen (HLA) peptidome High-throughput peptide-MHC complex generation and kinetic screenings of TCRs with peptide-receptive HLA-A*02:01 molecules Hotspots" of Antigen Presentation Revealed by Human Leukocyte Antigen Ligandomics for Neoantigen Prioritization ERAAP Shapes the Peptidome Associated with Classical and Nonclassical MHC Class I Molecules Towards a systems understanding of MHC class I and MHC class II antigen presentation A Genome-wide Multidimensional RNAi Screen Reveals Pathways Controlling MHC Class II Antigen Presentation MHC class I-associated peptides derive from selective regions of the human genome A Major Role for TPPII in Trimming Proteasomal Degradation Products for MHC Class I Antigen Presentation The major substrates for TAP invivo are derived from newly synthesized proteins Distinguishing functional polymorphism from random variation in the sequences of >10,000 HLA-A, -B and -C alleles Present Yourself! By MHC Class I and MHC Class II Molecules A tissue-based draft map of the murine MHC class I immunopeptidome ERAAP customizes peptides for MHC class I molecules in the endoplasmic reticulum Definition, conservation and epigenetics of housekeeping and tissue-enriched genes The carboxypeptidase ACE shapes the MHC class I peptide repertoire Combining Phylogenetic and Hidden Markov Models in Biosequence Analysis Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes The Role of the Ubiquitin-proteasome Pathway in MHC Class I Antigen Processing: Implications for Vaccine Design The role of the proteasome activator PA28 in MHC class I antigen processing An RNA-Seq atlas of gene expression in mouse and rat normal tissues Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles Sebnif: An Integrated Bioinformatics Pipeline for the Identification of Novel Large Intergenic Noncoding RNAs (lincRNAs) -Application in Human Skeletal Muscle Cells Leucine Aminopeptidase Is Not Essential for Trimming Peptides in the Cytosol or Generating Epitopes for MHC Class I Antigen Presentation Deep learning enables de novo peptide sequencing from data-independent-acquisition mass spectrometry Role for ribosome-associated quality control in sampling proteins for MHC class I-mediated antigen presentation Sizing up the key determinants of the CD8(+) T cell response Deletion of immunoproteasome subunits imprints on the transcriptome and has a broad impact on peptides presented by major histocompatibility complex I molecules Origin and plasticity of MHC I-associated self peptides Systems Immunology: Learning the Rules of the Immune System The Human Immunopeptidome Project: a roadmap to predict and treat immune diseases A rampage through the body Unravelling the immunological roles of dipeptidyl peptidase 4 (DPP4) activity and/or structure homologue (DASH) proteins A deep proteome and transcriptome abundance atlas of 29 healthy human tissues T-Cell Tolerance: Central and Peripheral Making sense of mass destruction: quantitating MHC class I antigen presentation The Cytosolic Endopeptidase, Thimet Oligopeptidase, Destroys Antigenic Peptides and Limits the Extent of MHC Class I Antigen Presentation Identification and analysis of house-keeping and tissue-specific genes based on RNA-seq data sets across 15 mouse tissues On the nature of human housekeeping genes Comparison of tissue dependent MHCI (Mouse) and HLAI (Human) peptide antigens. (A) MHCI peptide counts for each sampled mouse tissue, colors depict the MHCI alleles, respectively. (B) HLAI peptide counts for all sampled human tissues. Boxplots are represented as several tissues were sampled across different individuals Comparison of MHCI peptide counts/tissue (Mouse) and HLAI peptide counts/tissue (Human). (D) Principal component analysis of the measured intensities (log10) of MHCI peptides (Mouse). (E) Principal component analysis of the measured intensities (log10) of HLAI peptides (Human). (F) Tissue connectivity map of the 'B57BL/6 MHCI Ligand atlas. Heatmap depicts the number of shared MHCI peptides across tissues (Mouse) Supplementary Figure 2: Proportion of peptides shared across Colon, Spleen, Liver, Lung, Bone marrow and Kidney. (A) Deconvoluted by best allele for which all 6 tissues were sampled. (B) Deconvoluted by subjects for which all six tissues were sampled Supplementary Figure 5: NetMHCpan4.0 rank (best allele) plotted against the number of measurements across tissues in the human dataset Histogram showing the number of MHCI peptides presented in one or more tissues. 1 measurement are MHCI peptides identified in only one tissue whereas 19 measurements are MHCI peptides identified in all the 19 tissues. Mouse source genes coding for MHCI peptides that were measured across more than 17 tissues were considered for further analysis. (B) Source genes of the top 100 peptides from the human dataset measured the most frequently across tissues were considered 'source genes of universal peptides' . (C) In addition to B, source genes that present peptides across all tissues in at least two patients are also considered as 'source genes of universal peptides'. (D) In addition to B and C, source genes that present peptides across all tissues for at least one allele are considered 'source genes of universal peptides Supplementary Figure 8: Molecular weight of universal-peptide and tissue-specific-peptide source proteins. (A) Mouse and (B) Human Supplementary Figure 10: Negative correlations between total immunopeptide counts and protein intensities. (A) Proportion of direction of slopes of significant proteins in human and mouse. (B-D) Example fits of proteins in human and mouse with negative correlation Supplementary Figure 11: Overlap of enriched gene ontology (GO) terms between Mouse and Human for genes significantly correlating with total MHCI/HLAI counts. (A) GO Biological Process (GO-BP). (B) GO Molecular Function (GO-MF) Example fit curves of prominent proteins. (A) Example fit of the protein CD4 in human (only curves of subjects with statistically significantly fit curves are shown). (B) Example fit of the protein Erap1 in mouse