key: cord-0859510-qcmdzkfw authors: Taylor, Hannah B.; Klaeger, Susan; Clauser, Karl R.; Sarkizova, Siranush; Weingarten-Gabbay, Shira; Graham, Daniel B.; Carr, Steven A.; Abelin, Jennifer G. title: MS-Based HLA-II Peptidomics Combined With Multiomics Will Aid the Development of Future Immunotherapies date: 2021-06-17 journal: Mol Cell Proteomics DOI: 10.1016/j.mcpro.2021.100116 sha: 16f7f683dc7aa2e7d11c9ef4eb30abcc30ee1f69 doc_id: 859510 cord_uid: qcmdzkfw Immunotherapies have emerged to treat diseases by selectively modulating a patient’s immune response. Although the roles of T and B cells in adaptive immunity have been well studied, it remains difficult to select targets for immunotherapeutic strategies. Because human leukocyte antigen class II (HLA-II) peptides activate CD4+ T cells and regulate B cell activation, proliferation, and differentiation, these peptide antigens represent a class of potential immunotherapy targets and biomarkers. To better understand the molecular basis of how HLA-II antigen presentation is involved in disease progression and treatment, systematic HLA-II peptidomics combined with multiomic analyses of diverse cell types in healthy and diseased states is required. For this reason, MS-based innovations that facilitate investigations into the interplay between disease pathologies and the presentation of HLA-II peptides to CD4+ T cells will aid in the development of patient-focused immunotherapies. Although challenges remain in leveraging MS-based HLA-II peptidomics, investigations into the interplay between disease pathologies and the presentation of HLA-II peptides to CD4+ T cells will enable the development of future immunotherapies. In this Review article, we discuss our current understanding of HLA-II peptidomics and outstanding questions in the field and how MS-based innovations will enable us to fill knowledge gaps and help improve our ability to select HLA-II-presented antigens as targets for personalized immunotherapies. Immunotherapies have emerged to treat diseases by selectively modulating a patient's immune response. Although the roles of T and B cells in adaptive immunity have been well studied, it remains difficult to select targets for immunotherapeutic strategies. Because human leukocyte antigen class II (HLA-II) peptides activate CD4+ T cells and regulate B cell activation, proliferation, and differentiation, these peptide antigens represent a class of potential immunotherapy targets and biomarkers. To better understand the molecular basis of how HLA-II antigen presentation is involved in disease progression and treatment, systematic HLA-II peptidomics combined with multiomic analyses of diverse cell types in healthy and diseased states is required. For this reason, MS-based innovations that facilitate investigations into the interplay between disease pathologies and the presentation of HLA-II peptides to CD4+ T cells will aid in the development of patient-focused immunotherapies. The recognition of peptides presented on human leukocyte antigen class II (HLA-II) heterodimers by antigen-presenting cells (APCs) to CD4+ T cells is a key step in both T cell and B cell response pathways. HLA-II antigen presentation results in the production of T cell mediators, chemokines, and cytokines with proinflammatory, immunoprotective, and chemotactic properties that activate multiple immune cell types (Fig. 1A) . Mounting evidence from multiple preclinical studies demonstrate that CD4+ T cell-directed therapies can control tumor growth (1) (2) (3) (4) (5) (6) (7) and have the potential to aid in the treatment of autoimmune diseases (8) (9) (10) (11) and infectious diseases (12) . Characterization of HLA-II peptidomes can rapidly improve our ability to understand the rules of HLA-II antigen processing and presentation in the context of different diseases and across multiple cell types. An HLA-II peptidomics experiment involves capture and isolation of the HLA-II-presented peptides by immunoprecipitation or affinity tag purification followed by peptide sequencing and quantitation via LC-MS/MS. Although HLA-II peptidome studies combined with genomic, transcriptomic, and ribosomal profiling analyses have resulted in improvements to HLA-II epitope prediction approaches (13) (14) (15) (16) (17) (18) (19) , significant challenges remain in fully understanding the rules of HLA-II epitope presentation and selection of HLA-II antigens as immunotherapeutic targets. In this review, we will present and discuss evidence supporting the development of immunotherapies targeting HLA-II-presented antigens to either elicit or dampen CD4+ T cell responses. We will also put into context our current understanding of HLA-II antigen processing and presentation pathways within APCs. These topics will lead us to consider the key successes of HLA-II epitope prediction and outstanding questions related to antigen processing and presentation that MS-based HLA-II peptidomic technologies are uniquely able to help address. Finally, we will highlight a subset of knowledge gaps and challenges that we believe HLA-peptidomics technologies, in combination with multiomics studies, could overcome to better select targets for personalized immunotherapies across multiple diseases. The highly polymorphic nature of HLA-II alleles and the complexity of heterodimer pairing is an impediment for HLA-II peptidome prediction and profiling efforts (20, 21) . Each individual can theoretically express between 3 and 12 unique HLA-II heterodimers (HLA-DRB1/3/4/5, HLA-DRA1, HLA-DQB1, HLA-DQA1, HLA-DPB1, HLA-DPA1) (Fig. 1B) . HLA-DR alleles are often the most highly expressed, whereas HLA-DP and HLA-DQ are expressed at lower levels (22, 23) . All HLA-II heterodimers consist of an alpha and a beta chain, but HLA-DR heterodimers differ from HLA-DP and HLA-DQ in their alpha and beta chain pairing characteristics. A single HLA-DP and HLA-DQ beta chain can bind with multiple polymorphic alpha chains, whereas each HLA-DR beta chain binds with a monomorphic HLA-DRA chain (16) . In addition, linkage disequilibrium leads to the inheritance of DRB3, DRB4, and DRB5 alleles that form heterodimers with the same DRA chain as DRB1 alleles. Interestingly, many studies do not report DRB3/4/5 alleles in their datasets because their samples are not fully HLA typed, despite their high prevalence (2) (http:// www.allelefrequencies.net/). This discrepancy often leads to incorrect assignment of HLA-II peptides binding to specific HLA-DR heterodimers from HLA-II peptidome datasets. The variable heterodimer pairing of HLA-DQ and HLA-DP alpha and beta chains combined with the presence of DRB3/4/5 alleles creates over 7000 HLA-II alleles (http://www. allelefrequencies.net/), with few seen in more than half of the population (Fig. 1C, supplemental Fig. S1 ). Fortunately, HLA-II allele population frequencies suggest that studying a subset of around 100 HLA-II heterodimers will be sufficient to cover at least one HLA-II heterodimer in greater than 90% of the global population (24) . A, APCs, such as dendritic cells (DCs), B cells, and macrophages, express HLA-II heterodimers that present peptide antigens to CD4+ T cells. Upon T cell receptor (TCR)mediated antigen recognition, CD4+ T cells release cytokines and chemokines that activate both B cells and CD8+ T cells. These cytokines can induce B cell class switching to plasma cells to promote antibody production. Simultaneously, this cytokine and chemokine release can cause upregulation of HLA-I expression on APCs such as DCs. B, example of HLA-II heterodimer pairing between alpha and beta chains expressed from HLA-DR, HLA-DP, and HLA-DQ alleles. HLA-DPA and HLA-DQA chains can pair with multiple HLA-DQB and HLA-DPB chains. Conversely, a monomorphic HLA-DRA chain binds to multiple DRB1 chains. DRB3/4/5 alleles in linkage with DRB1 can also be expressed and bind to the same DRA chain. Shown is an example heterodimer, DRB1*11:01/DRA1*01:01, to demonstrate possible binding motif registers and an example HLA-II-binding motif with anchor residues (colored) in positions P1, P4, P6, and P9. C, HLA-II allele frequency plots for HLA-DR (23) (http://www. allelefrequencies.net/) (24) . APCs, antigen-presenting cells; HLA-II, human leukocyte antigen class II. HLA-II peptidome profiling is accomplished through LC-MS/ MS sequencing of peptides obtained from either multiallelic or monoallelic systems. Other methods for HLA-II peptidome profiling include yeast display and peptide microarray systems (25, 26) . Monoallelic systems have only one HLA heterodimer expressed or purified to ensure that all peptides can be assigned to a single heterodimer. Conversely, multiallelic approaches utilize cell lines with multiple HLA heterodimers. Although multiallelic datasets have provided novel insights into the binding rules of HLA-II heterodimers, these datasets often contain some uncertainty, as peptides must be assigned to a specific heterodimer using either deconvolution or machine learning methods that are trained on previously learned HLA-II allele-specific peptide-binding motifs (14, 15, 27) . It can be unclear which HLA-II heterodimer a peptide was bound to in multiallelic datasets because distinct HLA-II heterodimers can share an α or a β chain, creating overlap between peptidebinding groove preferences. Furthermore, HLA-II-binding peptides, usually 12 to 25 amino acids long, do not always bind in the same register because they overhang from the peptide-binding groove (Fig. 1B) . The ambiguous binding register makes HLA-II binding cores, typically nine amino acids with 2 to 4 anchor positions, difficult to identify (13-16) (Fig. 1B) . In contrast, monoallelic datasets that contain only one HLA-II heterodimer can be used to determine allelespecific HLA-II-binding registers without preexisting knowledge and leveraged to improve multiallelic deconvolution methods (16) . Although monoallelic approaches decrease ambiguity in peptide assignment and binding motif determination, engineering monoallelic cell lines may alter processing and presentation machinery and introduce bias. Therefore, combining both multiallelic and monoallelic approaches to HLA-II peptidomics deepens our understanding of HLA-II processing and presentation rules. Despite the fact that HLA-II peptidomics has been implemented since the early 1990s (28) (29) (30) , obstacles remain that prevent broader use. For example, contrary to HLA-I, there are no constant regions across all possible HLA-II heterodimers comparable with the beta-2 microglobulin chain of HLA-I complexes. The variability in protein sequence across HLA-II heterodimers creates the possibility that existing pan-HLA-II and HLA-DR-, HLA-DP-, and HLA-DQ-specific antibodies could bind to different heterodimers with varying affinities. Interestingly, some HLA-II peptidome studies leverage mixtures of pan-HLA-II and HLA-DR-specific antibodies to enable more efficient HLA-II peptidome purification (31) . We are currently limited in our ability to determine anti-HLA-II antibody-binding bias because of the lack of reagents to enrich for single HLA-II heterodimers. To partially address this potential bias, evaluations of antibody purifications from monoallelic HLA-II cell lines (16, 32) compared with orthogonal HLA-II purification technologies, such as affinity-tagged HLA-II, across a diverse set of HLA-II heterodimers could provide insight. Until these types of data are produced for a majority of HLA-II heterodimers, the potential bias of HLA-II immunopurification antibodies remains an open question. In contrast to typical proteomic studies initiated by tryptic digestion of a proteome, immunopurified HLA-II peptides have unique sequence characteristics and are isolated at orders of magnitude of lower abundances. Consequently, HLA peptidomic studies are challenging and require the use of sensitive instrumentation and chromatography techniques. Simply increasing cell input is not feasible for most HLA-II peptidome applications that utilize precious samples, such as patient tissues. Alternatively, the combination of gas-phase fractionation using ion mobility and microscale offline fractionation has been shown to increase peptide identifications for lowinput samples (33) (34) (35) . These methods are likely to benefit HLA-II peptidome efforts but should be altered from standard HLA-I workflows to enable better recovery of longer and more hydrophobic peptides (16) . Potential changes include eluting peptides with a higher percentage of organic solvent in both offline desalting protocols and online LC-MS/MS gradients. Different reversed-phase stationary phase materials that enable more efficient capture of hydrophobic peptides may also be advantageous. Longer maximum inject times can also be utilized to increase the quality of data from these low-level peptides (14, 16) . Compared with HLA-I, HLA-II LC-MS/MS methods should allow for a wider range of charge states (+2 to +5) and use increased collision energies for more complete peptide fragmentation (16) . The long length of HLA-II peptides may also result in the selection of the 13 C isotope peak for precursor isolation and MS2 fragmentation. To account for the possibility that the 13 C peak is the most abundant, as is the case for longer peptides, the precursor ion isolation width should be set wider than for the shorter HLA-I peptides to at least 1.1 m/z to enable coisolation of both the precursor 12 C and the 13 C isotope peaks. Continuing to tailor LC-MS/MS instrument methods to suit the low abundance, nontryptic nature, and long length of HLA-II peptides will further improve their detection and identification. The unique characteristics of HLA-II peptides necessitate the use of specialized search strategies for interpreting peptide sequence from MS/MS spectra that are different from conventional tryptic peptide-based approaches. MaxQuant, PEAKS, and Spectrum Mill are spectral interpretation software Mol Cell Proteomics (2021) 20 100116 3 packages that provide suitable functionality and have been utilized for HLA-II peptidome searches (14, 16, 36) . Typical database search settings for HLA-II searches include a false discovery rate of <1% at the peptide level, no enzyme specificity, and a maximum precursor mass range that accommodates peptides 12 to 25 amino acids in length (16, 36) . Fragment ion type scoring and peak detection represent two major challenges presented by MS/MS spectra from HLA-II peptides. Although the HCD MS/MS spectrum of a typical tryptic peptide is primarily composed of y-ions because of the presence of a basic C-terminal Lys or Arg, HLA-II peptides have no such constraint. In fact, reported HLA-II peptide sequences demonstrate that basic amino acids can be at any position in the peptide. This results in a collection of spectra that include subsets that are y-ion rich, b-ion rich, mixed b and y ion, or internal ion rich. Internal ions are much more pronounced in HLA peptide spectra than in tryptic peptides, and many search engines do not currently account for them. The generally low abundance of peptides in an HLA-II peptidome LC-MS/MS experiment produces spectra with signal/noise ratios near the lower limits for peak detection. In addition, the aforementioned potential selection of the 13 C isotope peak for precursor ion isolation and MS2 fragmentation creates the need for more robust deisotoping during the spectral preprocessing and peak-detection steps of a database search. With incremental improvements in fragment ion type scoring and peak detection in recent versions of Spectrum Mill, the number of confident peptide identifications from HLA-I and HLA-II datasets was increased by 20 to 100%, with the greatest improvements apparent among datasets that are the least tryptic-like (16, 37) . After database searching, additional data-cleaning steps such as removal of common laboratory contaminant proteins and peptides can further improve data quality (16, 37) . Overall, HLA-II allelic diversity and the unique characteristics of HLA-II peptides will continue to drive improvements in spectral interpretation tools to further increase the quality of HLA-II peptide identifications from LC-MS/MS data. HLA-II epitope-prediction methods, such as NetMHCIIpan (38) , originally used data from biochemical HLA-II-binding assays reported in the Immune Epitope Database (39) and the SYFPEITHI database (40) . Epitope-prediction algorithms, such as NetMHCIIpan, have used machine learning methods to identify the consensus binding motifs of synthetic peptides that were able to bind to biochemically purified HLA-II heterodimers. Although these biochemical assays combined with prediction methods were vital for improving our understanding of the rules of HLA-II peptide binding, HLA-II binding assay datasets lack complete coverage of HLA-DP, HLA-DQ, and rare HLA-DR heterodimers and do not account for the rules that govern endogenous processing and presentation. Hence, much effort has been put into generating LC-MS/MS datasets that characterize endogenously processed peptide ligands from diverse HLA-II heterodimers (14, 16, 19, 36, 41) . These eluted HLA-II peptide datasets enable rapid binding register motif determination and investigations into the rules of endogenous processing and presentation. Furthermore, most HLA-II epitope prediction methods now incorporate LC-MS/ MS-eluted HLA-II peptide datasets (13, 15, 17) . For example, NetMHCIIpan4.0 13 leverages both LC-MS/MS-identified HLA-II-eluted ligand and biochemical HLA-II affinity datasets (27), while MARIA (15), MixMHC2pred (17) , and neonmhc2 (16), among others, leverage endogenously processed HLA-II peptide LC-MS/MS datasets. As more alleles are profiled, instrumentation becomes more sensitive, and database search tools improve (42), we expect HLA-II peptide sequencing by LC-MS/MS to further improve our ability to predict HLA-II epitopes. Genome-wide association studies (GWASs) have begun to reveal the role HLA-II genes play in autoimmune disorders, cancer, and infectious disease (43) (44) (45) (46) . GWASs have demonstrated relatively strong associations between certain autoimmune diseases and HLA-II alleles. However, the GWAS has limitations that can prevent the causal relationship of HLA-II antigen presentation to CD4+ T cells from being fully elucidated (47, 48) . This is due in part to the highly polymorphic nature of HLA alleles and the numerous HLA-II paralogs that create genotyping challenges. Linkage disequilibrium in the HLA locus exacerbates these intricacies because the nonrandom assortment of HLA genes makes it difficult to identify specific causal variants of alleles, as several alleles are inherited together (49, 50) . Despite these complexities, GWASs have clearly illustrated that studying HLA-II presentation will help reveal the molecular pathways of autoimmune diseases, cancer, and infectious diseases. Several studies have shown that HLA-II antigen presentation and CD4+ T cell activation is involved in positive responses to cancer immunotherapies (51) (52) (53) (54) (55) (56) . In vivo, CD4+ T cells appear to play a critical role in sustaining the effects of PD-1 blockade; depleting CD4+ T cells in mice completely reverses the antitumor effects of a PD-1 blockade, presumably because of a loss of CD4+ T helper immunity to prime cytotoxic CD8+ T cell responses (57) . In addition, patients in multiple clinical studies have been found to respond positively to HLA-II neoantigen-specific CD4+ helper tumor-infiltrating lymphocytes infusions (1, 58) , and improved clinical response has been observed when CD4+ T cells are targeted in combination with CD8+ T cells compared with CD8+ T cell targeting alone (54) . However, even with CD4+ T cell activation, the response rate to tumor-infiltrating lymphocytes therapies and HLA epitope-specific vaccinations ranges from 40 to 70%, and responses are not always durable (58, 59) , implying that investigating antigen presentation in response to immunotherapies is critical for understanding how to improve the selection of HLA-II-presented antigens. Heritable polymorphisms in HLA-II are frequently associated with autoimmune diseases such as celiac disease (CD), inflammatory bowel disease, systemic lupus erythematosus, and type 1 diabetes (60-65). Normally, HLA-II-bound peptides elicit CD4+ T cells to release cytokines that promote CD8+ T cell expansion and trigger B-cells to drive robust immune responses against non-self-antigens (66, 67) . In the context of some autoimmune diseases, CD4+ T cells are conversely activated by self-derived antigens. One example of this is CD, where overactive CD4+ T cells recognize transglutaminase2 (a widely expressed enzyme that is active during digestion of gluten) derived peptides presented on HLA-II heterodimers, leading to the production of selfreactive antibodies (60) . Although CD is triggered by a foreign substance, its pathology is caused by a reaction to self-antigen presentation on specific HLA-II alleles (HLA-DQA 1*05 and HLA-DQB1*02) (68) . Current studies have investigated the relationship between B-cell epitopes and disease onset (69, 70) and suggest that studying disease-specific heterodimers and their peptide repertoires will provide insights into peptide processing and presentation rules that govern both B cell and T cell responses. Moreover, in type 1 diabetes, considerable work from the Unanue lab has identified and validated autoantigens using HLA-II peptidomics and T cell responses (71, 72) , thus enabling insights into the features of antigenicity that allow autoimmunity to develop and providing a potential workflow for autoantigen identification in other autoimmune diseases. Future studies evaluating HLA-II peptide presentation in the context of autoimmune diseases will likely reveal features of HLA-II peptide antigenicity or specific antigens that can be utilized as either biomarkers or as therapeutic targets. Investigations into the role of HLA-II in immune responses in the context of infectious disease are important for understanding disease pathology and the development of potential treatments. To date, susceptibility to hepatitis B, dengue, and West Nile virus has been associated with specific HLA-II alleles (44, 73, 74) , and CD4+ T cell responses have been documented in a range of infectious diseases, including chikungunya, tuberculosis, and COVID-19 (75) (76) (77) (78) . Moreover, many viruses successfully evade the immune system by altering HLA-II expression. For example, vaccinia virus infection has been shown to reduce HLA-II presentation in B cells, macrophages, and dendritic cells (DCs) (79) , while human cytomegalovirus marks HLA-DR for proteasomal degradation (80) . HIV specifically targets APCs that express the cell surface protein CD4 and coreceptors that include CCR5 and CXCR4 (81) . The main targets for HIV are CD4+ T cells and macrophages, and as their levels in the body decrease upon infection. Once inside the host cell, HIV replicates efficiently while simultaneously evading the immune response. HIV also impairs HLA-II presentation by increasing cell surface expression of immature HLA-II (82), decreasing the capability of the immune system to respond to the virus. LC-MS/MS has proven to be useful in identifying possible HLA-II vaccine targets that are both presented on infected cells and have conserved sequences in tuberculosis (83), leishmaniasis (84) , and COVID-19 (85) . These encouraging findings demonstrate the potential value of adding HLA-II peptidomics to current vaccine development pipelines, as well as its ability to reveal uncharacterized viral mechanisms of action and provide novel sources of biomarkers and potential treatment targets. HLA-II peptide presentation involves both endocytosis and autophagy pathways and is primarily a function of specialized APCs, but has also been observed in a few other cell types (Fig. 2) . HLA-II gene transcription is controlled almost entirely by the class II major histocompatibility complex transactivator (CIITA), a non-DNA-binding coactivator that is regulated by multiple different signals, including IFNγ, STAT-1α, and USF-1 (86, 87) . CIITA is transcriptionally regulated by multiple cell type-specific promoters, resulting in the translation of three different protein isoforms (88) . Promoters I and III result in constitutive expression of CIITA in APCs. For example, DCs and B cells are activated by CIITA promoters I and III, respectively (88) . Inducible expression of CIITA in non-APCs, such as epithelial cells (ECs) and CD4+ T cells (89) (90) (91) (92) (93) , is linked to CIITA's promoter IV. Promoter IV is activated by IFNγ expression and leads to the recruitment of several cis-acting elements and trans-acting factors that induces transcription of CIITA and therefore upregulation of MHC-II. Once CIITA is activated, HLA-II heterodimers are translated into the endoplasmic reticulum. Because HLA-II heterodimers are unstable in the absence of bound peptide, the invariant chain (Ii; CD74) functions as a chaperone to help assemble a stable complex of CD74 and HLA-II heterodimers (94) . Next, CD74 directs the trafficking of HLA-II complexes to the lysosome or MHC class II compartment (MIIC), where CD74 is cleaved by proteases such as legumain (95) and cathepsins, such as S, L, and F (96) (97) (98) . This cleavage does not completely remove CD74 from the peptide-binding groove but instead shortens it into a class II-associated invariant chain peptide (CLIP) that remains bound (99) . CLIP can then be exchanged for higher affinity peptides from either endogenous or exogenous source proteins that are degraded in the lysosome with the help of chaperones HLA-DM and HLA-DO (99) . HLA-DM acts by opening the binding groove of HLA-II to allow for the exchange of CLIP with high-affinity peptides in the MIIC (100). The role of HLA-DO is still being elucidated, but some studies have suggested it is a competitive inhibitor of HLA-DM that prevents peptide loading before antigenic peptides are present (101) . Importantly, not all HLA-II alleles are sensitive to HLA-DM (102) , demonstrating that HLA-II allele-specific rules have to be considered in the relevant context of class IIprocessing chaperones to accurately model the rules of processing and presentation. Profiling the HLA-II processing and presentation pathway by LC-MS/MS. HLA-II heterodimers are formed in the endoplasmic reticulum when alpha and beta chains are paired and loaded with CD74 to prevent heterodimer dissociation. The MHC-II heterodimers are then trafficked to the MIIC complex, where CD74 is trimmed via proteases such as cathepsin S into CLIP peptides that act as placeholders to block peptide loading. Concurrently, antigen source proteins enter the MIIC by both endocytosis (exogenous proteins-purple) and autophagy (endogenous proteins-green). Before trafficking to the MIIC, source proteins are digested by proteases in endosomal/lysosomal compartments. Once cleaved by proteases, they are transported to the MIIC and loaded onto the HLA-II heterodimers with the help of the HLA-DM and HLA-DO chaperones, which removes CLIP peptides from the binding groove and facilitates the binding of antigen-derived peptides. Loaded HLA-II heterodimers are transported to the cell surface where circulating CD4+ T cells can recognize MHC-II-bound peptides via their TCR. As shown, HLA-II heterodimers on APCs tend to present peptides from endogenous proteins (autophagy) and exogenous source proteins (endocytosis). The repertoire of HLA-II-bound peptides can be profiled directly using LC-MS/MS to determine peptide sequences and abundances to retrospectively learn the rules of endogenous peptide processing and presentation and identify disease-specific CD4+ T cell targets. APCs, antigen-presenting cells; CLIP, class II-associated invariant chain peptide; HLA-II, human leukocyte antigen class II; MIIC, MHC class II compartment; TCR, T cell receptor. Peptide binding to HLA-II heterodimers is governed by processing and presentation machinery that are expressed at varying levels among APC cell types and disease states (103) . Studying the impact that differences in antigen processing proteins has on HLA-II peptide presentation may reveal rules that can be applied to improve HLA-II epitope prediction models. For example, DCs are an important APC involved in cancer progression and treatment, as these APCs endocytose apoptotic tumor cells and tumor cell debris and present tumor antigens on HLA-II heterodimers to CD4+ T cells (2) . Profiling HLA-II peptides presented by different DC subsets that are represented at different stages of cancer progression and treatment may improve our understanding of the antigens driving CD4+ T cell responses. Alternatively, aberrant expression of HLA-II on ECs has been linked to HPV+ head and neck cancers, epithelial cancers, allergy, inflammatory bowel disease, and graft-vs-host disease (1, (104) (105) (106) (107) . Although expression of HLA-II in ECs has been shown to alter CD4+ and CD8+ activation in the gut (106), less is known about EC-specific rules of processing and presentation. Macrophages have also been implicated in some autoimmune diseases, such as multiple sclerosis, systemic lupus erythematosus, and rheumatoid arthritis, suggesting that HLA-II presentation may be involved (108) (109) (110) . Additional complexity in HLA-II presentation can be attributed to the varying expression levels of HLA-II processing and presentation proteins. Protein levels are dependent on the cell type, which is demonstrated by the differences in expression of CD74, cystatin, and cathepsin expression in different APCs (103) . It should be noted that studying some disease-related APC subsets in vivo is not possible today because their low frequencies in the blood are not compatible with HLA-II immunopurification input requirements. As such, careful design of in vitro model systems that express the appropriate APC gene sets and machinery is one way to overcome limited sample availability. In vitro model systems that can express APC-specific proteins at biologically relevant levels could inform how HLA-II presentation varies between these different cell types. Thus, it is crucial that future HLA-II peptidome studies consider both the disease relevant APC context and levels of antigen-processing machinery proteins to accurately learn the rules of HLA-II presentation. Currently, LC-MS/MS is being utilized to study HLA-II presentation across multiple diseases including cancer, autoimmunity, and infectious diseases (41, 111, 112) . For each disease, it is important to consider both the biological contexts and if putative HLA-II antigens are more likely to arise from an exogenous or endogenous source protein. In the context of HLA-II peptidomes in cancer, primary tumors, metastatic tumors, and numerous cancer cells lines have been profiled (16, 113, 114) . Although these data can identify cancer-specific HLA-II peptides on some tumors, not all tumor types express HLA-II, even when IFNγ is present in the tumor microenvironment (2) . Hence, studying how APCs, such as DCs and macrophages, uptake apoptotic tumor cells and present cancer-specific HLA-II peptides will help unravel the mechanisms underlying how HLA-II antigens activate CD4+ T cells (16) . In the context of autoimmunity, HLA-II peptidomics can assist in the identification of self-antigens that are driving aberrant immune responses (60) . In infectious disease, HLA-II peptidomics can not only identify endogenously presented viral-derived peptides but also may uncover which epitopes are driving B cell and antibody responses (67) . Close attention must be paid to the biological systems used in HLA-II infectious disease work because, similar to specific tumors, the infected cell may not express HLA-II. In this case, it may be more biologically representative to use an APC-feeding experiment in which a specific APC type is 'fed' the pathogen and presents the pathogen's antigens to immune cells (85) . Most importantly, the ability to predict disease-specific HLA-II peptides can be improved by studying the HLA-II peptidomes of disease-specific tissues, APCs, and in vitro model systems. Elucidating HLA-II and CD4+ T cell response pathways via genomics, transcriptomics, proteomics, and ribosome profiling has the potential to vastly improve our understanding of the molecular processes involved in disease pathology and treatment. Whole exome, RNA-Seq, and HLA immunopeptidomics are currently being used to identify cancer neoantigens and confirm their presentation on HLA heterodimers (115) . The identification of predicted neoantigens via epitope prediction pipelines that leverage genomics, transcriptomics, and HLApeptidomics data significantly decreases the number of potential peptide antigens that are screened in immunogenicity assays when looking for novel vaccine or T cell targets. HLA epitope prediction workflows also make personalized neoantigen-based therapeutic approaches feasible within a timeline that is compatible with clinical settings (2, 52, 55) . Currently, most neoantigen prediction is focused on HLA-Ipresented peptides as CD8+ T cell targets. Given that several large-scale HLA-II peptidome profiling efforts have enabled HLA-II epitope prediction pipelines (14) (15) (16) , and CD4+ T cells play a crucial role in activating CD8+ T cells (2) , it is likely that therapies aiming to elicit neoantigen-specific T cell responses will incorporate both HLA-I and HLA-II putative antigens. In addition, Kalaora et al. (116) recently identified bacteriaderived peptides presented on both HLA-I and HLA-II in multiple melanoma patients. These bacterial antigens elicited an immune response, suggesting it is possible that these nonself, tumor-specific antigens can be targeted with immunotherapeutic approaches. Identifying and targeting these tumor-specific bacterial peptide antigens may improve cancer therapies and warrants further study. A recent study from Laumont et al. (117) successfully detected HLA class I antigens derived from presumably "noncoding" regions of the genome from an Epstein-Barr virus-transformed B cell line 81. Subsequent work from this group used a focused proteogenomic strategy in combination with RNA-seq and identified several aberrantly expressed tumor-specific antigens from noncanonical sequences (118) . Ribosomal profiling (Ribo-seq) has further enabled the analysis of these noncanonical sequences by capturing and sequencing ribosome-protected mRNA fragments (119), thereby focusing on the subset of transcriptional sequences translated into protein. Several classes of translated alternative ORFs (altORFs) have been discerned from advances in Ribo-seq and other sequencing technologies; altORFs derived from the 5' and 3' untranslated regions, overlapping but outof-frame altORFs in annotated protein-coding genes, long noncoding RNAs, pseudogenes, and other transcripts currently annotated as nonproteins (120) (121) (122) . Ribo-seq has also facilitated the identification of HLA-I-presented peptides derived from altORFs and noncanonical ORFs in cancer cell lines and patient-derived xenografts that are not routinely included in HLA-peptidomics reference databases (123, 124) . In the context of cancer, especially tumors of low mutational burden, altORFs are a source of putative neoantigens that expand the pool of immunotherapeutic targets (124, 125). Although HLA-I-presented altORFs and noncanonical ORFs have been the focus to date, HLA-II heterodimers may also present this class of peptides. Moreover, altORFs are not restricted to cancer, as they have been reported to be involved in the replication cycle of multiple viruses, including SARS-CoV-2 (112, 126) . Thus, HLA-II peptidome profiling that includes reference exome-derived ORFs and noncanonical protein-coding products will deepen our understanding of endogenously processed and presented HLA-II antigens across multiple diseases. Immune reactions to monoclonal antibody therapies used to treat autoimmune disease and cancer, such anti-TNF (127) and PD-1/PD-L1 blockades (128) have been recorded in patients. These events remain difficult to predict, and both immune-related adverse events (irAEs) and treatment failure may relate to the development of antidrug antibodies (ADAs) (129) . The development of ADAs is far from rare, with as many as 62% of Crohn's disease patients treated with infliximab, an anti-TNF mAb, developing these antibodies (130) . The high incidence of ADAs paired with our lack of understanding creates a need for prediction methods that identify individuals who are more likely to develop ADAs, experience treatment failure, and suffer from irAEs. Because HLA-II allele associations have been described for patients with adverse responses, HLA-II profiling in patients and model systems treated with protein biologics used to treat patients with cancer, autoimmune disorders, and infectious diseases may be required (131) (132) (133) (134) . MS-based HLA-II immunopeptidome analysis is one approach that can be used to assess the potential for immunological off-target effects by directly identifying aberrantly presented HLA-II peptides either induced by or derived from protein biologics. A majority of HLA peptidome-profiling efforts focus on studying epitope presentation by diseasespecific cell lines and tissues. However, irAEs often impact other areas of the body including the skin, gastrointestinal tract, and hepatic, endocrine, neurologic systems (135) . To better understand and potentially predict irAEs related to biotherapeutic treatments, large-scale HLA-II peptideprofiling efforts across APCs and cell types involved in offtarget immune side effects could be leveraged. To accomplish this, reference protein databases that include the sequences of biotherapeutic monoclonal antibodies, as well endogenous immunoglobulins, will need to be constructed to ensure that HLA-II peptides are accurately mapped to their source proteins (136, 137) . Overall, the use of HLA-II peptidome profiling in combination with sequencing technologies that better characterize all possible endogenous protein products will inform our understanding of how biologics are modulating the immune response in the context of antigen presentation and help define possible biomarkers to identify patients with susceptibility to irAEs. Although our current understanding of HLA-II biology is limited because of the complexity of APC HLA-II presentation and technological sensitivity, it has become clear that HLA-II antigen presentation pathways play a critical role in adaptive immune responses. This is demonstrated by the correlation between specific HLA-II allele expression and the favorable response of cancer and infectious disease patients to CD4+ therapies and the frequent occurrence of autoimmune diseases (12, 52, 55, 138) . Continuing studies of the HLA-II presentation pathway with monoallelic cell lines and patientderived, multiallelic MS datasets will advance our understanding of the biological mechanisms involved in HLA-II presentation. Although most studies consider only canonical HLA-DR, HLA-DP, HLA-DQ alpha and beta chain pairing, there have been reports that mixed isotype pairing between DRB and DQA is possible (139, 140) . As such, further investigations of mixed isotype pairing are required to determine its impact on human health and disease. In addition, longitudinal investigations of HLA-II repertoires of APCs in the blood of patients with risk HLA-II alleles and patients before, during, and after disease manifestation and/or treatment may inform if changes in HLA-II epitope presentation are involved in disease progression. These data may also reveal specific HLA-II epitopes that are good candidates for disease biomarkers and immunotherapeutic targets. As LC-MS/MS instrumentation becomes more sensitive and improvements to database searches and workflows are made to increase HLA-II peptide identification rates and incorporate all possible endogenous proteins, we expect to uncover mechanistic insights into how HLA-II presentation impacts disease progression and treatment. Machine learning algorithms for improved peptide identification from LC-MS/MS data, such as Prosit (42), will likely be used to decrease false identifications and improve data quality. A decreased false discovery rate is extremely valuable for HLA-II peptidomics, as many antigens are expressed at very low levels and may only be seen in a sample a handful of times. Finally, it is imperative that the relevant biological context of HLA-II presentation is incorporated into the experimental design to ensure that the correct research questions are being addressed. As we begin to understand more about the HLA-II antigen presentation pathway via the use of multiomic strategies in combination with HLA-II peptidomics, we envision that learnings from these data will play a vital role in the development of future immunotherapeutic approaches that utilize HLA-II antigens. Supplemental data -This article contains supplemental data (24) . Acknowledgments -The authors thank leaders in the field for their work and contributions and Gibbs Cluster (141) for access to their free resources. The authors also thank Marit van Buuren, Vikram Juneja, and Michael Rooney for useful conversations related to epitope prediction and the application of HLA-II epitopes to immunotherapies. Funding and additional information -This work was supported in part by grants from the National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium Grants NIH/NCI U24-CA210986 and NIH/NCI U01 CA214125 (to S. A. C.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. Conflict of interest -S. A. C. is a member of the scientific advisory boards of Kymera, PTM Biolabs, and Seer and a scientific advisor to Pfizer and Biogen. J. G. A. is a past employee and shareholder of Neon Therapeutics, Inc (now BioNTech US). S. K., S. S., K. R. C., S. A. C., and J. G. A. are named co-inventors on patent application(s) related to HLA peptide motif technology that is related to topics in this article filed by The Broad Institute. Abbreviations -The abbreviations used are: ADAs, antidrug antibodies; altORFs, alternative ORFs; APCs, antigen-presenting cells; CD, celiac disease; CIITA, class II major histocompatibility complex transactivator; CLIP, class II-associated invariant chain peptide; DCs, dendritic cells; ECs, epithelial cells; GWASs, genome-wide association studies; HLA-II, human leukocyte antigen class II; irAEs, immune-related adverse events; MIIC, MHC class II compartment. Cancer immunotherapy based on mutation-specific CD4+ T cells in a patient with epithelial cancer MHC-II neoantigens shape tumour immunity and response to immunotherapy Mutant MHC class II epitopes drive therapeutic immune responses to cancer Tumor-reactive CD4(+) T cells develop cytotoxic activity and eradicate large established melanoma after transfer into lymphopenic hosts Specific T helper cell requirement for optimal induction of cytotoxic T lymphocytes against major histocompatibility complex class II negative tumors Tumor-specific CD4+ melanoma tumor-infiltrating lymphocytes Revisiting the role of CD4+ T cells in cancer immunotherapy-new insights into old paradigms Past, present, and future of regulatory T cell therapy in transplantation and autoimmunity Cutting edge: CD4 + CD25 + regulatory T cells suppress antigen-specific autoreactive immune responses and central nervous system inflammation during active experimental autoimmune encephalomyelitis Regulatory CD4+ T cells and the control of autoimmune disease Therapeutic potential of TGF-β-induced CD4+ Foxp3+ regulatory T cells in autoimmune diseases CD4 T-cell immunotherapy for chronic viral infections and cancer NetMHCpan-4.1 and NetMHCIIpan-4.0: Improved predictions of MHC antigen presentation by concurrent motif deconvolution and integration of MS MHC eluted ligand data Robust prediction of HLA class II epitopes by deep motif deconvolution of immunopeptidomes Predicting HLA class II antigen presentation through integrated deep learning Defining HLA-II ligand processing and binding rules with mass spectrometry enhances cancer epitope prediction Footprints of antigen processing boost MHC class II natural ligand predictions High-throughput prediction of MHC class I and II neoantigens with MHCnuggets Improved peptide-MHC class II interaction prediction through integration of eluted ligand and peptide affinity data The major histocompatibility complex and its functions Structural requirements for pairing of alpha and beta chains in HLA-DR and HLA-DP molecules Increased endothelial expression of HLA-DQ and interleukin 1α in extra-articular rheumatoid arthritis. Results from immunohistochemical studies of skeletal muscle Increased epithelial expression of HLA-DQ and HLA-DP molecules in salivary glands from patients with Sjogren's syndrome compared with obstructive sialadenitis Sequence-based prediction of SARS-CoV-2 vaccine targets using a mass spectrometry-based bioinformatics predictor identifies immunogenic T cell epitopes Repertoirescale determination of class II MHC peptide binding via yeast display improves antigen prediction HLA class II specificity assessed by high-density peptide microarray interactions Improved prediction of MHC II antigen presentation through integration and motif deconvolution of mass spectrometry MHC eluted ligand data Characterization of peptides bound to the class I MHC molecule HLA-A2.1 by mass spectrometry MHC ligands and peptide motifs: First listing Allele-specific motifs revealed by sequencing of self-peptides eluted from MHC molecules HLA ligand atlas: A benign reference of HLA-presented peptides to improve T-cell-based cancer immunotherapy Dominant protection from HLA-linked autoimmunity by antigen-specific regulatory T cells Streamlined protocol for deep proteomic profiling of FAC-sorted cells and its application to freshly isolated murine immune cells A novel differential ion mobility device expands the depth of proteome coverage and the sensitivity of multiplex proteomic measurements Extending the comprehensiveness of immunopeptidome analyses using isobaric peptide labeling High-throughput and sensitive immunopeptidomics platform reveals profound interferonγ A large peptidome dataset improves HLA class I epitope prediction across most of the human population Quantitative predictions of peptide binding to any HLA-DR molecule of known sequence: NetMHCIIpan The immune epitope database (IEDB): 2018 update SYFPEITHI: database for MHC ligands and peptide motifs Antigen discovery and specification of immunodominance hierarchies for MHCII-restricted epitopes Prosit: Proteome-wide prediction of peptide tandem mass spectra by deep learning What has GWAS done for HLA and disease associations? Genome-wide association study confirming association of HLA-DP with protection against chronic hepatitis B and viral clearance in Japanese and Korean GWAS identifies novel SLE susceptibility genes and explains the association of the HLA region Polymorphisms in HLA class II genes are associated with susceptibility to Staphylococcus aureus infection in a white population HLA-DQA1*05 carriage associated with development of anti-drug antibodies to infliximab and adalimumab in patients with Crohn's disease Extended analysis identifies drug-specific association of 2 distinct HLA class II haplotypes for development of immunogenicity to adalimumab and infliximab Defining the role of the MHC in autoimmunity: A review and pooled analysis Linkage disequilibrium and haplotype blocks in the MHC vary in an HLA haplotype specific manner assessed mainly by DRB1 * 03 and DRB1 * 04 haplotypes CD8+ T cell immunity against a tumor/self-antigen is augmented by CD4+ T helper cells and hindered by naturally occurring T regulatory cells An immunogenic personal neoantigen vaccine for patients with melanoma CD4+ T-cell help in the tumor milieu is required for recruitment and cytolytic function of CD8+ T lymphocytes Targeting CD4+ T-helper cells improves the induction of antitumor responses in dendritic cellbased vaccination Personalized RNA mutanome vaccines mobilize poly-specific therapeutic immunity against cancer Intradermal vaccinations with RNA coding for TAA generate CD8+ and CD4+ immune responses and induce clinical benefit in vaccinated patients Response to programmed cell death-1 blockade in a murine melanoma syngeneic model requires costimulation, CD4, and CD8 T cells Adoptive transfer of tumor-infiltrating lymphocytes in melanoma: A viable treatment option Durable complete responses in heavily pretreated patients with metastatic melanoma using T-cell transfer immunotherapy Autoimmunity provoked by foreign antigens HLA-DR and -DQ phenotypes in inflammatory bowel disease: A meta-analysis collaboration>International Inflammatory Bowel Disease Genetics Consortium, Australia and New Zealand IBDGC, Australia and New Zealand IBDGC, Belgium IBD Genetics Consortium, et al. (2015) High-density mapping of the MHC identifies a shared role for HLA-DRB1*01:03 in inflammatory bowel diseases and heterozygous advantage in ulcerative colitis Genetics of the HLA region in the prediction of type 1 diabetes Hla class II antigens assoiated with lupus nephritis in Italian SLE patients Distribution of HLA class II alleles among Scandinavian patients with systemic lupus erythematosus (SLE): An increased risk of SLE among non Distinct pathways of antigen uptake and intracellular routing in CD4 and CD8 T cell activation B cell MHC class II signaling: A story of life and death HLA-DQ2.5 genes associated with celiac disease risk are preferentially expressed with respect to non-predisposing HLA genes: Implication for anti-gluten T cell response Efficient T cell-B cell collaboration guides autoantibody epitope bias and onset of celiac disease ) B cell tolerance and antibody production to the celiac disease autoantigen transglutaminase 2 The MHC-II peptidome of pancreatic islets identifies key features of autoimmune peptides Natural peptides selected by diabetogenic DQ8 and murine I-A(g7) molecules show common sequence specificity Association between HLA class I and class II alleles and the outcome of West Nile virus infection: An exploratory study HLA class I and class II associations in Dengue viral infections in a Sri Lankan population Immunoproteomic analysis of a Chikungunya poxvirus-based vaccine reveals high HLA class II immunoprevalence Antigenspecific CD4-and CD8-positive signatures in different phases of Mycobacterium tuberculosis infection Phenotype and kinetics of SARS-CoV-2-specific T cells in COVID-19 patients with acute respiratory distress syndrome Mapping the SARS-CoV-2 spike glycoproteinderived peptidome presented by HLA class II on dendritic cells Disruption of MHC class II-restricted antigen presentation by vaccinia virus Cytomegalovirus US2 destroys two components of the MHC class II pathway, preventing recognition by CD4 + T cells CD4+ T cell depletion in human immunodeficiency virus (HIV) infection: Role of apoptosis HIV-1 Nef impairs MHC class II antigen presentation and surface expression Identification of antigens presented by MHC for vaccines against tuberculosis Identification of broadly conserved cross-species protective Leishmania antigen and its responding CD4+ T cells The human leukocyte antigen class II immunopeptidome of the SARS-CoV-2 spike glycoprotein Regulation of MHC class II expression by interferon-gamma mediated by the transactivator gene CIITA CIITA is a transcriptional coactivator that is recruited to MHC class II promoters by multiple synergistic interactions with an enhanceosome complex Activation of the MHC class II transactivator CIITA by interferon-γ requires cooperative interaction between Stat1 and USF-1 Regulation of MHC class II gene expression by the class II transactivator Induction of CIITA and modification of in vivo HLA-DR promoter occupancy in normal thymic epithelial cells treated with IFN-gamma: Similarities and distinctions with respect to HLA-DR-constitutive B cells Macroautophagy substrates are loaded onto MHC class II of medullary thymic epithelial cells for central tolerance Generation of diversity in thymic epithelial cells IFNγ modulates the immunopeptidome of triple negative breast cancer cells by enhancing and diversifying antigen processing and presentation The multifaceted roles of the invariant chain CD74 -more than just a chaperone The expression of legumain, an asparaginyl endopeptidase that controls antigen processing, is reduced in endotoxin-tolerant monocytes Cathepsin S controls MHC class II-mediated antigen presentation by epithelial cells in vivo Critical role in ii degradation and CD4 T cell selection in the thymus Role for cathepsin F in invariant chain processing and major histocompatibility complex class II peptide loading by macrophages List of contributors Mechanisms of peptide repertoire selection by HLA-DM HLA-DO acts as a substrate mimic to inhibit HLA-DM by a competitive mechanism HLA-DP, HLA-DQ, and HLA-DR have different requirements for invariant chain and HLA-DM Variations in MHC class II antigen processing and presentation in health and disease Epithelial MHC class II expression and its role in antigen presentation in the gastrointestinal and respiratory tracts Renal proximal tubular epithelial cells exert immunomodulatory function by driving inflammatory CD4+ T cell responses Epithelia: Lymphocyte interactions in the gut High level expression of MHC-II in HPV+ head and neck cancers suggests that tumor epithelial cells serve an important role as accessory antigen presenting cells Macrophages in inflammatory multiple sclerosis lesions have an intermediate activation status The contribution of macrophages to systemic lupus erythematosus Macrophage heterogeneity in the context of rheumatoid arthritis Mass spectrometry profiling of HLA-associated peptidomes in mono-allelic cells enables more accurate epitope prediction SARS-CoV-2 infected cells present HLA-I peptides from canonical and out-of-frame ORFs Unexpected abundance of HLA class II presented peptides in primary renal cell carcinomas CIITA-transduced glioblastoma cells uncover a rich repertoire of clinically relevant tumor-associated HLA-II antigens Comprehensive analysis of cancerassociated somatic mutations in class I HLA genes Identification of bacteria-derived HLA-bound peptides in melanoma Global proteogenomic analysis of human MHC class I-associated peptides derived from non-canonical reading frames Noncoding regions are the main source of targetable tumorspecific antigens Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling A regression-based analysis of ribosome-profiling data reveals a conserved complexity to mammalian translation Many lncRNAs, 5'UTRs, and pseudogenes are translated and some are likely to express functional proteins Ribosome profiling reveals resemblance between long noncoding RNAs and 5' leaders of coding RNAs Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes Thousands of novel unannotated proteins expand the MHC I immunopeptidome in cancer Most non-canonical proteins uniquely populate the proteome or immunopeptidome The coding capacity of SARS-CoV-2 Immunogenicity and autoimmunity during anti-TNF therapy Distinctive germline expression of class I human leukocyte antigen (HLA) alleles and DRB1 heterozygosis predict the outcome of patients with non-small cell lung cancer receiving PD-1/PD-L1 immune checkpoint blockade Immunogenicity to biotherapeuticsthe role of anti-drug immune complexes Predictors of anti-TNF treatment failure in anti-TNF-naive patients with active luminal Crohn's disease: A prospective, multicentre, cohort study Monoclonal antibodies in cancer therapy Targeting B cells and plasma cells in autoimmune diseases A human monoclonal antibody blocking SARS-CoV-2 infection Enabling routine MHC-II-associated peptide proteomics for risk assessment of drug-induced immunogenicity Immune-related adverse events with immune checkpoint blockade: A comprehensive review Antigen presentation profiling reveals recognition of lymphoma immunoglobulin neoantigens Safety of the tau-directed monoclonal antibody BIIB092 in progressive supranuclear palsy: A randomised, placebo-controlled, multiple ascending dose phase 1b trial The epidemiology of autoimmune diseases Role of a novel human leukocyte antigen-DQA1*01:02;DRB1*15:01 mixed isotype heterodimer in the pathogenesis of "humanized" multiple sclerosis-like disease * A novel HLA class II molecule (DRα-sDQβ) created by mismatched isotype pairing GibbsCluster: Unsupervised clustering and alignment of peptide sequences