key: cord-0780918-g0veeh8m authors: Wilson, Eric A.; Hirneise, Gabrielle; Singharoy, Abhishek; Anderson, Karen S. title: TOTAL PREDICTED MHC-I EPITOPE LOAD IS INVERSELY ASSOCIATED WITH POPULATION MORTALITY FROM SARS-COV-2 date: 2021-02-25 journal: Cell Rep Med DOI: 10.1016/j.xcrm.2021.100221 sha: 670525c1518c8d9a6c98f9961848d05a56947d2d doc_id: 780918 cord_uid: g0veeh8m Polymorphisms in MHC-I protein sequences across human populations significantly impacts viral peptide binding capacity and thus alters T cell immunity to infection. In the present study, we assess the relationship between observed SARS-CoV-2 population mortality and the predicted viral binding capacities of 52 common MHC-I alleles. Potential SARS-CoV-2 MHC-I peptides are identified using a consensus MHC-I binding and presentation prediction algorithm, called EnsembleMHC. Starting with nearly 3.5 million candidates, we resolve a few hundred highly probable MHC-I peptides. By weighing individual MHC allele-specific SARS-CoV-2 binding capacity with population frequency in 23 countries, we discover a strong inverse correlation between predicted population SARS-CoV-2 peptide binding capacity and mortality rate. Our computations reveal that peptides derived from the structural proteins of the virus produce a stronger association with observed mortality rate, highlighting the importance of S, N, M, E proteins in driving productive immune responses. In December 2019, the novel coronavirus, SARS-CoV-2 was identified from a cluster of cases of pneumonia in Wuhan, China 1,2 . With over 73.1 million cases and over 1.6 million deaths, the viral spread has been declared a global pandemic by the World Health Organization 3 . Due to its high rate of transmission and unpredictable severity, there is an immediate need for information surrounding the adaptive immune response towards SARS-CoV-2. A robust T cell response is integral for the clearance of coronaviruses, and generation of lasting immunity 4 . The potential role of T cells for coronavirus clearance has been supported by the identification of immunogenic CD8 + T cell epitopes in the S (Spike), N (Nucleocapsid), M (Membrane), and E (Envelope) proteins 5 . Additionally, SARS-CoV specific CD8 + T cells have been shown to provide long lasting immunity with memory CD8 + T cells being detected up to 17 years post infection 4, 6, 7 . The specifics of the T cell response to SARS-CoV-2 is still evolving. However, a recent screening of SARS-CoV-2 peptides revealed a majority of the CD8 + T cell immune response is targeted towards viral structural proteins (N, M, S) 8 . A successful CD8 + T cell response is contingent on the efficient presentation of viral protein fragments by Major Histocompatibility Complex I (MHC-I) proteins. MHC-I molecules bind and present peptides derived from endogenous proteins on the cell surface for CD8 + T cell interrogation. The MHC-I protein is highly polymorphic, with amino acid substitutions within the peptide binding groove drastically The accurate assessment of differences in SARS-CoV-2 binding capacities across MHC-I allelic variants requires the isolation of MHC-I peptides with a high probability of being presented. EnsembleMHC provides the requisite precision through the use of allele and algorithm-specific score thresholds and peptide confidence assignment. MHC-I alleles substantially vary in both peptide binding repertoire size and median binding affinity 25 . The EnsembleMHC workflow addresses this inter-allele variation by identifying peptides based on MHC allele and algorithm-specific binding affinity thresholds. These thresholds were set by benchmarking each of the seven component algorithms against 52 single MHC allele peptide data sets 17 . Each data set consists of mass spectrometry-confirmed MHC-I peptides that have been naturally presented by a model cell line expressing one of the 52 select MHC-I alleles. These experimentally validated peptides, denoted target peptides, were supplemented with a 100-fold excess of decoy peptides. Decoys were generated by randomly sampling peptides that were not detected by mass spectrometry, but were derived from the same protein sources as a detected target peptide. Algorithm and allele-specific binding affinity thresholds were then identified through the independent application of each component algorithm to all MHC allele data sets. For every data set and algorithm combination, the target and decoy peptides were ranked by predicted binding affinity to the MHC allele defined by that data set. Then, an algorithm-specific binding affinity threshold was set to the minimum score needed to isolated the highest affinity peptides commensurate to 50% of the observed allele repertoire size (STAR methods, Figure S1A ). The observed allele repertoire size was defined as the total number of target peptides within a given single MHC allele data set. Therefore, if a data set had 1,000 target peptides, the top 500 highest affinity peptides would be selected, and the algorithm-specific threshold would be set to the predicted binding affinity of the 500 th peptide. This parameterization method resulted in the generation of a customized set J o u r n a l P r e -p r o o f of allele and algorithm-specific binding affinity thresholds in which an expected quantity of peptides can be recovered. Consensus MHC-I prediction typically require a method for combining outputs from each individual component algorithm into a composite score. This composite score is then used for peptide selection. EnsembleMHC identifies high-confidence peptides based on filtering by a quantity called peptide FDR (STAR methods Eq. 1). During the identification of allele and algorithm-specific binding affinity thresholds, the empirical false detection rate (FDR) of each algorithm was calculated. This calculation was based on the proportion of target to decoy peptides isolated by the algorithm specific binding affinity threshold. A peptide FDR is then assigned to each individual peptide by taking the product of the empirical FDRs of each algorithm that identified that peptide for the same MHC-I allele. Analysis of the parameterization process revealed that the overall performance of each included algorithms was comparable, and there was diversity in individual peptide calls by each algorithm, supporting an integrated approach to peptide confidence assessment (Figure S1B-D) . Peptide identification by EnsembleMHC was performed by selecting all peptides with a peptide FDR of less than or equal to 5% 26 . The efficacy of peptide FDR as a filtering metric was determined through the prediction of naturally presented MHC-I peptides derived from ten tumor samples 17 (Figure 1 ). Similar to the single MHC allele data sets, each tumor sample data set consisted of mass spectrometry-detected target peptides and a 100fold excess of decoy peptides. The relative performance of EnsembleMHC was assessed via comparison with individual component algorithms. Peptide identification by each algorithm was based on a restrictive or permissive binding affinity thresholds ( Figure 1A (inset table) ). For the component algorithms, the permissive and restrictive thresholds correspond to commonly used binding affinity cutoffs for the identification of weak and strong binders, respectively 27 . The performance of each algorithm on the ten data sets was evaluated through the calculation of the empirical precision, recall, and F1 score. The average precision and recall of each algorithm across all tumor samples demonstrated an inverse relationship ( Figure 1A) . In general, restrictive binding affinity thresholds produced higher precision at the cost of poorer recall. When comparing the precision of each algorithm at restrictive thresholds, J o u r n a l P r e -p r o o f EnsembleMHC demonstrated a 3.4-fold improvement over the median precision of individual component algorithms. EnsembleMHC also produced the highest F1 score with an average of 0.51 followed by mhcflurry-presentation with an F1 score of 0.45, both of which are 1.5-2 fold higher than the rest of the algorithms ( Figure 1B) . This result was shown to be robust across a range of peptide FDR cutoff thresholds ( Figure S1E) , alternative performance metrics ( Figure S1F ) and other consensus-based prediction algorithms ( Figure S1G-H) . Furthermore, EnsembleMHC demonstrated the ability to more efficiently prioritize peptides with experimentally established immunogenicity from the Hepatitis-C genome polyprotein, the Dengue virus genome polyprotein, and the HIV-1 POL-GAG protein ( Figure S1I ). Taken together, these results demonstrate the enhanced precision of EnsembleMHC over individual component algorithms when using common binding affinity thresholds. In summary, the EnsembleMHC workflow offers two desirable features. First, it determines allelespecific binding affinity thresholds for each algorithm at which a known quantity of peptides are expected to be successfully presented on the cell surface. Second, it assigns a confidence level to each peptide call made by each algorithm. Together, these traits enhance the ability to identify MHC-I peptides with a high probability of successful cell surface presentation. EnsembleMHC was used to identify MHC-I peptides for the SARS-CoV-2 virus ( Figure 1C) . The resulting identification of high-confidence SARS-CoV-2 peptides allows for the characterization of alleles that are enriched or depleted for predicted MHC-I peptides. The resulting distribution of allelespecific SARS-CoV-2 binding capacities will then be weighed by the normalized frequencies of the 52 alleles ( Figure S2 , STAR Methods Eq. 5-6) in 23 countries to determine the population-specific SARS-CoV-2 binding capacity or EnsembleMHC population score (STAR Methods Eq. 7). The potential impact of varying population SARS-CoV-2 binding capacities on disease outcome can then be assessed by correlating population SARS-CoV-2 mortality rates with EnsembleMHC population scores. The MHC-I peptide-allele distribution for SARS-CoV-2 structural proteins is especially disproportionate J o u r n a l P r e -p r o o f MHC-I peptides derived from the SARS-CoV-2 proteome were predicted and prioritized using EnsembleMHC. A total of 67,207 potential 8-14mer viral peptides were evaluated for each of the considered MHC-I alleles. After filtering the pool of candidate peptides at the 5% peptide FDR threshold, the number of potential peptides was reduced from 3.49 Million to 971 (658 unique peptides) ( Figure S3A -B, Table S1 ). Illustrated in Figure 2A , the viral peptide-MHC allele (or peptide-allele) distribution for high-confidence SARS-CoV-2 peptides was determined by assigning the identified peptides to their predicted MHC-I alleles. There was a median of 16 peptides per allele with a maximum of 47 peptides (HLA-A*24:02), a minimum of 3 peptides (HLA-A*02:05), and an interquartile range (IQR) of 16 peptides. Quality assurance of the predicted peptides was performed by computing the peptide length frequencies and binding motifs. The predicted peptides were found to adhere to expected MHC-I peptide lengths 28 with 78% of the peptides being 9 amino acids in length, 13% being 10 amino acids in length, and 8% of peptides accounting for the remaining lengths ( Figure S3C-D) . Similarly, logo plots generated from predicted peptides were found to closely reflect reference peptide binding motifs for considered alleles 29 ( Figure S3E) . Overall, the EnsembleMHC prediction platform demonstrated the ability to isolate a short list of potential peptides which adhere to expected MHC-I peptide characteristics. The high expression, relative conservation, and reduced search space of SARS-CoV-2 structural proteins (S, E, M, and N) make MHC-I binding peptides derived from these proteins high-value targets for CD8 + T cell-based vaccine development. Figure 2B describes the peptide-allele distribution for predicted MHC-I peptides originating from the four structural proteins. This analysis markedly reduces the number of considered peptides from 658 to 108 (Table ?? ). The median number of predicted SARS-CoV-2 structural peptides assigned to each MHC-I allele was found to be 2 with a maximum of 12 peptides (HLAB*53:01), a minimum of 0 (HLA-B*15:02, B*35:03,B*38:01,C*03:03,C*15:02), and a IQR of 3 peptides. Analysis of the molecular source of the identified SARS-CoV-2 structural protein peptides revealed that they originate from enriched regions that are highly conserved ( Figure S4 ). This indicates that such peptides would be ideal candidates for targeted therapies as they are unlikely to be disrupted by mutation, and several peptides can be targeted using minimal stretches of the source protein. Altogether, consideration of MHC-I peptides derived only from SARS-CoV-2 structural proteins reduces the number of potential peptides to a condensed set of high-value targets that is amenable to experimental validation. Both the peptide-allele distributions, namely the ones derived from the full SARS-CoV-2 proteome and those from the structural proteins, were found to significantly deviate from an even distribution of predicted peptides as apparent in figure 2AB and reflected in the Kolmogorov-Smirnov test p-values ( Figure S5 , full proteome = 5.673e-07 and structural proteins = 1.45e-02). These results support a potential allele-specific hierarchy for SARS-CoV-2 peptide presentation. To determine if the MHC-I binding capacity hierarchy was consistent between the full SARS-CoV-2 proteome and SARS-CoV-2 structural proteins, the relative changes in the observed peptide fraction (number of peptides assigned to an allele / total number of peptides) between the two protein sets was visualized ( Figure 2C ). Six alleles demonstrated changes greater than the median peptide fraction ( = 0.015) when comparing the two protein sets. The greatest decrease in peptide fraction was observed for A*25:01 (1.52 times the median peptide fraction), and the greatest increase was seen with B*53:01 (2.38 times the median peptide fraction). Furthermore, the resulting SARS-CoV-2 structural protein peptideallele distribution was found to be more variable than the distribution derived from the full SARS-CoV-2 proteome with a quartile coefficient of dispersion of 0.6 compared to 0.44, respectively. This indicates that peptides derived from SARS-CoV-2 structural proteins experience larger relative inter-allele binding capacity discrepancies than peptides derived from the full SARS-CoV-2 proteome. Together, these results indicate a potential MHC-I binding capacity hierarchy that is more pronounced for SARS-CoV-2 structural proteins. The documented importance of MHC-I peptides derived from SARS-CoV-2 structural proteins 8 , coupled with the observed MHC allele binding capacity hierarchy and the high immunogenicity rate of SARS-CoV-2 structural protein MHC-I peptides identified by EnsembleMHC (Figure S5D) , prompts a potential J o u r n a l P r e -p r o o f relationship between MHC-I genotype and infection outcome. However, due to the absence of MHC genotype data for SARS-CoV-2 patients, we assessed this relationship at the population-level by correlating predicted country-specific SARS-CoV-2 binding capacity (or EnsembleMHC population score) with observed SARS-CoV-2 mortality. EnsembleMHC population scores (EMP) were determined for 23 countries (Table S2 ) by weighing the individual binding capacities of 52 common MHC-I alleles by their normalized endemic expression 18 (STAR Methods, Figure S2 ). Every country in the cohort is assigned two separate EMP scores, one calculated with respect to the 108 unique SARS-CoV-2 structural protein peptides (structural protein EMP) and the other with respect to the 658 unique peptides derived from the full SARS-CoV-2 proteome (full proteome EMP). The EMP score corresponds to the average predicted SARS-CoV-2 binding capacity of a select population. Therefore, individuals in a country with a high EMP score would be expected, on average, to present more SARSCoV-2 peptides to CD8 + T cells than individuals from a country with a low EMP score. The resulting EMP scores were then correlated with observed SARS-CoV-2 mortality (deaths per million) as a function of time (JAN -APR 2020). Temporal variance in community spread within the cohort of countries was corrected by truncating the SARS-CoV-2 mortality data set for each country to start after a certain minimum death threshold was met. For example, if the minimum death threshold was 50, then day 0 would be when each country reported at least 50 deaths. The number of countries included in each correlation decreases as the number of days increases due to discrepancies in the length of time that each country met a given minimum death threshold (Table S3) . Therefore, the correlation between EMP score and SARS-CoV-2 mortality was only estimated at time points where there were at least eight countries. The eight country threshold was chosen because it is the minimum sample size needed to maintain sufficient power when detecting large effect sizes ( > 0.85). The strength of the relationship between EMP score and SARS-CoV-2 mortality was determined using Spearman's rank-order correlation (for details concerning the choice of statistical tests, please refer to STAR Methods). Accordingly, both EMP scores and SARS-CoV-2 mortality data were converted into ascending ranks with the lowest rank indicating the minimum value and the highest rank indicating the J o u r n a l P r e -p r o o f maximum value. For instance, a country with an EMP score rank of 1 and death per million rank of 23 would have the lowest predicted SARS-CoV-2 binding capacity and the highest level of SARS-CoV-2related mortality. Using the described paradigm, the structural protein EMP score and the full proteome EMP score were correlated with SARS-CoV-2-related deaths per million for 23 countries. Total predicted population SARS-CoV-2 binding capacity exhibited a strong inverse correlation with observed deaths per million. This relationship was found to be true for correlations based on the structural protein EMP ( Figure 3A ) and full proteome EMP ( Figure S5A ) scores with a mean effect size of -0.66 and -0.60, respectively. Significance testing of the correlations produced by both EMP scores revealed that the majority of reported correlations are statistically significant with 63% attaining a p-value of ≤ 0.05. Correlations based on the structural protein EMP score demonstrated a 24% higher proportion of statistically significant correlations compared to the full proteome EMP score (74% vs 51%). Furthermore, correlations for EMP scores based on structural proteins produced narrower 95% confidence intervals ( Figure S5B , Table S3 ). Due to relatively low statistical power of the obtained correlations ( Figure S6 ), the positive predictive value for each correlation (STAR Methods, Eq. 8) was calculated. The resulting proportions of correlations with a positive predictive value of 95% were similar to the observed significant p-value proportions with 62% of all measured correlations, 72% of structural protein EMP score correlations, and 52% full proteome EMP score correlations ( Figure S5B ). The similar proportions of significant p-values and PPVs supports that an overall true association is being captured. Furthermore, analysis of similar sized peptide sets sampled from the full SARS-CoV-2 proteome revealed that the observed distinction between the correlations produced by the two protein groups are unlikely to be due to differences in peptide set sizes (Figure S7A-B) . Finally, the reported correlations did not remain after randomizing the allele assignment of predicted peptides prior to peptide FDR filtering ( Figure S7C-D) , through the use of any individual algorithm ( Figure S7E ). This indicates that the observed relationship is contingent on the high-confidence peptideallele distribution identified by EnsembleMHC. Altogether, these data demonstrate that the MHC-I allele hierarchy characterized by EnsembleMHC is inversely associated with SARS-CoV-2 population J o u r n a l P r e -p r o o f mortality, and that the relationship becomes stronger when considering only the presentation of SARS-CoV-2 structural proteins. The ability to use structural protein EMP score to identify high and low risk populations was assessed using the median minimum death threshold (50 deaths) at evenly spaced time points (Figure 3A , squares). All correlations, with the exception of day 1, were found to be significant with an average effect size of -0.71 ( Figure 3B) . Next, the countries at each day were partitioned into a high or low group based on whether their assigned EMP score was higher or lower than the median observed EMP score ( Figure 3C ). The resulting groups demonstrated a statistically significant difference in the median deaths per million between countries with low structural protein EMP score and countries with high structural protein EMP scores. Additionally, it was observed that deaths per million increased much more rapidly in countries with low structural protein EMP scores. Taken together, these results indicate that structural protein EMP score may be useful for assessing population risk from SARS-CoV-2 infections. In summary, we make several important observations. First, there is a strong inverse correlation between predicted population SARS-CoV-2 binding capacity and observed deaths per million. This finding suggests that outcome to SARS-CoV-2 may be tied to total epitope load. Second, the correlation between predicted epitope load and population mortality is stronger for SARS-CoV-2 structural MHC-I peptides. This suggests that CD8 + T cell-mediated immune response maybe primarily driven by recognition of epitopes derived from these proteins, a finding supported by recent T cell epitope mapping of SARS-CoV-2 8 . Finally, the EnsembleMHC population score can separate countries within the considered cohort into high or low risk populations. (Table S2) . Overall, the structural protein EMP scores produced a significantly stronger association with population SARS-CoV-2 mortality compared to other 12 descriptors (Figure 4A) . While various effect size trends were observed, all additional covariates failed to produce statistically significant correlations. To determine if the modeling of SARS-CoV-2 mortality rate could be improved by the combination of single socioeconomic or health-related risk factors with structural protein EMP scores, a set of linear models consisting of either a single risk factor (single feature model) or that factor combined with structural protein EMP scores (combination model) were generated for every time point across each minimum death threshold (STAR Methods). Following model generation, the adjusted coefficient of determination (R 2 ) and significance level of each individual model was extracted and aggregated by dependent variable (Figure S8 ). Single feature models were characterized by low R 2 ( = 0.0262) while combination models showed significant improvement ( = .496). Similarly, combination models demonstrated a substantially higher proportion of statistical significance ( Figure S8B) . To determine the set of features that produce the best fitting model, all possible combinations of explanatory factors (risk factors and structural protein EMP score) were tested. Subsequently, the top ten performing models, ranked by adjusted R 2 value, were selected for analysis ( Figure 4B ). The identified models were found to be largely significant (average proportion of significant regressions = 72%) and produce strong fits to the data (average R 2 = 0.7). Analysis of the dependent variables included in the top performing models revealed that all models used structural protein EMP scores followed by deaths per million due to complications from COPD (90% of models). The median model size included 3 features with a maximum of 5 features and a minimum of 2 features. The model producing the best fit (median R 2 =0.791) consisted of structural protein EMP scores, gender demographics, number of deaths due to COPD complications, the proportion of the population over the age of 65, and proportion of the population that is overweight ( Figure 4B) . All J o u r n a l P r e -p r o o f together, these results further indicate the robustness of the structural protein EMP score as a population level risk descriptor. In the present study, we uncover evidence supporting an association between population SARS-CoV-2 infection outcome and MHC-I genotype. In line with related work highlighting the relationship between total epitope load with HIV viral control 32 , we arrive at a working model that MHC-I alleles presenting more unique SARS-CoV-2 epitopes will be associated with lower mortality due to a higher number of capacity MHC-I alleles may be better protected. The correlation was shown to be stronger when calculating the EnsembleMHC population scores with respect to only structural proteins, reinforcing their relevance to viral immunity. Finally, the molecular origin of the 108 predicted peptides specific to SARS-CoV-2 structural proteins revealed that they are derived from enriched regions with a minimal predicted impact from amino acid sequence polymorphisms. The utility of structural protein EnsembleMHC population scores were further supported by a multivariate analysis of additional SARS-CoV-2 risk factors. These results emphasized the relative robustness of structural protein EMP scores as a population risk assessment tool. Furthermore, a linear model based on the combination of structural protein EMP scores and select population-level risk factors was identified a potential candidate for a predictive model for pandemic population severity. As such, the incorporation of the structural protein EMP score in more sophisticated models will likely improve epidemiological modeling. In order to achieve the highest level of accuracy in MHC-I predictions, the most up-to-date versions of each component algorithm were used. However, this meant that several of the algorithms (MHCflurry, netMHCpan-EL-4.0 and MixMHCpred) were benchmarked against subsets of mass spectrometry data that were used in the original training of these MHC-I prediction models. While this could result in an unfair weight applied to these algorithms in peptide FDR calculation, the individual FDRs of MHCflurry, netMHCpan-EL-4.0 and MixMHCpred were comparable to algorithms without this advantage ( Figure S1C ). Furthermore, the peptide selection of SARS-CoV-2 peptides was shown to be highly cooperative within EnsembleMHC (Figure S3A) , and individual algorithms failed to replicate the strong observed correlations between population binding capacity and observed SARS-CoV-2 mortality (Figure S7E ). In the future, the presented model could be applied to predict individual T cell capacity to mount a robust SARS-CoV-2 immune response. Evolutionary divergence of patient MHC-I genotypes have shown to be predictive of response to immune checkpoint therapy in cancer and HIV 34, 35 . However, confirmation will require large data sets associating individual patient MHC-I genotype and outcome. Additionally, future use of EnsembleMHC to design personalized T cell vaccines will require broad experimental validation of high scoring peptides, since EnsembleMHC predicts MHC-I peptides with a high probability of antigen presentation as opposed to directly predicting peptide immunogenicity. While previous work has determined that a majority of successfully presented viral MHC-I peptides are immunogenic 36 , there is an expectation that some presented SARS-CoV-2 MHC-I peptides will fail to produce an immune response. This work demonstrated a strong association between a population-level metric, SARS-CoV-2 MHC-I peptide binding capacity, and SARS-CoV-2 mortality rate by country. Other risk factors for SARS-CoV-2 specific mortality have been reported, including comorbidities, healthcare infrastructure, age, and gender. These risk factors are predicted to have a significant impact on individual patient outcome, which is not evaluated in this study. Other genetic determinants of severity, such as ACE2 polymorphism, were not SARS-CoV-2 structural protein-based EnsembleMHC population scores were assigned to 23 countries (Table S2A) , and correlated with observed mortality rate (deaths per million). The correlation coefficient is presented as a function of time. Individual country mortality rate data were aligned by truncating each data set to start after a minimum threshold of deaths was observed in a given country (line color). The Spearman's rank correlation coefficient between structural protein EMP score and SARS-CoV-2 mortality rate was calculated at every day following day 0 for each of the minimum death thresholds. Due to the differing lengths of time series analysis at each minimum death threshold, the number of days were normalized to improve visualization. Thus, normalized day 0 represents the day when qualifying countries recorded at least the number of deaths indicated by the minimum death threshold, and Further information and requests should be directed to and will be fulfilled by the Lead Contact, Karen Anderson (Karen.Anderson.1@asu.edu). This study did not generate new unique reagents. All data and code generated during this study are available at EnsembleMHC-Covid Parameterization of EnsembleMHC using mass spectrometry data. EnsembleMHC is able to achieve high levels of precision in peptide selection through the use of allele and algorithm-specific binding affinity thresholds. These binding affinity thresholds were identified through the parameterization of each algorithm on high-quality mass spectrometry data sets 17 Fifty-two common MHC-I alleles were selected for parameterization based on the criteria that they were characterized in Sarkizova et al. data sets and that all 7 component algorithms could perform peptide binding affinity predictions for that allele. Each target peptide (observed in the MS data set) was paired with 100 length-matched randomly sampled decoy peptides (not observed in the MS data set) derived from the same source proteins. If a protein was less than 100 amino acids in length, then every potential peptide from that protein was extracted. Each of the seven algorithms were independently applied to each of the 52 allele data sets. For each allele data set, the minimum score threshold was determined for each algorithm that recovered 50% of the allele repertoire size (the total number of target peptides observed in the MS data set for that allele). Additionally, the expected accuracy of each algorithm was assessed by calculating the observed false J o u r n a l P r e -p r o o f detection rate (the fraction of identified peptides that were decoy peptides) using the identified algorithm and allele specific scoring threshold. The parameterization process was repeated 1000 times for each allele through bootstrap sampling of half of the peptides in each single MHC allele data set. The final FDR and score threshold for each algorithm at each allele was determined by taking the median value of both quantities reported during bootstrap sampling. Peptide confidence is assigned by calculating the peptide FDR . This quantity is defined as the product of the empirical FDRs of each individual algorithm that detected a given peptide. The peptide FDR is calculated , where N is the number of MHC-I binding and processing algorithms, ND represents an algorithm that did not detect a given peptide, and algorithm FDR represents the allele specific FDR of the Nth algorithm. The peptide FDR represents the joint probability that all MHC-I binding and processing algorithms that detected a particular peptide did so in error, and therefore returns a probability of false detection. Unless otherwise stated, EnsembleMHC selected peptides based on the criterion of a peptide FDR ≤ 5%. Peptide identification by each algorithm was based on restrictive or permissive binding affinities thresholds. These thresholds correspond to commonly used score cutoffs for the identification of strong binders (restrictive) or all binders (permissive) (0.5% (percentile rank) or 50nM (IC50 value) for strong J o u r n a l P r e -p r o o f binders, and 2% (percentile rank) or 500nM (IC50 value) for all binders). Due to the lack of recommend score thresholds for MHCflurry-presentation-1.6.0, the raw presentation score was converted to a percentile score using presentation scores produced by 100,000 randomly generated peptides. Polymorphism analysis of SARS-CoV-2 structural proteins were performed using 102,148 full length protein sequences obtained from the COVIDep database 41 . Solved structures for the E (5X29) and S (6VXX) proteins (http://www.rcsb.org/) 42 and predicted structures for the M and N proteins 43 were visualized using VMD 44 . The peptides identified by the EnsembleMHC workflow were used to assess the SARS-CoV-2 population binding capacity by weighing individual MHC allele SARS-CoV-2 binding capacities by regional expression (for a schematic representation see Figure S2 ). The selection of countries included in the EnsembleMHC population binding capacity assessment was based on several criteria regarding the underlying MHC-I allele data for that country (Figure S2) . The MHC-I allele frequency data used in our model was obtained from the Allele Frequency Net Database (AFND) 18 , and frequencies were aggregated by country. However, the currently available population-based MHC-I frequency data has specific limitations and variances, which we have addressed as follows: Normalization of MHC allele frequency data. The focus of this work was to uncover potential differences in SARS-CoV-2 MHC-I peptide presentation dynamics induced by the 52 selected alleles within a population. Accordingly, the MHC-I allele frequency data was carefully processed in order to maintain important differences in the expression of selected alleles, while minimizing the effect of confounding variables. The MHC-I allele frequency data for a given population was first filtered to the 52 selected alleles. These allele frequencies were then converted to the theoretical total number of copies of that allele within the population (allele count) following , where allele freq is the observed allele frequency in a population and n is the population sample size for which that allele frequency was measured. The allele count is then normalized with respect to the total allele count of selected 52 alleles within that population using the following relationship , where i is one of the 52 selected alleles. This normalization is required to overcome the potential bias towards hidden alleles (alleles that are either not well characterized or not supported by EnsembleMHC) as would be seen using alternative allele frequency accounting techniques (e.g. sampleweighted mean of selected allele frequencies or normalization with respect to all observed alleles within a population (Figure S6C) ). The SARS-CoV-2 binding capacity of these hidden alleles cannot be accurately determined using the EnsembleMHC workflow, and therefore important potential relationships would be obscured. EnsembleMHC population score. The predicted ability of a given population to present SARS-CoV-2 derived peptides was assessed by calculating the EnsembleMHC Population (EMP) score. After the MHC-I allele frequency data filtering steps, 23 countries were included in the analysis. The calculation of the EnsembleMHC population score is as follows (4) , where norm allele count is the observed normalized allele count for a given allele in a population, N norm allele count ≠ 0 is the number of the 52 select alleles detected in a given population (range 51-52 alleles), and peptide frac is the peptide fraction or the fraction of total predicted peptides expected to be presented by that allele within the total set of predicted peptides with a peptide FDR ≤ 5%. Death rate-presentation correlation. The correlation between the EMP score and the observed deaths per million within the cohort of selected countries was calculated as a function of time. SARS-Cov-2 data covering the time dependent global evolution of the SARS-CoV-2 pandemic was obtained from Johns Hopkins University Center for Systems Science and Engineering 45 covering the time frame of January 22nd to April 9 th 2020. The temporal variations in occurrence of community spread observed in different countries were accounted for by rescaling the time series data relative to when a certain minimum death threshold was met in a country. This analysis was performed for minimum death thresholds of 1-100 total deaths by day 0, and correlations were calculated at each day sequentially following day 0 until there were fewer than 8 countries remaining at that time point. The upper-limit of 100-deaths was chosen due to a steep decline in average statistical power observed with day 1 death thresholds greater than 100 deaths ( Figure S6E) . The time death correlation was computed using Spearman's rank correlation coefficient (two-sided). This method was chosen due to the small sample size and non-normality of the underlying data ( Figure S6D ). The reported correlations of EMP score and deaths per million using other correlation methods can be seen in supplemental Figure S6A . The low statistical power for some of the obtained correlations were addressed by calculating the Positive Predictive Value (PPV) of all correlations using the following equation 46 (5) ,where 1 is the statistical power of a given correlation, R is the pre-study odds, and is the significance level. A PPV value of 95% is analogous to a p value of ≤ 0.05. Due to an unknown pre-study odd (probability that probed effect is truly non-null), R was set to 1 in the reported correlations. The significance of partitioning high risk and low risk countries based on median EMP score was determined using Mann-Whitney U-test. Significance values were corrected for multiple tests using the Benjamini-Hochberg procedure 38 . Sub-sampling of peptides from the Full SARS-CoV-2 proteome. 108 unique peptides, derived from the Full SARS-CoV-2 proteome and passing the 5% peptide FDR filter, were randomly sampled. Then, the time series EMP score -death per million correlation analysis used to generate Figure 3 was applied to each sampled peptide set. The sub-sampling procedure was repeated for 1,000 iterations ( Figure S7A) . To quantitatively describe the similarity of the distributions, the Kullback-Leibler divergence (KLD), a measure of divergence between two probability distributions, was calculated for the correlation distribution of each sub-sample iteration relative to either the correlation distribution of the Full SARS-CoV-2 proteome or SARS-CoV-2 structural proteins ( Figure S7B) . Twelve potential SARS-CoV-2 risk factors (table S2) were selected for analysis. Country-specific data for each risk factor was obtained from the Global Health Observatory data repository provided by the World Health Organization (https://apps.who.int/gho/data/node.main). Countries were selected for analysis based on the criteria of having reported data in the WHO data sets and inclusion in the set of 23 countries for which EnsembleMHC population scores were assigned (Table S2A) . Data regarding the total number of noncommunicable disease-related deaths (Cardiovascular disease, Chronic obstructive pulmonary disease, and Diabetes mellitus) were converted to deaths per million. Correlation analysis of each additional factor was carried out in a similar manner to that of the EnsembleMHC population score. In short, Spearman's correlation coefficient between each individual factor and observed deaths per million was estimated as a function of time from when a specified minimum death threshold was met (Figure 4) . The significance level was set to p ≤ 0.05 and significant PPV was set to PPV ≥ 0.95 (eq 8). Linear models of SARS-CoV-2 mortality. For the single and combination models, individual linear models were constructed for each considered death threshold as a function of time (similar to the univariate correlation analysis). Each model consisted of 1 (a single socioeconomic or health-related risk factor) or 2 (a combination of 1 risk factor and structural protein EMP score) dependent variables and deaths per million as the independent variable. The adjusted R 2 value and statistical significance of the model (F-test) were then extracted from each individual model and aggregated by dependent variable (Figure S8A) . The best performing models were determined by assessing all possible combinations of factors including structural protein EMP score. This resulted in the consideration of 4,083 different linear models. The top performing models were then selected by ranking each model by median adjusted R 2 . Individual algorithms were assessed for ability to prioritize viral peptides with known immunogenicity by calculating the precision (experimentally validated peptides / putative non-immunogenic peptides) when Spearman's rank correlation. Mann-Whitney U test was used to test for significant testing of death rate stratification between countries with high and low EnsembleMHC score. The threshold for statistical significance was set to p values of ≤ 0.05 or positive predictive value of PPV ≥ 0.95. Where indicated, p value correction for multiple testing was accomplished using the Benjamini-Hochberg procedure. Excel Table legends Table S1 MHC-I peptides identified by Ensemble MHC, Related to figure 2. All predicted peptides with a peptide FDR ≤ 0.05. Table S3 EMP score correlation data, Related to figure 3. All data pertaining to the correlations between EMP score and deaths per million. This includes rho estimates, 95% CI, non-normalized day values, and sample size for each correlation. Coronavirus disease 2019 (COVID-19): a perspective from China Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia The origin, transmission and clinical therapies on coronavirus disease 2019 (COVID-19) outbreak-an update on the status T cell-mediated immune response to respiratory coronaviruses Understanding the T cell immune response in SARS coronavirus infection Memory T cell responses targeting the SARS coronavirus persist up to 11 years postinfection SARS-CoV-2-specific T cell immunity in cases of COVID-19 and SARS, and uninfected controls Targets of T cell responses to SARS-CoV-2 coronavirus in humans with COVID-19 disease and unexposed individuals The MHC locus and genetic susceptibility to autoimmune and infectious diseases Human-leukocyte antigen class I Cw 1502 and class II DR 0301 genotypes are associated with resistance to severe acute respiratory syndrome (SARS) infection Association of human-leukocyte-antigen class I (B* 0703) and class II (DRB1* 0301) genotypes with susceptibility and resistance to the development of severe acute respiratory syndrome Immunogenetics in SARS: a case-control study. Hong Kong medical journal= Xianggang yi xue za zhi 16 HLA studies in the context of coronavirus outbreaks Human leukocyte antigen susceptibility map for SARS-CoV-2 Systematically benchmarking peptide-MHC binding predictors: From synthetic to naturally processed epitopes A large peptidome dataset improves HLA class I epitope prediction across most of the human population Allele frequency net 2015 update: new features for HLA epitopes, KIR and disease and HLA adverse drug reaction associations MHCflurry 2.0: Improved Pan-Allele Prediction of MHC Class I-Presented Peptides by Incorporating Antigen Processing NetMHCpan-4.0: improved peptide-MHC class I interaction predictions integrating eluted ligand and peptide binding affinity data Gapped sequence alignment using artificial neural networks: application to the MHC class I system Deciphering HLA-I motifs across HLA peptidomes improves neo-antigen predictions and identifies allostery regulating HLA specificity The PickPocket method for predicting binding specificities for receptors based on receptor pocket similarities: application to MHC-peptide binding Pan-specific prediction of peptide-MHC class I complex stability, a correlate of T cell immunogenicity HLA class I alleles are associated with peptide-binding repertoires of different size, affinity, and immunogenicity. The Journal of Immunology 191 False discovery rate procedures Immunoinformatics: Predicting Peptide-MHC Binding The length distribution of class I-restricted T cell epitopes is determined by both peptide supply and MHC allele-specific binding preference The MHC motif viewer: a visualization tool for MHC binding motifs OpenSAFELY: factors associated with COVID-19 death in 17 million patients Risk factors for SARS-CoV-2 among patients in the Oxford Royal College of General Practitioners Research and Surveillance Centre primary care network: a cross-sectional study. The Lancet Infectious Diseases Broad and Gag-biased HIV-1 epitope repertoires are associated with lower viral loads Prediction of SARS-CoV-2 epitopes across 9360 HLA class I alleles. bioRxiv Evolutionary divergence of HLA class I genotype impacts efficacy of cancer immunotherapy HLA heterozygote advantage against HIV-1 is driven by quantitative and qualitative differences in HLA allele-specific peptide presentation Most viral peptides displayed by class I MHC on infected cells are immunogenic Comparative genetic analysis of the novel coronavirus (2019-nCoV/SARS-CoV-2) receptor ACE2 in different populations Controlling the false discovery rate: a practical and powerful approach to multiple testing COVID-19 Vaccine Candidates: Prediction and Validation of 174 SARS-CoV-2 Epitopes. bioRxiv A new coronavirus associated with human respiratory disease in China COVIDep: A web-based platform for real-time reporting of vaccine target recommendations for SARS-CoV-2 The protein data bank Protein structure and sequence re-analysis of 2019-nCoV genome refutes snakes as its intermediate host or the unique similarity between its spike protein insertions and HIV-1 VMD: visual molecular dynamics Power failure: why small sample size undermines the reliability of neuroscience The immune epitope database (IEDB): 2018 update. Nucleic acids research 47 NetCTLpan: pan-specific MHC class I pathway epitope predictions NetMHCcons: a consensus method for the major histocompatibility complex class I predictions J o u r n a l P r e -p r o o f Highlights • EnsembleMHC, a MHC-I presentation prediction algorithm, can predict SARS-CoV-2 epitopes• Countries show variation in the predicted MHC-I SARS-CoV-2 binding capacity • A population score combines the MHC-I binding capacity with MHC-I allele frequencies• The population score inversely correlates with observed deaths from SARS-CoV-2 Wilson et al. define a predicted MHC allele-specific hierarchy for the presentation of peptides derived from SARS-CoV-2 viral proteins. They find that a composite population-level metric combining predicted MHC allele SARS-CoV-2 binding capacity and endemic allele frequencies is inversely correlated with deaths per million.