key: cord-0278769-c6xbvwis authors: Sepulveda, N.; Malato, J. T.; Sotzny, F.; Grabowska, A. D.; Fonseca, A.; Cordeiro, C.; Graca, L.; Biecek, P.; Behrends, U.; Mautner, J.; Westermeier, F.; Mattos Lacerda, E.; Scheibenbogen, C. title: Revisiting IgG antibody reactivity to Epstein-Barr virus in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome and its potential application to disease diagnosis date: 2022-04-25 journal: nan DOI: 10.1101/2022.04.20.22273990 sha: d0c54c01891358884c1381042b9b9ad996373e27 doc_id: 278769 cord_uid: c6xbvwis Infections by the Epstein-Barr virus (EBV) are often at the disease onset of patients suffering from Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS). However, serological analyses of these infections remain inconclusive when comparing patients with healthy controls. In particular, it is unclear if certain EBV-derived antigens eliciting antibody responses have a biomarker potential for disease diagnosis. With this purpose, we re-analysed a previously published microarray data on the IgG antibody responses against 3,054 EBV-related antigens in 92 patients with ME/CFS and 50 HCs. This re-analysis consisted of constructing different regression models for binary outcomes with the ability to classify patients and HCs. In these models, we tested for a possible interaction of different antibodies with age and gender. When analyzing the whole data set, there were no antibody responses that could be used to distinguish patients from healthy controls. A similar finding was obtained when comparing patients with noninfectious or unknown disease trigger to healthy controls. However, when data analysis was restricted to the comparison between HCs and patients with a putative infection at disease onset, we could identify stronger antibody responses against two candidate antigens (EBNA4_0529 and EBNA6_0070). Using antibody responses to these two antigens together with age and gender, the final classification model had an estimated sensitivity and specificity of 0.833 and 0.720, respectively. This reliable case-control discrimination suggested the use of the antibody levels related to these candidate viral epitopes as biomarkers for disease diagnosis in this subgroup of patients. When a bioinformatic analysis was performed on these epitopes, it revealed a potential molecular mimicry with several human proteins. To confirm these promising findings, a follow-up study will be conducted in a separate cohort of patients. Introduction the disease and routine application of serological assays in the clinical practice. However, EBV antigens included in commercial kits are mostly markers of exposure to the infection and are unable to distinguish between patients with ME/CFS and healthy controls (28) . This distinction can only be made when comparing a subset of clinically diagnosed ME/CFS patients with an EBV infection trigger to healthy controls (10). A serological evaluation of antibodies against less-studied EBV antigens did not identify any that could be used as a specific disease biomarker (29) . However, this antibody evaluation was done using a limited number of EBV-derived antigens and no subgroup analysis was performed. The lack of patients' stratification in ME/CFS studies reduces the chance of reproducing the same findings in follow-up studies (27, 30) . Therefore, it is still possible to identify alternative antigens whose antibody responses could be used as disease biomarkers for a subgroup of patients. Recently, we analyzed antibody responses against about 3,000 overlapping antigens derived from 14 EBV proteins (23) . The aim of this original study was to extract an antibody signature against EBV in ME/CFS patients when compared to healthy controls. In the present study, we extended the analysis of the obtained data with the specific objective of optimizing biomarker discovery. In particular, we compared patients with or without an infectious trigger at disease onset to healthy controls in order to discover EBV-derived antigens whose antibody responses could be used for ME/CFS diagnosis. Ninety-two ME/CFS patients were recruited between 2011 and 2015 at the Charité outpatient clinic for immunodeficiencies at the Institute of Medical Immunology in the Charité Universitatsmedizin Berlin, Germany. Additional fifty individuals were recruited from the employees of the same clinic, who self-reported to be healthy and to not suffer from fatigue. However, neither clinical nor laboratory assessment was performed to confirm the healthy status of those individuals. ME/CFS patients and healthy controls were matched for gender and age ( Table 1 ) with 50% of women and an overall average age of ~43 years of age. Fiftyfour out of 92 patients (58.7%) reported an acute infection at their disease onset, whilst the remaining 38 patients (41.3%) reported either a disease trigger other than infection, did not know their disease onset or the information about the disease trigger was missing. These two subgroups were also matched for age and gender (Table 1) . Data under analyses refer to signal intensities derived from IgG antibody responses to 3,054 EBV-associated peptides measured by a seroarray described in detail in the original study (23) . These peptides consisted of partially overlapping 15 amino acids and covered the full length of the following proteins (Supplementary Table 1) : BALF-2, BALF-5, BFRF-3, BLLF-1, BLLF-3, BLRF-2, BMRF-1, BZLF-1, EBNA-1, EBNA-3, EBNA-4, EBNA-6, LMP-1, and LMP-2. The peptides covering these antigens were 15 amino acids (15-mer) in length and overlapped in 11 amino acids. The amino-acid sequences of these peptides were representative of the following EBV strains: AG876 (West Africa, EBV type 2), B95-8 (USA, EBV type 1), GD1 (China, EBV type 1), Cao (China, EBV type 1), Raji (Nigeria, EBV type 1), and P3HR-1 (Nigeria, EBV type 2). These data are publicly available as a supplementary table of the original study (23) . We used the Chi-square test to compare the gender distribution between ME/CFS patients and healthy controls. The non-parametric Mann-Whitney test was used to compare the medians of the respective age distributions. There was evidence for age-and gender-matched distributions if the p-values of these tests were greater than the significance level of 0.05. We first performed a multivariate analysis using (i) the classical principal component analysis (PCA) and (ii) computing different correlation matrices using Spearman's correlation coefficient (which is invariant to monotonic changes in the scale of the data and is robust against the presence of outliers, and does not depend on the normality assumption). We then performed linear discriminant analyses (LDA) to determine the best linear combination of all antibody responses that could distinguish ME/CFS patients and their subgroups from healthy individuals. A similar analysis was done to compare the two subgroups of ME/CFS patients. The outcome of each LDA was the estimated classification probability for every individual. These classification probabilities were then analyzed by the respective receiver operating characteristic (ROC) curve where 1-specificity and sensitivity are plotted against each other as a function of the cutoff of the underlying classification probability. After computing each ROC curve, we calculated the respective area under the ROC curve (AUC) and its 95% confidence interval to determine the accuracy of the classification irrespective of the cut-off used. In general, an AUC=0.50 is indicative of a complete random classification of the individuals while AUC=1.00 implies that the constructed classifier perfectly predicts the true class membership of each individual. We performed further antibody-wide association analyses related to the following comparisons (or classification exercises): (i) healthy controls versus all ME/CFS patients; (ii) healthy controls versus ME/CFS patients with an infectious trigger; (iii) healthy controls versus ME/CFS patients with a noninfectious or unknown trigger; and (iv) ME/CFS patients with an infectious trigger versus the remaining ME/CFS patients. In each association analysis, we first estimated three regression models: logistic model, probit model, and complementary log-log model. In these models, the disease status was the outcome variable, and age and gender were the respective covariates. To determine the best link function for the outcome variable, we selected the model with the lowest Akaike's information criterion (AIC). For the best link function ("the null model"), we estimated the respective ROC and its AUC as described above. We fitted five different logistic models, including the main effects and all possible interaction terms among age, gender, and the antibody response under analysis: (i) a model with main effects only and no interaction terms; (ii) a model with an interaction term between age and the antibody response; (iii) a model with an interaction term between gender and the antibody response; (iv) a model with two interaction terms between age and the antibody response and between gender and the antibody response; (v) a model with all possible twoway and three-way interaction among age, gender, and the antibody response. We compared each of these models with the null one using Wilks' likelihood ratio test, where low p-values provide evidence for these models, including effects of an antibody response. We reported the minimum p-value obtained from these model comparisons. Finally, we adjusted the minimum p-values of each analysis. This adjustment was made using the Benjamini-Yekutieli procedure ensuring a global false discovery rate (FDR) of 5% under the assumption of dependent tests (31) . In this analysis, adjusted p-values <0.05 indicated statistically significant results. To filter out redundant antibody responses, we pooled all the significant antibody responses in a single logistic model. The effect and interaction terms of these antibody responses were defined according to the most significant model obtained in the previous stage of analysis. We performed a backward stepwise model selection. The resulting model was finally evaluated in terms of predictive performance using ROC analysis as described above. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint The above analysis was primarily done for the whole data set irrespective of the ME/CFS subgroups. We repeated the same analysis to compare each subgroup of ME/CFS patients (with infectious and noninfectious or unknown disease trigger) with the healthy controls. Finally, we repeated the analysis to compare the two subgroups of ME/CFS patients. The statistical analysis was performed in the R software version 4.0.3 with core functions and the following packages: MASS v7.3-56 to perform stepwise model selection (32), pROC v1.18.0 to estimate the ROC curve and the respective AUC (33), OptimalCutpoints v1.1-5 to estimate the optimal cutoff and the associated sensitivity/specificity (34) . The full reproducible code is are freely available from NS or JM upon request. All individuals gave written informed consent to participate in the study. The study was approved by the Ethics Committee of Charité Universitatsmedizin Berlin in accordance with the 1964 Declaration of Helsinki and its later amendments (23) . We first performed a PCA to discriminate patients with ME/CFS and their subgroups from healthy controls (Figures 1A, 1B, and 1C). A similar analysis was done for discriminating patients with an infectious trigger from patients with noninfectious or unknown trigger ( Figure 1D ). The proportion of variance explained by the first principal component varied from 35.4% ( Figure 1D ) to 44.6% ( Figure 1C ) referring to the comparisons between the two subgroups of ME/CFS patients, and between healthy controls and patients with non-infectious or unknown disease trigger, respectively. These high estimates associated with the first principal component suggested that the antibody levels were correlated with each other. This interpretation was confirmed by determining the distributions of Spearman's correlation coefficient between all possible pairs of antibodies using data from each study group (Supplementary Figure 1 ). In particular, the antibody levels were positively correlated with . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint each other with median correlation estimates of 0.56, 0.56, 0.40, and 0.48 for healthy controls, all ME/CFS patients, ME/CFS patients with an infectious disease trigger, and the remaining ME/CFS patients, respectively. Interestingly, the median correlation estimate was decreased in ME/CFS patients with an infectious trigger when compared to other study groups. This finding suggested that the production of the antibodies against the EBV-derived antigens could be reduced in these patients when compared to healthy controls or patients with non-infectious or unknown disease trigger. The visualization of the first two components did not reveal a clear discrimination between healthy controls and ME/CFS patients (or their subgroups). To improve this analysis, we then performed different LDAs in search of a linear combination of the antibody measurements that could be used for disease diagnosis. The performance of the constructed classifiers ranged from 0.86 ( Figure 1C ) to 0.91 ( Figure 1D ) referring to the classification of healthy controls and ME/CFS patients with noninfectious or unknown disease trigger and the classification of the two subgroups of ME/CFS patients, respectively. Therefore, the results of this analysis indicate that the obtained data could discriminate different study groups. The next step of the analysis was to identify specific antibody responses that could be used to discriminate the different study groups. With this purpose, we first determined the best "null" model among the logistic, probit, and complementary log-log models. All of them included age and gender and their interaction as covariates for each comparison between any two study groups (Supplementary Table 2 ). The best ''null' models were the following: (i) (i) complementary log-log -comparison between healthy controls and all the ME/CFS patients (AUC= 0.574; 95% CI=(0.475;0.672)); (ii) probit -comparison between healthy controls and ME/CFS patients with an infectious trigger (AUC=0.606; 95% CI=(0.496;0.715)); (iii) complementary log-log -comparison between healthy controls and ME/CFS patients with a noninfectious or unknown trigger (AUC= 0.556; 95% CI=(0.429;0.683)); and (iv) logitcomparison between the two subgroups of ME/CFS groups (AUC=0.596; 95% CI=(0.471;0.720)). The 95% confidence interval for the AUC of these null models included 0.50 and therefore, the respective predicted classification was consistent with a random guess. Such a result was in agreement with the age and gender matching between different study groups and healthy controls (Table 1) . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; We performed further antibody-wide association analyses controlling for a global FDR of 5%. The comparison between healthy controls and all ME/CFS patients did not identify any significant antibody associations with the disease (Figure 2A ). The top 5 antibodies, although not statistically significant, were EBNA6_0066, BLRF2_0005, EBNA4_0392, EBNA4_0497, and EBNA4_0529 (adjusted p-values = 0.181, 0.326, 0.326, 0.326, and 0.326, respectively). When the comparison was limited to healthy controls and ME/CFS patients with an infectious trigger, we identified three significant antibodies related to the following antigens ( Figure 2B ): EBNA6_0066, EBNA6_0070, and EBNA4_0529 (adjusted p-values=0.005, 0.005, 0.038, respectively). The first two antigens were shared between AG876, B95-8, and GD1 strains, while the third one was derived from the B95-8 strain. We compared ME/CFS patients with noninfectious or unknown disease trigger to healthy controls, and found no significant differences in antibody responses ( Figure 2C ). The same finding was obtained when we compared the two subgroups of ME/CFS patients ( Figure 2D ). The top 5 antibodies related to these analyses can be found in Supplementary Table 3 . We then analyzed in detail the impact of the antibody levels against the three candidate antigens on the classification of ME/CFS patients with an infectious trigger. Antibody levels were increased in this subgroup of ME/CFS patients when compared to healthy controls ( Figure 3A ). The same evidence could not be found when comparing all ME/CFS patients to healthy controls ( Figure 3A ). Data related to EBNA4_0529, EBNA6_0066 and EBNA6_0070 were significantly correlated with each other (Spearman's correlation coefficients higher than 0.58; Figure 3B ). The correlation between the levels of antibodies against EBNA6_0066 and EBNA6_0070 could be explained by the fact that these two peptides are 15-mers overlapping 11 amino acids with each other (23) . In contrast, it was unclear why the levels of antibodies against EBNA4_0529 and EBNA6_0066 were highly correlated (Spearman's correlation coefficient = 0.79), considering that these antigens did not share a high sequence homology ( Figure 3C ). Given the high correlation between antibody levels related to these antigens, some statistical redundancy was expected when using their data for patients' classification purpose. This redundancy was confirmed when the three candidate antibodies were included as covariates . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint in the same model. A stepwise variable selection procedure led to the exclusion of the antibody levels related to EBNA6_0066 from the final classification model. The final model included the main effects of antibodies to EBNA4_0529 and EBNA6_0070 and the two-way interaction of the latter with age and gender ( Table 2 ). On the one hand, the log10-levels of antibodies related to EBNA4 increased the probability of being a patient (coefficient estimate = 2.25, Standard error=1.09). In particular, the odds of being a patient were estimated to increase ~9.5 (e 2.25 ) times per fold-change in the levels of these antibodies. On the other hand, the effects of antibody levels related to EBNA6_0070 on the probability of an individual being an ME/CFS patients were not so trivial to ascertain ( Figure 4A ). In particular, women with high EBNA6_0070 antibody levels showed an increasing estimated probability of being a patient with increasing age. In contrast, the probability profile of being patient was different in men. In that case, younger men with low EBNA6_0070 antibody levels or older men with high EBNA6 antibody levels had a higher probability of being a patient. The AUC of the classification predicted by the final model was estimated at 0.835 with a 95% CI=(0.759;0.911) ( Figure 4B ). This estimate suggested that the combination of these two antibodies together with age and gender could be used for the diagnosis of patients with an infectious trigger. The optimal sensitivity and specificity were estimated at 0.833 and 0.720, respectively. Therefore, ME/CFS patients were better discriminated by this model than healthy controls. When the same classification model was applied to the whole cohort of ME/CFS patients, the AUC decreased to 0.731 with a 95% CI=(0.648, 0.814). This could be explained by the cohort of patients with a non-infectious or unknown trigger in which the performance of the classification model was close to a random guess (AUC=0.583; 95% CI=(0.461;0.705)). This study, based on previously published data, aimed to discover EBV-derived antigens that could elicit distinct antibody responses in ME/CFS patients when compared to healthy controls using previously published data. The key finding was the identification of two candidate antigens inducing increased antibody responses in ME/CFS patients with an infectious trigger. The high sensitivity and specificity of our classification model including these antibodies suggest their potential for diagnosis of this subgroup of affected individuals. For ME/CFS patients without an infectious trigger, we could not find any antigens causing . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; antibody responses that could be used for diagnostic purposes. This finding is in agreement with an extensive serological investigation of different herpesviruses in ME/CFS patients (29) . This negative finding supports the hypothesis that EBV plays a role in the group of ME/CFS patients with an infectious trigger. In a subset of patients, infectious mononucleosis caused by primary EBV infection can be documented as a trigger (10). In many others, no infection with a specific pathogen could be associated with disease onset (5) . A tempting hypothesis from our finding is that EBV reactivation which can occur during other infections may play until now an underestimated role in triggering ME/CFS. In line with this concept, a recent study showed that EBV reactivation during COVID-19 is a risk factor for Post COVID Syndrome which also includes ME/CFS (35) . Alternatively, the responses to the EBNA6 peptides are due to a cross-reactivity to other pathogens, as outlined below. Other findings of this study pointed to three key challenges associated with the discovery of a biomarker. Firstly, it is difficult to identify a disease-specific biomarker for all the ME/CFS patients. Thus, given the heterogeneous nature of ME/CFS, it is pivotal to stratify patients adequately (30), based on age, gender, and disease trigger for biomarker discovery (27). In this regard, the identification of antibody patterns specific to ME/CFS patients with an infectious trigger was in agreement with other studies where significant results could be found for subgroup of ME/CFS patients with infectious triggers (10,36,37). However, given the vast number of infectious agents associated with ME/CFS (5, 38) , it is worth noting that this subgroup of patients could be further subdivided according to the nature of the causative infection. In this regard, the data about the infectious agents that could have initiated ME/CFS are either inconclusive or simply based on self-reported history in most patients, as demonstrated by the data from the United Kingdom ME/CFS Biobank, where only a minority of patients had their infection confirmed with the lab test (10). Secondly, the final classification model included non-trivial statistical interactions of antibodies against EBNA6_0070 with both age and gender. This finding implies that significant interactions between candidate biomarkers and confounding factors might be overlooked by analysts or, even when tested, they are likely to be discarded due to the small sample sizes to detect them. The presence of these interactions might be yet another factor that contributes to the lack of reproducibility between biomarker studies on ME/CFS. A proposed strategy to overcome this limitation is to conduct more advanced statistical analyses including the application of machine learning techniques which intrinsically consider the complexity of a . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint large set of clinical and biological data, as demonstrated in drug discovery (39) . Thirdly, the interaction between the candidate antibodies against EBNA6_0070 and gender implied a remarkable distinct antibody signature between male and female patients. Again, this finding is in line with gender differences in immunity to viral infection (40) . In particular, men have typically lower antibody responses when vaccinated and are more susceptible to infections than women (41) . In this regard, our study suggested that the higher probability of younger man being an ME/CFS patient is associated with lower levels of antibodies against the antigen EBNA6_0070. In contrast, female and male patients seemed to be at higher risk with higher antibodies at increasing age suggesting that at least a subset develop these antibody responses later in life. An implication of having a different antibody profiling between men and women is that analysis of each gender should be performed separately. At the same time, it is important to note that epidemiological data on ME/CFS suggested approximately a disease ratio of three women to one man (42) (43) (44) . Therefore, if gender is an important stratification factor for biomarker discovery, studies should be designed towards a more balanced gender ratio. Similar sample sizes between male and female cohorts ensure comparable statistical power when analysing data from each sex separately. Both EBNA4_0529 and EBNA6_0070 antigens are derived from proteins whose genetic expression typically occurs during the EBV type III latency. Therefore, the acquisition of the respective antibodies might have occurred during initial B-cell transformation and immortalization. It could also be acquired slowly over time, given that the type III latency pattern can be detected sporadically in lymphoid follicles where EBV-infected B cells can proliferate and mimic a germinal center reaction program (45) . We can hypothesize from our data that both male and female patients developing higher antibody responses against this antigen later in live are at increased risk of developing ME/CFS suggesting that reactivation of EBV plays a role. In male patients a subgroup with lower EBNA6 antibodies early in live is at risk of developing ME/CFS, too. Using the recent analytical framework of ME/CFS natural progression (46), antibodies against these antigens are more likely to be biomarkers of patients suffering from ME/CFS more than two years of disease rather than the ones either in prodromal period or at early stages in line with our findings. Based on that assumption, these antibodies seemed more appropriate for diagnosing putative patients with delayed disease diagnosis rather than early suspected cases. However, it is known that the delay of ME/CFS diagnosis is a recurrent problem in the clinic (8, 47) . As such, we anticipate a higher . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint utility of these antibodies when redeployed to real-world screening. Another practical implication of using these antibodies as biomarkers is the possibility of developing routine ELISA kits that can be standardized across different laboratories and easily scalable for large population screenings. Notwithstanding these promising practical expectations, it is important to emphasize that past studies also suggested potential disease biomarkers (27) and, therefore, it is imperative to replicate the findings of this study with different cohorts of patients. An interesting observation is that both EBNA6_0066 and EBNA6_0070 contain an argininereach sequence. Such a sequence has homologies with putative epitopes from several human proteins (48) . Such homologies suggest a potential molecular mimicry between the viral and human antigens. Molecular mimicry can trigger deleterious autoimmune responses as hypothesized for ME/CFS pathogenesis (38, 49) . Molecular mimicry between human and microbial antigens has been also hypothesized for several autoimmune diseases (50) , such as multiple sclerosis and rheumatoid arthritis, and Post COVID syndrome, whose patients share similar symptoms with ME/CFS ones (19, (51) (52) (53) . Interestingly, T cell clones recognizing such arginine-repeat sequences were isolated from a patient with multiple sclerosis supporting our concept of epitope mimicry (48) . Finally, arginine-repeat sequences are found in various other pathogens including enteroviruses and human papillomavirus which are also triggers of ME/CFS (5). Further we can hypothesize that peptides highly enriched in arginine residues might be particularly susceptible to citrullination, in which arginine residues are post-translationally converted to citrulline. These post-translational modifications occur during cell death under normal physiological conditions. However, under chronic inflammation, the accumulation of citrullinated (auto)antigens in inflamed sites might lead to deleterious autoimmune responses, thus, promoting the onset of different autoimmune diseases (54) . A potential cross-reactivity between microbial and citrullinated human antigens could also be a mechanism by which an autoimmune disease can be triggered. In rheumatoid arthritis antibodies against EBNA-1 peptides were shown to cross-react with denatured collagen and keratin (55) . However, in the present study, we could not find any antibodies against EBNA-1-derived peptides to be associated with ME/CFS. Interestingly, the serum levels of citrulline were reported to be elevated in ME/CFS patients when compared to healthy controls (56) . However, another study could not confirm this finding, but instead provided evidence for . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint increased plasma levels of arginine residues (57) . Another source of antigen modification is the process of generating new and more immunogenic epitopes from ubiquitous molecules upon oxidative and nitrosative stress. In ME/CFS, IgM antibodies against several of these neoepitopes, including NO-Arginine, were increased in patients (58) . In all these possible scenarios, it is imperative to investigate the stability of this candidate biomarker antigen to post-translational modifications that could be occurred and eventually increased during the disease course. In conclusion, this study identified two candidate antigens whose antibodies could be used to identify ME/CFS patients with an infectious trigger. To strengthen our findings, two other cohorts of patients are currently studied, including the well-characterized ME/CFS patients with different disease triggers and healthy controls from the United Kingdom ME/CFS biobank (10). . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted April 25, 2022. ; and -log10(adjusted p-values) > 1.30 were considered statistically significant. Statistical analysis of the antibody levels related to EBNA4_0529, EBNA6_0066, and EBNA6_0070. A. Boxplots of the data per study group. B. Scatterplots and the respectively Spearman's correlation coefficients (R) in the whole dataset. C. Amino acid sequences of EBNA4_0529, EBNA6_0066, and EBNA6_0070. Analysis of the final classification model for predicting ME/CFS patients with infectious trigger when compared to healthy controls. A. Contour plots of the probability of being a ME/CFS patient as a function of age and EBNA6_0070 antibody levels, for men and women, respectively. The prediction values were calculated by fixing log10(EBNA4_0529) at the respective mean value. B. ROC curves and the respective AUC (95% confidence interval shown . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint within brackets) when using the model to compare different groups of ME/CFS patients to healthy controls. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint Table 1 Basic characteristics of ME/CFS patients and healthy controls and their statistical comparison, where p-values refer to the comparison between ME/CFS groups and healthy controls . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted April 25, 2022. ; https://doi.org/10.1101/2022.04.20.22273990 doi: medRxiv preprint Epstein-Barr Virus and Systemic Autoimmune Diseases Epstein-Barr virus-associated lymphomas Longitudinal analysis reveals high prevalence of Epstein-Barr virus associated with multiple sclerosis Chronic fatigue syndrome. A critical appraisal of the role of Epstein-Barr virus Chronic viral infections in myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) Epstein-Barr Virus and the Origin of Myalgic Encephalomyelitis or Chronic Fatigue Syndrome Myalgic encephalomyelitis/chronic fatigue syndrome: A comprehensive review Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Essentials of Diagnosis and Management Serology Distinguishes Different Subgroups of Patients From the United Kingdom Myalgic Encephalomyelitis/Chronic Fatigue Syndrome Biobank Epstein-Barr virus, and human herpesvirus-6 infections in patients with myalgic еncephalomyelitis/chronic fatigue syndrome Salivary DNA Loads for Human Herpesviruses 6 and 7 Are Correlated With Disease Phenotype in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome Virus Induced Gene-2 Upregulation Identifies a Particular Subtype of Chronic Fatigue Syndrome/Myalgic Encephalomyelitis. Front Pediatr Antibody to Epstein-Barr virus deoxyuridine triphosphate nucleotidohydrolase and deoxyribonucleotide polymerase in a chronic fatigue syndrome subset Deficient EBV-specific B-and T-cell response in patients with Chronic Fatigue Syndrome HLA-DR15 Molecules Jointly Shape an Multiple Sclerosis/Chronic Fatigue Syndrome overlap: When two common disorders collide Molecular mimicry in T cell-mediated autoimmunity: Viral peptides activate human T cell clones specific for myelin basic protein Cerebrospinal fluid CD4+ T cells from a multiple sclerosis patient cross-recognize Epstein-Barr virus and myelin basic protein EBNA1-specific T cells from patients with multiple sclerosis cross react with myelin antigens and co-produce IFN-γ and IL-2 Serological profiling of the EBV immune response in Chronic Fatigue Syndrome using a peptide microarray Molecular mimicry between Anoctamin 2 and Epstein-Barr virus nuclear antigen 1 associates with multiple sclerosis risk Impact of genetic variation on the molecular mimicry between Anoctamin-2 and Epstein-Barr virus nuclear antigen 1 in Multiple Sclerosis Cellular immune function in myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) Antibodies to human herpesviruses in myalgic encephalomyelitis/chronic fatigue syndrome patients Chronic fatigue syndrome: The need for subtypes The control of the false discovery rate in multiple testing under dependency Modern Applied Statistics with S. Fourth pROC: An open-source package for R and S+ to analyze and compare ROC curves Optimalcutpoints: An R package for selecting optimal cutpoints in diagnostic tests Multiple early factors anticipate post-acute COVID-19 sequelae Autoantibodies Against G-Protein Coupled Receptors, Immunological and Cardiovascular Parameters Identifies Distinct Patterns in Post-Infectious vs. Non-Infection-Triggered Myalgic Encephalomyelitis/Chro Infection elicited autoimmunity and Myalgic encephalomyelitis/chronic fatigue syndrome: An explanatory model Artificial intelligence to deep learning: machine intelligence approach for drug discovery Sex Differences in Immunity to Viral Infections The nonspecific and sex-differential effects of vaccines Onset patterns and course of myalgic encephalomyelitis/chronic fatigue syndrome Epidemiological characteristics of chronic fatigue syndrome/myalgic encephalomyelitis in Australian patients Prevalence of myalgic encephalomyelitis/chronic fatigue syndrome (ME/CFS) in three regions of England: A repeated cross-sectional study in primary care European Network on Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (EUROMENE): Expert Consensus on the Diagnosis, Service Provision, and Care of People with ME/CFS in Europe Recognition of conserved amino acid motifs of common viruses and its role in autoimmunity A potential antigenic mimicry between viral and human proteins linking Myalgic Encephalomyelitis/Chronic Fatigue Syndrome (ME/CFS) with autoimmunity: The case of HPV immunization Molecular mimicry and autoimmunity Fatigue and psychosocial variables in autoimmune rheumatic disease and chronic fatigue syndrome: A cross-sectional comparison Illness perceptions and levels of disability in patients with chronic fatigue syndrome and rheumatoid arthritis Insights from myalgic encephalomyelitis/chronic fatigue syndrome may help unravel the pathogenesis of postacute COVID-19 syndrome An Overview of the Intrinsic Role of Citrullination in Autoimmune Disorders Cross-reactivity between the EBNA-1 p107 peptide, collagen, and keratin: implications for the pathogenesis of rheumatoid arthritis Levels of Nitric Oxide Synthase Product Citrulline Are Elevated in Sera of . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity Chronic Fatigue Syndrome Patients Metabolic features of chronic fatigue syndrome Chronic fatigue syndrome is accompanied by an IgMrelated immune response directed against neoepitopes formed by oxidative or nitrosative damage to lipids