key: cord-0717305-r91f2sri authors: Trkulja, Vladimir; Kodvanj, Ivan; Homolak, Jan title: Immunoglobulin G glycome and severity of COVID-19: more likely a quantification of bias than a true association. A comment on Petrović et al., “Composition of the immunoglobulin G glycome associates with the severity of COVID-19” date: 2020-12-18 journal: Glycobiology DOI: 10.1093/glycob/cwaa115 sha: 8de01d6ff65a8b5bd24a52a0f335910b25b075db doc_id: 717305 cord_uid: r91f2sri nan A recent manuscript (Petrović et al. 2020) suggested an association between certain aspects of the total immunoglobulin G (IgG) glycosylation pattern and severity of the disease in hospitalized COVID-19 patients. More specifically, the authors claimed that their data supported a conclusion about "cross-sectional association" (Petrović et al. 2020 ) between higher percentage of bisecting N-acetlyglucosamine (GlcNAc) in the total IgG N-glycome and less severe disease (or, in reverse, between lower GlcNAc percentage and a "severe" disease, as opposed to "mild"). Comments on biological plausibility or on potential practical relevance of the IgG glycome research in the COVID-19 setting are beyond our scope -we draw the attention to methodological flaws (apart from a clearly superior bioanalytics) of the manuscript in question due to which we consider the reported "effects" by far more likely to represent quantification of bias than of a true association. We elaborate our view by addressing potential "doors" through which bias could have been introduced (Altman 1994 ), i.e., design, analysis, reporting and interpretation. We then use the reported data (Petrović et al. 2020) to reconstruct information about uncertainty that was not reported, re-analyze data to illustrate this uncertainty, and subject reported effects to the analysis of sensitivity to unmeasured confounding. Design. Although limitations of a cross-sectional design were acknowledged, cross-sectional nature is not the main limitation of the study (Petrović et al. 2020) . The main design flaws refer to: 1) Inconsistent definition of "severe" and "mild" COVID-19 disease. The study comprises three cohorts of hospitalized patients sampled at different locations (designated as Italy, Portugal and Spain) where different definitions of "severe" and "mild" were used. In the Italian cohort, "severe" referred to patients hospitalized for acute respiratory failure requiring mechanical (MV) or non-invasive (NIV) ventilation, while "mild" referred to patients not requiring oxygen support or requiring oxygen support by mask. People who are well oxygenated at room air and those requiring "oxygen support by mask" do not belong to the same category of disease severity. Moreover, people requiring "oxygen by mask" may considerably differ in disease severity: some would do well with only low-flow oxygen (e.g., up to 3-5 L/min), while 3 the others will require higher rates oxygen (e.g, 15 L/min), and both could be "critical" due to e.g, hemodynamic instability and/or sepsis. High-flow oxygen-requiring and NIV-requiring patients are "closer" to each other than the low-flow oxygen requiring and high-flow oxygen requiring patients, and NIV-requiring patients are closer to high-oxygen requiring patients than to MV-requiring patients (see e.g., stratification for baseline severity in a large recent randomized trial, Beigel et al. 2020 ). In the Portuguese cohort, "severe" (need MV) and "moderate" (no need for MV and no need for intensive care unit, ICU, admission) patients were pooled together into a single group -an obvious case of mixing patients with very different disease severity. "Mild" patients were defined as those without pneumonia. This is a completely different level of disease severity as compared to definition of "mild" in the Italian cohort. In the Spanish cohort, only "severe" was defined: patients needing ICU admission, NIV or MV, or those who died during the index hospitalization. There was no definition of "mild", but one could deduce that all the others were considered "mild" -again, a case of mixing patients with very different disease severity in both the "severe" and the "mild" subset. Overall, the composition of the "severe patient" subset varied greatly across the cohorts, and (by cohort) it embraced patients with clinically considerably different disease severity. The same is applicable for the "mild patient" subset; 2) Not accounting for confounding at the design level. There is a range of more than obvious potential confounders (factors that may associate with both the disease severity and the IgG glycome) that were disregarded at the design level, including: (i) comorbidities and baseline treatments (e.g., malignancy, diabetes, autoimmune diseases, obesity, cardiovascular and respiratory co-morbidity, co-infection, sepsis, immunosuppressants /immunomodulators), or treatments delivered specifically for various manifestations of COVID-19; (ii) patient sampling time-periods (e.g. concurrent for "severe" and "mild" or not). confounders not only at the design level but also at the analysis level. Data were first analyzed by country and then summary country data were pooled by a meta-analysis. Description of both procedures is vague and data presentation is to some extent confusing. Apparently, data by cohort were analyzed by logistic regression. As far as one could deduce, the dependent variable was "disease severity" (and probability of "severe" was modeled), while 6 quantitative variables [GlcNAc (depicted as "B"), agalactosylation ("G0"), one galactose ("G1"), two ("G2") galactoses, sialylation ("S") and fucosylation ("F")] were "effects". They were not entered all simultaneously in the model, rather, relationship between each one of the 6 gylcans and the outcome was assessed in a separate logistic regression -a situation in which any observed "association" might have been confounded by any of the others not accounted for. Odds ratios were then pooled meta-analytically across the three countries. This was repeated six times, once for each individual glycan (B to F). The part reporting on the meta-analysis provides no description of the method, and measures of uncertainty and of heterogeneity were not reported. Finally, the authors reported "adjusted meta-analysis P-values", where by "adjusted" they meant false discovery rate (FDR)-adjusted P-values, but FDR is not a method to control familywise error rate Re-analysis of data to illustrate uncertainty. Having in mind all the unmeasured confounding, data re-analysis is equally as meaningless as the reported analysis (Petrović et al. 2020) . It serves the sole purpose of illustrating uncertainty (not reported by the authors). Figure 1 summarizes a simple meta-analysis of log(ORs) by individual "effect" (glycan) -uncertainty (best illustrated by the prediction intervals) is huge, and the claimed "significant association" (GlcNAc) is in this respect indistinguishable from other "effects" for which no "significance" was claimed. Figure 2 summarizes analysis that accounts for correlation between "effects" from the same cohort: Bayesian (posterior densities, 95%CrI) (upper panel) and frequentist (lower panel). The latter is given with 99%CI (to account for multiplicity) and raw and adjusted Pvalues to control FWER. Even with all disregarded confounding, no clear association between the level of GlcNAc and "severe disease" is apparent. Sensitivity of the reported "significant association" to unmeasured confounding. The claimed association between bisecting GlcNAc and "severe disease" (OR= 0.712, 95%CI 0.579-0.876, Figure 1 ) was provided without an account for a range of potential confounders. For example, diabetes (type 2) is a known risk factor for mortality (OR=1.90, 95%CI 1-37-2.64) and a severe disease (OR=2.75, 95%CI 2.09-3.62) in COVID-19 patients (Kumar et al. 2020) . At the same time, diabetes is also strongly associated with the IgG glycosylation patterns: specifically, odds ratio for the association between bisecting GlcNAc and diabetes was reported to be OR=0.361 (Wu et al. 2020 ). Many (severe) COVID-19 patients are diabetics -prevalence of diabetes among those who required ventilation or died varied between 9% and 25% in several cohorts (most commonly between 15% and 25%) (Kumar et al. 2020) . It is reasonable to ask: assuming that there were some 15% of patients with diabetes in the cohorts reported in the manuscript in question (Petrović et al. 2020) , what strength of association between diabetes and "severe COVID-19" and between diabetes and bisecting GlcNAc would completely explain away the reported association between GlcNAc and "severe disease" (i.e., "push" the reported OR to GlcNAc IgG content. Hence, the reported "association" is susceptible to even small unmeasured bias. Overall, while the IgG glycome analysis might turn-out to be a fruitful area of COVID-19-related research, the reported "effect" (Petrović et al. 2020 ) is more likely to be a quantified bias than a true association. random-effects meta-analysis of "probability of being severe COVID-19 patient" for each "effect" (i.e., continuous independent -glycan). Data on log(OR), P-values and number of subjects per cohort (Petrović et al. 2020 ) were used to recover standard errors (based on N-3 d.f.). Considering that only three cohorts ("studies") contributed to meta-analysis, 95%CIs for  2 and I 2 are provided, while point-estimates (since meaningless) are omitted. Package meta in R was used (Schwarzer 2015) . The scandal of poor medical research Remdesivir for the treatment of COVID-19 -final report Advanced Bayesian multilevel modeling with R package brms Hierarchical models Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations Is diabetes mellitus associated with mortality and severity of COVID-19? A meta-analysis Composition of the immunoglobulin G glycome associates with the severity of COVID-19 Meta-analysis in R Sensitivity analysis in observational research: introducing the E-Value Advanced methods in meta-analysis: multivariate approach and meta-regression Conducting meta-analysis in R with the metafor package Moving to a world beyond Multiple comparisons and multiple tests using SAS, 2 nd edition IgG glycosylation profile and glycan score are associated with type 2 diabetes in independent Chinese populations: a case-control study