key: cord-0686745-p95rawea authors: Cozzi-Lepri, Alessandro; Smith, Colette; Mussini, Cristina title: Signals were broadly positive for months, but never definitive: the tocilizumab story date: 2021-11-09 journal: Clin Microbiol Infect DOI: 10.1016/j.cmi.2021.10.018 sha: 83179a33fa6e520eb8d7a8d572ba09617c735af5 doc_id: 686745 cord_uid: p95rawea BACKGROUND: Most COVID-19 treatment guidelines currently recommend tocilizumab in combination with dexamethasone in critically ill patients who are exhibiting rapid respiratory decompensation. OBJECTIVES: To produce a critical review and summary of the pathway which led to the repurposing of tocilizumab for COVID-19 treatment, from in vitro observations to guidelines recommendations. SOURCES: All studies evaluating the effectiveness of tocilizumab to treat COVID-19 disease published over July 2020-July 2021. CONTENT: Two large methodologically well conducted observational studies, the TESEO and the STOP COVID cohorts, showed a reduction in the risk of invasive mechanical ventilation or death in patients treated with tocilizumab as compared to standard of care in 2020. Concomitantly and up to February 2021 a number of small sample size randomized trials (RCTs) were showing discrepant results. These RCTs had a number of issues: small sample size, various designs and inclusion criteria and different dosages of tocilizumab used. The confidence interval of the meta-analytic estimate for the RCT results was consistent with the hypothesis of no efficacy of tocilizumab. In our opinion, this was mainly because the meta-analysis included small and heterogeneous studies. These results led to a delay in the inclusion of tocilizumab in guidelines which occurred only in the summer of 2021. IMPLICATIONS: Although observational studies are unable to control for unmeasured confounding, they can be put together quickly during a pandemic and promptly provide important information. The large sample size allows us to investigate effect measure modifiers and better target interventions. It is key that the effect size is somewhat large (RR>2), all sources of bias are properly accounted for and the direct evidence is weighted against these factors. It appears to us that for tocilizumab, not having dismissed the results of carefully designed and analysed observational studies in 2020 could have prevented many deaths over those months. The standard of care of patients with severe COVID-19 pneumonia changed dramatically during 2020, whilst treating physicians balanced the urgent need to lower the mortality risk with quickly accumulating evidence on the efficacy of candidate clinical interventions. Randomised clinical trials (RCTs) are considered the gold standard for evaluating the efficacy and safety of medical interventions. Although the application of this rigorous strategy, which is valid in general, becomes even more relevant during a pandemic, it can be a challenge to rapidly organise and conduct RCTs. Conversely, collection of real-world data from routine clinical practice is relatively simpler and, by using modern techniques of data capture, can be done almost in real time during a pandemic. In this scenario, the key question still holds: to what extent can we trust evidence from non-randomised studies, in particular observational studies using real world data? By strictly applying the evidence based medicine (EBM) hierarchy, only by conducting high quality RCTs (using concealment of allocation, blinding, adequately powered, with no loss to follow-up, etc.) we can be certain that people receiving the intervention are as similar as possible to those who did not; thus, any difference in outcome can be ascribed to the intervention. In contrast, if evidence of high effectiveness of a certain intervention has been obtained from observational studies alone, these treatments are typically not licensed for routine use, since the interventions are believed to be supported by relatively poor evidence. Indeed, results of observational studies are unlikely to trigger FDA emergency use approvals (EUA). Observational studies are vulnerable to a number of factors, including the presence of confounding (both measured and unmeasured) and differential levels of clinical monitoring by intervention arm, which could bias the estimate of the effect of the intervention. Specifically, randomised studies are placed higher in the hierarchy of EBM because bias due to unmeasured confounding is minimised. Therefore, over the last two decades an increasing number of epidemiologists and statisticians have advocated for the use of propensity score adjustments and marginal structural models to at least properly control for all sources of measured confounding. If the objective is to provide an estimate of the causal effect of an intervention which is as close as possible to the estimate that could be obtained in such hypothetical randomised clinical trials, all other sources of J o u r n a l P r e -p r o o f bias being equal, marginal structural models represent the best available tool to date to make valid inference from observational data (1) (2) (3) . Tocilizumab is an IL-6 receptor antagonist. As elevated levels of IL-6 consistently predicted both severe prognosis and mortality, there was a strong rationale for using this compound in COVID-19 disease (4). The drug was originally indicated for the treatment of rheumatoid arthritis and more recently for the treatment of cytokine release syndrome secondary to chimeric antigen receptor T cell therapy (CAR-T) (5) . The fact that the cytokine storm described during severe COVID-19 could be considered similar to that occurring after CAR-T and the availability of data on tolerability of the drug in humans with rheumatoid arthritis and CAR-T led clinicians, initially in China followed by Europe and USA, to administer tocilizumab to treat COVID-19 patients (6) . Quickly, real world data were put together in order to gather evidence on the effectiveness of the drug in reducing the risk of invasive mechanical ventilation and death (7, 8) . These studies were conducted with very variable inclusion criteria ranging from patients with very mild disease to patient populations who were admitted to the ICU. Among these, we want to highlight two studies; one in Italy (the TESEO cohort, our group) and one in the USA (the STOP-COVID cohort based in 68 tertiary hospitals) (7, 8) . Both studies included critically ill patients and used a marginal structural model with inverse probability weights to control for measured imbalanced characteristics of patients who were treated with tocilizumab compared to those who were not. The TESEO cohort is a large multicenter observational study of 544 patients with COVID-19 admitted to three tertiary hospitals in the Emilia Romagna region, during the first wave of the pandemic (7). The study showed high effectiveness of tocilizumab compared to standard of care for reducing the risk of both invasive mechanical ventilation and death. At day 14 after hospital admission, the proportion of patients with the composite outcome was 23% (95% CI 16-29%) for the tocilizumab group versus 37% (31-42%) for the standard of care group (log rank p=0·0023). From fitting a marginal structural Cox regression model, the adjusted hazard ratio (aHR) was0·53 (95% CI: 0·31-0·89, p=0·016), thus showing high effectiveness of tocilizumab in reducing the risk of these events. The effect of tocilizumab was even greater in people with baseline PaO 2 /FiO 2 value of less than 150 mmHg (aHR 0·19, 95% CI 0·08-0·44), suggesting that the drug might be particularly indicated for critically ill patients. The article was published in August J o u r n a l P r e -p r o o f geographical setting in the STOP COVID cohort, published in November 2020 (8) . Here, 28-day mortality risks were 28% in the tocilizumab group vs. 37% in the non-tocilizumab group with an aHR for death of0.71 (95% CI:0.56-0.92). Again, the effect was stronger in the subset of patients who were admitted to the ICU within 3 days of symptoms onset (aHR=0.41, 95% CI: 0.23-0,74) and largely attenuated in those admitted to the ICU >3 days after symptoms onset. TESEO and STOP COVID also shared the same type of imbalance observed between the two treatment strategies with patients treated with tocilizumab typically having more comorbidities, higher prevalence of hypoxemia and higher levels of inflammatory markers. After August 2020, randomised studies started to be published. The trials were highly heterogeneous in inclusion criteria and design. For example, the CORIMUNO TOCI study, RCT-TCZ-COVID-19 Study and Brazilian COVID-19 trials were not double blind and did not have placebo controls (9) (10) (11) and only REMAP CAP comprised mainly critically ill patients (12) . Exact dosing of tocilizumab also varied by trial. In the RECOVERY, EMPACTA and REMAP-CAP (12-14) patients received 1 dose of tocilizumab that could be repeated after 12 or 24 hours on the basis of clinical evaluation, while in the COVACTA just one dose was used and in the RCT-TCZ-COVID-19 always two doses (10, 15) . By the summer of 2020, because of the conflicting nature of the results of these trials, tocilizumab was excluded in many countries from the list of recommended regimens for the treatment of COVID-19 disease and its use was recommended against outside of clinical trials. In early January 2021, the results of the REMAP CAP trials were published and there was the need to re-synthesize all the evidence coming from interventional studies (12) . The best way to pool all the information together is to conduct a meta-analysis including all trials regardless of design features and inclusion criteria. This was indeed the approach used by the MRC Population Health Research Unit [https://twitter.com/rupert_pearse/status/1349424862876594179/photo/1]. Such a meta-analysis indicated an effect, albeit of a small magnitude, of tocilizumab in reducing the risk of 28-day mortality (meta-analytic OR=0.83, 95% CI:0.66-1.04, p=0.11). As a consequence, the drug was seen more positively although more definitive evidence was needed to recommend routine use, as reported in a commentary published at the time in the BMJ (16). However, looking at some of the trials included in the meta-analysis individually, the study conducted by Salvarani et al, only included patients with mild disease severity with a median PaO2/FiO2 ratio of 265 mmHg (10) . This is clearly a population with no ongoing cytokine storm for J o u r n a l P r e -p r o o f whom tocilizumab should not be indicated. Secondly, the sample size of the BAAC Bay trial (17) was small and because of the specific wording used in the conclusion paragraph of the abstract (i.e. 'tocilizumab was not effective from preventing intubation or death'), its inconclusive evidence has been widely interpreted as 'no effect of the drug'. Despite recent efforts to better teach statistical hypothesis testing and avoid p-value misconceptions, unfortunately it is not uncommon to confuse the concept of 'no statistical significance' with that of 'no effect' (18) . Importantly, when restricting to trials with >150 participants per arm and predominantly including patients with more severely compromised respiratory function at entry (EMPACTA, CORIMUNO and REMAP CAP) there was strong evidence that tocilizumab reduced the risk of invasive mechanical ventilation or death by a remarkable 40% (15, 9, 12) . In March 2021, the results of the largest trial conducted to date, the RECOVERY trial, were finally publically reported and confirmed a reduction in mortality (RR=0.86, 95% CI: 0.77-0.96 p=0.007) in patients treated with dexamethasone plus tocilizumab compared to those treated with dexamethasone alone (13) . On the basis of these results, several guidelines including the Italian Society of Infectious diseases, UK, NIH and IDSA were eventually modified to recommend the use of tocilizumab in critically ill patients (19) (20) (21) (22) . Interestingly, along the lines of our arguments, Lawrence et al recently proposed that ideally meta-analyses should be conducted using individual patients (IP) data rather than an assemblage of summary statistics. Indeed, most of the flaws identified in trials of ivermectin would have been immediately detected in an IP meta-analysis. Such an approach would also have led to a better assessment of the effect measure modifiers in these studies [23] . At this point, we should also note that other drugs such as hydroxychloroquine, azithromycin and ivermectin appeared to be promising to treat COVID-19 disease during the early phase of the pandemic. However, we do not think that the pathway from observational studies to RCTs for these drugs is comparable to that we describe here for tocilizumab. Briefly, hydroxychloroquine showed promising reductions of SARS Cov-2 replications in vitro studies but we are not aware of high quality observational studies in humans. Indeed, one initial study lacked a control group and others were criticised to be affected by confounding and immortal-time bias [26, 27] . Similarly, invermectim's use was advocated following the results of a meta-analysis which was later retracted by the authors due to the inclusion of at least two studies which had been poorly designed, not peer reviewed or affected by clear bias [28] [29] [30] . Classical principles of EBM are not put under scrutiny here. High quality evidence of the safety and efficacy of new interventions is vital. It is indeed reasonable to always treat results of observational studies cautiously because of the issue, among others, of unmeasured confounding. It is also reasonable not to rely on the results of single randomised trials in isolation, and to ensure that meta-analyses consider differences in study designs and patient populations before pooling the results (23). Nevertheless, there were a number of anomalies in this 'tocilizumab story' which seems to suggest that perhaps more caution is required when performing some of these steps, especially while we are working under the pressure of a devastating pandemic. First, the interpretation of some of the early small individual trials has been incorrect or misleading. In addition, the meta-analyses of these trials performed in January 2021 focussed on a mortality endpoint only. Although death is certainly a more solid and less subjective endpoint, this choice appears to be debatable because saturation of ICUs was one of the most critical issues, (24) . This suggests that randomised and non-randomised studies should no longer be separated but instead classified together as study designs providing 'direct evidence' for the effectiveness of an intervention. According to Howick et al, the accumulation of 'direct' evidence demonstrating that the effect size is greater than the combined influence of plausible confounders and other potential bias is more important than the actual study design (experimental vs. observational) (24) . In this particular case, patients treated with tocilizumab in observational studies were on average more critically ill than those who did not receive the drug. Thus, if anything, the effect of tocilizumab could have even been underestimated in these studies. Importantly, our key point is that carefully designed and analysed observational studies can also play a key role in advancing our knowledge of treating COVID-19. In a recent study by Shepshelovich D et al, for COVID-19 treatment comparisons, a large discrepancy between results of observational studies and trials was shown, although 30% of the observational J o u r n a l P r e -p r o o f studies reported only crude mortality rates (univariable analyses without controlling for confounding) and only 55% used propensity adjustment analysis [25] . In addition, only eight nonrandomized studies (8%) contained any type of adjustment for immortal time bias. Taking all these arguments together, in a scenario of particularly severe outcomes occurring on a rapid timescale, is it sensible to withdraw or recommend against the use of a promising intervention, despite the fact that has been shown in well conducted non-randomised studies to reduce mortality by as much as 40%? Indeed, many patients admitted during the second wave in 2020 and early 2021 could not have access to tocilizumab because of guidelines recommendations. Crucially, many lives could have been saved by introducing tocilizumab in routine clinical practice as early as August 2020 (the date of publication of the TESEO study) or even in November 2020 (pre-print of the STOP COVID study) and this is certainly something which regulatory agencies, researchers, infectious disease clinicians and the community as a whole should be reflecting upon. The authors wish to show their appreciation to Professor Miguel Hernan for inspiration and sharing his results from the STOP COVID collaboration cohort https://www.lshtm.ac.uk/media/44276 None of the authors declare conflicts of interest and no external funding was received for this work Cozzi-Lepri A: data analysis, data interpretation, writing and revising for intellectual content. Smith C: writing and revising for intellectual content. Mussini C: writing and revising for intellectual content. Marginal structural models and causal inference in epidemiology Do observational studies using propensity score methods agree with randomized trials? A systematic comparison of studies on acute coronary syndromes INSIGHT START Study Group and the HIV-CAUSAL Collaboration. Effect Estimates in Randomized Trials and Observational Studies: Comparing Apples With Apples IL-6-based mortality risk model for hospitalized patients with COVID-19 Tocilizumab for the treatment of chimeric antigen receptor T cell-induced cytokine release syndrome Effective Treatment of Severe COVID-19 Patients with Tocilizumab Tocilizumab in patients with severe COVID-19: a retrospective cohort study STOP-COVID Investigators. Association Between Early Treatment With Tocilizumab and Mortality Among Critically Ill Patients With COVID-19 Effect of tocilizumab vs usual care in adults hospitalized with Covid-19 and moderate or severe pneumonia: a randomized clinical trial Effect of Tocilizumab vs Standard Care on Clinical Worsening in Patients Hospitalized With COVID-19 Pneumonia: A Randomized Clinical Trial Coalition covid-19 Brazil VI Investigators. Effect of tocilizumab on clinical outcomes at 15 days in patients with severe or critical coronavirus disease 2019: randomised controlled trial Receptor Antagonists in Critically Ill Patients with Covid-19 Tocilizumab in patients admitted to hospital with COVID-19 (RECOVERY): a randomised, controlled, open-label, platform trial Tocilizumab in Patients Hospitalized with Covid-19 Tocilizumab in Hospitalized Patients with Severe Covid-19 Pneumonia Covid-19 controversies: the tocilizumab chapter Efficacy of Tocilizumab in Patients Hospitalized with Covid-19 Statistical tests, P values, confidence intervals, and power: a guide to misinterpretations Therapeutic strategies for severe COVID-19: a position paper from the Italian Society of Infectious and Tropical Diseases (SIMIT) IDSA Guidelines on the Treatment and Management of Patients with COVID-19 Extending inferences from a randomized trial to a target population The evolution of evidence hierarchies: what can Bradford Hill's 'guidelines for causation' contribute? Concordance between the results of randomized and non-randomized interventional clinical trials assessing the efficacy of drugs for COVID-19: a cross-sectional study Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial Treatment with hydroxychloroquine, azithromycin, and combination in patients hospitalized with COVID-19 Meta-analysis of randomized trials of ivermectin to treat SARS-CoV-2 infection The lesson of ivermectin: meta-analyses based on summary data alone are inherently unreliable