key: cord-0991044-on2zflr5 authors: Mazhar, Faizan; Hadi, Muhammad Abdul; Kow, Chia Siang; Marran, Albaraa Mohammed N.; Merchant, Hamid A.; Hasan, Syed Shahzad title: Use of hydroxychloroquine and chloroquine in COVID-19: How good is the quality of randomized controlled trials? date: 2020-09-29 journal: Int J Infect Dis DOI: 10.1016/j.ijid.2020.09.1470 sha: fa49a79dd321aa1d615a890bccaff9ec5b279915 doc_id: 991044 cord_uid: on2zflr5 Objectives We critically evaluated the quality of evidence and quality of harms reporting in clincal trials that recently evaluated the effectiveness of HCQ/CQ in COVID-19. Study Design and Setting Scientific databases were systematically searched to identify relevant trials of HCQ/CQ in COVID-19 published until 10th September, 2020. The Cochrane risk-of-bias tools for randomized trials and non-randomized studies of interventions were used to assess risk of bias of included studies. A 10-item Consolidated Standards of Reporting Trials (CONSORT) harms extension was used to assess for quality of harms reporting. Results Sixteen trials including fourteen randomized and two non-randomized trials met the inclusion criteria. The results from included trials were conflicting, lacked effect estimates adjusted for confounders and baseline disease severity or comorbidities in many cases, and recruited a fairly small cohort of patients. None of the clinical trials met the CONSORT criteria in full for reporting harms data in clinical trials. None of the sixteen trials had an overall ‘low’ risk of bias, while four of the trials had ‘high’, ‘critical’, and ‘serious’ risk of bias. Biases observed in these trials arise from the randomization process, potential deviation from intended interventions, outcome measurement, selective reporting, confounding, participant selection, and/or classification of interventions Conclusion In general, the quality of currently available evidence for the effectiveness of CQ/HCQ in COVID-19 is suboptimal. The importance of a properly designed and reported clinical trial cannot be overemphasized amid the COVID-19 pandemic and its dismissal could lead to poorer clinical and policy decisions resulting in wastage of already stretched invaluable healthcare resources. Since its outbreak in December 2019, coronavirus disease 2019 , caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has affected more than seven million individuals across the world [1, 2] . The full spectrum of clinical manifestations of COVID-19 ranges from asymptomatic, or mild, self-limiting respiratory tract illness, to severe progressive pneumonia, multiorgan failure, and death [3] . The reported case fatality rate of COVID-19 was highly variable, ranging from about 0.06% to 19%, depending on countries, settings, and age groups [4] . The case fatality rate is higher for hospitalized COVID-19 patients with population data from the UK suggesting a case fatality rate of 26% [5] . Many candidate drugs have been proposed for the treatment of COVID-19, but the antimalarial agents, chloroquine (CQ), and hydroxychloroquine (HCQ) have attracted a lot of attention [6] . In vitro studies have suggested direct antiviral properties of these antimalarial agents through the inhibition of pH-dependent steps of viral replication, while other researchers have suggested anti-inflammatory effects mediated through inhibiting the production of tumor necrosis factor-alpha and interleukin 6 and thus blocking the cascade of events leading to acute respiratory distress syndrome [7] . The use of antimalarials first attracted media attention in February 2020 after a news briefing by the Chinese government revealed that, according to several Chinese studies, CQ and HCQ seemed 'to have apparent efficacy and acceptable safety against COVID-19' [8] . A second boost of attention on J o u r n a l P r e -p r o o f antimalarials came after publishing of a non-randomized study -with considerable methodological limitations-claiming that a combination of HCQ and azithromycin achieved more rapid SARS-CoV-2 clearance in respiratory secretions of 20 patients [9] , followed by which the drug combination had been touted numerous times by the president of the United States as a potential cure for COVID -19 in the media [10] . Evidence-based medicine is one of the cornerstones of high-quality clinical care. It has been wellestablished that the best evidence comes from well-designed and well-conducted randomized controlled trials. The promising signals from in-vitro studies or uncontrolled data must be rigorously confirmed or refuted in high-quality randomized controlled trials. Ideally, efficacy-based trials, including proof-of-mechanism studies, should precede larger pragmatic effectiveness trials [11] . However, development of robust evidence through well-designed and well-conducted trials can be challenging particularly during a pandemic [12] , and thus there may be a temptation to lower the 'quality threshold' and overlook the limitations associated with study design either in the wider interest of public health or to claim a 'breakthrough'. However, such temptation must be resisted because falsely adopting ineffective and potentially unsafe interventions based on studies with methodological flaws may only cause harm without a noteworthy benefit. This may eventually have a negative impact not only on the design of other clinical trials but also on the course to find truly effective and safe interventions. In addition to robust trial design, transparent and accurate reporting of trial data are equally important, especially reporting of harms data related to interventions. Optimal collection and reporting of adverse events (AEs) during any clinical trial should not be overlooked in order for clinicians to make a comprehensive risk-benefit assessment for their patients who may have other underlying conditions that contraindicate the use of a particular drug, either relatively or absolutely. Previous studies have examined the methods for AE collection and presentation and highlighted inadequacies and inconsistencies in AE reporting in various published clinical trials [13] [14] [15] [16] . In 2004, J o u r n a l P r e -p r o o f 6 the Consolidated Standards of Reporting Trials (CONSORT) Group produced an extension to their guidelines for reporting trial results to include the reporting of harms but these guidelines are poorly implemented in practice [17] . The number of clinical trials assessing various treatment strategies against COVID-19 continues to increase, albeit some are good, the quality of most of these trials remains questionable. It is, therefore, more important than ever to critically assess the quality of emerging evidence from clinical trials amid COVID-19. This review aims to systematically summarize and critically evaluate the quality of evidence from all clinical trials of CQ or HCQ for the treatment of COVID-19 and to evaluate the quality of AE assessment and reporting in these trials. The Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guideline was followed for the study design, search protocol, screening, and reporting. Articles were searched using a predefined search strategy and eligibility criteria. A systematic search was performed in electronic databases: PubMed, EMBASE, medRxiv (a preprint repository), Bibliovid, Google Scholar, and Dimensions, to retrieve eligible articles published between 1 st December 2019 and 10 th September 2020. We also searched for other eligible studies by screening the reference lists of relevant articles as well as unpublished studies from the clinical trial database (Clinicaltrial.gov). The search strategy included all MeSH terms and free keywords found for COVID-19, SARS-CoV-2, and CQ/HCQ. The following search terms were also used: "anti-malarial", "antimalarials", "chloroquine* OR CQ OR Aralen", "hydroxychloroquine OR HCQ OR Plaquenil", and "COVID-19 OR 2019-nCoV OR SARS-COV-2 OR Wuhan virus OR coronavirus". Two authors (SSH and FM) independently screened the electronic databases and selected the articles against eligibility criteria. Discrepancies between them in the selection of articles for inclusion were resolved by discussion with a third author to achieve a consensus. The study was included if: (1) it is a J o u r n a l P r e -p r o o f randomized or non-randomized controlled trial; and 2) it reported the effects of CQ and/or HCQ, compared to placebo and/or active comparator treatment(s) in COVID-19 patients. Studies were excluded if they were: 1) observational studies, animal studies, reviews, case reports, and in vitro studies; and 2) duplicate publications. Two authors (SSH and FM) assessed the risk of bias of studies included in the systematic review. Version 2 of the Cochrane risk-of-bias tool for randomized trials (RoB v2) [18] , which is a standardized method for assessing potential bias in reports of randomized interventions, was used to assess the risk of bias in the included randomized trials. RoB v2 is structured into a fixed set of domains of bias, focusing on different aspects of trial design, conduct, and reporting. A proposed judgement about the risk of bias arising from each domain is generated by an algorithm, where judgement can be 'Low' or 'High' risk of bias or can express 'Some concerns'. For non-randomized trials, the bias was assessed by the Risk of Bias in Non-randomised Studies of Interventions (ROBINS-I) tool [19] . Similarly, ROBINS-I which is structured into a fixed set of domains of bias includes signalling questions that inform risk of bias judgements and based on answers to the questions, judgements for each bias domain, and overall risk of bias can be 'Low', 'Moderate', 'Serious' or 'Critical' risk of bias. For this systematic review, we considered harms to be a continuum of all adverse treatment effects, including tolerability issues at the lower end and safety concerns at the upper end, a definition consistent with the 2004 CONSORT harms recommendations [17] . Data on harms reporting was assessed using a 10-item checklist,'CONSORT Extension for Harms' [17] . The 10-item checklist was adapted and modified as there are multiple items of interest within a single CONSORT harms recommendation and thus scoring the multiple items within a single recommendation would have been difficult and misleading. Therefore, where appropriate, we split the single CONSORT harms extension items into two or three items resulting in a 19-item checklist. Each item of the 19-item J o u r n a l P r e -p r o o f checklist was scored individually and weighted with equal importance in line with CONSORT harms recommendations. Each item carries a score of '1' if it was adequately reported or a '0' if it was inadequately reported or not reported at all. The total harm reporting score (THRS) was calculated by summing up all the individual scores with maximum and minimum scores of 19 and 0, respectively. All included trials were coded by the first author (FM) using the descriptors from the CONSORT Extension for Harms and subsequently cross-verified by the second author (SSH). Extracted data from all included studies were compiled into an electronic summary table. The following pertinent information was extracted: all-cause mortality, the requirement for mechanical ventilation, virologic clearance, radiological results, admission into the intensive care unit, confirmation of COVID-19 status, development of new symptoms, serious adverse events, and total adverse events. Further parameters of interest included the number of patients, number of controls, mean age, gender distribution, baseline disease severity, treatment option, treatment dosage, and treatment duration. We summarized the quantitative findings of individual studies and utilized a descriptive approach for reporting of harms. The percentage of trials fulfilling each CONSORT Extension for Harms recommendation and the number of recommendations fulfilled by each trial were tabulated descriptively. In addition, after the calculation of THRS for all included trials, we determine the median THRS along with its interquartile range. A total of 1,320 records were identified from the literature search. The retrieval of the full text after the screening was performed on 74 potentially eligible abstracts. After the implementation of J o u r n a l P r e -p r o o f eligibility criteria, 16 clinical trials were finally included [9, [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] . Data were extracted from these 16 clinical trials (Figure 1 ). There were 16 clinical trials included in this systematic review [9, [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] : two non-randomized controlled trials [9, 23] , two double-blind randomized controlled trials [26, 27] , and twelve open-label randomized controlled trials [20] [21] [22] 24, 25, [28] [29] [30] [31] [32] [33] [34] . The sample size in those sixteen trials ranged from 22 to 4,716 participants. The characteristics of these trials are summarized in Table 1 . Most trials (n=9) (Gautret et al. [9] , Chen J et al. [20] , Chen Z et al. [22] , Esper et al. [23] , Borba et al. [25] , Boulware et al. [26] , Horby et al. [31] , Abd-Elsalam et al. [32] and Mitjà and Ubals et al. [34] ) provided no information on baseline disease severity, though it may not be significant for the trials by Esper et al. [23] , Boulware et al. [26] and Mitjà and Ubals et al. [34] since they evaluated HCQ as postexposure prophylaxis. Tang et al. [21] defined disease severity based on the Chinese guideline for the J o u r n a l P r e -p r o o f management of COVID-19. Chen C et al. [30] categorized patients into different disease severity levels based on chest radiographic findings. Mitjà and Corbacho-Monné et al. [29] enrolled patients with the mild disease presenting with symptoms including fever, acute cough, shortness of breath, sudden olfactory, or gustatory loss, or influenza-like-illness. Skipper et al. [27] evaluated the severity of disease based on a 10-point visual analogue scale. Furtado et al. [33] included patients with a severe disease based on the use of oxygen supplementation of more than 4 L/min flow, the use of a high-flow nasal cannula, the use of non-invasive positive-pressure ventilation, and the use of mechanical ventilation. Though Cavalcanti et al. [28] specified that they enrolled participants with mild-to-moderate disease, no information was given on such definition and the breakdowns of the proportion of participants with either mild or moderate disease. Huang et al. [24] recruited patients with a moderate-to-severe disease without information given on such definition, though they did provide the proportions of participants with moderate or severe disease, respectively. The trials by Tang et al. [21] , Huang et al. [24] , and Chen C et al. [30] were found to have baseline differences in disease severity between the treatment arm and the comparator arm. None of the trials statistically adjusted disease severity at the baseline. Four trials [9, 20, 21, 30] of oral HCQ utilized viral clearance as the outcome measure. With a daily oral dose of HCQ 400 mg for 5 days, Chen J et al. [20] reported a negative conversion rate on day 7 in a pharyngeal swab of 86.7% in the HCQ group (n=15), compared to 93.0% for the control group (n=15). Tang et al. [21] who employed a loading plus maintenance dose regimen of HCQ reported a probability of a 28-day negative conversion rate of 85.4% in the HCQ arm (n=75), compared to 81.3% in the standard of care arm (n=75). A loading plus maintenance dose regimen of HCQ employed by Chen C et al. [30] reported a 14-day negative conversion rate of 81.0% for the HCQ group (n=21) and 75.0% for the standard of care group (n=12). Gautret et al. [9] employed a daily oral dose of HCQ 600 mg and J o u r n a l P r e -p r o o f reported a higher viral clearance rate of 70% in the HCQ arm (n=20) compared to 12.5% in the control arm (n=16) at 6 days. Hospital admission as the outcome measure was utilized in three trials [23, 27, 29] . Esper et. al [23] administered HCQ with azithromycin (HCQ 800 mg on the first day and 400 mg for another 6 days and azithromycin 500 mg once daily for five days; n=412) for patients with suspected COVID-19 and reported a lower hospitalization rate of 1.9% compared to 5.4% in the control group who refused the trial drug (n=224). Mitjà and Corbacho-Monné et al. [29] in their trial among patients with mild symptoms of COVID-19 reported hospitalization rates of 5.9% in HCQ arm (800 mg on day 1, followed by 400 mg once daily for another 6 days; n=136) and 7.1% in the control arm (n=157). Skipper et al. [27] who enrolled non-hospitalized adults with suspected or confirmed COVID-19 status reported hospitalization rates of 1.8% and 4.7% for the HCQ group (800 mg once, followed by 600 mg in 6 to 8 hours, then 600 mg daily for another 4 days; n=212) and placebo group (n=211), respectively. Death as the outcome was included in four trials [28, 31, 32, 33] . Cavalcanti et al. [28] reported 15-day mortality rates of 1.7% with HCQ plus azithromycin (HCQ 400 mg twice daily plus azithromycin 500 mg once daily for 7 days), 3.1% with HCQ alone (HCQ 400 mg twice daily for 7 days), and 2.9% with control. Horby et al. [31] reported that patients randomized to hydroxychloroquine (n=1,561) had a 28-day all-cause mortality rate of 26.8% while patients were randomized to usual care (n=3,155) had a 28-day all-cause mortality rate of 25.0%. In the trial by Abd-Elsalam et al. [32] , death at 28 days occurred in 6.1% of patients in the HCQ group (400 mg twice daily on day 1, followed by 200 mg twice daily for another 14 days; n=97) and 5.1% of patients in the control group (n=97). Furtado et al. [33] randomized patients to receive either HCQ (400 mg twice daily for 10 days; n=183) or HCQ plus azithromycin (HCQ 400 mg twice daily plus azithromycin 500 mg once daily for 10 days; n=214) and reported that death at 29 days occurred in 40% of patients in HCQ group and 42% in HCQ plus azithromycin group. The other outcome measure in HCQ trials was radiological lung clearance, where Chen Z et. al [22] employed a dosing regimen of 400 mg daily and reported an improvement in radiological results on day 6 in 80.6% of patients in the HCQ arm (n=31) versus 54.8% of patients in the control arm (n=31). Boulware et al. [26] reported incidence of either laboratory-confirmed COVID-19 or illness compatible with COVID-19 within 14 days of administration of either HCQ (800 mg once, followed by 600 mg in 6 to 8 hours, then 600 mg daily for 4 additional days) or placebo for patients who had exposure to confirmed COVID-19 cases and reported a lower incidence of 11.8% in HCQ arm (n=414) compared to 14 .3% in the placebo arm (n=407). Mitjà and Ubals et al. [34] compared the incidence of laboratoryconfirmed symptomatic COVID-19 within 14 days of administration of either HCQ (800 mg on day 1, followed by 400 mg once daily for six days; n=1,116) and usual care (n=1,198) for healthy contacts of COVID-19 index cases and observed that the incidence was lower in HCQ arm (5.7%) relative to usual care arm (6.2%). For CQ trials, Huang et al. [24] compared CQ in a regimen of 500 mg orally twice daily for 10 days (n=10) with lopinavir/ritonavir in a regimen of 400 mg/100 mg for 10 days (n=12) and observed that all patients in the CQ arm achieved virologic clearance on day 14, while eleven out of twelve patients in the lopinavir/ritonavir arm achieved virologic clearance on day 14. In terms of lung clearance rate based on computed tomography imaging, 60% in the CQ group achieved radiological lung clearance by day 9, compared with 25% in the lopinavir/ritonavir group. Borba et al. [25] compared between high (600 mg twice daily for 10 days; n=41) and low (450 mg twice daily on day 1 and once daily for 4 days; n=40) doses of CQ and reported a higher mortality rate (17.5%) with a high dose regimen compared to the low dose regimen (9.7%). Tang et al. [21] in his HCQ trial noticed a higher rate of any adverse events among HCQ recipients (n=21/70; 30.0%) compared to those who received standard-of-care (n=7/80; 8.8%). Similarly, Boulware et al. [26] reported a higher rate of any adverse events in HCQ users (140/414; 40.1%) J o u r n a l P r e -p r o o f relative to placebo (n=59/407; 16.8%). In both trials, the most common adverse event among the HCQ recipients was diarrhea (n=7/70; 10.0% in Tang et al. [21] and n=81/414; 19.6% in Boulware et al. [26] ). This was also reported in a larger trial by Esper et al. [23] who also reported diarrhea (n=68/412; 16 .5%) as the most common adverse event among 412 patients who received HCQ. In both the trials by Mitjà et al. [29, 34] , there was also a higher rate of any adverse events in the HCQ arm compared to the control/placebo arm (n=121/169; 72.0% versus n=16/184; 8.7% in Mitjà and Corbacho-Monné et al. [29] and n=671/1197; 51.6% versus n=77/1300; 5.9% in Mitjà and Ubals et al. [34] ). The most frequent adverse events reported among participants given HCQ in both the trials by Mitjà et al. [29, 34] were related to the gastrointestinal system (diarrhea, nausea, and abdominal pain) without individual presentation of the adverse events. Similarly, Skipper et al. [27] reported more frequent encounters of any adverse events in the HCQ group (n=92/212; 43.4% versus n=46/211; 21.8%), with the most frequent adverse events reported was upset stomach/nausea (n=66/212; 31.1%). Though more adverse events were also reported in the trial by Cavalcanti et al. [28] in patients who received HCQ plus azithromycin (n=94/239; 39.3%) or HCQ alone (n=67/199; 33.7%) than in those who received azithromycin alone (n=9/50; 18.0%) or none of the trial drugs (n=40/177; 22.6%), QTc interval >480 msec within 7 days was the most frequent adverse event encountered in patients who received HCQ (n=17/116; 14.7% in HCQ plus azithromycin group and n=13/89; 14.6% in HCQ alone group). Chen Z et al. [22] reported two patients (3.2%) with mild adverse reactions in the HCQ group (n=31), where one patient developed a rash, and one patient experienced a headache, while none experienced an adverse event in the control group. Futaro et al. [33] did not compare the proportion of patients with any adverse events between HCQ recipients and HCQ plus azithromycin recipients, though they reported a higher proportion of serious adverse events among HCQ plus azithromycin recipients (n=102/241; 42%) relative to HCQ recipients (n=75/198; 38%). Similarly, Chen C et al. [30] did not compare the proportion of patients with adverse J o u r n a l P r e -p r o o f events between the two study arms, though they reported headache as the most frequent grades 1 and 2 HCQ-related adverse events. Huang et al. [24] in his CQ trial observed that almost all of the patients (n=9/10) experienced CQrelated adverse events, with the most common event being vomiting (n=5) and diarrhea (n=5). Borba et al. [25] who compared between high and low dose regimen of CQ reported that more proportion of patients (in the safety population) who received high-dose CQ (n=7/37; 18.9%) experienced QT prolongation compared to those who received low-dose CQ (n=4/36; 11.1%). Among the trials [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] 33, 34] which presented harm data (n=12), all trials presented harm data in both text and tables except the trial by Chen C et al. [30] which presented harm data in text only. Nevertheless, only four [26] [27] [28] [29] out of the twelve trials described the scale or any criteria employed to measure the severity of AEs. In all but four trials [26, 27, 33, 34] , safety data were presented as frequencies only without a statistical comparison of the occurrence of the adverse event between the investigational and control arms. All but two trials by Esper et al. [23] and Skipper et al. [27] attributed adverse events to HCQ. Nine trials [21, 22, 24, [27] [28] [29] [30] 33, 34] reported both frequent and serious adverse events, while the other three trials [23, 25, 26] reported only adverse events selected by the investigators. Out of a maximum score of 19, the median THRS was 6.5 (interquartile range=5). Out of the total 19 CONSORT items, all but five trials [25, [27] [28] [29] 33] reported less than 50% of the items (THRS range: 1 to 9). Only two trials by Borba et al. [25] and Cavalcanti et al. [28] reported more than 60% of the items (THRS=13). The number (and percentages) of the RCTs fulfilling each of CONSORT harms recommendations are presented in Table 2 . Scoring for each recommendation can be found in Table S1 . Nine trials [21, 22, [25] [26] [27] [28] [29] 33, 34] mentioned AEs in the title or abstract (CONSORT recommendation 1). However, only five J o u r n a l P r e -p r o o f trials [23, 25, 28, 32, 33] provided information on AEs in the introduction section (CONSORT recommendation 2). Though only five trials [26] [27] [28] [29] 31 ] used a validated scale to measure the severity of AEs, half of the trials (n=8) [21, 23, 25, 26, [28] [29] [30] 33] defined the AEs (CONSORT recommendations 3) in their report. Only seven trials [20, [25] [26] [27] [28] [29] 34] described how AE-related data were collected (CONSORT recommendation 4 (4a)), but more than half (n=9) [20, 22, 23, [25] [26] [27] [28] [29] 34] of the trials described when AE data were collected (CONSORT recommendation 4 (4b)). There were only two trials [25, 28] described methods of presenting and/or analyzing AEs (CONSORT recommendation 5). Less than half of the trials (n=7) [21, 22, [24] [25] [26] [27] 34] described the number of withdrawals due to AEs in each arm (CONSORT recommendation 6 (6a)). More than half (n=10) [20, [24] [25] [26] [27] [28] [29] [30] 33, 34] of the trials provided denominators for AEs (CONSORT recommendation 7 (7a)). The majority of trials presented results for each arm separately (n=13) [20] [21] [22] [23] [25] [26] [27] [28] [29] [30] [31] 33, 34] and presented a balanced discussion on both safety and efficacy of the drug (n=11) [21, 24, 25, [27] [28] [29] [30] [31] [32] [33] [34] (CONSORT recommendations 8 (8a) and 10 (10a)). The trials by Esper et al. [23] and Cavalcanti et al. [28] were the only two trials that described subgroup analyses and exploratory analyses for harms (CONSORT recommendations 9). The risk of bias analysis using the RoB v2 and ROBINS-I framework for the sixteen trials included in this review is summarized in Figure 2 . Surprisingly, none of the sixteen clinical trials scored an overall 'low risk', four trials scored a high [21, 26] , critical [9] , or serious [23] risk, and the remaining trials [20, 22, 24, 25, [27] [28] [29] [30] [31] [32] [33] [34] were classified into a moderate risk. Among the fourteen randomized controlled trials [20, 21, 22, [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] included in this review which had their risks of bias assessed using RoB v2, all of the trials had at least some concerns of risk of bias in at least one of the bias domains in RoB v2. The most significant risk of bias domain across all the fourteen randomized controlled trials was bias in the measurement of the outcome, in which eleven of the trials [21, 22, 25, 26, [28] [29] [30] [31] [32] [33] [34] had at least some concerns on this domain of bias. Such concerns were due to the possibility that assessments of outcomes, either efficacy outcomes or outcomes of occurrence J o u r n a l P r e -p r o o f of adverse events, would be affected by the knowledge of the intervention assignment by the outcome assessors owing to open-label study design. The trial by Boulware et al. [26] had a particularly high risk of bias in the measurement of the outcome because 24% of those randomized to receive study treatment were excluded in the analysis on the outcome of occurrence of adverse events. Though similar proportions of participants were excluded from analysis in the two groups on the outcome of the occurrence of adverse events, the reasons for exclusion differ between the two arms and are likely to be related to the outcome. Also, the trial by Boulware et al. [26] had some concerns of bias due to missing outcome data since there were critical differences between interventions in the proportion of participants with missing data (seventeen HCQ-treated patients discontinued treatment while eight in the placebo group). The trial by Tang et al. [21] had an overall high risk of bias, with a particularly high risk of bias arising from the randomization process, in which it was an open-label study and the authors did not assess baseline differences between intervention groups. Also, a high risk of bias was noted in the trial by The trial by Chen J et al. [20] was noted to have the least risk of bias, with some concerns over the randomization process since no information was provided about the concealment of the allocation sequence, while all the other domains had a low risk of bias. The trial by Borba et al. [25] had some concerns over the risk of bias in the selection of the reported results due to the early interruption of the high-dose arm, the unmasking of treatment allocation in some participants, and the missing data on radiological findings. Both the trials by Mitjà and Corbacho-Monné et al. [29] and Skipper et al. [27] had some concerns over the risk of bias due to deviations from intended intervention. Nevertheless, the trial by Mitjà and Corbacho-Monné et al. [29] had also some concerns over the risk of bias in the selection of the reported result due to the unavailability of the study protocol and statistical analysis plan. On the other hand, the trial by Skipper et al. [27] had also some concerns over the risk of bias arising from the randomization process because 20% of the participants assigned to the intervention and control arm were also randomized in a separate trial. Moreover, with nearly 5% of participants with missing data for mortality outcome and with more than 10% of participants with missing data for the primary outcome in the trial by Skipper et al. [27] , there J o u r n a l P r e -p r o o f appear to be some concerns over the risk of bias due to missing outcome data. Though the trial by Chen C et al. [30] had a low risk of bias due to deviations from intended intervention, there were some concerns over the risk of bias arising from the randomization process due to differences in the proportion of participants with mild disease severity and moderate disease severity between the two study arms at baseline and some concerns over the risk of bias in the selection of the reported result due to unavailability of the study protocol and statistical analysis plan. The risk of bias for non-randomized controlled trials [9, 23] was assessed using the ROBIN-I tool. Across the seven domains of the risk of bias in the ROBIN-I tool, the trial by Gautret et al. [9] had at least a moderate risk of bias in all domains except bias in the measurement of outcomes (low risk). In particular, the trial by Gautret et al. [9] had a critical risk of bias due to missing data, in which there were critical differences between the two intervention arms in the proportion of participants with missing data (six HCQ-treated patients were lost in follow-up while none in the control group), had a serious risk bias due to confounding due to the baseline differences in participants' characteristics between treatment and control arms. Also, the trial had a serious risk of bias due to deviations from This systematic review was aimed to critically assess and summarize the quality of published clinical trials evaluating the effectiveness of CQ or HCQ in COVID-19. Overall, the sixteen trials [9, [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] included in our review recruited over 10,873 participants (5,036 patients in HCQ/CQ group and 5,837 patients in the comparator group). Potentially a meta-analysis of these trials would have definitively J o u r n a l P r e -p r o o f answered the effectiveness and safety questions regarding CQ or HCQ in COVID-19, but given the methodological inadequacies in these trials, a meta-analysis is likely to produce misleading outcomes. Since the publicity of HCQ/CQ in the press as the potential cure for COVID-19 based on promising preliminary clinical experiences [35] and in vitro investigations [36] , several trials were designed across the world to testify the effectiveness and safety of these antimalarial drugs for the treatment of COVID-19. However, we noticed that most of these trials recruited a fairly small number of participants and hence lack the statistical power and significance. Upon critical review of the trial design and reporting, we noticed that the findings from these trials were far from conclusive, in which inferences were contradictory and findings were significantly confounded by covariates such as comorbidities and/or use of co-interventions, along with lack of adjustment for disease severity at baseline. Indeed, most trials provided no information on the baseline severity of recruited patients. Moreover, all trials were associated with moderate to high risk of bias, the examples included risks of bias arising from randomization process, potential deviation from intended interventions, outcome measurement, selective reporting, confounding, participant selection, and/or classification of interventions. Also, none of the trials included in this review met the CONSORT criteria in full for reporting harms in the trials, with few trials presented no harm data. Albeit a safe use history of HCQ/CQ, the adverse events such as toxic retinopathy and QTc-prolongation especially may have a fatal consequence and, therefore, an inadequate reporting on adverse events in these trials can be misleading and pose serious risks to public safety, particulary during more widespread use during COVID-19 pandemic. This is not surprising given that amid the current pandemic, the researchers may place a stronger emphasis on benefits than the risks in an attempt to save lives. It is, however, acknowledged that there are genuine methodological challenges in designing studies for COVID-19 pandemic crises and there is a great sense of urgency for the containment of the COVID-19 pandemic. A recent editorial by Knottnerus et al. [12] has critically summarized these issues. However, the quality of trial design and reporting is imperative for the adoption of trials' findings in patients. Clinicians are eagerly looking for definitive answers as to what works and what does not and they may not have sufficient time amid the COVID-19 emergency to critically appraise every single trial. A complete and accurate reporting is, therefore, invaluable to inform policy makers and guide clinical decisions. Furthermore, the responsibility to ensure a greater balance between reporting of both benefits and harms lies with the authors and the journals publishing those trials. Whilst we acknowledge the limited availability of space in journals often leads to selective outcomes reporting, however it should not be an excuse since this can be easily overcomed by the reporting of supplementary data. We hope that this review would encourage clinical researchers to better design, conduct, and report trials, to uphold the principles of evidence-based medicine even amid a global health emergency like COVID-19. The strengths of our review lie in the comprehensive literature search. We have used multiple databases such as PubMed and EMBASE, clinical trial registry, and COVID-19 specific database (DIMENSIONS) to search for relevant clinical trials. Furthermore, we utilized standardized quality reporting tools and methods employed in the synthesis of evidence, in which for research synthesis, we followed standard PRISMA guidelines in searching, selection, inclusion, and exclusion of studies, as well as with data extraction. We evaluated the harms reporting and risk of bias in the included trials using standardized CONSORT harms recommendations and RoB v2 and ROBINS-I tools, respectively. Although we used all the possible terms, free-text terms, and the MeSH terms to search for relevant HCQ/CQ clinical trials, the sensitivity of our strategy is still unknown. In addition, as with most studies examining the design and reporting methods, it is challenging particularly when the reporting of these aspects is incomplete. An example of this is the reporting of baseline disease activity in the HCQ/CQ trials, where it can be difficult to determine the methodology used as it was often not defined. J o u r n a l P r e -p r o o f There are implications of this work in designing and conducting of randomized controlled trials for the treatment of COVID-19. This review suggests that basic requirements for designing and conducting randomized controlled trials should never be compromised and standardized protocol must be used and followed. It is also important that confounders should always be identified and adjusted. To uphold the public trust in medical practice amid the COVID-19 pandemic, randomized controlled trials should be designed and reported more exhaustively, particularly when evaluating the effects of the specific treatment that could be potentially life-saving. The authors should always report all adverse events experienced by the patients during a trial with optimal quality standards (CONSORT harms recommendations), where all adverse events are explicitly described (instead of a mere selection) and provide details for patients who were dropped out due to adverse events. Definitions of baseline disease activity should always be provided (and appropriately adjusted for if there are differences) to determine if a particular treatment is effective in either mild, moderate, severe, or critical stage of the disease. Given the quality of evidence available, it is not possible to draw a meaningful conclusion on the effectiveness and the safety of CQ or HCQ for the treatment of patients with COVID-19. The quality of evidence should be carefully considered while making clinical and policy decisions, particularly during a pandemic. The importance of designing and reporting trials properly cannot be overemphasised for the synthesis of clinical evidence and its dismissal in entirety or partially amid pandemic crises could not only lead to a waste of the invaluable healthcare resources but may also risk precious lives. The authors declare that they have no competing interests. Director-General's remarks at the media briefing on An interactive web-based dashboard to track COVID-19 in real time Clinical features of patients infected with 2019 novel coronavirus in Wuhan, China Global Covid-19 Case Fatality Rates [Internet] UK: Centre for Evidence-Based Medicine Features of 20 133 UK patients in hospital with covid-19 using the ISARIC WHO Clinical Characterisation Protocol: prospective observational cohort study What if hydroxychloroquine doesn't work? What if it does? Right now, we don't know Effects of chloroquine on viral infections: an old drug against today's diseases? Is it worth the wait? Should Chloroquine or Hydroxychloroquine be allowed for immediate use in CoViD-19? Hydroxychloroquine and azithromycin as a treatment of COVID-19: results of an open-label non-randomized clinical trial What do we know about Hydroxychloroquine? Methodological challenges in studying the COVID-19 pandemic crisis Reporting of safety data from randomised trials Reporting of Adverse Effects in Clinical Trials Should Be Improved Adverse event reporting in randomised controlled trials of neuropathic pain: considerations for future practice Quality of reporting of harms in randomised controlled trials of pharmacological interventions for rheumatoid arthritis: a systematic review Better reporting of harms in randomized trials: an extension of the CONSORT statement The Cochrane Collaboration's tool for assessing risk of bias in randomised trials ROBINS-I: a tool for assessing risk of bias in non-randomised studies of interventions A pilot study of hydroxychloroquine in treatment of patients with common coronavirus disease-19 (COVID-19) (published online March 6, 2020) Hydroxychloroquine in patients with mainly mild to moderate coronavirus disease 2019: open label, randomised controlled trial Efficacy of hydroxychloroquine in patients with COVID-19: results of a randomized clinical trial Empirical treatment with hydroxychloroquine and azithromycin for suspected cases of COVID-19 followed-up by telemedicine Treating COVID-19 with Chloroquine Effect of High vs Low Doses of Chloroquine Diphosphate as Adjunctive Therapy for Patients Hospitalized With Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infection: A Randomized Clinical Trial A Randomized Trial of Hydroxychloroquine as Postexposure Prophylaxis for Covid-19 Hydroxychloroquine in Nonhospitalized Adults With Early COVID-19: A Randomized Trial Hydroxychloroquine with or without Azithromycin in Mild-to-Moderate Covid-19 Hydroxychloroquine for Early Treatment of Adults with Mild Covid-19: A Randomized-Controlled Trial A Multicenter, randomized, open-label, controlled trial to evaluate the efficacy and tolerability of hydroxychloroquine and a retrospective study in adult patients with mild to moderate Coronavirus disease 2019 (COVID-19) Effect of Hydroxychloroquine in Hospitalized Patients with COVID-19: Preliminary results from a multi-centre, randomized, controlled trial Hydroxychloroquine in the Treatment of COVID-19: A Multicenter Randomized Controlled Study Azithromycin in addition to standard of care versus standard of care alone in the treatment of patients admitted to the hospital with severe COVID-19 in Brazil (COALITION II): a randomised clinical trial A Cluster-Randomized Trial of Hydroxychloroquine as Prevention of Covid-19 Transmission and Disease Hydroxychloroquine, a less toxic derivative of chloroquine, is effective in inhibiting SARS-CoV-2 infection in vitro Breakthrough: Chloroquine phosphate has shown apparent efficacy in treatment of COVID-19 associated pneumonia in clinical studies CQ: chloroquine; HCQ: hydroxychloroquine; NR: not reported; ICU: intensive care unit; SOC: standard of care; RCT: randomized controlled trial; COPD: chronic obstructive pulmonary disease; HIV: human immunodeficiency virus The authors of this review received no funding 10. Provide a balanced discussion of benefits and harms with emphasis on study limitations, generalisability and other sources of information on harms.10a. If the discussion was balanced with regard to efficacy and AEs 9 (64)10b. Limitations of the study specifically in relation to AEs discussed