key: cord-1049511-tjx9erfy authors: Yanis, Roussel; Matthieu, Million; Eric, Chabriere; Jean-Christophe, Lagier; Raoult, Didier title: Be careful with Big Data: Re-analysis of Patient Characteristics and Outcomes of 11,721 Patients with COVID19 Hospitalized Across the United States date: 2020-10-22 journal: Clin Infect Dis DOI: 10.1093/cid/ciaa1618 sha: 79adc828a0b5fb563107261aa053fa42c0a9f94c doc_id: 1049511 cord_uid: tjx9erfy nan 12.2%). Seven percent of patients were intubated on the first day of hospitalization. While the mechanical ventilation subgroup of patients in the remdesivir group is analyzed in the supplementary data, this is not the case for the hydroxychloroquine group. 93% of patients in the hydroxychloroquine group had pneumonia, versus 79% in the non-hydroxychloroquine group (Supplemental table 3 in Field and al (1)). The risk of being trapped in a Simpson's paradox-like situation (3) is high. This brings us to our main point of concern, which is the reliability of the data used in this article. Among the 11 authors, 7 are affiliated to a data collection company, namely Target Pharmasolutions. In this article, they provide little detail on how the data were collected. The authors state that the data come "from a commercial insurance claims database that requires a data sharing agreement and data license for access". They also specify that the data "were acquired from a commercially available source representing adults receiving inpatient care between February 15 and April 20, 2020 at 245 hospitals across 38 states in the US". The hospital names are not provided, nor whether these hospitals have agreed to have their data used in such a study. The information available on the Target Pharmasolutions company website does not provide further details on the data collection mechanism. A c c e p t e d M a n u s c r i p t There are some points that catch our attention. For instance, we do not understand how 99.4% of patients treated with hydroxychloroquine were treated in urban hospitals, compared to 65% of untreated patients (Supplemental Table 3 in Fried et al. (1) ), while patients are distributed in a more balanced manner between teaching or not-teaching hospitals, as well as in the most urbanized (Northeast) and less urbanized (Midwest) regions of the United States. Likewise, the mortality rate of 70.5% among patients under mechanical ventilation ( Table 2 in A c c e p t e d M a n u s c r i p t Epub ahead of print Effectiveness of hydroxychloroquine was hiding in plain sight The Interpretation of Interaction in contingency tables. ournal of the oyal Statistical Society, s rie RETRACTED: Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis Jun 5;:null. Erratum in: Lancet Cardiovascular Disease, Drug Therapy, and Mortality in Covid-19 The authors declare no competing interests. Funding sources had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript. Our Marseille group used widely available generic drugs distributed by many pharmaceutical companies. A c c e p t e d M a n u s c r i p t