key: cord-0772179-nk50q2dk authors: Desai, Angel; Nouvellet, Pierre; Bhatia, Sangeeta; Cori, Anne; Lassmann, Britta title: Data journalism and the COVID-19 pandemic: opportunities and challenges date: 2021-09-20 journal: Lancet Digit Health DOI: 10.1016/s2589-7500(21)00178-3 sha: 21a00f781840517a78f9c409303db1d095632a55 doc_id: 772179 cord_uid: nk50q2dk nan Non-traditional disease surveillance tools, including news media reporting, have disseminated event-based information during past disease outbreaks. 2 The current global health crisis has highlighted the additional possibilities that so-called data journalism can offer. While the news media has traditionally reported on events of public health importance, media outlets over the course of the COVID-19 pandemic have also conducted data collation, including detailed summaries of case counts and deaths, data curation, and, in some instances, analysis ( Before COVID-19 was declared a Public Health Emergency of International Concern, news reports served as key data sources to further understand disease transmission and spread. Academic institutions and researchers assembled early epidemiological data scattered across various news articles to inform risk assessments, forecasts, and policy decisions. 3 The relative dearth of traditional public health data at the beginning of an epidemic is not a new phenomenon. During the 2014-15 west Africa Ebola outbreak, for example, early epidemiological data were often only available through local and international news media articles. 4 As the COVID-19 pandemic evolved, and in response to epidemiological data gaps, news media outlets began to collect and synthesise data for scenarios involving congregate settings such as schools, large public events, and household transmission. In some cases, media have actively reached out and solicited case counts from their readers-a strategy known as participatory surveillance-effectively recruiting the public back into public health (table) . News media outlets have also been among the first to systematically collect, aggregate, and analyse excess death counts. For example, the data behind the Financial Times tracker for COVID-19 excess deaths dates to April, 2020; the tracker is open access, and the code and methodology used to clean, analyse, and present the data are available on GitHub. The Economist and The New York Times have also provided their own analyses on excess deaths (table) . As the COVID-19 pandemic has shown, there is an urgent need for real-time data that can inform risk assessments to guide public health interventions. While traditional data collection remains the cornerstone of outbreak response, public health programmes and information technology infrastructure are chronically underfunded in many countries and are not always well positioned to collect contextual information in a flexible manner. This is of particular concern during an outbreak when traditional data sources might lag in reporting cases early on. Another key need for epidemic forecasting and risk assessments is data surrounding non-pharmaceutical interventions such as physical distancing, school closures, and lockdowns. 5 Interventions differ regionally and implementation timelines are not often readily disseminated. While non-traditional disease surveillance systems have begun to fill some of these gaps, more can be done. Partnerships between academic research centres and news media should be considered, given their complementary strengths; indeed, collaboration between these entities might mitigate their respective weaknesses as well. While news media can rapidly aggregate and disseminate information, they might be unable to sustain these efforts following the course of an outbreak. Likewise, research centres might be able to continue collating and analysing data long after an outbreak has ended, but might be unable to collect relevant information in a timely manner early in an outbreak. It is also important to note that news media data and data visualisations, while informative, differ from peer-reviewed literature. Divergent incentives, intended audiences, and analytic methodologies can result in very different outputs and conclusions. Supporting collaborations between news media outlets that can provide an expedient data stream and academic institutions that can support targeted analyses could be an important step towards improving outbreak response timeliness in the future. In response to the COVID-19 pandemic, several global epidemiological data collection and harmonisation efforts have been initiated to provide guidance on conforming case definitions, data formatting, and data sharing. As these efforts are further developed, data collected by media outlets could be integrated for use by researchers and policy makers, although regulatory issues surrounding data privacy will need to be addressed. Cross-collaborations between academic groups and the media should be encouraged and the role of the media in curating, analysing, and sharing epidemiological information that is otherwise hard to collect should be recognised. While these efforts should be considered complementary to traditional public health endeavours, the rapid dissemination of accurate, real-time information remains paramount in the face of current and future communicable disease outbreaks. We declare no competing interests. Copyright © 2021 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. USA (AD); School of Life Sciences MRC Centre for Global Infectious Disease Analysis, School of Public Health, Faculty of Medicine Real-time epidemic forecasting: challenges and opportunities Digital disease detectionharnessing the Web for public health surveillance Report 1: estimating the potential total number of novel Coronavirus cases in Wuhan City, China. Version 2. Imperial College COVID-19 Response Group Key data for outbreak evaluation: building on the Ebola experience COVID-19 government response tracker