key: cord-0872362-elq4rrdi authors: Bisanzio, Donal; Kraemer, Moritz U G; Brewer, Thomas; Brownstein, John S; Reithinger, Richard title: Geolocated Twitter social media data to describe the geographic spread of SARS-CoV-2 date: 2020-07-23 journal: J Travel Med DOI: 10.1093/jtm/taaa120 sha: b0a15726a04af4b88a6e959ccc739ec8508a4975 doc_id: 872362 cord_uid: elq4rrdi Openly available, geotagged Twitter data from 2013 to 2015 was used to estimate the 2019–2020 human mobility patterns in and outside of China to predict the spatiotemporal spread of severe acute respiratory syndrome coronavirus 2. Countries with the highest number of visiting Twitter users outside of China were the USA, Japan, UK, Germany and Turkey. A high correlation was observed when comparing country-level Twitter user visits and reported cases. As of 1 August, 2020, 17,396,943 confirmed cases of coronavirus disease (COVID-19) have been reported since December 2019, including 675,060 deaths, in >225 countries. 1 On 11 March 2020, World Health Organization declared COVID-19 a pandemic. We show how geolocated Twitter data can be used to predict the spatiotemporal spread of reported COVID-19 cases at the global level from China to identify those areas at high risk of becoming secondary outbreaks. We applied an analytical approach previously used to study dengue transmission dynamics 2 ; an analysis during the early stages of COVID-19 predicted 74.1% of locations were cases would occur. 3 Briefly, we used a convenience sample of openly available, geotagged Twitter data from 2013 to 2015 to estimate 2019-2020 human mobility patterns in and outside of China; at a global scale, mobility has shown to be fairly stable over long period of time. 4 Human mobility patterns were estimated by analyzing the Twitter data from users who had: Several locations we identified in our analyses, including London, Singapore, Tokyo and Bangkok, were also previously identified as possible locations for severe acute respiratory syndrome coronavirus 2 (SARS-CoV2) spread in an analysis of using 2019 International Air Transport Association data. 6 We used geolocated tweets instead of other data such as flights, census surveys, internet traffic and mobile phone activity, as these do not necessarily allow to identify travellers' intermediate or final destinations (e.g. flight data only capture the flight route but not visited cities; mobile phone data do not capture overseas trips). 6 ,7 Our analyses show that geolocated Twitter data can be used to describe the spread of a novel, highly transmissible agent such as SARS-CoV2 and identify areas at high risk of importation. This would allow public health authorities to develop appropriate response plans as well as start sensitizing public health providers and the population to the impending risk of exposure to such agent. The authors declare no potential conflict of interest. This study was conducted on institutional overhead funds. The opinions, results and conclusions reported in this paper are those of the authors and are independent from funding sources of the authors' respective institutions and employers. WHO. Coronavirus disease 2019 Inferences about spatiotemporal variation in dengue virus transmission are sensitive to assumptions about human mobility: a case study using geolocated tweets from Lahore Use of Twitter social media activity as a proxy for human mobility to predict the spatiotemporal spread of COVID-19 at global level Unraveling daily human mobility motifs An interactive web-based dashboard to track COVID-19 in real time Potential for global spread of a novel coronavirus from China Measuring mobility, disease connectivity and individual risk: a review of using mobile phone data and mHealth for travel medicine Study design, data analysis, data interpretation, writing (Donal Bisanzio; Richard Reithinger); data collection, data interpretation, writing (Moritz Kraemer, Thomas Brewer, John Brownstein).