key: cord-347257-s0w95qdn authors: Kraemer, Moritz U. G.; Yang, Chia-Hung; Gutierrez, Bernardo; Wu, Chieh-Hsi; Klein, Brennan; Pigott, David M.; du Plessis, Louis; Faria, Nuno R.; Li, Ruoran; Hanage, William P.; Brownstein, John S.; Layan, Maylis; Vespignani, Alessandro; Tian, Huaiyu; Dye, Christopher; Pybus, Oliver G.; Scarpino, Samuel V. title: The effect of human mobility and control measures on the COVID-19 epidemic in China date: 2020-03-25 journal: Science DOI: 10.1126/science.abb4218 sha: doc_id: 347257 cord_uid: s0w95qdn The ongoing COVID-19 outbreak expanded rapidly throughout China. Major behavioral, clinical, and state interventions have been undertaken to mitigate the epidemic and prevent the persistence of the virus in human populations in China and worldwide. It remains unclear how these unprecedented interventions, including travel restrictions, affected COVID-19 spread in China. We use real-time mobility data from Wuhan and detailed case data including travel history to elucidate the role of case importation on transmission in cities across China and ascertain the impact of control measures. Early on, the spatial distribution of COVID-19 cases in China was explained well by human mobility data. Following the implementation of control measures, this correlation dropped and growth rates became negative in most locations, although shifts in the demographics of reported cases were still indicative of local chains of transmission outside Wuhan. This study shows that the drastic control measures implemented in China substantially mitigated the spread of COVID-19. The incubation period is the time interval between infection and symptom onset. We assumed that cases travelling from Wuhan were exposed during their stay in Wuhan. We estimated the incubation period from 38 travelling cases returning from Wuhan with known dates of symptom onset, entry and exit. The end of the exposure period was assumed to be the exit travel date except if symptom onset occurred prior to the exit date (in which case exposure was assumed to have occurred prior to symptom onset). The start of the exposure period corresponded to the entry date. We assumed that the incubation period could not exceed 30 days. For each case, the minimum and maximum incubation period was derived from the dates of entry, exit and symptom onset We fitted a truncated gamma distribution (0 to 30 days) and estimated the mean and variance of the incubation period using Markov Chain Monte Carlo (MCMC) in a Bayesian framework using an uninformative prior distribution. We derived the likelihood as follows: = 5 ( ≤ #$% + 1) − 5 ( ≤ #/0 ) 5 A Metropolis-Hastings algorithm was implemented in R. Marginal posteriors were sampled from a chain of 5,000 steps after discarding a burn-in of 50 steps. Convergence was inspected visually. Age and sex distributions are important in understanding risk of infection across populations. Assuming risk to be distributed relatively equally across a population, as an outbreak evolves age and sex distributions should follow the underlying population structure. Varying degrees of immunity and exposure may shift these distributions (30) . To examine whether the ongoing outbreak shifted from an epidemic concentrated in Wuhan and among travelers from Wuhan to an epidemic that was self-sustained in provinces across China we use age and sex data from different periods of the outbreak for individuals with reported travel history and no known travel history. We define two periods of the outbreak, an "early" phase, starting with the first reports in early December and ending a set number of days after the Wuhan shutdown. This was selected to be 8 days after the Wuhan shutdown, which conservatively corresponds to one incubation period + 1SD (see above) after the shutdown. After that date (i.e. 1 st Feb 2020; the "later" phase) we assume that most reported transmissions in provinces outside of Wuhan are the result of local transmission. We further divided our data in those that had cases with known travel history to Wuhan and those who did not. Then we produce the following summary statistics: We cannot exclude the possibility that shifts in distributions may be due to heightened awareness among the general population which may have increased reporting in female cases later in the epidemic. Further, more work will be necessary to understand the differential risk of severe or symptomatic disease to fully understand the age and sex distributions in this outbreak. For example, why there are relatively few reports of cases <18y old. However, as for other respiratory pathogens symptomatic and severe infection were more concentrated in older populations. We do not intend to make any general statements about differential risk but were more interested in shifts in reported cases across multiple geographies in China. We extract human mobility data from the Baidu Qianxi web platform, which presents daily population travels between cities or provinces tracked through the Baidu Huiyan system. The data do not represent numbers of individual travelers but rather an index of relative movements constructed by Baidu's proprietary methods which are correlated with human mobility (31) (http://qianxi.baidu.com/). In particular, two pieces of information are collected. First, we extract a series of migration scale indices for traveling out of Wuhan, from January 1st to February 10, both in 2019 and 2020. Second, we obtain the proportion of human movement from Wuhan were bound for each of 31 provinces in China. These proportions are available for January 1st -February 10, 2020. Based on this data we had access to both changes in mobility volume and changes in mobility direction. See more detailed descriptions of the human movement data here: (32, 33) . As of 2017, Baidu Inc's. mapping service had a 30% market share in China (34) . We reviewed the literature and online social media to understand the key timings of interventions and announcements that are relevant for disease transmission across China. We collated information about the type (e.g., announcement of outbreak, travel restrictions, isolation of patients, etc.), geographic location (e.g., city where available, province), and timing (specific date or date range). Definitions of probable and confirmed COVID-19 cases have changed throughout the epidemic. We collected data from official sources describing the timing and specifics of the case definitions. Probable: Need to satisfy (i) and (ii): i. Clinical symptoms: (1) fever; (2) imaging showing pneumonia typical of the disease; (3) during early disease, total white cells normal or reduced, or lymph cell count reduced. ii. Epidemiologic history: (1) within 2 weeks of symptom onset, Wuhan travel or resident history; or within 2 weeks of symptom onset, contact with persons from Wuhan who had fever with respiratory symptoms; or belong to a cluster. Confirmed: Need to satisfy criteria for probable case and have a real-time quantitative polymerase chain reaction (RT-qPCR) positive result from sputum, nasopharyngeal swabs, lower respiratory tract secretions or other sample tissue, or genome sequencing highly similar with known SARS-CoV-2. available strains. Probable: Need to satisfy (i) and any one epidemiologic history described in (ii): i. Clinical symptoms: (1) fever; (2) From January 27-February 5: Probable: Need to satisfy any two of the symptoms described in (i) and any of the January 2020 and 0 before (which represents one median incubation period from 22nd January 2020). Models were fit to province-level data. The three models were compared using differences in Bayesian Information Criteria (BIC), where larger values indicate models with lower relative support, and BIC>4 considered the cutoff for substantial model improvement. We performed a detailed sensitivity analysis on the availability of RT-qPCR tests, doubling time, and incubation periods. We obtained qualitatively similar results for Model 1 (Poisson GLM fit to daily case counts), Model 2 (negative binomial GLM fit to daily case counts), and Model 3 (loglinear regressions fit to cumulative cases), see Table S2 . In addition, we provide a full time series analysis of the optimal lag structure for cases and mobility for each province. Additionally, although BIC is considered more conservative, model selection results were confirmed using AIC for model selection (see Fig. 4 and Table S2 ). Lastly, we validated our model selection results using elastic-net regression and n-fold cross validation as implemented in the R package GLMNET v. 2.0-18 (35, 36) . To estimate the epidemic doubling time across each province, we fit a mixed effects Poisson GLM of daily case counts to days since the first case report in each province (fixed effect) and a random effect for each province on the slope and intercept, using the R package lme4 v. (37) . All code and data are available here (38) . To ascertain whether earlier travel restrictions could have prevented the wide-spread increase in cases witnessed in late-January we constructed a simple forecasting model for COVID-19. Briefly, we forecast the cumulative number of cases in each Chinese province by simply doubling the number of cumulative cases reported six days prior. For dates prior to Jan. 28th and after Feb 3rd, this naive forecast produces an accurate estimate of the cumulative number of cases in each province (Fig. S4) . However, the cumulative number of cases reported on Jan 28th is poorly estimated using this model (Fig. S4) . In order to accurately forecast the number of cases on Jan 28th, we must also include the relative amount of mobility out of Wuhan into various provinces in the regression model. In Fig. S4 , we show how a model including only movement from Wuhan on January 22nd fit to the residuals from Fig. S4 is once again able to accurately forecast cumulative cases. This indicates that for any hope of success of controlling the spread of an epidemic, movement restrictions must be prompt. Table S1 . COVID-19 control in China during mass population movements at New Year China Novel Coronavirus Investigating and Research Team, A novel coronavirus from patients with pneumonia in China The impact of transmission control measures during the first 50 days of the COVID-19 epidemic in China Risk for transportation of 2019 novel coronavirus disease from Wuhan to other cities in China Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study Quantifying the association between domestic travel and the exportation of novel coronavirus (2019-nCoV) cases from Wuhan, China in 2020: A correlational analysis Coronavirus Disease 2019 (COVID-19) Situation Report Epidemiological data from the COVID-19 outbreak, real-time case information Middle East respiratory syndrome coronavirus: Quantification of the extent of the epidemic, surveillance biases, and transmissibility Incubation periods of acute respiratory viral infections: A systematic review Incubation period of 2019 novel coronavirus (2019-nCoV) infections among travellers from Wuhan, China Generalized linear models" in Statistical Models in Pattern of early human-to-human transmission of Wuhan Reporting, epidemic growth, and reproduction numbers for the 2019 novel coronavirus (2019-nCoV) epidemic Metapopulation dynamics of infectious diseases" in Ecology, Genetics and Evolution of Metapopulations Multiscale, resurgent epidemics in a hierarchical metapopulation model Characteristics of and important lessons from the coronavirus disease, 2019 (covid-19) outbreak in china: summary of a Report of 72,314 cases from the Chinese Center for Disease Control and Prevention Novel Coronavirus Pneumonia Emergency Response Epidemiology Team, The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China Temporally varying relative risks for infectious diseases: Implications for infectious disease control Factors that make an infectious disease outbreak controllable Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus Code for: The effect of human mobility and control measures on the COVID-19 epidemic in China Kraemer; Open COVID-19 Data Curation Group, Open access epidemiological data from the COVID-19 outbreak A database of geopositioned Middle East Respiratory Syndrome Coronavirus occurrences The chi-square test of independence Handbook of Biological Statistics (Sparky House Genomic and epidemiological monitoring of yellow fever virus transmission potential Past and future spread of the arbovirus vectors Aedes aegypti and Aedes albopictus The impact of traffic isolation in Wuhan on the spread of Population movement, city closure and spatial transmission of the 2019-nCoV infection in China Mobile Map App Research Report: Which of the Highest, the Baidu, and Tencent Is Strong? Fitting linear mixed-effects models using lme4 Regularization paths for generalized linear models via coordinate descent Open COVID-19 Data Working Group Pseudo R-squared measures for Poisson regression models with over-or underdispersion Regression models for count data in R