key: cord-0141622-2mdnhjo1 authors: Liu, Ji; Wang, Xiakai; Xiong, Haoyi; Huang, Jizhou; Huang, Siyu; An, Haozhe; Dou, Dejing; Wang, Haifeng title: An Investigation of Containment Measures Against the COVID-19 Pandemic in Mainland China date: 2020-07-16 journal: nan DOI: nan sha: 0373565c1bce0d7c6abc1c86cf3725c45ddf8e79 doc_id: 141622 cord_uid: 2mdnhjo1 As the recent COVID-19 outbreak rapidly expands all over the world, various containment measures have been carried out to fight against the COVID-19 pandemic. In Mainland China, the containment measures consist of three types, i.e., Wuhan travel ban, intra-city quarantine and isolation, and inter-city travel restriction. In order to carry out the measures, local economy and information acquisition play an important role. In this paper, we investigate the correlation of local economy and the information acquisition on the execution of containment measures to fight against the COVID-19 pandemic in Mainland China. First, we use a parsimonious model, i.e., SIR-X model, to estimate the parameters, which represent the execution of intra-city quarantine and isolation in major cities of Mainland China. In order to understand the execution of intra-city quarantine and isolation, we analyze the correlation between the representative parameters including local economy, mobility, and information acquisition. To this end, we collect the data of Gross Domestic Product (GDP), the inflows from Wuhan and outflows, and the COVID-19 related search frequency from a widely-used Web mapping service, i.e., Baidu Maps, and Web search engine, i.e., Baidu Search Engine, in Mainland China. Based on the analysis, we confirm the strong correlation between the local economy and the execution of information acquisition in major cities of Mainland China. We further evidence that, although the cities with high GDP per capita attracts bigger inflows from Wuhan, people are more likely to conduct the quarantine measure and to reduce going out to other cities. Finally, the correlation analysis using search data shows that well-informed individuals are likely to carry out containment measures. Since December 2019, novel coronavirus COVID-19 has been identified and the outbreak has expanded rapidly throughout tremendous countries, e.g., China [1] , United States [2], European countries [3] , etc. In China, the number of confirmed cases increased from 571 on January 23, 2020 to 84,388 on May 1, 2020 and saturated around 84.5 thousand. The COVID-19 has become a global emergency and is currently spreading throughout the whole world [4, 5] . In order to deal with the rapid outbreak of the COVID-19 pandemic in Mainland China, a range of containment measures have been put in place by Chinese authorities [6] [7] [8] . Similar containment measures have been adopted in major countries all over the world [9, 10] etc. In Mainland China, the containment measures consist of intra-city and inter-city measures. As intra-city measures, suspected and confirmed cases have been quarantined in hospitals or monitored self-quarantine at home [11] , which is denoted the "quarantine" measure in the paper. The authorities also encouraged citizens to stay-at-home and discouraged mass gatherings closed schools [12] . In addition, Wuhan city travel ban was adopted, i.e., all transport was prohibited in and out of Wuhan city from 10:00 a.m. on 23 January 2020, which incurred a significant reduction of the outflows from Wuhan as shown in 1. As shown in Figure 2 , the national spring vacation has been prolonged and inter-city travel has been discouraged to reduce massive human migration across cities in order to reduce infection. Mobile applications, e.g., Baidu Migration 1 and search engines, e.g., Baidu 2 , can be easily used to achieve information acquisition for citizens to keep informed during the outbreak of COVID-19. There are a great number of studies [1, [13] [14] [15] [16] [17] [18] [19] that demonstrated the feasibility to leverage mobile applications for information acquisition. As a result, the history search records can reflect the information acquisition status and the history statistical migration data can be used to analyze the status of the COVID-19 outbreak. As well-informed individuals are likely to travel less when there is COVID-19 [20] , it is interesting to analyze the correlation between information acquisition and the execution of the containment measures. In addition, the correlation between the local economy and the information acquisition can be used to reveal how people in different economy situations react to the COVID-19 pandemic. In this work, we aim at using a parsimonious model, i.e., SIR-X model, and Markov Chain Monte Carlo (MCMC) [21] methods to estimate the parameters of the execution of intra-city containment measures in major cities of Mainland China. Then, we analyze the correlation among different random variables, i.e., information acquisition status (COVID-19-related search frequency), local economy situation (GDP per capita) and the parameters in the SIR-X model, in order to understand the relationship among economy, information acquisition and the execution of containment measures. More specifically, we would like to investigate following problems: • How to construct a model to capture the confirmed cases? This research issue has been studied [7, 11] at province scale using the migration scale index released by Baidu Migration Open Data, where the index is calculated based on the past historical statistical records from a widely-used Web mapping service, i.e., Baidu Maps 3 . We propose to construct an SIR-X model based on the confirmed cases at city scale using a Markov Chain Monte Carlo (MCMC) method. We provide solid results using exact figures for major Chinese cities. • To what degree does the local economy affect the pandemic outbreaks of COVID-19 and the execution of containment measures in major cities of China? The impact of the COVID-19 pandemic on economy has been studied [22, 23] while the correlations between local economy and the outbreaks of the COVID-19 pandemic or the execution of containment measures are not analyzed. We hypothesized that the outflows from Wuhan trend to go to the cities where GDP per capital is high. We further hypothesized that there would be more initial confirmed cases as more infected people arrived at these cities. In addition, we hypothesized that the people in the cities where GDP per capital is high tend to perform more information acquisition activities through voluntary COVID-19-related search in order to be well informed on the situation of the COVID-19 pandemic. We analyze the correlations based on the estimated parameters of SIR-X and the statistical data from Baidu Maps and provide solid results for major Chinese cities. • To what degree does the information acquisition affect the execution of containment measures in major cities of China? Strong positive correlation between the pandemic outbreaks of COVID-19 the information acquisition has been reported in [20] while the correlation between the information acquisition and the execution of containment measures are not analyzed. As reported in [20] and was seen in the collective responses to the emergencies [24] and panics [13] , people voluntarily acquire information more frequently when the pandemic situations become worse in their cities. We hypothesized the well-informed people tend to apply the containment measures more strictly in order to avoid the risk to be infected and the risk to make the situation worse. We analyze this correlation based on the estimated parameters of SIR-X and the statistical data from Baidu Maps and Baidu Search Engine, and provide solid results for major Chinese cities as well. Different from existing research [1, 6, 7, 11] , we particularly analyze the correlation between local economy strength and the COVID-19-related search frequency with the city population size (a controlling variable) removed in order to avoid the impact of the scale of city. Compared to the existing work [20, 22] , we analyze the correlations not only based on the data from Baidu Maps and Baidu Search Engine but also based on the combination of an SIR-X model and MCMC methods. In this section, we first present the existing models to capture COVID-19. Then, we propose using SIR-X and MCMC to construct the model. Afterwards, we present the comparison between the official number and the model fitting number of accumulated confirmed cases. Susceptible Infectious Recovered (SIR) model [25, 26] and Susceptible Exposed Infectious Recovered (SEIR) model [27] [28] [29] are largely adopted to characterize the outbreak of COVID-19 epidemic. However, the containment measures cannot be described in the standard SIR or SEIR model. A modified SEIR model [30] is proposed with the consideration of mobility while it is still not able to infer the execution of containment measures. A Long-Short-Term-Memory (LSTM) [30] model is proposed to project the number of accumulated confirmed cases, which is not able to describe the execution of containment measures either. In order to characterize the outbreak of COVID-19 epidemic with containment measures at city level, we exploit the SIR-X model [11] . The SIR-X model is a modified SIR model, which takes the containment measures into consideration. We have the same assumptions and use the same representative parameters as those in [11] . We assume that there are public containment efforts, e.g., stay-at-home, reduced interaction with other people, which is referred as 'containment' and represented by a variable κ 0 . In addition, we assume that infected individuals are quarantined, which is referred as 'quarantine' and represented by a variable κ. We use α to represent the infection speed of an infected individual and β −1 to represent the average time an infected individual remains infectious before recovery or removal. Then, the SIR-X model is expressed by the following differential equations: July 20, 2020 4/16 Instead of fixing the same parameters (α and β) for each province in [11] , we estimated the parameters using a MCMC [21] method, inspired by [31] . In the model, I 0 represents the number of initial infected individuals. The basic reproduction number R 0 represents the average number of secondary infections an infected will cause before he or she recovers or is removed [11] . The reproduction number can be calculated as: . We use R 0,f ree to represent the reproduction number without containment or quarantine measures. As high temperature and high humidity significantly reduce the transmission of COVID-19 citeWang2020, R 0,f ree and β may be different for different cities because of the diversity of local environments. Thus, we use the MCMC [21] method to estimate the distribution of the parameters, i.e., α, β, κ, κ 0 , I 0 , while the other parameters are fixed (S 0 is the population in the city, R 0 is fixed as 0 and X 0 is the number of initial confirmed cases) at the beginning (January 23, 2020) with S 0 , R 0 , X 0 and I 0 representing the initial values of S, I, R and X. Specifically, we use the uniform distribution as the parameter's prior distribution. And with the consideration of the nonlinear of SIR-X model, we adopt the Sequential Monte Carlo sampler to achieve the posterior distribution of model's parameters including the α and β. Finally, we take the expected value of each parameter to construct the model. In order to have stable results from the MCMC method, we use a priori conditions, i.e., R 0 < R 0,f ree and κ 0 < κ. The results of MCMC methods may not be stable, i.e., the results of each execution may be different without a priori conditions. Thus, we introduce a priori conditions, i.e., R 0 < R 0,f ree , κ 0 < κ and the model fit number of accumulated confirmed cases should be equal or bigger than the official number of confirmed cases. R 0,f ree represents a maximum value of R 0 . We set R 0,f ree as 6.2, which is in accordance with the result from [11] that the R 0 should be between 1.4 and 3.3. During the fitting process, if the a priori conditions are not met, the fitting process will be executed again until reaching a limit, e.g., 20 times of execution, in order to avoid infinite execution. We assume that the quarantine measure is applied more strictly on the infected individuals than other public citizens, i.e., κ 0 < κ. In order to use the SIR-X model, we need to assume that few travelers and symptomatic infected individuals travel into or from a city. As reported in [20] , there is strong correlation between the inflows from Wuhan and the confirmed cases in a city. We assume that few infected individuals travelled into major cities after January 23, 2020 as Wuhan travel ban has been put in place since January 23, 2020 and few people went to other cities from Wuhan as shown in Figure 1 . In addition, we assume that the number of infected individuals in the inflows from other cities can be ignored compared to the number of infected individuals among the local citizens in a city. With these two assumptions, we can use the SIR-X and MCMC to estimate parameters for each major cities in Mainland China based on the number of confirmed cases 4 from January 23, 2020 to May 1, 2020. Figures 3a -3f illustrate the confirmed cases in several major cities of Mainland China. From figures, we can see that the combination of SIR-X and MCMC captures the number of confirmed cases in different cities very well, e.g., Beijing, Shanghai, Shenzhen, Wuhan and Chongqing. However, the model does not well characterize the number of confirmed cases in Guangzhou as there are many (127 5 ) infected individuals from other countries, which cannot be captured by the SIR-X model. In addition, we believe that the errors between the confirmed cases and fitted data are mainly due to the travelers from other countries in Beijing (174 confirmed cases from other countries) and Shanghai (326 confirmed cases from other countries). Then, we calculate the confirmed cases of different provinces by adding the number of confirmed cases in each affiliated city. Figures 4a -4f shows the number of confirmed cases in several provinces of Mainland China. We can see that the SIR-X well captures the aggregated cases at province scale. Furthermore, we use the same method to calculate the number of confirmed cases in Mainland China as shown in Figure 5 . Figure 6 shows that the combination of SIR-X model and MCMC captures well the number of confirmed cases at city scale. Besides the number of confirmed cases (May 1, 2020), we collected three datasets, i.e., GDP, mobility and COVID-19-related search frequency for major cities in Mainland China. The GDP dataset was collected from [32], which characterized local economic development in 2019. The mobility data was captured from Baidu Maps and the COVID-19-related search frequency (the ratio between COVID-19-related search volume from January to March 2020 and population in each city) data was gathered from a widely-used Web search engine. We analyze the correlation among local economy, mobility, search behaviors and the parameters estimated based on the combination of the SIR-X model and the MCMC method as presented in Section 2, for 238 cities in Mainland China (excluding Wuhan). We normalize inflow, outflow, search volume by the following formula: N ormalize(data) = data − data min data max − data min . (2) In this way, the data in the study are curved into the range from 0 to 1 proportionally. The results of our data-driven analysis are summarized in Table 1 and shown in Figure 7 . In this section, we present the observations obtained from the analysis. We have evidenced the significant positive correlations between local economy and COVID-19-related search frequency for major Chinese cities (Result I in Table 1 ). In order to analyze the correlation between two random variables, we calculated the Pearson correlation coefficients [33] and conducted the Student's T-test (two tails) to verify the significance test (the same for the following analysis in the paper). The Pearson correlation between the local GDP per capita and the total COVID-19-related search volume (between January and March 2020) is R * * * = 52.5% (N = 238 and p-value= 3.06 × 10 −18 < 0.0001) for each city. However, we considered that this We therefore tested the significance of the correlations between GDP per capita and COVID-19-related search frequency, where we evidenced the significance in the correlations as R * * * = 63.5% (N = 238 and p-value=2.91 × 10 −28 < 0.0001). In addition, in order to obviate the impact of the scale of the city, i.e., the impact of city population, we conducted partial correlation analysis [34, 35] between GDP per capita and the COVID-19-related search frequency with the effects of the city population size (a controlling variable) removed. In order to estimate the partial correlation of random variables X and Y with the random variable Z removed, we expressed the partial correlation coefficient in terms of the Pearson correlation coefficients as We find a strong correlation with significance as well, such that R * * * = 57.0% (N = 238 and p-value= 7.15 × 10 −22 < 0.0001). Thus, we can conclude that no matter whether the scale of the city is big or small, the GDP per capita has a significant positive correlation with the COVID-19-related search frequency. Please see also in Figure 8 for the visualization of the correlations. We have evidenced the significant positive correlation between the GDP per capita and inflows from Wuhan (Result II in Table 1 ). We hypothesized that cities with higher GDP per capita would attract larger inflows from Wuhan. Therefore, for every city in the study, we correlated the GDP per capita and the inflows from Wuhan, where we obtained Pearson correlation coefficients of R * * * = 42.3% (N = 238 and p-value= 9.88 × 10 −12 < 0.0001). In addition, in order to obviate the impact of the scale of the city, we correlated GDP per capita and the inflows rate from Wuhan, i.e., the ratio between inflows from Wuhan and the population. We found a strong positive correlation with significance as well, such that R * * * = 32.6% (N = 238 and p-value= 2.67 × 10 −7 < 0.0001). Please see also in We have evidenced the significant positive correlation between the inflows and I 0 (Result II in Table 1 ) and the significant positive correlation between the inflow rate and R 0 (Result III in Table 1 ). We hypothesized that cities with larger inflows from Wuhan have more initial infected cases, i.e., I 0 in the SIR-X model. Thus, we correlated the inflows from Wuhan and I 0 , where we obtained found a strong positive correlation with significance, such that R * * = 21.6% (N = 238 and p-value= 8.11 × 10 −4 < 0.001). In addition, we analyzed the correlation between the inflow rate and R 0 , where we found a strong positive correlation with significance, such that R * = 19.2% (N = 238 and Table 1 ) and the positive correlation between R 0 and the number of confirmed case rate (Result III in Table 1 ). We hypothesized that cities with bigger I 0 finally have more confirmed cases. To this end, we performed correlation using I 0 and the number of confirmed cases on May 1, 2020. We found a significant positive correlation, such that R * * = 22.5% with N = 238 and p-value= 4.67 × 10 −4 < 0.001. In addition, we hypothesized that cities with bigger R 0 have more confirmed case rate, i.e., the ratio between the confirmed cases and the population. We performed correlation between R 0 and the confirmed case rate, where we obtained a significant positive correlation, such that R * * * = 29.8% with N = 238 and p-value= 2.82 × 10 −6 < 0.0001. Please see also in Figures 13 and 14 for the visualization of the correlations. We obtained a strong positive correlation with significance between the number of confirmed cases and the COVID-19-related search frequency, such that R * * * = 41.5% (N = 238 and p-value= 2.45 × 10 −11 < 0.0001). Furthermore, we obtained a strong positive correlation with significance between confirmed case rate and the COVID-19-related search frequency, such that R * * = 21.4% (N = 238 and p-value= 9.01 × 10 −4 < 0.001). Similar results are also reported in [20] . We thus can conclude that for each major city in the study, the GDP per capita and the factors that incur infections, e.g., inflows from Wuhan, I 0 , R 0 , have significant positive correlation. This indicates that the rich cities attract more inflows from Wuhan, which caused infections and in order to fight against COVID-19, the citizens in the rich cities tend to perform more search activities in order to be well-informed. In this section, we analyze the correlation among local economy, information acquisition and containment measures. As the quarantine measure (see details in Section 2) is directly related to the number of confirmed cases, we analyze the correlation between κ and other factors (local economy and information acquisition). In addition, we are also interested in the realization of inter-city containment measures, i.e., the outflow recovery rate (the ratio between the outflows of 2020 and that of 2019). We have evidenced the significance of the positive correlation between the execution of the quarantine measure and GDP per capita for major Chinese cities in the study (Result IV in Table 1 ). Among all 238 cities in the correlation study, we hypothesized that people with high GDP per capita would try harder to realize the quarantine measure for infected individuals. Therefore, we correlated the GDP per capita and κ, where Pearson correlation coefficients are R * = 17.3% (N = 238 and p-value= 7.46 × 10 −3 < 0.01). Furthermore, we correlated the GDP per capita and the outflow recovery rate, where we obtained Pearson correlation coefficients of R * * * = −46.5% (N = 238 and p-value= 3.82 × 10 −14 < 0.0001). The correlation analysis result suggests that people with higher GDP per capita are more likely to apply the quarantine measure. Please see also in Figures 15 and 16 for the visualization of the correlations. We have evidenced the significance of the positive correlations between the realization of quarantine measure and the COVID-19-related search frequency. We performed correlation using the COVID-19-related search frequency and κ. We found a significant positive correlation, such that R * = 17.8% with N = 238 and p-value= 5.84 × 10 −3 < 0.01. Please see also in Figures 17 for the visualization of the correlations. We found negative correlations between the COVID-19-related search frequency and the outflow recovery rate are stronger with R * * * = −51.4% (N = 238 and p-value= 1.88 × 10 −17 < 0.0001) (similar result is also reported in [20] ). The correlation analysis result suggests that people with more per capita COVID-19-related search frequency are more likely to apply the containment measures, i.e., separate the infected individuals and small outflow recovery rate. We can conclude that for every city in the study the GDP per capita and the COVID-19-related search frequency have significant positive correlation to the realization of containment measures. We believe it is due to the will to avoid the risk to be infected and the natural response to the fear and massive panics [13, 20] . In addition, the reason also goes to the fact that the people in the cities of higher GDP per capita tend to have bigger capacity or more tolerance to apply the containment measures. In this work, we first exploit the SIR-X model and MCMC method to estimate the parameters related to the COVID-19 pandemic at the scale of city. Then, we examined the correlation between the local economy and the spread of COVID-19 pandemic and the execution of containment measures in major cities of Mainland China. We conducted correlation analysis based on the mobility data and search data from Baidu Maps and Baidu Search Engine in Mainland China. Our analysis brings novel knowledge of the correlation among different factors related to the COVID-19 pandemic. The cities of higher GDP per capita attracts bigger inflows from Wuhan, which cause more confirmed cases. However, the demands of information from individuals becomes higher, which incurs the reaction to apply the containment measures. Furthermore, well-informed individuals are more likely to apply the intra-and inter-city containment measures, i.e., quarantine of infected individuals and reducing going to other cities. The implications of these correlations include that, the better the local economy is and the more that timely information acquisition is attained by residents, the better the containment measures are realized, which help fight against the COVID-19 pandemic in major cities of Mainland China. The effect of travel restrictions on the spread of the 2019 novel coronavirus (COVID-19) outbreak SARS-CoV-2 was already spreading in France in late The Novel Coronavirus Pneumonia Emergency Response Epidemiology Team. The epidemiological characteristics of an outbreak of 2019 novel coronavirus diseases (COVID-19) in China World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19) The effect of human mobility and control measures on the COVID-19 epidemic in China An investigation of transmission control measures during the first 50 days of the COVID-19 epidemic in China Quantifying SARS-COV-2 transmission suggests epidemic control with digital contact tracing Staying at Home: Mobility Effects of COVID-19 COVID-19 and Italy: what next? The Lancet Effective containment explains subexponential growth in recent confirmed COVID-19 cases in China COVID-19 control in China during mass population movements at New Year Fear can be more harmful than the severe acute respiratory syndrome coronavirus 2 in controlling the corona virus disease 2019 epidemic Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions The novel coronavirus outbreak: what can be learned from China in public reporting? COVID-19 control in China during mass population movements at New Year. The Lancet Challenges of SARS-COV-2 and lessons learnt from SARS in Guangdong Province Correlation between the migration scale index and the number of new confirmed Novel Coronavirus Pneumonia cases in China Correlation between travellers departing from Wuhan before the Spring Festival and subsequent spread of COVID-19 to all provinces in China Understanding the Collective Responses of Populations to the COVID-19 Pandemic in Mainland China Markov Chain Monte Carlo Quantifying the Economic Impact of COVID-19 in Mainland China Using Human Mobility Data What Will Be the Economic Impact of COVID-19 in the US? Rough Estimates of Disease Scenarios Quantifying information flow during emergencies Susceptible-Infected-Recovered (SIR) Dynamics of COVID-19 and Economic Impact A simple Stochastic SIR model for COVID 19 Infection Dynamics for Karnataka: Learning from Europe Epidemic analysis of COVID-19 in China by dynamical modeling The effectiveness of quarantine of Wuhan city against the Corona Virus Disease 2019 (COVID-19): A well-mixed SEIR model analysis Analysis of COVID-19 epidemic traced data and stochastic discrete transmission dynamic model Modified SEIR and AI prediction of the epidemics trend of COVID-19 in China under public health interventions Nowcasting and Forecasting the Potential Domestic and International Spread of the 2019-nCoV Outbreak Originating in Wuhan, China: A Modelling Study Pearson Correlation Coefficient. Noise Reduction in Speech Processing Partial correlation and conditional correlation as measures of conditional independence Partial correlation analysis: applications for financial markets To protect Baidu users' data privacy, all experiments in this paper were carried out using anonymous data and secure data analytics provided by Baidu Data Federation Platform (Baidu FedCube). For data accesses and usages, please contact us via {fedcube, shubang}@baidu.com.Author contributions statement J. Liu formulated the research problems and drafted the manuscript. J. Liu, X. Wang and J. Huang collected data from Baidu, conducted the experiments, and performed data analysis. S. Huang and H. An collected the consensus data and carried out the data visualization. X. Wang, J. Huang, H. Xiong and D. Dou revised the whole paper. D. Dou and H. Wang proposed the research, coordinated the research efforts, and oversaw the whole research process.