key: cord-0002359-964tbh7d authors: Liu, Kui; Li, Li; Jiang, Tao; Chen, Bin; Jiang, Zhenggang; Wang, Zhengting; Chen, Yongdi; Jiang, Jianmin; Gu, Hua title: Chinese Public Attention to the Outbreak of Ebola in West Africa: Evidence from the Online Big Data Platform date: 2016-08-03 journal: Int J Environ Res Public Health DOI: 10.3390/ijerph13080780 sha: f8f5d842db3bc80e5c22011583b8e965abfe624c doc_id: 2359 cord_uid: 964tbh7d Objective: The outbreak of the Ebola epidemic in West Africa in 2014 exerted enormous global public reaction via the Internet and social media. This study aimed to investigate and evaluate the public reaction to Ebola in China and identify the primitive correlation between possible influence factors caused by the outbreak of Ebola in West Africa and Chinese public attention via Internet surveillance. Methods: Baidu Index (BDI) and Sina Micro Index (SMI) were collected from their official websites, and the disease-related data were recorded from the websites of the World Health Organization (WHO), U.S. Centers for Disease Control and Prevention (CDC), and U.S. National Ministries of Health. The average BDI of Internet users in different regions were calculated to identify the public reaction to the Ebola outbreak. Spearman’s rank correlation was used to check the relationship of epidemic trends with BDI and SMI. Additionally, spatio-temporal analysis and autocorrelation analysis were performed to detect the clustered areas with the high attention to the topic of “Ebola”. The related news reports were collected from authoritative websites to identify potential patterns. Results: The BDI and the SMI for “Ebola” showed a similar fluctuating trend with a correlation coefficient = 0.9 (p < 0.05). The average BDI in Beijing, Tibet, and Shanghai was higher than other cities. However, the disease-related indicators did not identify potential correlation with both indices above. A hotspot area was detected in Tibet by local autocorrelation analysis. The most likely cluster identified by spatiotemporal cluster analysis was in the northeast regions of China with the relative risk (RR) of 2.26 (p ≤ 0.01) from 30 July to 14 August in 2014. Qualitative analysis indicated that negative news could lead to a continuous increase of the public’s attention until the appearance of a positive news report. Conclusions: Confronted with the risk of cross-border transmission of the infectious disease, online surveillance might be used as an innovative approach to perform public communication and health education through examining the public’s reaction and attitude. The Ebola epidemic of West Africa had been viewed as a public health emergency of international concern by the World Health Organization (WHO) on 8 August 2014, attributed to its explosive course and high fatality [1] . Ebola virus disease (EVD), also known as Ebola hemorrhagic fever, was first identified in Yambuku and the surrounding areas in Zaire and South Sudan in 1976 [2, 3] . The largest public reaction, which might provide clues for government and health authorities to reform existing modes of health education. To understand the public reaction in China to the outbreak of EVD in West Africa, we carried out an innovative network digital epidemiologic study based on the online data retrieved from 20 July to 4 September in 2014, in which the epidemics had aroused significant attention and reaction in China. According to the Chinese keyword of "埃博拉 (Ebola)", the BDI and SMI were collected from the websites of Baidu Index and Sina Micro Index daily, respectively [19, 30] . All of the Ebola-related data, including the number of cases and deaths, were collected from the websites of the World Health Organization, Centers for Disease Control and Prevention and National Ministries of Health (USA), and netizen data were from the 33rd Statistical Report on Internet Development in China [31]. We initiated Internet surveillance of cyber citizens' reactions to Ebola from 20 July in China. The daily BDI was recognized as a vital data source, which could provide information involving the weighted sum of search frequency for a keyword in light of its daily search volume via the Baidu website. We gathered the daily BDI with "埃博拉 (Ebola)" as the keyword in cities/provinces to examine the public response. Additionally, Given Internet users in different locations, the average BDI was calculated to identify the mean attention of the netizens (1/100 million) in each province and some cities. Also, we used the same strategy to investigate the blogs posted and forwarded daily for the topic of "Ebola" by the Sina Micro Index (SMI). The headline news reports concerning "Ebola" were also collected. These media events were retrieved from two sources: the headlines by the Baidu Index and Baidu News [32] . The former does not carry news headlines of the same topic every day, especially of topics with minor fluctuations of BDI. The latter not only provides the related media events but also sorts focused news by the topic word. Finally, the collected media events were abstracted and categorized as positive or negative news. Media events were classified as negative if it generated negative sentiments or attitudes towards the topic of Ebola, or as positive if it aroused optimistic sentiments or supportive attitudes towards "Ebola". These were determined by two individuals, and were eventually decided by the third person if a discrepancy existed. We graphed the curves of the Ebola outbreak in West Africa to describe the severity of epidemics. To explore the public reaction, the average public reaction of the BDI (average BDI) among Internet users from different regions was calculated by the mean or median (P50), in which P50 was used as data distribution did not suffer the test of normality. Spearman's rank correlation was employed to check the relationship of epidemic trends with BDI and SMI. The autocorrelation analysis included general autocorrelation analysis and local autocorrelation analysis. The general autocorrelation used the global Moran's Index. According to the value of Moran's Index, the result would be determined as a clustered distribution, dispersed distribution, or random distribution, respectively [33] . When the p value of the global Moran's Index was less than 0.05, the local autocorrelation analysis would be carried out by local Getis's Gi* to identify the potential hotspots. Additionally, Kulldorff's space-time scan statistics was carried out to recognize the special cities/provinces with high attention to "Ebola" [34] . The parameters of the maximum spatial cluster size and maximum temporal cluster size used the default settings (50%). The log likelihood ratio (LLR) was calculated through comparing the real average BDI with the expected average BDI, and a Monte Carlo test (p < 0.05) was utilized to determine the most likely clustered regions. Spatial-temporal analysis was done using ArcGIS software (version 10.1, ESRI Inc., Redlands, CA, USA) and SaTScan software (version 9.1.1, Boston, MA, USA). All of the results were considered statistically significant if p < 0.05. This current Ebola outbreak started in Guéckédou and Macenta districts of Guinea during December 2013 [35] , and WHO proclaimed the EVD outbreak on 23 March 2014. As the situation deteriorated, from all of the available evidence, Director-General Margaret Chan of WHO defined the epidemic to be a Public Health Emergency of International Concern. Figure 1 DBI and SMI were used as the indicators for the public attention to the Ebola outbreak, and the correlation analysis was used to detect the consistency of the two indices ( Figure 2) . The result showed a positive correlation between BDI and SMI (Spearman's rank correlation coefficient = 0.9, p ď 0.05). The BDI for the keyword of "Ebola" increased sharply from 29 July, which peaked at 101,222 on 1 August, and its BDI declined with fluctuations, remaining at a high level above 50,000 between 2 and 9 August. It dropped again on 15 August and reached a lower peak at 79,939. It then steadily decreased to 28,480 with minor fluctuations on 20 August, along with a BDI of 58,360. After that, the BDI of "Ebola" stayed at a lower level between 10,000 and 20,000, but higher than that before 29 July ( Figure 2 ). The data scale ranged from 399 to 101,222, with a median of 25,421 during the study period. We further collected the daily BDI of different provinces and some municipalities in China between 20 July and 4 September. The overall trend of the BDI in available cities and provinces was similar. Considering the diverse frequencies among Internet users in different areas, the numbers of netizens were gathered to investigate the average attention as indicated by the average BDI in separated regions ( Figure 3 ). The top five cities/provinces in terms of the average BDI were Beijing, Tibet, Shanghai, Tianjin, and Hainan (Table 1 ). The BDI for the keyword of "Ebola" increased sharply from 29 July, which peaked at 101,222 on 1 August, and its BDI declined with fluctuations, remaining at a high level above 50,000 between 2 and 9 August. It dropped again on 15 August and reached a lower peak at 79,939. It then steadily decreased to 28,480 with minor fluctuations on 20 August, along with a BDI of 58,360. After that, the BDI of "Ebola" stayed at a lower level between 10,000 and 20,000, but higher than that before 29 July ( Figure 2 ). The data scale ranged from 399 to 101,222, with a median of 25,421 during the study period. We further collected the daily BDI of different provinces and some municipalities in China between 20 July and 4 September. The overall trend of the BDI in available cities and provinces was similar. Considering the diverse frequencies among Internet users in different areas, the numbers of netizens were gathered to investigate the average attention as indicated by the average BDI in separated regions ( Figure 3 ). The top five cities/provinces in terms of the average BDI were Beijing, Tibet, Shanghai, Tianjin, and Hainan (Table 1) . This map was created by the website of dituhui for free [36] . Correlation analysis was carried out to explore potential case-related indicators resulting in the fluctuation of public attention. The associated analyses were performed of the BDI and cumulative fatality rate, BDI and cumulative case, BDI and cumulative death case, BDI and new reported case, and BDI and new reported death case. The results showed no correlation between all case-related influencing indicators and the BDI (Spearman's rank correlation, p > 0.05). We also conducted the correlation analysis of the adjusted BDI and new reported case, and adjusted BDI and new reported death case, which took into account the time difference of America and China (the adjusted BDI being the mean BDI of two adjacent days). No correlation was identified between the adjusted BDI and case-related influencing indicators (Spearman's rank correlation, p > 0.05). These results are detailed in Figures 4A, and 5-7 . SMI based on the total microblogs posted and forwarded daily for the keyword "Ebola" on the Sina microblog were also collected. The SMI rapidly increased from 2153 on 29 July to its peak at 88,761 on 30 July 2014, declined to 14,510 with fluctuations on 7 August, and stayed above the primary level before 29 July. The SMI reached another peak at about 45,860 on 11 August, which was lower than the first one, and gradually declined from 31,186 on 12 August to 2056 on 2 September (Figure 2 ). The data scale ranged from 17 to 88,761 with a median of 7756 during the study period. To explore the potential case-related indicator resulting in the fluctuation of public attention, the correlation analysis was carried out. The associated analyses were performed of the SMI and cumulative fatality rate, SMI and cumulative case, SMI and cumulative death case, SMI and new reported case, and SMI and new reported death case. No correlation was found between case-related influencing indicators and the SMI (Spearman's rank correlation, p > 0.05). Correlated analyses were also conducted of the adjusted SMI and new reported case, and the adjusted SMI and new reported death case, which took into account the time difference of America and China (the adjusted SMI being the mean SMI of two adjacent days). No correlation was identified between the adjusted SMI and case-related influencing indicators (Spearman's rank correlation, p > 0.05). The results are detailed in Figures 4B and 7-9 . In the spatial clustering analysis, the general analysis implied that there was significant spatial clustering for the average BDI of "Ebola" in China. The global Moran's I Index = 0.23 (p < 0.01). A local spatial autocorrelation analysis was then performed to identify the hotspot through local Getis's Gi*. Results of the local autocorrelation analysis showed that the only hotspot to "Ebola" was Tibet. Furthermore, spatio-temporal clustering of public attention to "Ebola" in the study time was carried out. The most likely cluster was identified in the 13 regions of China from 30 July-14 August 2014. The LLR was 103,962.85 with the relative risk (RR) of 2.26 (p < 0.01). It included 13 cities/provinces, namely, Tianjin, Beijing, Hebei, Shandong, Shanxi, Liaoning, Inner Mongolia, Henan, Jiangsu, Anhui, Jilin, Shaanxi, and Shanghai. The details are shown in Figure 10 . As no direct correlation was detected between case-related influencing indicators and BDI/SMI, events possibly related to the fluctuation of public reaction were listed in Figure 11 . Our results suggested that a series of negative news reports might cause public concern and nervousness, and subsequently induced a raised public reaction as represented by the network retrieval behavior and the number of microblogs posted and forwarded. A case in point was the report concerning one woman who returned to Hong Kong from Africa with the symptoms of Ebola disease around 30 July in 2014, an event followed by the first peak of the BDI and SMI. Another event was the announcement made around 8 August by WHO, that the Ebola outbreak was identified as an international public health emergency along with the second peak of the BDI/SMI. On the other hand, positive news reports also influenced public attention. When the WHO spokesman deemed that Chinese people did not need to panic for the epidemic of Ebola in West Africa, the BDI/SMI dropped in the next few days from their first peak. Later on, ruling out one suspected case in Hong Kong led to the decline after the second peak. These observations implied that negative news might increase public reaction while positive news might just do the opposite. As no direct correlation was detected between case-related influencing indicators and BDI/SMI, events possibly related to the fluctuation of public reaction were listed in Figure 11 . Our results suggested that a series of negative news reports might cause public concern and nervousness, and subsequently induced a raised public reaction as represented by the network retrieval behavior and the number of microblogs posted and forwarded. A case in point was the report concerning one woman who returned to Hong Kong from Africa with the symptoms of Ebola disease around 30 July in 2014, an event followed by the first peak of the BDI and SMI. Another event was the announcement made around 8 August by WHO, that the Ebola outbreak was identified as an international public health emergency along with the second peak of the BDI/SMI. On the other hand, positive news reports also influenced public attention. When the WHO spokesman deemed that Chinese people did not need to panic for the epidemic of Ebola in West Africa, the BDI/SMI dropped in the next few days from their first peak. Later on, ruling out one suspected case in Hong Kong led to the decline after the second peak. These observations implied that negative news might increase public reaction while positive news might just do the opposite. This paper reported the use of BDI and SMI to identify the Chinese public's reaction to the Ebola outbreak in West Africa from 20 July to 4 September in 2014. Compared with common network tools, including content analysis, indices such as BDI and SMI to investigate public attention possessed unique merits. Firstly, these indicators could identify nearly all retrieved information to the specific keywords on the Big Data platform, while content analysis might only be implemented in limited samples. Additionally, BDI and SMI could mirror the public attention in a timely manner, whereas conventional methods might cause bias, and even be seriously affected by information deletion in websites. In our study, both indices consistently suggested the tremendous public concern to the Ebola event in China. Then, included in the study were the centralized tendency of BDI and average public attention to the Ebola outbreak as indicated by average BDI in different cities/provinces of China. The highest BDI was observed in Guangdong, the province with the largest number of Internet users in China. This might be partly attributed to the opportunities brought about by China's booming economy, inducing large numbers of West Africans coming to southeast coastal cities including Guangdong, which might lead to overreaction by local netizens. Additionally, the highest average attention to the Ebola outbreak was found in Beijing, the political center of China, along with comparable average BDI in Shanghai, the economic center of the country. The direct flights from these cities to West Africa might contribute to the increase of the average BDI. Interestingly, comparable public attention to Ebola was captured in Tibet, an underdeveloped region, which might be explained by the unique geographical location. Tibet has an underdeveloped transportation system, lower population density, and limited communication, all factors probably contributing to the more frequent web access to acquire information. Therefore, more attention should be paid in Tibet concerning public health education and rumor management. The spatio-temporal analysis had identified 13 clustered regions with higher average attention in China from 30 July to 14 August. During the same period, the daily BDI also indicated higher attention than other periods in China. Thus, we thought that the 15 days after the peak of the BDI was a critical period for infectious diseases with imported risk and that necessary health education intervention should be adopted in these clustered regions. This paper reported the use of BDI and SMI to identify the Chinese public's reaction to the Ebola outbreak in West Africa from 20 July to 4 September in 2014. Compared with common network tools, including content analysis, indices such as BDI and SMI to investigate public attention possessed unique merits. Firstly, these indicators could identify nearly all retrieved information to the specific keywords on the Big Data platform, while content analysis might only be implemented in limited samples. Additionally, BDI and SMI could mirror the public attention in a timely manner, whereas conventional methods might cause bias, and even be seriously affected by information deletion in websites. In our study, both indices consistently suggested the tremendous public concern to the Ebola event in China. Then, included in the study were the centralized tendency of BDI and average public attention to the Ebola outbreak as indicated by average BDI in different cities/provinces of China. The highest BDI was observed in Guangdong, the province with the largest number of Internet users in China. This might be partly attributed to the opportunities brought about by China's booming economy, inducing large numbers of West Africans coming to southeast coastal cities including Guangdong, which might lead to overreaction by local netizens. Additionally, the highest average attention to the Ebola outbreak was found in Beijing, the political center of China, along with comparable average BDI in Shanghai, the economic center of the country. The direct flights from these cities to West Africa might contribute to the increase of the average BDI. Interestingly, comparable public attention to Ebola was captured in Tibet, an underdeveloped region, which might be explained by the unique geographical location. Tibet has an underdeveloped transportation system, lower population density, and limited communication, all factors probably contributing to the more frequent web access to acquire information. Therefore, more attention should be paid in Tibet concerning public health education and rumor management. The spatio-temporal analysis had identified 13 clustered regions with higher average attention in China from 30 July to 14 August. During the same period, the daily BDI also indicated higher attention than other periods in China. Thus, we thought that the 15 days after the peak of the BDI was a critical period for infectious diseases with imported risk and that necessary health education intervention should be adopted in these clustered regions. Further analyses were performed to explore the correlation between case-related influencing indicators and BDI/SMI. Contrary to our expectation, no existing statistical correlation was established between case-related influencing indicators and BDI/SMI, which was discrepant with our previous findings [18] and other studies [7, 37] . This might be attributable to the fact that Ebola epidemics did not occur in China. The public, justifiably, pays more attention to the disease-related data when the outbreak takes place in their location. Otherwise, the public focuses more on the news reports and the notices from the authorities. The assumption was partly testified by the observation that the peaks of BDI and SMI were usually accompanied with some negative news reports and the decline of the indices followed the positive reports. Additionally, the appearance of a suspected Ebola case in Hong Kong might serve as the vital reason for the first peak of BDI and SMI. Moreover, the particular geographic location and administrative position of Hong Kong should be considered as influencing factors of the public's attention. Previous public health experience, such as SARS, indicated that no individual country could single-handedly prevent and protect itself from public health threats. Thanks to the worldwide spread of diseases coupled with the easy access to network information, communicable diseases such as Ebola, did not only affect the locations of the outbreak but also could cause panic or even major public health events in non-endemic areas. In the public health field, increased Internet searching implied the tremendous need of Ebola-related preventive knowledge to the public. That is to say, high attention areas could be more susceptible to rumors or false online information if public health authorities did not address the emerging public concerns in a timely manner. To handle the demand warranted by this emerging situation, traditional epidemiological methods and public health education modes were obviously inadequate. Online surveillance, with the aid of opinion indices, may have sensitively detected the emergence of serious infectious diseases at their initial stage via the Big Data platform in previous studies, which could buy time for controlling outbreaks of these diseases and reducing the risk of transmission to humans [28, 38, 39] . That is to say, this modality, different from classic epidemiology questionnaires and telephone interviews developed to know the public reaction after the disease outbreak, enables surveillance beforehand and saves health resources. More importantly, online surveillance can reflect public reactions to emergency public health events and disease outbreaks in a timely manner so that public health interventions can be implemented during epidemic crises to avoid deterioration. Additionally, compared to the classical health education mode aimed at high-risk populations in high morbidity regions, our results implied a need for a shift in health education methods to a public-attention-based mode, especially in non-epidemic areas, to identify the regions of high attention. This new mode should be based on the findings of opinion monitoring through public reaction indices like BDI and SMI. Different interventions, in the future, should be adopted for areas with different indices, more attention being targeted at high-index areas in terms of public propaganda and education. A majority of model studies of the two historical outbreaks of Ebola in the Democratic Republic of Congo and Uganda involved time series analysis [40] [41] [42] . Further investigation was carried out to assess the dynamics of Ebola in the aspect of the different transmission sources [43] . However, owning to the complicated influencing factors involved in network transmission dynamics, limited research time and medical informatics, an online surveillance model to identify the public concern about Ebola has not been established. Several limitations are mentionable for this study. (1) Considering the deleted Ebola-related blogs, the average SMI was not calculated in our analysis, and the average attention, spatio-temporal distribution characteristics as indicated by SMI in different regions were not verified; (2) The in-depth correlative analysis was not performed because of the absence of clinically characteristic data and detailed distribution information; (3) Our study only focused on Chinese websites and netizens from Chinese mainland, which could not depict the public reaction from Internet users of websites in English or other languages, for instance, Twitter and Facebook; (4) Our findings had not yet been directly applied to identify public health emergency. Meanwhile, these findings had not been evidenced with other search engines; (5) Due to lack of data, we did not analyze the association of the public attention with the migration population from Africa to China and the direct flights from West Africa to China; (6) Although limited adjustment used in our study, the potential time lag was not considered in our study; (7) Traditional media such as TV, newspapers might track well with Ebola case data whereas these were not considered in this study. This digital epidemiologic study suggested that online surveillance reflected significant attention in the Chinese population to the Ebola outbreak, and that BDI and SMI were rapid and efficient in identifying and evaluating public reactions. We also identified the regions that paid significant attention to the outbreak. Additionally, compared to domestic outbreaks of epidemic diseases, EVD, which had not occurred in China, might affect the public reaction through positive and negative news reports. In sum, confronted with the risk of cross-border transmission of the infectious disease, online surveillance based on Big Data platforms might be an innovative approach to purposefully perform public communication and health education, which was helpful to avoid the occurrence of public panics and dispel rumors. Ebola and Marburg haemorrhagic fever viruses: Major scientific advances, but a relatively minor public health threat for Africa The Epidemiology of Ebola Hemorrhagic Fever in Zaire World Health Organization. Ebola haemorrhagic fever in Sudan World Health Organization. WHO Statement on the Meeting of the International Health Regulations Emergency Committee regarding the Outbreak of Ebola hemorrhagic fever in the Republic of the Congo Ebola hemorrhagic fever outbreaks in Gabon, 1994-1997: Epidemiologic and health control issues Ebola hemorrhagic fever: Tandala, Zaire, 1977-1978 The reemergence of Ebola hemorrhagic fever, Democratic Republic of the Congo Multiple Ebola virus transmission events and rapid decline of central African wildlife A limited outbreak of Ebola haemorrhagic fever in Etoumbi Emergence of Zaire Ebola virus disease in Guinea Ebola virus disease: A review on epidemiology, symptoms, treatment and pathogenesis Clinical presentation of patients with Ebola virus disease in Conakry Ebola outbreak in Conakry, Guinea: Epidemiological, clinical, and outcome features WHO Ebola Response Team. Ebola virus disease in West Africa-The first 9 months of the epidemic and forward projections Importance of Internet surveillance in public health emergency control and prevention: Evidence from a digital epidemiologic study during avian influenza A H7N9 outbreaks Chinese Internet Users Search Behavior Study Pandemics in the age of Twitter: content analysis of Tweets during the 2009 H1N1 outbreak The use of Twitter to track levels of disease activity and public concern in the U.S. during the influenza A H1N1 pandemic Use of hangeul twitter to track and predict human influenza infection Using social media to predict and track disease outbreaks Chinese social media reaction to the MERS-CoV and avian influenza A (H7N9) outbreaks Scoping review on search queries and social media for disease surveillance: A chronology of innovation Predicting flu trends using twitter data Early detection of an epidemic erythromelalgia outbreak using Baidu search data Monitoring influenza epidemics in China with search query from Baidu Available online: //news.baidu.com/?tn=news Notes on continuous stochastic phenomena SaTScan User Guide for Version 9.0; Department of Ambulatory Care and Prevention The international Ebola emergency Correlation between reported human infection with avian influenza A H7N9 virus and cyber user awareness: What can we learn from digital epidemiology? Media and public reactions toward vaccination during the "hepatitis B vaccine crisis" in China Assessing cyber-user awareness of an emerging infectious disease: Evidence from human infections with avian influenza A H7N9 in Zhejiang, China A likelihood-based method for real-time estimation of the serial interval and reproductive number of an epidemic Understanding the dynamics of Ebola epidemics Estimation and inference of R0 of an infectious pathogen by a removal method Potential for large outbreaks of Ebola virus disease This study was partly supported by Zhejiang Provincial Medical and Health (2016151967