key: cord-1049302-xspxh0gp authors: Symum, H.; Ali, K. M. title: Monitoring COVID-19 related public Interest and population Health Literacy in South Asia: An Internet Search-Interest Based Model date: 2020-08-26 journal: nan DOI: 10.1101/2020.08.24.20180943 sha: 0b28559e65279a90edf305375c957ee6dedcdc25 doc_id: 1049302 cord_uid: xspxh0gp Background Information epidemiology based on internet search data can be used to model COVID-19 pandemic progressions and monitor population health literacy. However, the applicability of internet searches to monitor COVID-19 infections and public health awareness in South Asian countries are unclear. Objectives To assess the association of public interest and health literacy in COVID-19 with the actual number of infected cases for countries in South Asia. Methods Google Trends data from January to March 2020 were used to correlate public interest and health literacy with official data on COVID-19 cases using the relative search volume (RSV) index. Public interest in COVID-19 was retrieved in RSV indices with the search term Coronavirus (Virus). Similarly, an OR combination of search terms hand wash, face mask, hand sanitizer, face shield, and gloves were used to retrieve RSV indices as a surrogate of population health literacy in COVID-19. Daily confirmed COVID-19 cases were obtained from the COVID-19 data repository managed by the Johns Hopkins University. Country-level time-lag correlation analyses were performed for a time lag between 30 and +30 days. Results COVID-19-related worldwide public interest reached a peak on March 16, 2020, right after the WHO announcement of coronavirus outbreak as a pandemic. COVID-19 related public interest reached the highest peak in South Asian countries a few days after each county reported 100th confirmed cases. There were significant positive correlations between COVID-19 related public interest and daily laboratory-confirmed cases in countries expect Nepal, Bhutan, and Sri Lanka. The highest public interest in South Asian Countries was on average 12 days before the local maximum of new confirmed cases. Similarly, web searches related to personal hygiene and COVID-19 preventive measures in south Asia correlated to the number of confirmed cases as well as national restriction measures. Conclusion Public interest indicated by RSV indices can help to monitor the progression of an outbreak such as the current COVID-19 pandemic particularly in countries with a lack of diagnostic and surveillance capacity, and thereby distribute appropriate health information to the general public. The coronavirus disease 2019 (COVID- 19) pandemic represents an unprecedented global healthcare emergency with more than 20 million laboratory-confirmed cases and 730,000 deaths between February and July 2020. 1 The COVID-19 which is caused by the novel acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has been the first time reported in Wuhan, China on December 31, 2019. 2 Since December 2019, the outbreak continues to spread exponentially worldwide due to the high virulence and prevalence of asymptomatic cases. 3, 4 By March 2020, COVID-19 has spread globally with more than 400,000 confirmed cases which prompted the World Health Organization (WHO) to declare the COVID-19 as a global pandemic. 5 As of August 10, 2020, COVID-19 has affected more than 227 countries and territories with more than 7 million active cases and the number is still exponentially rising. 1, 6 South Asia is one of the leading COVID-19 affected regions with more than 20 million (28.57 % of global cases) confirmed cases as of August 2020. India, Pakistan, and Bangladesh were the most COVID-19 affected countries in the South Asia region with 1695988, 278305, and 237661 confirmed cases, respectively. 6 Over the last few years, with availability and usage increased worldwide, the internet has become the main source of information particularly for healthcare-related concerns. 7, 8 With over 4.5 billion active internet users around the world, millions of people worldwide search online for health-related queries which make Web search queries a valuable source of public health infoveillance. 9, 10 Understanding about Web search trends can provide valuable information about public interest and awareness in health emergencies as a proxy for public health risk perception. 8, 11 Prior studies used internet search queries to model the outbreak of infectious disease (e.g., dengue, and influenza), track substance usages, and monitor public behavior. 12, 13 Subsequently, internet search queries also used to investigate public interest, health awareness, and mental health in the COVID-19 pandemic situation. [14] [15] [16] However, most of the studies so far focused on China, the United States, and other European countries. To our knowledge, there is no Infodemiology study till now have explored the association of web queries to the COVID-19 public internet and lawlessness in South Asian countries. With more than 880 million (18.87% of global users) internet users in South Asia, tracking Web search queries can be a real-time health informatics tool to strengthen the public health surveillance in health emergencies such as . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint COVID-19 pandemic. 9, 10, 17 Therefore, this study explored the potential use of internet search trends for monitoring public interest and preventive health awareness towards COVID-19 infections in South Asia. Daily data on confirmed COVID-19 cases were obtained from the data repository managed by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University. 6 Data were retrieved for worldwide total new cases as well as for the following individual countries namely the United States (US), Bangladesh, Bhutan, India, Maldives, Nepal, Pakistan, and Sri Lanka from the 22nd January 2020 to 29th July 2020. The Google Trends analytics platform was utilized to explore internet user search activities related to COVID-19 pandemic public interest and population health literacy. Google Trends is an online tracking system of internet hit search volumes, which enables the researchers to study trends and temporal patterns of the popular search queries. 18 Google trend determines the proportion of the searches for user specified terms among all web queries on the Google Search website and other affiliated Google sites for a given location and time. 19 Google trend then normalized the proportion by the highest query share of that term over the time-series and reports search interest as a unit of relative search volume (RSV) index. The retrieved RSV index values range from 0 to 100, with a value of 50 representing half the public interest as a value of 100. 20 This RSV indices have been used previously used to analyze public interest in various healthrelated issues as well as passive health surveillance and disease monitoring. 13, 14, 21 The search term "Coronavirus (Virus)" was used to retrieve Worldwide and nation-specific RSV indices in Google Trend as a representation of public interest on COVID-19 information. Also, a combination of the popular search phrases "hand wash", "face mask", "hand sanitizer", "face shield" and "gloves" were used to retrieve RSV indices as a surrogate of public interest on the practice of personal hygiene and other COVID-19 preventive measures. Changes in temporal trend of public interest and number of COVID-19 cases were analyzed graphically for nation-specific major events and policy initiatives (e.g., first coronavirus cases,1000 cumulative deaths, nationwide lockdown). Temporal association in public interest in COVDI-19 and the number of new confirmed cases were analyzed using time-lag correlations measures. The lag correlation assesses whether the increases in GT data were correlated with the . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint subsequent increase in COVID-19 cases and how far (days) the two series are offsets. 22 Pearson correlation coefficient was used to correlate between public interest and COVID-19 cases for a time lag between 30 and +30 days. Data of south Asian countries were also compared with the United States which was one of the most affected countries by the COVID-19 pandemic to investigate whether these searching terms objective for other communities or not. For the general worldwide interest and confirmed new case data, the period was set from January 22 to March 17, 2020, while for the countries it was set from February 15 to March 17. For the South Asian countries' time lag and correlation analysis, case data from February 22 to March 17 were used. Each country's data was examined individually, and no direct comparison was made between countries in COVID-19 data or RSV index data. For each time frame, a new google trend dataset for was retrieved and matched with the official COVID-19 confirmed case data for further analysis. All the statistical analyses were performed using R studio and p value less than 0.05 was considered significant. IRB approval was not required because this study did not involve human subjects. Worldwide public interest in coronavirus started to increase from January 22nd,2020 and reached its first peak on January 31, 2020, when WHO declared Covid-19 as a national healthcare emergency ( Figure 1 ). Then COVID-19-related worldwide searches remain low for some time and continuously increased after the word was spread on the outbreak in Wuhan, China. Worldwide COVID-19 related searches. COVID-19-related worldwide public interest continued to expand and reached a peak on March 16, 2020, as worldwide coronavirus cases and related death were reported and right after the WHO announcement to declare the coronavirus outbreak as a pandemic. Worldwide COVID-19 related searches remained steadily high for 1 week and again reached a peak on March 22, 2020, after the massive spread of coronavirus in Europe and the US. There are two small peaks, one sharp increase in numbers on February 12, 2020, when china adjusted coronavirus cases from confirmatory laboratory test and the other peak on April 12, 2020, due to cases around the globe. As of 29 July 2020, the daily number of confirmed coronavirus cases still increasing almost every day and has not reached its highest point yet. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint The number of reported confirmed COVID-19 cases in Bhutan, Afghanistan, Maldives, Nepal, and Sri Lanka were at a low level compared to other South Asian countries. India, Pakistan, and Bangladesh were the most COVID-19 affected three South Asian countries that overtook China in terms of the number of coronavirus cases. The COVID-19 related searches in India, Pakistan, and Bangladesh researched its initial peak right after the 1st confirmed case reported in the respective country. After the first peak, public interest in COVID-19 related information increased rapidly after the dissemination of confirmed cases in mainland China and Europe. Public searches about the coronavirus in these countries reached its second major peak right after Table 1 ). Close monitoring and continued evolution of enhanced communication strategies are urgently needed to provide general populations and vulnerable populations with actionable information for self-protection and clear guidance during an outbreak. 24 The application of electronic medium, specifically the internet data in health care research, known as infodemiology, is a promising new field that provides unmatched opportunities for the management of health information generated by the end-users. 10 Using this unique potential, previous researchers were able to correlate the internet searches with traditional surveillance data and can even predict the outbreak of infectious disease several days or weeks earlier. 11, 13 Recent COVID-19 related infodemiology studies modeled daily laboratory-confirmed / suspected cases and associated death with internet search queries in the US, China, Iran and several European countries. [16] [17] [18] [19] 23, 25 Also, the regression/ machine learning model using the Google search queries moderately predicted the incidence of COVID-19 in Iran. Similar attempts were subsequently made to predict the previous coronavirus related outbreaks (e.g., SARS, MERS) and other infectious diseases (e.g., Zika, dengue). 11, 13, 26 . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint Despite the studies above, there is still a lack of research on this theme in South Asia and so fur there is no similar COVID-19 infodemiology study in the countries of this region. According to WHO, South Asia is considered highly vulnerable to any large-scale outbreak of an infectious disease being one of the most populous word region. 27 Therefore, the use of infodemiology particularly internet searches provides unique opportunities to monitor real-time public interest and awareness, particularly in countries with a lack of diagnostic and surveillance capacity, and thereby disseminate evidence-based health information to the people. 28 In South Asia, the first imported coronavirus case was reported in Nepal on 23 January 2020. 6 Starting from the March 2020 number of coronavirus cases started to increase rapidly in South Asian countries except for Bhutan (Supplemental Table 1 which is likely to be related to the number of local cases and lower virulence to COVID-19. As of the 15th March 2020, the number of patients confirmed with COVID-19 in Bangladesh (05 cases) was lower than in India (110 cases), and Pakistan (31 cases). 6 After the peaks within twothree weeks, internet searches continued to decline declined due to massive dissemination of information reported on the local/national news reporting, video news reporting, and health expert reporting in social media. 16,17 These findings suggest that internet searches can potentially help governments to define proper timing of risk communication, improve the public's vigilance, and strengthen the publicity of precautionary measures when facing any public health emergencies like This study has some limitations that should be acknowledged. First, this study used single search engines, Google, to retrieve population interest data form the South Asian Countries. Thus, there might be selection bias since people who use other search engines are not included in this investigation. However, since more than 880 million people in South Asia use the internet and . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint google as the major search engine (more than 98% market share), google search queries can be a strong tool to estimate public interest. 9, 33 Second, Google trend do not report search query result the form of a relative search value instead of absolute search volumes which might have limited more in-depth and precise investigations. In addition, the Google trend excludes all the search results with any typographical error in the query terms. Third, although the number of studies based on google trend is increasing, Google does not provide the detailed information about the procedures by which they generate search data, and the study population responsible for the searches remain unclear. 13 Finally, search volumes can be influenced by the dissemination of information through the news media and it is still unclear whether changes in online activity translate to changes in health behavior. This requires caution when analyzing results and making interpretations from the analyses. This work did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted August 26, 2020. . https://doi.org/10.1101/2020.08.24.20180943 doi: medRxiv preprint An interactive web-based dashboard to track COVID-19 in real time A novel coronavirus outbreak of global health concern COVID-19 infection: Origin, transmission, and characteristics of human coronaviruses Presumed Asymptomatic Carrier Transmission of COVID-19 COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) Help seeking behavior and the Internet: A national survey Trends of infodemiology studies: a scoping review Global Digital Overview. DataReportal. Accessed Infodemiology and infoveillance: Scoping review The application of internet-based sources for public health surveillance (infoveillance): Systematic review Google trends: A web-based tool for real-time surveillance of disease outbreaks The Use of Google Trends in Health Care Research: A Systematic Review Rapid Decline in Online Search Queries for Hip and Knee Arthroplasties Concurrent With the COVID-19 Pandemic Analysis of dermatologic conditions in Turkey and Italy by using Google Trends analysis in the era of the COVID 19 pandemic Google Trends: Opportunities and limitations in health and health policy research. Health Policy Google Trends in Infodemiology and Infoveillance: Methodology Framework Silver lining of COVID-19: Heightened global interest in pneumococcal and influenza vaccines Assessing the methods, tools, and statistical approaches in Google trends research: Systematic review Association of the COVID-19 pandemic with Internet Search Volumes: A Google TrendsTM Analysis COVID-19: what is next for public health? Predicting COVID-19 Incidence Through Analysis of Google Trends Data in Iran: Data Mining and Deep Learning Pilot Study Scoping review on search queries and social media for disease surveillance: A chronology of innovation Communicable diseases in the South-East Asia Region of the World Health Organization: Towards a more effective response