key: cord-0784744-alykt22u authors: Rovetta, Alessandro; Castaldo, Lucia title: The Impact of COVID-19 on Italian Web Users: A Quantitative Analysis of Regional Hygiene Interest and Emotional Response date: 2020-09-29 journal: Cureus DOI: 10.7759/cureus.10719 sha: 6f6dfde413daa8386e21f821d946c33f58e28de3 doc_id: 784744 cord_uid: alykt22u Background: Between the end of February and the beginning of June 2020, Italy was certainly one of the worst affected countries in the world by the coronavirus disease 2019 (COVID-19) pandemic. During this period, Web interest in the novel coronavirus underwent a drastic surge. Objective: The aim of this study was to quantitatively analyze the impact of COVID-19 on Web searches related to hygiene-preventive measures and emotional-psychological aspects as well as to estimate the effectiveness and limits of online information during an epidemic. We looked for significant correlations between COVID-19 relative search volumes and cases per region to understand the interest of the average Italian Web user during international, national, and regional COVID-19 situations. By doing so, it will be possible to deduce the mental and physical health of the population. Methods: We used the Google Trends tool, which returns normalized values called relative search volumes (RSV), ​​ranging from 0 to 100 according to the Web popularity of a group of queries. By comparing the RSVs in periods before and after the outbreak of the novel coronavirus in Italy, we derived the impact of COVID-19 on the activity of Italian netizens towards novel coronavirus itself, specifically regarding hygiene, prevention, and psychological well-being. Furthermore, we calculated Pearson’s correlations ρ between all these queries and COVID-19 cases for each region. We chose a p-value ([Formula: see text]) threshold α=.1. Results: The general Web interest in COVID-19 in Italy waned, as did the correlation with the official number of cases per region (p<.1 only until March 14). Web interest was similarly distributed across the regions (average search volume [ASV]=92, standard deviation [SD]=6). We found that all trends depend significantly on the number of COVID-19 cases at the national but not international or regional levels. Between February 20 and June 10, Web interest related to hygiene and prevention increased by 116% and 901%, respectively, compared to those from January 1 to February 19, 2020 (95%CIs: [115.3, 116.3], [850.3, 952.2]). Significant correlations between regional cumulative Web searches and COVID-19 cases were found between February 26 and March 7 ([Formula: see text] =.43, 95%CI: [.42, .44], p=.07). During the COVID-19 pandemic until June 10, 2020, national Web searches of the generic terms “fear” and “anxiety” grew by 8% and 21%, respectively (95%CIs: [8.0, 8.2], [20.4, 20.6]), compared to those of the period of January 1, 2018 - December 29, 2019. We found cyclically significant correlations between negative emotions related to the novel coronavirus and COVID-19 official data. Conclusions: Italian netizens showed a marked interest in the COVID-19 pandemic only when this became a direct national problem. Web searches have rarely been correlated with the number of cases per region; we conclude that the danger was perceived similarly in all regions. The period of maximum effectiveness of online information in relation to this type of situation is limited to three to four days from a specific key event. We suggest that all government agencies focus their Web disclosure efforts over that time. We found cyclical correlations with Web searches related to negative feelings such as anxiety, depression, fear, and stress. Therefore, to identify mental and physical health problems among the population, it suffices to observe slight variations in the trend of related Web queries. Owing to the introduction of tools for the analysis of query popularity such as Google Trends, the Internet has become one of the most important platforms for obtaining valuable data on users' health and interests [1, 2] . Over time, an increasing number of authors have utilized Google Trends and the techniques of use have been refined by creating tutorial articles [3] . Alongside the increasing number of users who use the Web every day to obtain information on any topic, there is a growing need to control the circulation of this information to prevent fake news from affecting public health and the economy [4] . For this reason, new branches of science called infodemiology and infoveillance were born [5] . Given the coronavirus disease 2019 (COVID-19) emergency, these two disciplines will provide useful data to researchers to evaluate the impact of the pandemic on people's lives and welfare, the spread of fake news and infodemic monikers, as well as people's attitudes towards an unprecedented international issue [6] . Between the end of February and the beginning of April, Italy was the second most affected country by COVID-19, both in terms of the total number of infected and deaths, becoming the epicenter of a shifting pandemic [7] . As evidenced in other studies, an enormous amount of information regarding the novel coronavirus has circulated in the country [8] . Although most of these were moderately infodemic, there was great interest in hygiene and prevention measures. However, to our knowledge, no research in the scientific literature available has quantitatively analyzed the impact of COVID-19 on psychology-or hygiene-related Web searches by looking for correlations with the number of cases per region in Italy. Since, as shown in other studies, the impact of COVID-19 on the mental and physical health of the Italian population has been devastating, we believe it is of fundamental importance to study the relationship between the trend of health-related queries and the real need for assistance [9] . In fact, this could help scientists estimate the well-being of the population from the trend of Web searches. We used the Google Trends tool to investigate Italian netizens' Web interest in the COVID-19 pandemic from February 20 to June 10, 2020. Data analysis was performed from June 11 to August 2, 2020. As explained in other studies of this type, Google Trends provides normalized values, called a relative search volume (RSV), ranging from 0 to 100 in proportion to the popularity of the queries [3] . We searched for specific keywords that recorded high RSVs, in conjunction with "related queries" and "related topics", and utilizing the results of another study on the COVID-19 infodemiology in Italy [8] . We focused on three categories of queries: generic news, hygiene, and those related to stress and anxiety ( Table 1) . After collecting the data, we looked for correlations with the official data on COVID-19 provided by the Italian Civil Protection Department regarding the number of infected, hospitalized, dead, healed, and tested [10] . We have summarized the list of abbreviations used in Table 2 . When the trends showed sufficient regularity, we looked for interpolating functions that represented them. We collected all the data day-by-day from February 24 to June 10, 2020. In order to have a graphical representation of regional Web search trends, we exploited this strategy by calling the Google Trends relative search volume of a specific keyword group for the region and the national relative search volume for the same keyword group, we introduced a variable x and imposed , obtaining . Then, we calculated a weighted relative search volume for each region i. After that, we calculated Pearson's correlations between the daily RSVs and the number of active COVID-19 cases, total cases, new cases, isolations, recovered, new deceased, and total deceased, for all regions. We repeated the same procedure with cumulative values in the same time interval. The regions investigated were N=19. We chose (10%) as the p-value ( ) significance threshold. We also reported the thresholds for (5%) and (1%): . We calculated the average search volume (ASV) values as the average values of RSVs at time intervals specified in the results. For each ASV we reported a Gaussian 95% confidence interval using the formula where is the standard deviation. We denote the arithmetic mean of specific groups of correlations shown in the results. To estimate the confidence interval, we used the error propagation formula: , where is standard deviation in the Gaussian distribution . We used Igor Pro 6.3.7 and Microsoft Excel 2019 software for interpolations and data processing. We calculated percentage discrepancies using the formula . We have calculated all errors on the functions used through the propagation formula of standard errors [11] . For all interpolations and some data, we reported the best values and the relative standard deviations using the abbreviation ; in such cases, we have kept all possible significant figures provided by the software. The normality of each data group was verified both graphically and through the following requests: , where k is the kurtosis and s is the skewness [12, 13] . The Web interest of Italian users towards COVID-19 was similarly distributed among all regions (ASV=92, SD=9) and dropped over time [ Table 3 , Figure 1 ]. Its trend, after the last peak on March 11 until June 10, 2020, is well represented by the exponential function with . The arithmetic mean of all the average daily COD correlations between February 24 and June 10, 2020, is not statistically significant (ρ best =.17, 95%CI: [.15, .19], p=.48). We found significant positive daily COD correlations at the end of February with hospitalized, home isolated, and recovered patients, as well as with new, active, and total cases, and new and total deaths (ρ best =.45, 95%CI: [.44, .46], p=.05). After that, other substantial positive COD correlations occurred only two times [ Figure 2 ]. We found significant negative daily COD correlations from March 8 onwards; namely with the number of COVID-19 swab tests reached several significant negative peaks between May and June, 2020 (ρ best =−.48, 95%CI: [−.51,−.45], p=.04). The cumulative COVID-19 Web searches -COD correlations maintained significant values until around March 14, although these also showed a declining trend [ Figure 3 ]. In fact, excluding the items "recovered", "Δ total infected", and "swabs", a function representative of this correlation (February 24 -June 10, 2020) is , with . Generic Web search interest related to disinfectants went from ASV=5. 14 Figure 4 ]. Considering daily Web searches, we found significant COD correlation values in very few isolated cases [ Figure 5 ]. Finally, regarding cumulated Web searches, we highlighted significant positive COD correlations from February 26 to March 7, 2020, with all fields except "home isolation" and "swabs" (ρ best =.43, 95%CI: [.42, .44], p=.07) [ Figure 6 ]. Changes in the distribution of interest were more important than those on generic news (ASV=86, SD=9; ASV=85, SD=9) [ Table 1 ]. Regarding the Web emotional response to COVID-19, we obtained the following results: nationally, comparing the average weekly RSVs of the two periods January 1, 2018-December 31, 2019, and January 1, 2020-June 10, 2020, we observed decreases of 3. novel coronavirus made up about 1.4% that related to COVID-19 (95%CI: [1.2, 1.5] ). Regarding the latter, it was only possible to look for COD correlations with the cumulative Web interest because the data on the daily Web interest were too uncertain. We found significant cyclically positive COD correlations throughout the investigated period [ Figure 8 ]. To support this, Web interest in negative emotions has had extremely different distributions across regions (ASV=55, SD=41). The quantitative investigation of Web users' interests in a crisis moment is a priority aspect to test the effectiveness of online information and the perception of risk by the population. This is the first study to carry out such an analysis in Italy, one of the countries worst affected by the COVID-19 pandemic. Our results show that interest in the novel coronavirus has exploded only in the presence of two contemporary phenomena: i) the presence of the virus in the nation, and ii) a significant number of cases. Even the state of emergency declared on January 31, 2020, due to a case of two Chinese tourists visiting Italy while infected with COVID-19, caused a very moderate Web interest for a limited duration [14] . We also highlighted a disparity between online searches and the actual perception of the danger: in fact, as shown in other studies and reported by national and international newspapers, the first episodes of racism against Chinese people and the so-called "coronavirus psychosis" [8, 15, 16] . Therefore, we believe it is plausible that the analysis of Web interest in Italy shows genuine interest only if the issue exceeds a certain severity threshold. We believe that the climate of uncertainty created by the press and by the many conflicting opinions of scientists negatively influenced the perception of the risk linked to COVID-19 in the Italian population. In particular, some have compared it to an influence, while others rightly denoted the dangers and critical issues also related to the lack of reliable information [17, 18] . When the problem became national, we recorded two peaks in the Web queries (February 27, 2020 and around March 14, 2020) . Significant correlations between Web interest and regions with multiple cases rarely occurred in the initial phase, i.e. Web interest in the novel coronavirus has been linked only to its presence on national soil and nothing more. Since the aforementioned peaks had a maximum duration of about seven days, we believe that online information is effective only in that time frame. The regional Web interest in this topic was very similarly distributed. This leads us to propose two considerations: i) the online information on a problem in its initial phase can heavily influence the thinking of the Italian population; ergo, it is important that those who present information online are clear right away, and ii) the online information on a problem in its initial phase can have a deep impact on the problem's evolution. Positive data emerged from Web interest in hygienic precautions, such as disinfectants and masks. In fact, the respective increases of approximately 115% and 901% compared to the periods preceding the virus signals a drastic change in netizens' habits to face the pandemic. We must weigh such a result on the fact that the mean value of these queries corresponds to approximately 13% of all novel coronavirus-related queries. The lack of long-term correlations with the number of cases and the low variance of the data suggest that the interest was similarly distributed among all regions; this helped to avoid the spread of the epidemic nationwide [ Table 1 ]. Even in this case, we have seen two peaks in queries and then a waning interest. This does not mean that the actual interest in hygiene has decreased, since disinfectants, masks, and gloves have experienced a substantial increase in sales [19] . General interest in negative feelings such as stress and depression fell during the lockdown (around 3% and 16%, respectively). The RSVs of anxiety and fear increased by approximately 21% and 8%, respectively. Therefore, it is plausible that there has been a shift in interest towards the latter. Direct associations between these symptoms of distress and the novel coronavirus made up about 1% of total COVID-19 queries. The data had an incredibly significant variance and high linear correlation values, i.e. the interest was much more pronounced in some regions, such as Emilia Romagna, Piedmont, Campania, Lombardy, and Puglia [ Table 1 ]. The inconsistent trend of the correlation between cumulative Web searches and COVID-19 official data shown in Figure 8 suggests that the number of total searches was low, as it was easily influenced from day to day. The effects of the novel coronavirus and its lockdown on the mental health of the Italian population have been serious, as shown in other studies [9] . Therefore, even a small amount of these types of queries can mean a lot for mental disorders. Web searches provide quantitative values only for users who use the Internet to obtain information on certain topics. All the queries investigated were collected from the Google search engine. Italian netizens showed a marked interest in the COVID-19 pandemic only when this became a direct national problem. In general, Web searches have rarely been correlated with the number of cases per region; we conclude that the danger, once it arrived in the country, was perceived similarly in all regions. We can state that the period of maximum effectiveness of online information, in relation to this type of situation, is limited to three to four days from a specific key event. If such a scenario were to occur again, we suggest that all government agencies focus their Web disclosure efforts over that time. Despite this, we found cyclical correlations with Web searches related to negative feelings such as anxiety, depression, fear, and stress. Therefore, to identify mental and physical health problems among the population, it suffices to observe slight variations in the trend of related Web queries. This article was previously published as a preprint on the medRxiv database [20] . Human subjects: All authors have confirmed that this study did not involve human participants or tissue. Animal subjects: All authors have confirmed that this study did not involve animal subjects or tissue. In compliance with the ICMJE uniform disclosure form, all authors declare the following: Payment/services info: All authors have declared that no financial support was received from any organization for the submitted work. Financial relationships: All authors have declared that they have no financial relationships at present or within the previous three years with any organizations that might have an interest in the submitted work. Other relationships: All authors have declared that there are no other relationships or activities that could appear to have influenced the submitted work. The use of Google trends in health care research: a systematic review The utility of "Google Trends" for epidemiological research: Lyme disease as an example Assessing the methods, tools, and statistical approaches in Google trends research: systematic review Our World in Data Infodemiology and infoveillance: framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the internet Global infodemiology of COVID-19: analysis of Google web searches and Instagram hashtags COVID-19-Related web search behaviors and infodemic attitudes in Italy: Infodemiological study Effects of Covid-19 lockdown on mental health and sleep disturbances in Italy Italian Civil Protection Department, COVID-19 updates An Introduction to Error Analysis: The Study of Uncertainties in Physical Measurements Problematic standard errors and confidence intervals for skewness and kurtosis Analysis of Skewness and Kurtosis UPDATE 2-Italy declares coronavirus emergency after first two cases confirmed Outbreaks of Xenophobia in West as Coronavirus Spreads La psicosi da Coronavirus in Italia è ingiustificata: il rischio contagio è improbabile Lessons from the Italian Media's Coverage of the Coronavirus Coronavirus: Roberto Burioni parla delle notizie, dei pericoli e dei possibili sviluppi Fase 2, dai reagenti per i tamponi alle mascherine perché tutto è esaurito The impact of COVID-19 on Italy web users: a quantitative analysis of regional hygiene interest and emotional response