key: cord-0152313-8fasz5ht authors: Costola, Michele; Iacopini, Matteo; Santagiustina, Carlo R.M.A. title: Public Concern and the Financial Markets during the COVID-19 outbreak date: 2020-05-14 journal: nan DOI: nan sha: 8254606eb96c212ee7025cff519299ff98776f82 doc_id: 152313 cord_uid: 8fasz5ht We measure the public concern during the outbreak of COVID-19 disease using three data sources from Google Trends (YouTube, Google News, and Google Search). Our findings are three-fold. First, the public concern in Italy is found to be a driver of the concerns in other countries. Second, we document that Google Trends data for Italy better explains the stock index returns of France, Germany, Great Britain, the United States, and Spain with respect to their country-based indicators. Finally, we perform a time-varying analysis and identify that the most severe impacts in the financial markets occur at each step of the Italian lock-down process. Google Trends (GT) provides indices based on the relative web-search volumes of a specific topic over time. These indices can be retrieved for selected geographic areas or at the worldwide scale. The interpretation of GT indices is straightforward: the higher the value of a given GT index, the greater the public attention for that topic. In recent years, the informational content of Google Trends data has shown to have explanatory and forecasting power in several fields of economics and finance. In relation to financial markets, GT indices based on stock-related terms: (i) lead to a higher portfolio diversification with respect to the market benchmark [9] , and (ii) can be used as sentiments [5] and early warning market signals [2, 4] . On the macroeconomic side, GT indices have been used to construct economic uncertainty indicators able to explain several macroeconomic variables [6, 3, 7] . GT data has also been successfully used for disease surveillance purposes for MERS [11] , Chicken-Pox [1] and Flu [12] . In this paper, we retrieve GT country indices (GT-COVID-19) for the coronavirus topic from January 2020 to April 2020. We use them as proxies for country-level public concern and investigate their impact on the financial markets during the outbreak of the coronavirus disease. The COVID-19 pandemic is an interesting case to investigate since, due to its virulence and infectivity, it represents a major exogenous shock to the economic and financial system, which could not have been reasonably foreseen. In our analysis, we consider the six most impacted countries worldwide in terms of confirmed cases, as of May the 1st 2020: the United States, Spain, Italy, Great Britain, Germany and France. Our findings are three-fold. First, the GT-COVID-19 index for Italy is found to be a driver of the GT indices for all the countries considered. This lead-lag relationship is primarily due to the fact that Italy has been the first European country to experience an outbreak of COVID-19 and to implement lock-down measures since World War II. In addition, several European governments have introduced travel restrictions from and to Italy during the first weeks of the outbreak. The ensuing spread of the coronavirus disease in other countries provided a similar dynamic on the new cases and for the implemented lock-down measures. Therefore, the delayed reaction of those indexes is probably due to the twist of the aforementioned events. Second, we analyse if GT-COVID-19 indices explain the stock market returns. Given that an epidemic disease is by definition an adverse event, GT indices can be interpreted as a measure of coronavirus-related uncertainty and perceived risk. Our findings show that GT-COVID-19 indices contribute to explain the dynamic of the stock market returns for Italy, Spain and Germany. Interestingly, substituting country-specific GT indices with the Italian one magnifies the exposition of all considered markets to COVIDrelated public concern, bursting the explained variance of stock index returns. This highlights that the pandemic outbreak in Italy may have exerted a role in the general perception of the pandemic severity. In this respect, we finally perform a time-varying analysis to investigate the impact of the GT-COVID-19 indices on the financial markets over time. Interestingly, we identify the most severe impacts during each step of the Italian lock-down process and find almost no impact prior to the beginning of the pandemic in Europe, even though the presence of COVID-19 in China was already known by the end of December 2019. The structure of the paper is as follows: Section 2 briefly illustrates the dataset and data collecting process, then Section 3 presents the main results. Finally, Section 4 concludes. We collected social and financial data for a panel of N = 6 countries, including Germany (DE), France (FR), United Kingdom (GB), Unites States (US), Italy (IT) and Spain (ES), for T = 75 working days, from 1 January 2020 to 14 April 2020. As for the financial data, we have downloaded the closing price of the stock market index 1 at daily frequency from Bloomberg, and obtained the log-returns. Social data collection The raw social data have been systematically collected from Google using the GtrendsR package by [10] , which sends requests to the Google Trends website exploiting the query parameters in the URL of the request (see Table 1 ). We have downloaded daily search relative volumes per country, from 01/01/2020 to 14/01/2020, on Google Search (all), Youtube and Google News, matching the "coronavirus" topic. Then, the time series have been rescaled in the [0, 1] interval. Topics are groups of terms that have equivalent meanings in different languages. Searching by topic, rather than by term(s), has several practical and theoretical advantages, including: • No "priors" affecting results: with this method a topic is given as it is, e.g. no means to strategically "stretch it" or "prune it". Researchers, on their side, have the advantage of not having to use their prior knowledge about the topic to choose the terms used to identify it. For example, one could erroneously specify a set of terms that are not representative for the topic he wants to study; • No need to "translate" the query terms: the topic's identifier is unique for all languages and all countries in the world. Therefore, with the same identifier one can extract data about a topic for any country, in whatever language. This is a huge advantage for studies that comprise different countries with different languages and different degree of usage of the various technical terms related to a topic (e.g COVID-19 vs Coronavirus). It also allows to take into account searches done in a country in a language other than the official one(s) 2 . Therefore the data obtained will also be representative of searches made by linguistic minorities/foreigners; • Overcoming the "number of terms" limit: since Google Trends imposes a limit on the number of terms that can be used jointly to build queries, if a topic is related to many terms, using terms is never convenient. Preliminary analysis Figure 1 shows the time series of log-returns and the three groups of GT indices, subdivided by source. For each country i ∈ {DE, F R, GB, US, IT, ES}, we observe the log-return series, y i,t , and three GT indices GT j,i,t , where j ∈ {Y, N, S}, corresponding to YouTube, Google News and Google Search, respectively. The country-specific GT series have a common dynamic, whereas for each source there is cross-country heterogeneity. We remark that high values of either GT index correspond to intervals which exhibit turbulent dynamics for financial returns. This is particularly evident throughout March. Moreover, we notice that the Italian GT indices (for each source) anticipate those of the other countries. To test for the presence of lead-lag relations, we perform a cross-correlation analysis of each series against Italy. Table 2 reports the peak in the cross correlation, where negative values implying that the series for Italy is leading. By looking at the beginning of the period, we observe that both the GT indices and the financial series do not appear to react to epidemicsrelated facts originated in China. Overall, these findings corroborate the interpretation that public concern has cross-country similarities, with Italy acting as the forerunner. YouTube -3 -4 -6 -4 -3 Google News -3 -5 -7 -5 -3 Google Search -4 -6 -8 -6 -3 In this section we present the results of country-specific regressions, aiming to discover a possible relation between social media data and the returns on each country's market index. We focus on YouTube data, that is GT i,t = GT Y,i,t , however we have obtained similar results using either Google News or Google Search data. 3 We investigate the exposition of markets to public concern, as proxied by GT-COVID-19 indices, by estimating the following AR(1)-X model. We focus on YouTube indices, that is GT j,i,t = GT Y,i,t . Analogous results have been obtained using either Google News or Google Search indices. Robust standard errors are in parentheses, *, ** and *** denote statistical significance at 10%, 5% and 1%, respectively. Table 3 highlights that the impact of the country-specific GT-COVID-19 index is significant and negative for all countries except France and Great Britain. This implies that markets negatively respond to public concern, as proxied by GT-COVID-19 indices. Following the insights from the lead-lag relations, we check the exposition of markets to public concern in Italy by estimating the model using GT i,t = GT IT,t for all i. Interestingly, from Table 4 , we observe that the Italian GT-COVID-19 index is a key explanatory variable for all country-level stock index returns, and its use remarkably increases the adjusted R 2 as compared to using country-specific GT indices. In particular, the markets in FR and GB are insensitive to the public concern proxied by their GT indices, but are negatively related to the Italian GT-COVID-19. This highlights that the severity of the outbreak perceived from Italy represents a timely indicator of the destabilizing effect of the pandemic on financial markets. As robustness check, we have controlled for country-specific market implied volatility and growth rate of confirmed COVID-19 cases, our findings remain unchanged. Robust standard errors are in parentheses, *, ** and *** denote statistical significance at 10%, 5% and 1%, respectively. To further deepen the relation between the GT-COVID-19 indices and market returns, we estimate a time-varying parameter model, where the impact of the GT-COVID-19 index is allowed to vary according to a latent AR(1) process where ǫ i,t and η i,t are mutually independent. Estimation of β i,t is performed using the Kalman smoother, without imposing stationarity restrictions on the autoregressive coefficients A i . Figure 2 plots the estimated paths of the impact of the GT index on the corresponding stock index returns. All country-specific coefficients share a common dynamic. The impact of all GT indices drops to negative values the same day of key events related to COVID-19. In particular, the lowest peaks occur in correspondence of the enactment of the Italian lock-downs. Moreover, before the first confirmed case in Europe (23 January) all the coefficients are very close to zero. This signals that the GT indices had no explanatory power prior to the spreading of the pandemic to Europe, even though the presence of COVID-19 in China was already known since the end of December 2019. The main findings of the constant parameter analysis are robust to controlling for implied volatility and number of confirmed cases (see Appendix B). In addition, we have checked the robustness: (i) using a lagged version of the social index, (ii) inclusion of daily and weekly lags of the social index, controlling or not for the lagged financial return. 4 Moreover, the results of the TVP analysis are robust to: (i) using a lagged version of the social index, (ii) controlling or not for the lagged financial return, (iii) specifying more lags in the dynamics of the coefficients. 5 Recent literature has shown that Google Trends data can successfully explain the current and future patterns of the state of the economy, especially during unfavorable events [8, 13, 14] . Differently from recent financial crises, COVID-19 is an exogenous shock to the system that has affected several countries with different timings related to the spreading of the disease. We have investigated the exposure of the stock index returns of Italy, France, Germany, Great Britain, the United States and Spain to GT-COVID-19 indices based on search-engine query volumes. Our findings show that most of these indices have significant explanatory power on stock market returns. Interestingly, the Italian GT-COVID-19 index acts as forerunner and better explains other countries' market returns. Moreover, the greatest impact of GT indices occur in correspondence of the different phases of the lock-down in Italy, despite the public awareness of the contagion in China since January. The disruptive effect of COVID-19 on financial markets is well described by the public concern perceived in Italy, which has been the first Western country to experience a virulent outbreak and to adopt drastic measures after China. Google Trends provides a rescaled values of relative search volumes per unit-of-time. The data-generation procedure can summarized as follows: 1. Extract a random sample of queries corresponding to the searched geographical area, category and time-span; 2. For every unit-of-time (whose length/duration depends on the time-span) divide the count of the number of searches matching the query term(s)/topic parameter (q), by the total number of searches in the same unit-of-time; We checked the robustness of our findings by estimating the constant parameters model with additional lags of the GT indices, aiming to capture the daily and weekly effects. Moreover, we controlled for (i) the country-specific implied volatility 6 (IV i,t ), and (ii) the country-specific growth rate of new COVID-19 cases 7 (∆ % CC i,t ). In the latter case, we estimated the model Additional robustness checks are available upon request to the authors. Robust standard errors are in parentheses, *, ** and *** denote statistical significance at 10%, 5% and 1%, respectively. Digital epidemiology reveals global childhood disease seasonality and the effects of immunization Google searches and stock returns Google it up! a Google Trends-based uncertainty index for the United States and Australia Quantifying the semantics of search behavior before stock market moves Search of Attention Google search-based metrics, policy-related uncertainty and macroeconomic conditions The predictive power of Google searches in forecasting US unemployment Collective attention and stock prices: Evidence from Google Trends data on Standard and Poor's 100 Can Google Trends search queries contribute to risk diversification? gtrendsr: Perform and display Google Trends queries. R package version High correlation of Middle East respiratory syndrome spread with Google search and Twitter trends in Korea Accurate estimation of influenza epidemics using Google search data via ARGO Online big data-driven oil consumption forecasting with Google Trends Revisiting the use of Web search data for stock market movements