key: cord-0970458-1qt7vf59
authors: Chakraborty, Amartya; Bose, Sunanda
title: Around the world in 60 days: an exploratory study of impact of COVID-19 on online global news sentiment
date: 2020-10-21
journal: J Comput Soc Sci
DOI: 10.1007/s42001-020-00088-3
sha: dd77844deff7211974f67ddfa4793950c09f30b7
doc_id: 970458
cord_uid: 1qt7vf59

The world is going through an unprecedented crisis due to COVID-19 breakout, and people all over the world are forced to stay indoors for safety. In such a situation, the rise and fall of the number of affected cases or deaths has turned into a constant headline in most news channels. Consequently, there is a lack of positivity in the world-wide news published in different forms of media. Texts based on news articles, movie reviews, tweets, etc. are often analyzed by researchers, and mined for determining opinion or sentiment, using supervised and unsupervised methods. The proposed work takes up the challenge of mining a comprehensive set of online news texts, for determining the prevailing sentiment in the context of the ongoing pandemic, along with a statistical analysis of the relation between actual effect of COVID-19 and online news sentiment. The amount and observed delay of impact of the ground truth situation on online news is determined on a global scale, as well as at country level. The authors conclude that at a global level, the news sentiment has a good amount of dependence on the number of new cases or deaths, while the effect varies for different countries, and is also dependent on regional socio-political factors.

We are in the midst of a global crisis, owing to the outbreak and spread of the COVID-19 virus, and the substantially damaging influence of this viral infection has forced the World Health Organization (WHO) to declare the ongoing situation as a pandemic. As per the official statement of WHO, "COVID-19 is the infectious disease caused by the most recently discovered coronavirus. This new virus and disease were unknown before the outbreak began in Wuhan, China, in December 2019. COVID-19 is now a pandemic affecting many countries globally" [1] . As a precautionary or preventive response to this declared pandemic, countries all over the world have introduced restrictions on mobility and transportation, referred to as lockdowns. Consequently, citizens are being asked to stay indoors as a measure of safety from the infection. In this age of a multitude of news channels and popular virtual social frameworks aimed at better connectivity, a massive share of the time spent indoors is undoubtedly invested in engaging with such media. This is corroborated by the recent study [2] which has revealed that there has been about 57% increase in news consumption by watching television or on smartphones, due to constant indoor presence.

A primary obsession of people during this pandemic is about the changing statistics of affected or deceased people world-wide, and needless to say, such articles form the crux of news that the different media channels publish. This virus outbreak has also raised a plethora of other controversial issues, leading to continuing debates and discussions with consequences at both local and global levels. As a whole, it is apparent that there is only a limited number of news delivered with a positive note. The impact of negativity in the news is a long-standing concern, and has been addressed from time to time [3, 4] , but the prevailing situation is predicted to leave a long-lasting and damaging impact on mental health and human psychology as a whole [5] . Meanwhile, the day-to-day statistics of deaths or count of affected patients due to the pandemic is expected to influence the news sentiment too. The authors have taken up this challenge of determining the news sentiment during a fixed period of study, as well as analyzing the influence of world-wide and countrywide statistics on the news sentiment during the selected duration.

The organization of the paper is as follows: "Literature review" section gives a brief description of the studied related works and motivations drawn for the current work; the details of each data corpus used in the work are provided in "Data description" section; "Data processing" section lists the techniques used for processing the comprehensive data corpora; the experiments and observations are discussed in "Experiment 1: sentiment analysis" section, "Experiment 2: statistical analysis" section, "Experiment 3: n-gram analysis" section, "Experiment 4: case studies" section; finally, the concluding remarks are offered in "Conclusion" section.

The challenge of opinion mining as an application field of data mining is well addressed, and there have been multiple works in this domain with a variety of solutions based on the increasing availability of growing datasets. A vast majority of these works are dedicated to the challenge of sentiment analysis in text collections of different types. Similar to challenges in other domains, the task of sentiment analysis can also be approached as either as a supervised classification problem, or an unsupervised approach for sentiment identification [6] .

The number of works that have addressed the problem of sentiment analysis with a supervised approach is more than the ones that have used unsupervised, exploratory techniques. For a supervised sentiment classification problem, the primary requirement is that the text corpus needs to be labeled, i.e., each text string in the whole data set needs to be annotated as belonging to a particular class-positive, negative, or neutral in this case. A study of the state-of-the-art works reveals that for previously annotated texts mostly based on twitter data, blog posts, web logs, movie reviews etc., the researchers have used some common machine learning techniques, namely Support Vector Machine, Naive Bayes [6] [7] [8] [9] [10] [11] , or even Deep Convolutional Neural Networks [12, 13] , etc. It is a general observation that such techniques are more efficient in sentiment analysis tasks than the other unsupervised approaches. Also, the overall performance of supervised algorithms in challenges of opinion mining is generally lower than that in other domains [10] .

On the other hand, the task of analyzing sentiments is more challenging with the use of unsupervised learning techniques. Also, such techniques are often more suited for mining the sentiment from bulky sources of data. Identification of semantic orientation [14] , comparative study and low performance of the SentiWordNet lexicon in sentiment analysis [9] , development of novel emoji and linguistic contentbased lexicons using unsupervised approach [15, 16] , sentiment polarity detection system using unsupervised approach on Turkish movie reviews [17] , etc. are all different interesting research works that use unsupervised approach. The application of standard lexicons such as SentiWordNet [18] , AFINN [19] , etc. in unsupervised sentiment classification is widely studied and evaluated in different works [20] [21] [22] . These lexicon based techniques are employed in solving interesting problems, such as analyzing the sentiment of the characters in Shakespeare's plays [23] , opinion mining from clinical discharge summaries [24] , development of bias-aware systems [25] , etc. Other popular methods for sentiment identification include k-means [11, 26, 27] , Latent Dirichlet Allocation (LDA) [28, 29] , etc. In all such cases, it is seen that the inherent simplicity, lack of training, and lower computation requirement involved in unsupervised approaches make it easier to use on and learn from data corpus of substantially large size [30] .

During a survey of state-of-the-art research using unsupervised lexicon based approach on text data, it is seen that most of these works are based on exploratory sentiment analysis and evaluation of classification techniques, used on different types of data. However, there is a relatively small amount of research that has worked with news data, and almost all such works are based on financial news and stock price prediction [31] [32] [33] [34] [35] , etc. Similarly, there are only a few works regarding the statistical effect of real-world events on the overall sentiment of global news, mostly related to the financial sector [36] [37] [38] , etc.

In this technologically developed era, people are engrossed in the news media, and agenda setting [39] has a crucial role to play in times of a crisis. Researchers have often determined the role played by mass media in determining or setting the agenda in response to a particular incident or event, and this is rapidly propagated among the audience [40] . Obviously, it entails a number of problems as well as ludicrous opportunities for the media agencies, as explored in [41] . In a related context, the work by Kirk et al. [42] analyzes the agenda setting and media policies in response to a disaster. While the proposed work does not focus on these issues, the authors wish to highlight the underlying role of media in maintaining global public sentiment and mental health given the ongoing COVID-19-related crisis. The news media need to be responsible as well as alert to ensure the proper propagation of awareness and shaping of public sentiment particularly involving second-level agenda setting [43, 44] .

Given these observations and the ongoing pandemic, the authors were motivated to make the following research contributions: -The current work determines the general sentiment of news articles during the ongoing pandemic with unsupervised and transfer learning-based approaches, -This is the only work, as per the authors' knowledge, that determines the implications of temporal statistics in a pandemic situation, on news sentiment throughout the world during a fixed period of study. The current work statistically determines how and after what amount of delay, the number of affected patients, and number of deaths due to COVID-19, impacts the news sentiment in regional and world-wide news, -The authors also analyze other relevant factors that contribute to rise or fall of global news sentiment related to particular countries.

The proposed work uses data regarding the daily news articles published online globally, as well as the statistical details of day-to-day cases and deaths due to COVID-19 throughout the world. Accordingly, two comprehensive data sets have been used in this work, as described below:

- 

The unlabeled news data described in the previous section have been processed in this part of the work. All of the steps discussed below are performed for each day's data, to generate usable corpora for the experiments.

-Data merging: There are 11 files containing news snippets from each day, and these are initially merged to generate a single data repository per day. Thereafter, some steps are followed for processing, as described below. -Removing numbers: Initially, the news text contained in the merged corpus for each day of the study is processed using regular expression-based operations. -Removing stop words: A common approach is followed to remove the words that are not useful in sentiment analysis process, but which make up a significant part of any text. Examples of such words are: and, for, is, the, to, at, in, etc. -Stemming: As a last step of processing the news articles, stemming is applied to derive the root form of the inflected or derived words in each cleaned string. Such derived words are used to propagate different grammatical concepts such as mood, tense, voice, etc. As a simple example, the words working, works, and worked all have the same stemmed form work.

Once all the above steps have been performed, the processed texts for the total duration of the current study in 60 processed files are merged as a single file containing over 6.34 million distinct news articles.

The merged news data corpus consisting of comprehensive, cleaned strings from the previous step is unlabeled in nature, i.e., the news articles are not originally assigned any particular sentiment label. For this purpose, any machine learning and classification-based sentiment analysis are not directly possible on this data set.

For sentiment prediction, the cleaned text articles for each day are now scored using two different approaches, namely the AFINN lexicon [19] in an unsupervised learning approach, and by the Naive Bayes [47] -based transfer learning approach which has been trained on a popular movie reviews dataset [48] . A lexicon is a comprehensive collection of words, and AFINN is one such widely used lexicon consisting of over 3300 words where each word contains a corresponding sentiment score value. This polarity score lies between + 5 to − 5, and every string in our cleaned news text is now analyzed by applying the AFINN lexicon, to generate corresponding sentiment scores. As an example, the string It was a good memory is analyzed and scored word by word using AFINN, where the scores are 0, 0, 0, 3, and 0, respectively, to give a total score of +3. Evidently, the stop words have no role to play in such analysis, and thus, they have been removed during text processing in the previous section. The determined scores (using AFINN lexicon), are now converted to sentiment category. For this purpose, all texts with score less than 0 are labeled negative, those with score equal to 0 are neutral, and all remaining texts are annotated as positive. A notable observation is that such approaches consider only single-word construct or unigrams for sentiment scoring. This is a prime weakness of such approach, as it fails to capture the inherent essence of different multi-word constructs in English, and fails to recognize emotions and complexities of the language.

In contrast, the trained Naive Bayes classifier uses its knowledge about sentiment polarity from the aforementioned movie reviews corpus, and correspondingly applies it to assign a sentiment category to each news article per day. Unlike AFINN, this supervised classification approach considers the complete text at a time and is more sensitive to emotions, inherent figures of speech and multi-word constructs in the language used. Also, this approach gives a different view of the studied corpus of news texts, and returns the sentiment category for each news article.

In this manner, for every piece of cleaned news text, we now have an overall sentiment score (for AFINN) and sentiment category (for Naive Bayes classifier) which is either positive, 0 or negative, for that string.

The news data corpus for different days do not consist of the same number of text articles, and also each news article has a different sentiment category predicted by AFINN and the trained Naive Bayes classifier. Therefore, there is a need for normalization, before any comparative study of news sentiment on different days is conducted. For this purpose, a negativity index for each day is calculated and is used as an indicator of the overall negative sentiment in news on that day. The index for the ith day is calculated as:

(1) neg i = Number of articles of negative category Total number of news articles Similarly, indices for positive sentiment and neutral type of news articles are determined using equations:

These index values are calculated for the comprehensive data on news articles for the duration of study. The overall spread of these sentiment indices, as determined by the analysis using unigram-based AFINN, are shown in Fig. 1 , while Fig. 2 illustrates the same as analyzed by Naive Bayes-based classifier. Notably, with the use of the latter, substantially large negativity (about 75%) and low positivity (about Illustration of the significance of the three sentiments in global news during the period of study, determined using AFINN lexicon. News with neutral sentiment has minimum presence, and positive news sentiment seems to be slowly catching up with the negativity Fig. 2 Illustration of the significance of the three sentiments on global news during the period of study, determined using Naive Bayes. News with neutral sentiment has minimum presence, and there is a substantial gap between the positivity and negativity in news sentiment 21%) values are detected, whereas the neutrality decreases by more than 50% and is deemed almost irrelevant to the study at hand. Also, in both cases, it is obvious that any fall in negativity, results in an increase in positive sentiment, and vice versa. Therefore, news of neutral sentiment plays a negligible role. Consequently, a statistical study of the sentiment indices determined in both approaches reveals that negative sentiment has the highest mean, followed by the mean number of positive news articles. Also, these two sentiments show almost similar deviation during the studied duration, using both the scoring techniques. Finally, it is evident from both pairs of Figs. 1, 3 and 2 and 4 that the overall variation in sentiment patterns is more profound in the detection by AFINN lexicon, in spite of its poor sentiment detection performance, and is selected for the experiments in the next section.

The most commonly occurring words in the news articles with negative sentiment, for the complete duration of study, are illustrated in Fig. 5 . 

This is the next set of experiments where two separate sets of data are utilized, namely:

-the world-wide news-based negativity index values from the previous experiment determined using AFINN lexicon based approach, as the variation of sentiment polarity is found to be more in that case, and, -the number of new cases and number of deaths per million of the population, These corpora are analyzed to determine the underlying relation between the variation of news sentiment and ground reality of cases and deaths due to COVID-19 pandemic.

To statistically determine the link between the news negativity and number of cases or number of deaths due to the pandemic, it is essential to determine the distribution of each of these variables. Figure 6 shows the respective distributions.

From the figures, it is noticed that all the three variables used in this work follow a near-normal or near-Gaussian [49] distribution. Therefore, it is feasible to directly determine the statistical relation between these variables.

Initially, an attempt has been made to visually determine the relation between distribution of features from two different data corpora. In Fig. 7 , the number of confirmed COVID-19 cases during the span of the study has been represented as bar plots. The negativity index values in global news have been plotted for the same This word-cloud highlights the specific words which are present in each day's most negative news articles. The relatively large size of words, such as death, fatality, case, coronavirus, died, infection, and hospitalized are representative of their frequencies of occurrence during the 60-day period of study duration as a line plot. It is seen that peaks in news negativity are quite often related to the rise in number of cases, as seen in the variations of both variables for different set of days. Also, the decreasing step pattern in number of cases during days 14-19 and 21-26 is distinctively reflected in the news negativity plot too.

Similar to the previous case, Fig. 8 gives the number of daily deaths in bar stacks, while the line plot is the same as the previous figure. It is seen that there is not much similarity in trends between the two data during the first 20 days. In contrast, some similarity in the data patterns is evident in the duration of days 22-32, after which there is no visible similarity.

However, in both the above cases, it is observed that similar patterns in news occur at a delay of a few days. This can be attributed to the fact that day-to-day statistics do not get immediately reported on the same day, and generally takes at least a day or two, to appear and make impact on the global news sentiment. This observation leads to the need for determining the optimal time window, at which the trends in the corpora are most similar. 

From the previous section, it is observed that the trends in news negativity are more or less affected by the variations in the number of cases and the number of deaths. Also, the impact of the trends in number of cases or deaths is visible at a delay of a few days. Therefore, it is necessary to statistically determine the exact delay at which the news sentiment reflects the reality of the situation.

The statistical measure of similarity in data for two variables can be determined by calculating their correlation coefficient. In this part of the experiment, the authors have experimentally determined the correlation coefficient r n , between the news sentiment and number of cases or number of deaths, using a set of sliding windows on the news sentiment index values, where each such window is shifted n days ahead of the actual duration of the conducted study, for values of n = (0, 1, 2, 3, 4). This means, to re-create the most visibly aligned variations, a statistical study is done using a same set of values for the number of cases or deaths, along with values of news negativity index considered during temporally shifted sets of 60 days each. In all cases, the correlation is calculated using the Pearson correlation coefficient [50] between two variables x and y, given by the formula:

This coefficient value for any two variables remains between − 1 and + 1, where a positive value close to 1 indicates that both variables change simultaneously in same direction, a negative correlation stands for two variables changing in opposite direction, and zero correlation denotes no similarity in the variables. In practice, any correlation value above 0.5 is treated as a moderately strong positive correlation. Using these concepts, along with the previous observations about delay in impact of actual change in parameters on the news sentiments, the optimal maximum positive correlation value is determined to derive the actual delay. A similar use of correlation is seen in the works by Fu et al. and Zhang et al. [36, 38] .

From Table 1 , it is obvious that in general, there exists more correlation between the daily negative sentiment in news and number of COVID-19-related deaths, considering data world-wide, and that the positive correlation is maximum between these variables when the news negativity indices are considered using a 2-day shifted sliding window, i.e., it takes 2 days for the trends in the number of deaths, to have impact on the global news sentiment. Similarly, this shift is confirmed for the global number of cases at a delay of 3 days. This experiment validates the observation about a delay in the impact of number of confirmed patients and number of deaths, on the news sentiment, and also determines the delay in said impact on a global scale. 

In the final part of this experiment, the correlation values and optimal time-windows determined in the previous section are used for plotting time-shifted news sentiment curves along with the daily number of cases and number of deaths. Accordingly, the news sentiment about daily number of cases is considered at a shift of 3 days, while that concerned with daily death count is plotted at a shift of 2 days to get the ideally aligned plots. These are shown in Figs. 9 and 10, respectively. It can be seen from Fig. 9 that there are almost perfect matches in pattern in the duration of days 1, 12-19, 20-27, and 31 onwards, though due to differences in scale, the variations are not equally spaced. The visible resemblance in variations is also noted in Fig. 10 , especially in days 14-19, 22-26, the abrupt spikes in 30-31, 36-37. However, it is a general observation that the negativity in news prevails even when the global statistics in both cases and deaths are declining which can be attributed to other factors as determined in succeeding experiments. Therefore, it can be said that the negativity index, considering global news, is quite indicative of the changes in the number of new cases and deaths during the ongoing pandemic, while the declining statistics do not seem to have much effect on the overall negativity. 

A n-gram can be defined as a continuous sequence of n words from a given sentence or text. In this part of the experiments, the authors have determined the 60 most common tri-grams that occur in the news during the period of study. This analysis highlights the several events, topics, or persons that have been most widely publicized by the online global news in relation with the pandemic scenario. The trigrams have been listed along with their corresponding weighted frequency (calculated using tri-gram frequency and total occurrence of most common 60 tri-grams), as shown in Table 2 .

It is obvious from the table that most of the tri-grams are regarding the pandemic, with massive usage of phrases such as tested positive coronavirus, tested positive COVID, confirmed case COVID, etc. in the global news. The news agenda during the studied period of time revolves around this central theme, and involves daily COVID-19-related updates and awareness programs being broadcast as deduced from the usage of phrases like personal protective equipment, confirmed case COVID, people tested positive, number COVID case/number coronavirus case, social distancing guideline, practice social distancing, etc. The crucial and commendable role played by World Health Organization, Centers for Disease Control and Prevention (CDC), John Hopkins University, and health care workers all over the globe in shaping the different challenges and aspects of this pandemic is also prominently noted from the table. A remarkable observation is that only three state leaders have made it to this list, namely the President of the United States of America (whose name is incidentally in the third most common tri-gram), and the Prime Ministers of United Kingdom and India, which emphasizes the prominence they enjoy as world leaders in global news, even in these times of distress.

In this last part of the experiments, the observations about the delayed impact of globally changing count of affected patients and deaths on the news sentiment as seen in the previous section have been used to identify similar trends for some specific countries using the respective correlation values. The study is conducted for four countries ordered chronologically, based on when the first virus outbreak occurred in that area, and all articles mentioning country X have been extracted from online global news to perform the corresponding case study on country X. For this purpose, the authors have extracted all news articles corresponding to the countries in question, from the comprehensive global news corpus, for the whole period of the study. Also, in this experiment, z-score [51] technique has been used on both the variables, to normalize the values prior to visualization. The z-score is used to bring values of different variables on the same scale, and is calculated as:

where, x i denotes the current data element, denotes the mean of the variables, and is standard deviation. Using this method, the data for each variable are converted to have a mean of 0, so in the following graphical representations, all values below the mean will denote a decreasing trend and vice versa. A visual analysis of these images reveals how the observations are generally applicable throughout the data from different countries; that is, whether the global news sentiment about a country is actually affected by the daily trends in number of new cases or deaths. This is determined by the individual correlation of countrywise statistics with appropriately time-shifted global online news about that country.

The scatter plots are generated for the four countries in question. In every set of two plots for each country, perfect or partial overlaps signify only discrete, temporal alignment of the variables, and cannot be treated as a measure of continued similarity in trend, which can be better determined from a set of parallelly distributed data values.

The current virus outbreak is believed to have originated in China much early, in the month of December 2019, and so, the current duration of study has witnessed a sharply flattening curve in the number of cases, and complete prevention of deaths successfully. Among the 6.34 million news texts, only those that feature 'China' have been extracted along with the corresponding sentiment index values per day. The correlation coefficients determined by sliding window approach are quite low and insignificant from a statistical point of view, as calculated and shown in Table 3 . However, in the current context, such values are indicative of loosely positive similarity in trends. Remarkably, there seems to be an immediate impact of the number of daily deaths per million in China on the global news, whereas the number of cases per million takes quite some time. Along with this, the highly minimized and flattened death or infection rate is evident from Figs. 11 and 12.

Also, it is seen that in spite of the flattened curves for cases and deaths, the negativity index values are distinctly high, and show a decreasing trend only after the 40th day of our study. The corresponding correlation coefficient values indicate more parallelly aligned set of points as seen in the first figure for the number of cases, while the points are more dispersed around the flattened death curve in the second figure, in spite of multiple overlaps, shown as deep red.

Observations: The observed negativity, though generally aligned, could be due to different other issues as evident from the global news related to China (shown in Table 4 ). For instance, the rise in negativity during days 14-15 of the study relate with news articles 1 and 3, while articles 3, 7, and 8 attest to the decline in negativity that follows. On a similar note, the high negativity around days 34-35 of the study can be attributed to articles 4 to 6, while the succeeding positivity is enforced by articles like 9-11. Therefore, it is evident that the global news agenda related to China is mostly motivated in driving an overall negative image of the country and its actions during the ongoing pandemic.

The outbreak spread to USA in late January, and a substantial part of the pandemic's effect on news is observable in this case. Similar to the previous case, the news Table 5 shows that the number of confirmed cases has more impact on negative sentiment in the news based on USA, at a delay of 2 days, and a lower impact of the number of deaths at an overall delay of 4 days. The overall correlation is weakly positive for both the pair of variables. The spread of both the number of cases and deaths, in the case of the USA, resembles bell curve for the current duration of study, with gradually increasing values up to day 30, and an opposite trend thereafter. Figures 13 and 14 show that towards the later half of the studied duration, the overall number of confirmed cases and deaths follows a decreasing trend (more data points below mean), whereas negative sentiment thrives and even increases.

Observations: Apart from the effect of COVID-19-related statistics, different media reports citing the anti-China sentiment of the President of the country and governmental decisions appear to have influenced the news sentiment, as well. A set of such news articles has been provided in Table 6 , while the prominence of the US President in global news is already established in Table 2 . The high amount of negativity during the initial 10 days of the study, may be an effect of the articles 1-5, while the decreasing negativity since day 10 may be due to the event that article 6 and 7 correspond to. Similarly, the positive sentiment at about day 50 is aligned with the article 8, whereas the succeeding rapid rise in negativity (in spite of a drop in COVID-19 cases and deaths) could be attributed to events highlighted by articles 8-13. Similar to the observations regarding China, the agenda of global online news is driven more by different socio-political activities concerning the country.

Italy is one of the most badly affected countries due to the COVID-19 virus outbreak. During our period of study, both the death count as well as number of More than 800,000 physicians across the country signed a letter urging President Donald Trump to keep social distancing practices in place after he said he wants to reopen businesses by Easter. "Significant COVID-19 transmission continues across the United States, and we need your leadership in supporting science-based recommendations on social distancing that can slow the virus," the letter, released by the Council of Medical Specialty Societies, said. "Our societies have closely adhered to these measures by moving our staff to fulltime telework and canceling in-person meetings (including annual meetings). These actions have helped to keep physicians and other health professionals in health care facilities, including hospitals, and reduce the risk of spreading COVID-19"

Health care workers say that they are being asked to reuse and ration disposable masks and gloves. A shortage of ventilators, crucial for treating serious COVID-19 cases, has also become critical, as has a lack of test kits to comply with the World Health Organization's exhortations to test as many people as possible. In the United States, a fierce political battle over ventilators has emerged, especially after President Donald Trump told state governors that they should find their own medical equipment if they think they can get it faster than the U.S. government 6 President Donald Trump signed into law the unprecedented $2 trillion economic stimulus package Friday, capping a week that saw markets yo-yo as recession concerns grew worldwide. Now that the package has been signed into action, attention turns to how quickly the U.S. Treasury and other departments can distribute checks to individual Americans and businesses grappling with the ongoing effects of COVID-19. It could prove to be a Herculean effort to flood the money into the economy quick enough to prevent more job losses and businesses going under 7 President Donald Trump signed an unprecedented $2.2 trillion economic rescue package into law after swift and near-unanimous action by Congress to support businesses, rush resources to overburdened health care providers, and help struggling families during the deepening coronavirus epidemic. Acting with unity and resolve unseen since the 9/11 attacks, Washington moved urgently to stem an economic free fall caused by widespread restrictions meant to slow the spread of the virus that have shuttered schools, closed businesses and brought American life in many places to a virtual standstill. "This will deliver urgently needed relief," Trump said as he signed the bill Friday in the Oval Office, flanked only by Republican lawmakers confirmed cases are seen to be gradually declining. The global news articles which feature 'Italy' have been extracted along with the corresponding sentiment category of each article for this experiment. Similar to the previous experiments, for assessing the impact of death or infection-based statistics on news sentiment, a study of correlation has been undertaken. This helps to determine the measure by which the news sentiment reflects the ground reality, by considering days shifted one at a time upto 5 days. The results of the study for Italy, as shown in Table 7 . It is seen that there is maximum impact of the COVID-19 situation in Italy, on global news, on the 5th day, though there is a high continuing correlation. Accordingly, the aligned scatter plots are generated using the z-scored, normalized values, as shown in Figs. 15 and 16 . Evidently from the table and figures, there exists a higher correlation between the deaths in Italy and negativity index in global news, than that due to number of infected cases, although both these variables show a comparatively strong correlation with the negative news sentiment. This can also President Donald Trump attacked the United Nations health body as a Chinese "puppet" on Monday and confirmed he is considering slashing or cancelling US support. "They're a puppet of China, they're China-centric to put it nicer," he said at the White House. Trump said the United States pays around $450 million annually to the World Health Organization, the largest contribution of any country. Plans are being crafted to slash this because "we're not treated right. They gave us a lot of bad advice," he said of the WHO be observed by the higher number of complete and partial overlaps, as well as the gradually decreasing dispersion of the negativity proportional to the parametric values of confirmed cases or deaths in 15 and 16.

Observations: Due to the determined strong correlation, it can be determined that COVID-19 statistics are most effective on global news sentiment regarding Italy. However, a small set of relevant news articles has been put up in Table 8 .

Though the first confirmed COVID-19 case in India was noted at almost the same time as Italy, the rising effect of outbreak is quite clear in our studied time period. The study reveals interesting results, where both the number of affected cases, and number of deaths, is steadily increasing during the time period considered. The correlation coefficients determined by shifted negativity index Table 9 . Surprisingly, the correlations are all negative in nature, indicating that the overall impact of rising deaths and spread of COVID-19 in India has a very weak effect on global news sentiment about India.

Given that the study intends to determine the similarity in trends of news sentiment and death or infection statistics, the least negative correlation coefficient values are selected for visualizing the trends, which are noted at a delay of 4 days in each case. A notable fact is that, statistically, this minimum negativity indicates almost no correlation. The same is depicted in the scatter plots in Figs. 17 and 18 , where the negativity index values are highly dispersed, and even show a decreasing trend in the later half of the study in spite of the steep climb of actual statistics. As noted in "Experiment 1: sentiment analysis" section, the neutral news has minimal role in the global scenario, and that should be significantly minimized at a country-wide level. A possible inference may be that the negative sentiment in global news based on 'India' is minimized so as to prevent panic among the huge population, or that the global news is not really representative of only the COVID-19 statistics in Indian context.

Observations: The lack of proper correlation suggests that the news agenda is influenced by many factors other that COVID-19, during our period of study. Table 10 highlights some of the problems that were initially a cause of the massive negativity in news sentiment in spite of the minimum rate of COVID-19 affection. This covers several socio-economic aspects of Indian life during this crisis, and the analysis and discussion of such observations in itself, can be articulated as a full-fledged study of the agenda setting policies of online news media. 

The proposed work addresses the challenge of identifying the general sentiment in globally published news articles as an effect of the ongoing pandemic, in both unsupervised and transfer learning-based approaches, on comprehensive data gathered for a fixed period of time. A statistical study is also undertaken to determine the impact of variations in the number of affected patients and deaths due to the COVID-19 virus, on the news sentiment at a global scale. The same study is also repeated for some countries and the sentiment of global news which pertain to the effect of COVID-19 in those countries, by considering normalized values of all variables. The observations are substantiated by n-gram analysis that highlights the most prominent tri-grams or three-word phrases that have been used in online news globally. The strongest correlation between news sentiment and COVID-19 statistics exists for Italy, which is almost similar to the observation considering news and Mumbai Police Friday arrested three men for allegedly storing 5000 bottles of hand sanitiser, worth an estimated Rs 2.5 lakh, at a flat in Mahim and illegally selling them above their maximum retail prices. The crime branch raided the flat after it received information that 100 ml bottles of hand sanitiser were being sold for Rs 65, which was Rs 15 more than the MRP 2 The National Commission of Women NCW has received over 250 complaints, since the country-wide lockdown was imposed to control the spread of coronavirus out of which 69 were cases of domestic violence which it said has been increasing since then. Since the lockdown was imposed, a total of 257 complaints related to various offences against women were received out of which 69 complaints are related to domestic violence the data released by the NCW showed. NCW chairperson Rekha Sharma said the number of cases of domestic violence must be much higher, but the women are scared to complain due to constant presence of their abuser at home Kumar said, his union is in touch with nearly 1000 families who need the rations urgently, having lost incomes for over a week now. For lakhs of migrant workers in Maharashtra, lack of clear information has continued to cause anxiety, especially after the Centre and state governments issued instructions Sunday to prevent them from attempting to return to their native places. Their biggest concern being accessible accommodation and food for the remainder of the 21-day lockdown period. "What's going to happen will be reminiscent of the Bengal famine" 5 "While no definite conclusion can be drawn, this is probably due to the circumspection on the part of victims in reporting such incidents due to the presence of the perpetrators in the house and the fear of further violence if such attempt to report were made known to the perpetrator", the commission had said. It had also said that the cases of molestation, sexual assault, rape, kidnapping, and stalking have decreased manifold presumably, since a large number of these incidents take place outside the domestic setting and by third parties. AICHLS in its plea has contended that incidents of domestic violence and child abuse have gripped not only India, but countries such as Australia UK and USA, and the reports suggest that countries are witnessing a horrific surge in domestic violence cases 6 The video showed around 40 migrant workers sitting on the roadside in full clothes, including women, while water jets were showered on them through fire tenders by men in white protective kit. In the video, one of the officials is heard asking the migrants to keep their eyes shut 7

After the lockdown announcement, the badli workers in West Bengal's jute mills are the worst affected out of the lot 8 "We may survive from corona but not hunger": Bengal's daily wage workers struggle for survival. In India, thousands of workers are lining up twice a day for bread and fried vegetables to keep hunger at bay 9 Job loss pay cuts worry Indians the most during lockdown: Survey. Every 1 in 5 Indians is now worried about losing his or her job as the coronavirus pandemic has shut industries and businesses in India, a new survey warned on Wednesday. According to the survey conducted by YouGov, an Internet-based market research and data analytics firm, some Indians worry about the economic impact of the virus such as losing their jobs (20%), getting a pay cut (16%), or not getting a bonus or increment this year (8%) statistics on a global scale. The authors have also utilized a set of relevant news articles to substantiate the observations during the case studies. The authors have determined that negativity is a pre-dominant sentiment in global news, and that the COVID-19-related real-world statistics, agenda setting by news agencies as well as different social (such as job loss, migrant worker problems) and political factors (such as the continued tussle between the Presidents of the USA and China), drive the negativity in online news quite strongly, which could lead to long-standing effects on mental heath of the news audience. The results lead to relevant questions and consequently a plethora of computational and social study-based research challenges. Such studies will be useful in determining the long-standing, psychological effects of news sentiment on mental health in a pandemic situation, representation of regional challenges in online global news, news media agenda setting, etc. In future, the authors wish to extend this work by utilizing country-specific news data in their respective national official languages, which will aid in further fine-grained analysis.

Q and A on coronaviruses (COVID-19). Accessed

Time spent on watching TV, with smartphone rises as people stay indoors: BARC data

The psychological effects of TV news

What constant exposure to negative news is doing to our mental health

Health experts on the psychological cost of Covid-19

Sentiment analysis of english tweets: A comparative study of supervised and unsupervised approaches

Mining sentiment classification from political web logs

Mastering Machine Learning with scikit-learn

Sentiment analysis of textual reviews; evaluating machine learning, unsupervised and sentiwordnet approaches

Thumbs up?: sentiment classification using machine learning techniques

Sentiment analysis of movie review comments

Deep convolution neural networks for twitter sentiment analysis

Unitn: Training deep convolutional neural network for twitter sentiment classification

Unsupervised learning of semantic orientation from a hundredbillion-word corpus

Unsupervised method for sentiment analysis in online texts

Creating emoji lexica from unsupervised sentiment analysis of their descriptions

A framework for sentiment analysis in turkish: Application to polarity detection of movie reviews in turkish

Sentiwordnet 3.0: an enhanced lexical resource for sentiment analysis and opinion mining

Afinn. Richard Petersens Plads, Building

Twitter, myspace, digg: Unsupervised sentiment analysis in social media

A comparative study on twitter sentiment analysis: Which features are good

Valento: Sentiment analysis of figurative language tweets with irony and sarcasm

Character-to-character sentiment analysis in shakespeare's plays

Word2vec and doc2vec in unsupervised sentiment analysis of clinical discharge summaries

Bias-aware lexicon-based sentiment analysis

Using text mining and sentiment analysis for online forums hotspot detection and forecast

Opinion mining on large scale data using sentiment analysis and k-means clustering

Sentiment analysis with global topics and local dependency

Network text sentiment analysis method combining lda text representation and grucnn. Personal and Ubiquitous Computing

Sentiment analysis and the complex natural language

Stock price prediction using news sentiment analysis

Sentiment analysis on english financial news

Finenews fine-grained semantic sentiment analysis on financial microblogs and news

Market trend prediction using sentiment analysis: lessons learned and paths forward

Enhanced news sentiment analysis using deep learning methods

Discovering the correlation between stock time series and financial news

Time series analysis on stock market for text mining correlation of economy news

Trading strategies to exploit blog and news sentiment

Agenda-setting

The agenda-setting function of mass media

Problems and opportunities in agenda-setting research

After disaster: Agenda setting, public policy, and focusing events

The dynamics of public attention: Agenda-setting theory meets big data

Agenda-setting effects of business news on the public's images and opinions about major corporations

Coronavirus Source Data

Now Live Updating & Expanded: A New Dataset For Exploring The Coronavirus Narrative In Global Online News

Idiot's bayes-not so stupid after all?

Learning word vectors for sentiment analysis

Gaussian Distribution

Proceedings of the Royal Society of London. Number v. 58

Advanced Engineering Mathematics, 10th Edn

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

On behalf of all authors, the corresponding author states that there is no conflict of interest.