key: cord-1049735-tye5zf8p authors: Rahmanti, Annisa Ristya; Chien, Chia-Hui; Nursetyo, Aldilas Achmad; Husnayain, Atina; Wiratama, Bayu Satria; Fuad, Anis; Yang, Hsuan-Chia; Li, Yu-Chuan Jack title: Social media sentiment analysis to monitor the performance of vaccination coverage during the early phase of the national COVID-19 vaccine rollout date: 2022-04-27 journal: Comput Methods Programs Biomed DOI: 10.1016/j.cmpb.2022.106838 sha: b7c6f1c8645f273c1ba2d68f80f07676f7b15d45 doc_id: 1049735 cord_uid: tye5zf8p BACKGROUND AND OBJECTIVE: : Social media sentiment analysis based on Twitter data can facilitate real-time monitoring of COVID-19 vaccine-related concerns. Thus, the governments can adopt proactive measures to address misinformation and inappropriate behaviors surrounding the COVID-19 vaccine, threatening the success of the national vaccination campaign. This study aims to identify the correlation between COVID-19 vaccine sentiments expressed on Twitter and COVID-19 vaccination coverage, case increase, and case fatality rate in Indonesia. METHODS: : We retrieved COVID-19 vaccine-related tweets collected from Indonesian Twitter users between October 15, 2020, to April 12, 2021, using Drone Emprit Academic (DEA) platform. We collected the daily trend of COVID-19 vaccine coverage and the rate of case increase and case fatality from the Ministry of Health (MoH) official website and the KawalCOVID19 database, respectively. We identified the public sentiments, emotions, word usage, and trend of all filtered tweets 90 days before and after the national vaccination rollout in Indonesia. RESULTS: : Using a total of 555,892 COVID-19 vaccine-related tweets, we observed the negative sentiments outnumbered positive sentiments for 59 days (65.50%), with the predominant emotion of anticipation among 90 days of the beginning of the study period. However, after the vaccination rollout, the positive sentiments outnumbered negative sentiments for 56 days (62.20%) with the growth of trust emotion, which is consistent with the positive appeals of the recent news about COVID-19 vaccine safety and the government's proactive risk communication. In addition, there was a statistically significant trend of vaccination sentiment scores, which strongly correlated with the increase of vaccination coverage (r = 0.71, P<.0001 both first and second doses) and the decreasing of case increase rate (r = -0.70, P<.0001) and case fatality rate (r = -0.74, P<.0001). CONCLUSIONS: : Our results highlight the utility of social media sentiment analysis as government communication strategies to build public trust, affecting individual willingness to get vaccinated. This finding will be useful for countries to identify and develop strategies for speed up the vaccination rate by monitoring the dynamic netizens' reactions and expression in social media, especially Twitter, using sentiment analysis. In response to the prolonged COVID-19 pandemic, the World Health Organization (WHO) and its partners have to speed up vaccines' rapid development and deployment to protect from the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) [1] [2] [3] . Currently, eight COVID- Pfizer/BioNtech. [4] Various other vaccines are also being administered in many countries under their national regulatory authorities [1] . On January 13, 2021, Indonesia began its COVID-19 vaccination program, with the first phase targeting health workers and frontline workers [5] . The government has set a target of getting 181.5 million people (70% of the total population) to get vaccinated to create herd immunity within 15 months [5] . However, despite having the second-highest number of COVID-19 cases and deaths in Asia, only 10.4 million people (3.7%) were fully vaccinated against COVID-19 [6] . Moreover, with the current pace, herd immunity may not be reached soon in Indonesia, providing opportunities for broader circulation of new variants emerging [1] . The Indonesian government is working to increase the number of daily vaccination shots to reach the target. Besides dealing with the global vaccine suppliers and developing the national vaccine industry to meet the required stocks, WHO's Strategic Advisory Group of Experts (SAGE) has advised all countries to develop a strategy to increase vaccine uptake and address vaccine hesitancy [3] . Moreover, there has been a surge of COVID-19 vaccines misinformation and conspiracy theories surrounding the efficacy and safety of the vaccines that potentially reduce individual willingness to get vaccinated [7] . Therefore, it is critical to monitor the spread of the COVID-19 vaccine's misleading information and mitigate its impact to reinforce public confidence and address vaccine hesitancy successfully. Social media has been frequently used to develop a large-scale real-time tracking tool on disease outbreaks and the government epidemic control measures during the global pandemic [8, 9] . Social media also become a crucial communication platform to disseminate either valid or misleading information faster than traditional news reporting [10, 11] . It can also be effectively used to understand public responses, which can potentially help the nations influencing public behavioral response as mitigation strategies to combat the pandemic [12] [13] [14] . Among other social media, Twitter was the leading platform used by people to express their perceptions or behaviors toward health information [9, [15] [16] [17] . Twitter study is also part of a supply-based "infodemiological" study which offers the opportunity to analyze people's needs through their health information-seeking behavior, including vaccine uptake [18] . A few studies using Twitter have been conducted to assess vaccine uptake on influenza, A(H1N1), and human papillomavirus (HPV) [19] [20] [21] . With the number of Twitter active users in Indonesia that reached 22.8 million users [22] , enabling timely monitoring on public reactions to the government risk messages and the immediate actions to clarify misinformation [23] . Moreover, a recent survey of Indonesian Twitter users showed that 77.9% of adults in Indonesia (18-44 years old) used Twitter [24] , which is most notable for reaching as the government targeted intervention population for COVID-19 vaccines was the working-age population. We, therefore, used these benefits to analyze the public sentiments on Twitter in response to the COVID-19 vaccine rollout focusing in Indonesia. We hypothesized that public sentiment would correlate with the national vaccination coverage, contributing to the decline of COVID-19 case increase and case fatality rate. The approach to the methodological framework for data collection and analysis is summarized in Figure 1 . We divided our data collection into two steps. First, we collected Twitter conversations of Indonesian Twitter users using a Twitter monitoring and analytics platform called Drone Emprit Academic (DEA) [25] . DEA is a big data technology based on artificial intelligence, specifically machine learning and natural language processing, utilized for social media monitoring and analytics [26] . The search criteria included all tweets related to COVID-19 filtering with the selected keyword #vaksin (vaccine). All tweets were retrieved over 180 days, from October 15, 2020, to April 12, 2021, to observe the temporal dynamics of public reactions before and after the vaccination rollout in Indonesia [27] . Second, we obtained the data relating to COVID-19 vaccination coverage from the MoH official website [28] . Meanwhile, COVID-19 case increase and case fatality rate were retrieved from the KawalCOVID19 crowdsourced database. The case increase rate was determined as the ratio of the new cases to the total number of cases. The case fatality rate was calculated as the ratio of those who died from COVID-19 to the number of COVID-19 patients [29] . We only obtained 90 days of reporting data from the MoH official website and KawalCOVID19 database starting from January 13 to April 12, 2021, which corresponded with the early phase of the national COVID-19 vaccination rollout. We processed all filtered tweets by removing irrelevant attributes, including slang words/sarcasm, short terminology, and stop words. We analyzed the frequency, sentiments, and trends of all processed tweets (including mentions, retweets, replies, and favourites) using DEA Twitter analytics features similar to those used in the previous study [14] . In addition, we also calculated the number of interactions of certain tweets (interaction rate) to understand the public concern or engagement toward COVID-19 vaccine in the selected period. Interaction rate was calculated as the total number of replies, retweets, and favourites a tweet receives divided by the total number of mentions. The interaction rate is high if the total number of mentions is lower than the total number of replies, retweets, and favourites [26] . The detailed interaction rate formula is as follows: DEA sentiment analysis used Naïve Bayes (Adaptive Multiple Model) methods to classify a word as positive, negative, or neutral with an accuracy of 90.26% [30] . We calculated the vaccination sentiment score as the relative difference between positive sentiment and negative sentiment tweets [21] , which is defined as We extracted all hashtags from the tweets corpus database to create the hashtag list. A word cloud (also known as a tag cloud) was formed to visualize the high-frequency words for each sentiment category. We also performed emotion analysis based on Plutchik's Wheel of Emotions (joy, fear, anticipation, anger, disgust, sadness, surprise, and trust) to identify the predominant of tweets [26] . We performed trend analysis to observe the patterned relationship between the exposure variable (COVID-19 vaccine sentiment scores) and the outcome variables (vaccine coverage, case increase rate, and case fatality rate). We also analyzed the normal distribution of the outcome variables. We evaluated the trends of each pattern using a rank-based nonparametric test such as the Jonckheere-Terpstra test [31] , which is suitable for non-normally distributed data. We then applied Spearman's rank correlation test to measure the correlation between sentiment scores and each outcome variable. All statistical tests were conducted in SAS software, Version 9.4 (SAS Institute Inc., Cary, NC, USA) [32] using a two-tailed test with a significance level  = 0.05. All data visualizations were performed using Tableau public version 2021.1 [33] . We identified a total of 555,892 COVID- 19 days study period, with key major events annotated. We observed most of the tweets, i.e., 374,180 (49%), were classified as neutral sentiments, while 140,798 (25%) tweets expressed negative sentiments and 140,624 (25%) were accounted for positive sentiments toward the COVID-19 vaccine. Figure 2 shows the daily distribution of tweets by positive and negative sentiments throughout the study period, with the key major events annotated in the figure. We can see the sentiments distribution was divided into two critical stages. During the first stage of the study, we observed a fluctuating trend of each sentiment, dominated mainly by negative sentiments. The negative sentiments outnumbered positive sentiments for 59 days (65.50%). However Figure 2B ). [5] The negative sentiments immediately decreased after the government decided to provide free of charge COVID-19 vaccine ( Figure 2C ). The positive sentiments gradually declining but immediately increased after the National Agency of Drug and Food Control (NADFC), known as BPOM, announced that Sinovac was safe with an efficacy rate of 65.3% ( Figure 2D and 2E) [27] . However, in January 2021, the negative sentiments escalated dramatically until January 13, after President Jokowi received the first vaccine shot ( Figure 2F ). This particular event has become a massive topic of discussion on Twitter because the government decided to provide vaccine shots to celebrities and social media influencers ahead of the healthcare workers. Yet, just a few hours after the vaccine shot, the celebrity was spotted violating the health protocol, which triggered many criticisms over the government vaccine campaign [27] . The negative sentiments reached their peak with 6813 tweets ( Figure 2F ). However, we then observed a steady decline of negative sentiments until mid of February. The positive sentiments displayed their peaks with 7718 tweets ( Figure 2I ) after the second stage of the vaccination program began at the end of February. The NADFC also reported the safe use of AstraZeneca COVID-19 vaccines ( Figure 2K ). However, the negative sentiments also showed a slight increase over the concern of public attention on the rising number of UK's new coronavirus variant (B.1.1.7) cases in Indonesia ( Figure 2J ). To identify the most frequently used topics of COVID-19 vaccine-related tweets, we used a visualization representation of the most common hashtags (Supplementary Figure 1 ) and the sentiment word clouds ( Figure 3A-3B) . We found that the most prevalent hashtag was "#COVID-19" with 29,543 tweets, followed by "IndonesianPoliceCorpsEnforceVaccineDistribution" (Bahasa #PolriKawalDistribusiVaksin) with 11,075 tweets. When we performed the word clouds visualization, we eliminated all observed words related to COVID-19 (such as "COVID", "COVID-19", "corona", and "coronavirus") to identify the most meaningful topics in each sentiment group. As illustrated in Figure 3A , before the vaccination rollout, the word clouds for the "positive group" appear to be related to the government's actions to import the Sinovac COVID-19 vaccine. Interestingly, the word "vaccine" has emerged as one of the most frequent words in the "negative group" that is likely from PDIP politician Ribka Tjiptaning who firmly stated her refusal against COVID-19 vaccination as the clinical trial III phase has not yet finished [27] . Meanwhile, after the vaccination rollout, the "positive group" word clouds appear to be related to the effectiveness of the COVID-19 vaccine. Finally, we observed some negative reference words appear upon the people's concern with the side effect following vaccination ( Figure 3B ). The result of the emotion analysis is presented in Figure 4A The trend analysis results (90 days after national vaccination rollout) show a significant trend between the daily sentiment scores and the vaccine coverage (P<.0001). In addition, we also observed a significant trend between vaccination sentiment scores and the case increase rate (P<.0001) and case fatality rate (P<.0001). We found a strong positive correlation between sentiment scores and the vaccination coverage (r Indonesian people were reluctant to receive COVID-19 vaccine shots because they were concerned about its safety and effectiveness. A study reported that vaccines' perceived safety and efficacy were strongly associated with intention to take the vaccine [34] . Moreover, given Indonesia has the largest Muslim population globally, people are worried that the vaccine is not halal (not permissible in Islam). Lack of trust and fear of the side effect was also reported from those who expressed vaccine hesitancy [7] . To gain public trust about the COVID-19 vaccine, the government announced the safety and efficacy of the Sinovac vaccine. The baseline effectiveness of the vaccine highly influenced the acceptance of a COVID-19 vaccine [35] . The Indonesian Ulama Council had also declared the Sinovac vaccine as halal. But surprisingly, we still observed the dominant-negative sentiments during the first month of the vaccination period. We noticed that the government's top-down communication took a slow pace to counter vaccine hesitancy and the emergence of the antivaccine movement in Indonesia. Moreover, the unpreparedness of the national health system, for example, the unreliable of the MoH vaccine recipient data, has caused many people not registered in the vaccination database, leading to public distrust [27] . To overcome this, at the end of January 2021, Indonesian Minister of Health Budi Gunadi Sadikin decided to use election voter data instead of MoH data to provide an updated vaccination database [27] . Furthermore, to speed up the COVID-19 vaccine rollout, the government has also recruited celebrities and social media influencers among the priority for coronavirus vaccines in pandemic, the conspiracy beliefs and trust in conventional media were reported to be the vital determinants of vaccine acceptance [36] . The increasing level of positive vaccination sentiment with the dominant emotion of trust may reinforce the acceptance of the COVID-19 vaccine [37] . Meanwhile, the negative sentiments were mainly related to fear, as people worried about the side effect after getting vaccinated. Our finding was consistent with the recent study involving Australian Twitter users who expressed fear as the top negative emotions toward the COVID-19 vaccine [38] . In addition, negative sentiments appeared upon people concerned about vaccine safety [39] and vaccine hesitancy [34] . WHO had declared vaccine hesitancy as one of the leading global health threats [40] . Thus, if people expressed fear of rejection to receive the vaccine, it is plausible that Indonesia will face a prolonged pandemic, potentially contributing to the emergence of new COVID-19 variants that may reduce the efficacy of existing vaccines and increase the risk of infections and deaths [1] . Our findings implied that the public online COVID-19 vaccination sentiment trend is significantly correlated with the increasing trend of vaccination coverage and the decreasing trend in COVID-19 case increase and case fatality rate. These results were similar to the previous studies that implied a strong and positive correlation between vaccination sentiment score expressed on Twitter and vaccine uptake [19, 21, 39] . Future studies should consider performing the social network analysis to explore the interactions between organizational and individual accounts with the most shared tweets or images affecting the public sentiments on vaccine uptake. Moreover, to speed up the vaccination coverage, we need to continue exploring the public sentiment on the COVID-19 vaccine over a longer period and analyzing the dynamic of public reactions, specifically toward a specific brand of vaccines and its correlation with the individual willingness to get vaccinated. Our study highlights several limitations. First, we limited our findings to specific settings (e.g., pandemic situation, particular country) or datasets (e.g., young adults as most Twitter users in Indonesia). Thus, it will decrease the generalizability of the findings to the general population and the global situation. Second, we were unable to confirm the geographical location of each tweet, so we cannot perform a clustering analysis to identify the sentiment distribution of COVID-19 vaccine-related tweets by the province in Indonesia. Third, we analyzed the public sentiments during the earlier phase of COVID-19 vaccination implementation in Indonesia (the first three months). More extended observation may capture more recent information on COVID-19 vaccination, which potentially affects the overall predominant emotions and sentiments trend. Lastly, although it is permissible to reuse public datasets on Twitter, we are concerned about the ethical issues of acquiring individual opinions without obtaining user's informed consent. However, suppose we reviewed the ethical framework of using Social Media Data [41] . In that case, we could reproduce tweets in academic publications as long as we can ensure the user's anonymity and protect their sensitive personal information. Thus, we will not violate the ethics of using social media data in research. In conclusion, this study highlights the utility of social media sentiment analysis based on Twitter data to identify the correlation between online public sentiments on COVID-19 vaccine and vaccination coverage in Indonesia. Our study found that the public sentiments on COVID-19 vaccine-related tweets have gradually shifted from negative to positive due to the government's proactive risk communication and immediate actions to clarify misinformation surrounding the COVID-19 vaccine. At the beginning of the national vaccination rollout, tweets were dominated by negative sentiments over positive sentiments with the predominant emotion of anticipation. However, the trend gradually changes to positive sentiments with the growing of trust emotion, which is consistent with the positive appeals of the recent news about COVID-19 vaccine safety and the government's proactive risk communication. The rising trend of the emotion of fear was quite concerning as it can induce people's willingness to receive the vaccine, putting a risk to the national vaccination uptake. We also observed a significant increase in the vaccination sentiment trend, which strongly correlates with the vaccination coverage and a strong negative correlation with the rate of case increase and case fatality during a pandemic situation. The finding of this study suggests that social media sentiment analysis can facilitate real-time monitoring of online COVID-19 vaccine-related issues so that the governments can take proactive measures to enhance vaccine uptake and address misinformation and inappropriate behaviors on COVID-19 vaccination. Our study leads to developing effective communication strategies through collaborative efforts of public health officers, policymakers, and media sources to build public trust and disseminate credible information on COVID-19 vaccine-related issues to improve national vaccine coverage in a pandemic situation. We believe that this finding will be helpful for countries to identify and develop strategies for speed up the vaccination rate by monitoring the dynamic netizens' reactions and expression in social media, especially Twitter, using sentiment analysis. The authors declare no conflicts of interest in this paper. World Health Organization (WHO), WHO COVID-19 Situation Report, Coronavirus disease (COVID-19) Weekly Epidemiological Update World Health Organization (WHO), WHO Coronavirus disease (COVID-19): Vaccine access and allocation World Health Organization (WHO), WHO Concept for fair access and equitable allocation of COVID-19 health products Status of COVID-19 Vaccines within WHO EUL/PQ evaluation process World Health Organization (WHO), WHO Indonesia Coronavirus Disease 2019 (COVID-19) Situation A global database of COVID-19 vaccinations Ministry of Health (MoH) Indonesia, COVID-19 Vaccine Acceptance Survey in Indonesia, The Ministry of Health Using Big Data to Monitor the Introduction and Spread of Chikungunya What social media told us in the time of COVID-19: a scoping review Adapting and Extending a Typology to Identify Vaccine Misinformation on Twitter Addressing COVID-19 Misinformation on Social Media Preemptively and Responsively, Emerg Infect Dis Dynamic Public Perceptions of the Coronavirus Disease Crisis, the Netherlands, 2020, Emerg Infect Dis Public Perception of the COVID-19 Pandemic on Twitter: Sentiment Analysis and Topic Modeling Study Social Media Data Analytics for Outbreak Risk Communication: Public Attention on the "New Normal" During the COVID-19 Pandemic in Indonesia The impact of social media-based support groups on smoking relapse prevention in Saudi Arabia Social media as a primary source of medical knowledge acquisition and dissemination Twitter Communication During an Outbreak of Hepatitis A in San Diego Trends of infodemiology studies: a scoping review Social media use and influenza vaccine uptake among White and African American adults Use of Deep Learning to Analyze Social Media Discussions About the Human Papillomavirus Vaccine Assessing vaccination sentiments with online social media: implications for infectious disease dynamics and control Indonesia: number of Twitter users Assessment of Public Attention, Risk Perception, Emotional and Behavioural Responses to the COVID-19 Outbreak: Social Media Surveillance in China Breakdown of social media* users by age and gender in Indonesia as of Drone Emprit Academic: Software for social media monitoring and analytics Jeroan Drone Emprit: NLP, Sentiment, Emotion, Bot, dan Demography Analysis Office of Assistant to Deputy Cabinet Secretary for State Documents & Translation Republic of Indonesia, COVID-19 Vaccine News, CABINET SECRETARIAT OF THE REPUBLIC OF INDONESIA Ministry of Health (MoH) Indonesia, National Dashboard for COVID-19 vaccination National Dashboard for COVID-19 statistics Automatic term and relation extraction for medical question answering system Choosing Wisely: Using the Appropriate Statistical Test for Trend in SAS Tableau Software LLC, Free Data Visualization Software: Public Tableau COVID-19 Vaccine Hesitancy Worldwide: A Concise Systematic Review of Vaccine Acceptance Rates, Vaccines (Basel Acceptance of a COVID-19 Vaccine in Southeast Asia: A Cross-Sectional Study in Indonesia Conspiracy beliefs and trust as determinants of COVID-19 vaccine acceptance in Bali, Indonesia: Cross-sectional study COVID-19 Vaccine-Related Discussion on Twitter: Topic Modeling and Sentiment Analysis Tweet Topics and Sentiments Relating to COVID-19 Vaccination Among Australian Twitter Users: Machine Learning Analysis Artificial Intelligence-Enabled Analysis of Public Attitudes on Facebook and Twitter Toward COVID-19 Vaccines in the United Kingdom and the United States: Observational Study World Health Organization (WHO), Ten threats to global health in 2019 The Ethics of Using Social Media Data in Research: A New Framework, The Ethics of Online Research The authors would like to acknowledge KawalCOVID19, MoH Indonesia, and Drone Emprit Academic, a.k.a Media Kernel Indonesia, for providing the platform for our data collection and analysis.