key: cord-0729732-ir7zw0kh authors: Yoo, Myungeun; Jang, Chan Woong title: Physical rehabilitation on social media during COVID-19: topics and sentiments analysis of tweets date: 2021-10-12 journal: Ann Phys Rehabil Med DOI: 10.1016/j.rehab.2021.101589 sha: b6cc20c54d582872e133e3249197fa846d37e5ac doc_id: 729732 cord_uid: ir7zw0kh nan Kingdom. In addition, younger people and people with higher incomes often tend to use Twitter; the use is not related to education or sex. [1] The coronavirus disease 2019 has affected people's daily lives, and subsequently, the physical rehabilitation delivery system because physical rehabilitation facilities have been closed entirely or closed to visitors. During the current COVID-19 pandemic, Twitter has been successfully used for monitoring the dynamics of public opinion and health behaviors. [2] [3] [4] To obtain insights into how people are managing the current situation and what they need, physical rehabilitation professionals and health regulators need to understand what people feel and think about physical rehabilitation. Thus, we explored the quantity of social media (Twitter) activity, topics people discussed, and sentiments expressed related to physical rehabilitation. On July 15, 2020, we used the program Hydrator to collect the content of specific tweets related to COVID-19 from the Twitter dataset of Chen et al. [5] Then, we obtained the full text and extra information (retweet counts and favorite counts) for each tweet posted from July 1, 2020, to July 14, 2020. Because of the rapidly evolving and spreading nature of COVID-19, we limited our study to 2 weeks immediately before the search, when the most confirmed cases occurred in the United States. [6] Python 3.6.2 (Python Software Foundation) was used to preprocess and analyze the collected data. The preprocessing eliminated tweets written in non-English languages, duplicate tweets, and retweets. Then, we converted each tweet to lowercase and removed stop words (e.g., "is" or "at"), non-English letters, URLs (sentences starting with "http://" or "pic.twitter.com/"), emails, and emoticons. [7] To analyze tweets referring to physical rehabilitation, we selected the following keywords based on Google Trends (http://www.google.com/trends/): "rehab," "rehabilitation," "pmr," "pm&r," "physical therapy," "occupational therapy," and "speech therapy," and extracted tweets containing these keywords. Additionally, to compare the quantity of social media activity and the sentiments, we used tweets mentioning drug treatment, the mainstream of acute care for coronavirus infection. Then, drug treatment-related tweets were extracted by using the keywords "drug," and "remdesivir," "chloroquine," and "favipiravir," as representative candidate drugs for COVID-19. We performed lemmatization, that is, finding and replacing the normalized form of a word. Consequently, we identified the public interest within Twitter of all keywords related to physical rehabilitation based on social media activity by using the total number of tweets, favorites, and retweets during the study period. Moreover, to identify the hidden topics from each tweet's text, we adopted the latent Dirichlet allocation algorithm (genism Python package) with Mallet's implementation. [8, 9] This unsupervised machine-learning algorithm automatically extracts topics from a large volume of unstructured text. According to the degree of semantic similarity between highly contributing words in the topic, or the coherence score, 10 topics were extracted in the 2-week study period. [10] The automatically extracted topics were then re-labelled by the authors according to the top 10 contributing words from each topic. TextBlob, a Python library for processing textual data, was used for sentiment assessment. [11] TextBlob provides polarity scores for sentences ranging from -1.0 to 1.0, with -1.0 representing the most negative sentiment and +1.0 the most positive sentiment. If the polarity score was >0.05, the tweet was judged positive, and if the score was <-0.05, it was judged negative; otherwise, the tweet was judged neutral. Then, we confirmed the daily flow of polarity scores across tweets related to physical rehabilitation and drug treatment, respectively. Further analysis involved finding the contribution of each topic to the sentiment scores. Tweets were collected separately to obtain average scores. From the results, 53,393,925 tweets were retrieved (30.7 GB), and 37,658,147 (71%) were in English. After data preprocessing, 8,989,043 (17%) tweets remained: 64,630 and 5,628 belonged to drug treatment and physical rehabilitation keyword tweets, respectively. Tweets referring to physical rehabilitation represented 0.06% of the total, fewer than those referring to drug treatment, 0.72% of the total. The total favorite counts and retweets for these tweets were 113,829,754 and 30,329,903, respectively. Among the drug treatment tweets were 1,131,333 favorite counts and 473,436 retweets, with 98,777 and 22,350, respectively, for the physical rehabilitation tweets. Almost half of the tweets related to physical rehabilitation were uploaded in the United States (44%), followed by the United Kingdom (10%), India (9%), Australia (5%), and Canada (4%). According to the uploader's classification, non-professionals, including independent users, non-physician staff, news media, and unknown origin, wrote 78% of tweets, whereas independent professionals, including physicians and therapists, wrote about 15% of tweets. The other 7% of those tweets were uploaded by official academic institutions, organizations, and hospitals. For topic modeling, the top 5 topics, depending on the tweet volume, are illustrated in Table 1 , which also illustrates the words that contributed to each topic, from top to bottom in order of frequency. Among the topics, "schedule" was associated with the largest number of tweets, followed by "virus," "wear mask," "physical therapy," and "infection." According to the sentiment analysis, over the 2 weeks, 2,381 (42%) tweets were classified as containing a positive sentiment, 2,339 (42%) a neutral sentiment, and 908 (16%) a negative sentiment. The Figure illustrates that the daily mean polarity scores were higher for physical rehabilitation than drug treatment tweets on most days. The physical rehabilitation tweets had a positive polarity score of ≥ 0.05 every day for 2 weeks. As indicated in Table 1 , other topics were neutral except for "physical therapy," which was the only positive topic, with an average polarity score of 0.115. Unexpectedly, few tweets related to physical rehabilitation of COVID-19 patients. Moreover, the amount of public activity on social media, based on the number of tweets, retweets, and favorite counts, was relatively smaller for physical rehabilitation than drug treatment tweets. Therefore, people were more focused on COVID-19-related pharmacology during the 2-week period. Given the criticality of COVID-19 and that this study was conducted during only 2 weeks with the most confirmed cases, not surprisingly, public interest in acute care was higher for drug treatment than physical rehabilitation. However, many experts have suggested the significance of rehabilitation after acute COVID-19 infection from the early phase of the global pandemic. [12] [13] [14] An analysis of just 2 weeks of tweets cannot be conclusive, but researchers should consider that there were few references to physical rehabilitation as compared with drug treatment. The topic "schedule" was mainly related to tweets referring to the duration of treatment before the death or survival due to COVID-19 of family members or friends. These tweets were included in the study's search results because they mentioned physical rehabilitation during hospital treatment. The topics "virus" and "infection" were primarily linked to tweets about the negative effects of coronavirus infections on the body, even effects requiring physical rehabilitation. According to the extracted topics "wear mask," "physical therapy" with relevant words and positive polarity scores of physical rehabilitation for 2 weeks, we deduced that there were concerns regarding the spread of coronavirus within facilities but also expectations for reopening facilities after the lockdown. Until now, physical rehabilitation centres have been actually closing and reopening according to confirmed cases. Healthcare professionals should be aware of the mixed opinion of public concern and expectation and keep in mind compliance with quarantine guidelines is necessary when providing rehabilitation treatment. One interesting finding is that the "physical therapy" topic was related to "home," which denotes home-visit or home-based programs. This finding may reflect receiving personal rehabilitation treatment at home due to strict quarantine regulations. Although not among the most frequent words and topics, "tele-rehabilitation" appeared in some Twitter accounts. Overall, 51 of the physical rehabilitation-related tweets mentioned telemedicine, and several examples are illustrated in Table 2 . This observation may reflect that a paradigm shift in physical rehabilitation is under way in the non-contact era and that the public is accepting this change. Some studies have already reported that innovative approaches to rehabilitation, such as virtual rehabilitation or tele-medicine, may be preferred to contact interactions. [15, 16] Similarly, it is meaningful that sentiments regarding "physical therapy" were the most positive. These sentiments may express hope for the normalization of physical rehabilitation facilities or the expectation of new rehabilitation programs. Therefore, physical rehabilitation professionals should be aware of the need and hope for a transition from facility-based physical rehabilitation to an environment in which human contact is minimized and personalized programs can be performed. There are several limitations to this study. First, this study was performed at a specific phase of the pandemic. Given that the pandemic keeps changing with different phases, future studies at different times with different conditions are necessary. Second, we excluded non-English tweets. However, because more than 70% of the collected tweets were in English, we thought that analyzing only English tweets did not create a large bias. Third, only Twitter data were analyzed. Although the use of Twitter data for public health surveillance has proven reliable in previous studies, there may be a selection bias issue, with Twitter users not being representative of any population. [1, [17] [18] [19] To overcome this bias, we used Twitter when there were the most confirmed COVID-19 cases in the United States with the highest Twitter utilization. Fourth, the keywords used are limited. The word "rehabilitation" can be used for a different meaning than for medical purposes, and there might be representative words that are used frequently depending on the region. We attempted to overcome this situation by searching for other keywords along with "rehabilitation" by using Google Trend, found useful in various health care research. [20] For example, according to Google Trend, "pmr" and "physical therapy" are used more than "prm" and "physiotherapy" worldwide. This is the first study to analyze social media data regarding physical rehabilitation in the COVID-19 pandemic. In conclusion, physical rehabilitation topics in social media during e: if the polarity score was >0.05, the topic was considered positive and if <-0.05, negative; otherwise, the topic was considered neutral. Representativeness of social media in great britain: investigating Facebook Public perception of the COVID-19 pandemic on Twitter: sentiment analysis and topic modeling study Twitter Sentiment Analysis during COVID-19 Outbreak. Available at SSRN 3572023 Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA Tracking Social Media Discourse About the COVID-19 Pandemic: Development of a Public Coronavirus Twitter Data Set Short text similarity measurement using context-aware weighted biterms Latent dirichlet allocation Package 'mallet'. A wrapper around the Java machine learning tool MALLET Exploring the space of topic coherence measures Release 0.15 Considerations for postacute rehabilitation for survivors of COVID-19 The war on COVID-19 pandemic: role of rehabilitation professionals and hospitals The role of physical and rehabilitation medicine in the COVID-19 pandemic: the clinician's view Cardiac Rehabilitation During Quarantine in COVID-19 Pandemic: Challenges for Center-Based Programs Rehabilitation After Critical Illness in People With COVID-19 Infection Investigating public health surveillance using Twitter You are what you tweet: Analyzing twitter for public health Twitter as a tool for health research: a systematic review The use of google trends in health care We thank all persons, including Emily Chen, for providing Twitter ID data related to COVID-19.