key: cord-0980657-mofmyiyv authors: Guntuku, Sharath Chandra; Sherman, Garrick; Stokes, Daniel C.; Agarwal, Anish K.; Seltzer, Emily; Merchant, Raina M.; Ungar, Lyle H. title: Tracking Mental Health and Symptom Mentions on Twitter During COVID-19 date: 2020-07-07 journal: J Gen Intern Med DOI: 10.1007/s11606-020-05988-8 sha: 63c04cbadc353482f3c062126c05779cb2e627ab doc_id: 980657 cord_uid: mofmyiyv nan The magnitude of novel coronavirus (COVID-19) pandemic has led to considerable economic hardships, stress, anxiety, and concerns about the future. Social media can provide a place for measuring a pulse of mental health in communities. Evaluating the changing use of language on social media can complement traditional survey-based approaches and provide new insights into the well-being of a country or region during a public health crisis. Social media could also enable early symptom discovery for diseases where the pathology is not completely known and is evolving. 1 We, therefore, created a dashboard (https://bit.ly/penncovidmap) to monitor and analyze changes in language expressed on Twitter over the course of the COVID-19 pandemic within the USA with a specific focus on mental health and symptom mentions. We are collecting two data sets, each containing approximately 5 million tweets/day, of publicly accessible streaming data for the dashboard: (a) a random 1% sample of daily US tweets to infer overall mental health, from which we identify Englishlanguage tweets posted from within the USA on the previous day; and b) tweets containing COVID-19 related keywords obtained using a public keyword streaming API to compute symptom mentions per state related to COVID-19. After geolocating all the tweets by mapping posts to states using a combination of location coordinate information and user location descriptions, we extract the relative frequency of single words and phrases (consisting of two or three consecutive words). Based on the word and phrase frequencies, mental health estimates are computed on the random 1% sample by applying four pre-trained data-driven machine learning models: overall sentiment (net positive language) 2 , stress 3 , anxiety 4 , and loneliness expressions 5 . We calculated estimates for these four measures from the national declaration of emergency, on March 13, to May 6, and compared them to the estimates from the same period in 2019, controlling for day of the week and seasonality effects. We quantified the effect size using Cohen's d. Using the second Twitter sample containing COVID-19 keywords, we calculate the frequency of Twitter posts relating to different COVID-19 symptoms across states. The study was considered exempt under the University of Pennsylvania Institutional Review Board guidelines. Comparing the mental health estimates across all the states in the duration after the declaration of emergency from March 13 to May 6, sentiment (Fig. 1a) Symptom mentions in the COVID-19 related tweets capture emerging symptoms such as a change in smell/taste, body aches, and skin lesions (Fig. 2) . Language used in tweets can provide insight into changes in mental health of communities during public health emergencies where widespread polling may not be available. Stress, anxiety, and loneliness are increasingly divergent from 2019 levels. Early recognition of hotspots of declining mental health can lead to community-level interventions, for example through providing increased access to telepsychiatry services, supporting local community partners, and locally employing more paraprofessionals, such as community health workers. Trending symptom mentions may lead to early recognition of new symptoms, such as recently noted skin findings associated with COVID-19. 6 Several symptoms were reported in the context of COVID-19 tweets prior to them being added to the symptom list by the Centers of Disease Control and skin lesions have been discussed starting March. Syndromic surveillance could also enable early recognition of disease re-National emergency declared -2 0 2 4 J a n 0 7 J a n 1 4 J a n 2 1 J a n emergence or spread and more informed distribution of tests and equipment. 1 Limitations of this study include that Twitter users are not representative of all segments of population and that the language-based estimates are on a random 1% data stream of tweets. Further, lack of polling data means our estimates have not been validated during the assessment period. In future work, we intend to validate these models against gold standard polling data. In conclusion, real-time monitoring of locationspecific social media posts can provide insight into emerging issues of public concern. Early recognition of local trends can lead to an informed distribution of resources, targeted public health interventions, and better preparedness in this and future public health emergencies. Social Media-and Internet-Based Disease Surveillance for Public Health Crowdsourcing a word-emotion association lexicon Understanding and measuring psychological stress using social media Towards assessing changes in degree of depression through Facebook Studying expressions of loneliness in individuals using twitter: an observational study Chilblain-like lesions on feet and hands during the COVID-19 Pandemic