key: cord-0694297-yvny0lap authors: Bardier, Cortni; Yang, Joshua S.; Li, Jiawei; Mackey, Tim K. title: Characterizing alternative and emerging tobacco product transition of use behavior on Twitter date: 2021-08-09 journal: BMC Res Notes DOI: 10.1186/s13104-021-05719-0 sha: 921c7f8bed169b15efc97a4727889ab8efb25dba doc_id: 694297 cord_uid: yvny0lap OBJECTIVE: The objective of this study was to develop an inductive coding approach specific to characterizing user-generated social media conversations about transition of use of different tobacco and alternative and emerging tobacco products (ATPs). RESULTS: A total of 40,206 tweets were collected from the Twitter public API stream that were geocoded from 2018 to 2019. Using data mining approaches, these tweets were then filtered for keywords associated with tobacco and ATP use behavior. This resulted in a subset of 5718 tweets, with 657 manually annotated and identified as associated with user-generated conversations about tobacco and ATP use behavior. The 657 tweets were coded into 9 parent codes: inquiry, interaction, observation, opinion, promote, reply, share knowledge, use characteristics, and transition of use behavior. The highest number of observations occurred under transition of use (43.38%, n = 285), followed by current use (39.27%, n = 258), opinions about use (0.07%, n = 46), and product promotion (0.06%, n = 37). Other codes had less than ten tweets that discussed these themes. Results provide early insights into how social media users discuss topics related to transition of use and their experiences with different and emerging tobacco product use behavior. Social media is now a common source of health-related information [1] . This includes user-generated conversations about a variety of topics, with an emerging field focused on better understanding tobacco and alternative and emerging tobacco (ATP) and electronic nicotine delivery system (ENDS) related knowledge, attitudes, and behaviors [2, 3] . User generated social media conversations can be assessed [4] to better understand how health behaviors are changing closer to real-time [5] . This approach introduces certain advantages over traditional survey methodology including faster identification of emerging trends [6] . However, methods to appropriately code social media content for specific health-related topics remain underdeveloped, particularly in the context of characterizing transitions in behaviors that change over time. Twitter is a micro blogging social networking platform that allows users to tweet 280-character messages, which can then be retweeted, favorited, and shared across a network of online users [7] . Users can form online communities [8] by interacting with other users who share similar beliefs, interests, and opinions about topics. This includes users who initiate, use, and transition between different tobacco and ATP and ENDS products [9, 10] . In fact, Twitter has specifically become a platform for sharing information about electronic cigarettes (e-cigarettes) [11] [12] [13] [14] a nicotine delivery device commercially available only in the past decade [15] . Bardier et al. BMC Res Notes (2021) 14:303 Evidencing growing popularity of vaping behavior, studies have shown that online searches for electronic cigarettes have increased [16] . However, increased uptake of different types of e-cigarettes (e.g., Juul, heat-not-burn, etc.), particularly among youth and young adults, has not been without controversy [17] . Ongoing concerns about the long-term health impact of nicotine consumption [18] , e-cigarette-related adverse events [19] (e.g., the 2019 outbreak of e-cigarette or vaping product use-associated lung injury), [20] [21] [22] and mixed evidence about the efficacy of ATPs as cessation devices, continues to generate public health and patient safety concerns [23, 24] . These concerns are accentuated when trying to assess the interaction of use behavior between traditional combustible tobacco products (e.g., cigarettes, cigars) and ENDS [25] . Understanding the pathways of transition of tobacco and ATP use-including what products users initiate on, why they switch between products, and unique health harms related to dual-use (i.e., simultaneous use of both combustible and ATPs/ENDS)-is still a relatively underdeveloped area of study [26] . Hence, the objective of this study was to examine Twitter user conversations to characterize users' conversations in relation to transition of use associated with ENDS, with a focus on developing an inductive coding approach specific to characterizing transition of use knowledge, attitudes, and behaviors. We conducted a retrospective observational social media study in two phases: (1) data collection; and (2) content analysis using an unsupervised machine learning and inductive coding approach. Inductive content analysis was used to identify and characterize posts relevant to tobacco and ATP use (i.e., "signal" tweets) and involved manual annotation by coders with training in tobacco and substance use behavior, with results used to generate a codebook of transition and behavioral-related themes that could also be iterated on in future social media studies. Data was first collected from the Twitter public streaming API with a filter to collect all tweets that contained geocoded posts located in the United States, with no further language or demographical restrictions. The time period of data collection was from 07/21/2018-07/21/2019. This initial dataset of geocoded tweets was then filtered for the keywords and hashtags "vape" and "vaping" in order to better isolate relevant twitter posts associated with study aims and for purposes of preliminary data analysis about ENDS behavior. The collected data included textual content of the tweet, user and account information, URLs, and time and date of post. To identify themes in our full corpus of tweets, we used an unsupervised machine learning approach called the Biterm Topic Model (BTM) designed to detect patterns in data and summarize the entire corpus of tweets into distinct highly correlated categories [27] . BTM is used to sort short text into highly prevalent themes without the need for predetermined coding or training and has been previously used for exploration of key public health topics [28] [29] [30] [31] . For each topic, BTM generates the top 20 words that represent the topic cluster. These topics were then reviewed and selected to identify clusters of Twitter conversations relevant to vaping and transition of use. Using BTM, we are able to identify "signal" topics based on the BTM output and eliminate irrelevant topics. BTM topics were first generated after applying keyword filters and were included for further analysis if they were pertinent to vape and vaping behavior, topics were excluded if they contained irrelevant topics or appeared to correlate with non-user generated conversations (e.g., news tweets, etc. ) We then extracted all the posts from the select vaping BTM topics and manually coded the content of tweets in these topics to ensure relevance to user-generated tobacco and ENDS use behavior. Posts were excluded as signals if they were: (1) news related and not organically user-generated content; (2) not written in English; and (3) retweets, the tweets that were retweeted counted as only one tweet. However, all tweets, replies, and tweets containing photos or videos were included to assess additional contextual information in addition to content analysis of text of tweet. Transition of use was classified as switching from one tobacco or ATP/ENDS product to another. Tweets and any associated URLs/hyperlinks were aggregated into a table and imported into Atlas.ti qualitative software for content analysis [32] . A first iterative, inductive analysis of the data was conducted (JSY) to identify thematic areas and classify tweets into codes with code descriptions. Tweets were read for identification of thematic areas in the dataset, then coded based on thematic areas of interest. Codes and coding descriptions were developed and modified iteratively throughout the coding process. A second analysis of the dataset was undertaken to expand the codebook to include subcodes. Subcodes and subcode descriptions were created and modified iteratively during a second round of data coding. Once a coding scheme was developed, the data were coded, extracted, and reviewed to assess the validity of the coding scheme by a second coder (CB). The final coding scheme and distribution of codes is presented in Fig. 1 and Table 1 . Data was collected from the Twitter public API stream and included publicly available tweets that were filtered for posts with geolocation/geotagged information. As the study did not involve human subjects, involved no interactions with online users, and only used publicly available data that was further de-identified for research purposes, ethics, and IRB approval was not required and twitter users were not consented into this study [33] . Any user identifiable information was removed from the study results. A total of 40,206 tweets were collected after filtering for "vape" and "vaping" keywords/hashtags. After data filtering, we ran BTM on the keyword filtered data to generate topic clusters and reviewed them for relevance to study aims. We chose 16 BTM clusters, which comprised a total of 5728 (14.25%) tweets selected based on word groupings relevant to vaping and ATP/ENDS behavior terms. After manually annotating these tweets for characteristics relevant to tobacco and ATP/ENDS use and behavior, we removed all non-signal tweets, leaving 589 signal tweets related to transition of use that were further analyzed. The 589 signal posts were categorized into 10 tobacco/ATP/ENDS general use and behavior thematic codes listed and identified in Table 1 . Specific to codes related to transition of use (48.39%, n = 285), thirteen distinct tobacco/ATP/ENDS transition pathways were identified; the term "vaping" was used to describe both nicotine vaping and vaping of cannabis-based products. Transitions detected were cannabis to cannabis (0.005%, n = 3), cannabis to e-cigarettes (0.006%, n = 4), chewing tobacco to e-cigarettes (0.01%, n = 6), cigarettes to e-cigarettes (27.16%, n = 160), cigarettes to no product (0.14%, n = 8), cigarettes to vape cannabis (0.007%, n = 4), e-cigarette to cannabis (0.002, n = 1), e-cigarette to cigarette (0.01%, n = 7), e-cigarette to e-cigarette (0.2%, n = 13), e-cigarette to no product (0.07%, n = 43), no product to e-cigarette (0.03%, n = 17), no product to vape cannabis (0.03, n = 15), and unknown product to e-cigarette (0.007%, n = 4). There were also transitions among different ATP product types as well as cannabis product types, one of which was vaping a cannabis product. Vaping use factors that were observed as influencing transition of use included self-reporting of addiction prompting use, reaction to adverse symptoms, cost of ATPs/ENDS, faulty or broken ATPs/ENDS, preference for flavors, losing or misplacing ATPs/ENDS, interest in polysubstance use, Vaping trick, such as blowing clouds I have been trying to nail this trick for a while and I finally succeed, I was so shocked. #vape#vapetricks concern about reducing nicotine levels, stigma, and the alleged therapeutic effects of vaping, especially cannabis. This study explored user-generated conversations occurring on Twitter in relation to tobacco and ATP/ ENDS use, with a specific focus on transition of use between these highly addictive products. We observed that this subset of Twitter users actively tweeted about their experience using tobacco and ATPs/ENDS, representing powerful information about this behavior that is influenced by a changing landscape of new and emerging nicotine products. The majority of tweets reviewed related to tobacco and ATP/ENDS use and behavior characteristics, including users asking about tobacco/ATP/ENDS products, how to quit, observations of tobacco/ATP/ENDS use behavior, opinions about products and vaping (including claiming vaping as a healthier alternative to tobacco or its alleged therapeutic benefits), sharing knowledge about tobacco/ ATP/ENDS products, and specific characteristics of use (e.g. addiction, adverse events, costs, flavoring, tricks, etc.) Close to half of all conversations discussed transition of use behavior, including users actively discussed the types of tobacco/ATP/ENDS products used and switched between, as well as provided reasons for product use change. A wide variety of tobacco/ATP/ ENDS products were mentioned, including combustible tobacco products (e.g., cigarettes), chewing tobacco, different types of e-cigarettes (Juul, vaping pens, etc.) and cannabis smoking products. Transition was observed between different products and within specific product classes (i.e., transitioning from one type of e-cigarette product to another), with some users (n = 32) selfreporting polytobacco and polysubstance behavior (e.g., smoking cigarettes and also vaping). Users expressed various sentiment about different products including how products could act as substitutes for others, what products made them feel better, attempts to quit use of one product by switching to another, and issues related to cost and access. Some users stated that cannabis vaping products helped them with cessation of nicotine addiction. Based on these preliminary results, Twitter appears to enable robust conversation and sharing of information related to tobacco and ATP/ENDS use and can act as a digital forum for smokers and vapers to accumulate knowledge, share experiences, and actually lead to potential behavior change associated with nicotine use and addiction. The results of our study are exploratory in nature and were derived from a sample of general geolocated tweets over a one-year period, which were then filtered for common vaping keywords and then analyzed using unsupervised machine learning. The results of this study are not generalizable to overall trends in tobacco or ATP/ENDS behavior, but nevertheless provide important insights into conversations occurring among Twitter users specific to transition of tobacco and nicotine product use. Themes associated with the transition of use were primarily focused on navigating quit attempts or having trouble quitting in the past, those who had relapsed to nicotine addiction, and those who had quit cigarettes but still vaped. These results provide early evidence that experiences in transition of use also present opportunities for more targeted cessation interventions, particularly in the context of increasing knowledge of known health harms related to tobacco use and nicotine addiction and exposure [34, 35] . Future work should conduct further confirmatory studies to assess if themes related to transition of use knowledge, attitudes and behaviors observed hold true in other digital communities and use more structured research approaches to generalize findings. Future studies should also examine other platforms now popular among youth and young adults, such as Instagram, Snapchat, and TikTok. This study was exploratory and meant to generate hypotheses for future research. The study's limitations include use of a single platform and that Twitter user demographics may not reflect that of the general population of tobacco/ATP/ENDS users. The sample of tweets were also limited based on a convenience sample generated from geocoded tweets, and hence, may be subject to sample bias as it is estimated that only 1% of all tweets are geocoded [36, 37] . Future studies should use multiple Twitter APIs to generate a more representative Twitter dataset. Social media: a review and tutorial of applications in medicine and health care Effective uses of social media in public health and medicine: a systematic review of systematic reviews. OJPHI Public reactions to e-cigarette regulations on Twitter: a text mining analysis Classification of twitter users who tweet about E-cigarettes Vandelanotte C. Are health behavior change interventions that use online social networks effective? A systematic review E-cigarette surveillance with social media data: social bots, emerging topics, and trends A scoping review of the use of Twitter for public health research Building stronger online communities through the creation of facebook-integrated health applications Organizing online health content: developing hashtag collections for healthier internet-based people and communities Online tobacco websites and online communities-who uses them and do users quit smoking? The quit-primo and national dental practice-based research network Hi-Quit studies E-cigarette advocates on twitter: content analysis of vaping-related tweets E-cigarette promotion on twitter in Australia: content analysis of tweets E-cigarette social media messages: a text mining analysis of marketing and consumer conversations on twitter Using twitter data to gain insights into e-cigarette marketing and locations of use: an infoveillance study Vaping: the new wave of nicotine addiction Revisiting the rise of electronic nicotine delivery systems using search query surveillance The rise of e-cigarettes, pod mod devices, and JUUL among youth: factors influencing use, health implications, and downstream effects E-cigarettes: impact of E-liquid components and device characteristics on nicotine exposure EVALI and the pulmonary toxicity of electronic cigarettes: a review E-cigarette, or vaping, product use associated lung injury (EVALI): case series and diagnostic approach Vitamin E acetate in bronchoalveolar-lavage fluid associated with EVALI How did beliefs and perceptions about e-cigarettes change after national news coverage of the EVALI outbreak? Electronic cigarettes as smoking cessation tool: are we there? Vaping Versus Smoking: A Quest for Efficacy and Safety of E-cigarette Cigarette and e-cigarette dual use and risk of cardiopulmonary symptoms in the Health eHeart Study A narrative review evaluating the safety and efficacy of e-cigarettes as a newly marketed smoking cessation tool A biterm topic model for short texts Exploring Trends of Nonmedical use of Prescription Drugs and Polydrug Abuse in the Twittersphere Using Unsupervised Machine Learning Detection of selfreported experiences with corruption on twitter using unsupervised machine learning Characterizing twitter user topics and communication network dynamics of the "liberate" movement during COVID-19 using unsupervised machine learning and social network analysis Application of unsupervised machine learning to identify and characterize hydroxychloroquine misinformation on twitter convenient online submission • thorough peer review by experienced researchers in your field • rapid publication on acceptance • support for research data, including large and complex data types • gold Open Access which fosters wider collaboration and increased citations maximum visibility for your research: over 100M website views per year • At BMC, research is always in progress. Learn more biomedcentral.com/submissions Ready to submit your research Ready to submit your research ? Choose BMC Basic Content Analysis. 2455 Teller Road, Thousand Oaks California 91320 United States of America Data Mining on Facebook: A Free Space for Researchers or an IRB Nightmare? Nursing interventions for smoking cessation in hospitalized patients: a systematic review What are the respiratory effects of e-cigarettes? A survey of location inference techniques on Twitter Twitter geolocation: a hybrid approach Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations None. Authors' contributions CB, JY, JL, and TKM jointly conceived the study, drafted the study, conducted data collection and analysis, and wrote and agreed to the final version of this manuscript. All authors read and approved the final manuscript. This research was supported by the Tobacco-Related Disease Research Program (Awards #T29IP0465 and #T29IP0384). The de-identified data that support the findings of this study are available upon request to corresponding author tmackey@ucsd.edu and certain data will be available freely from the website www. ghpol icy. org. Ethics approval and consent to participate Not applicable/Not required for this study. All information collected from this study was from the public domain and the study did not involve any interaction with users. Any user identifiable information was removed from the study results. Not applicable.