key: cord-0496953-bqgs4zeg authors: Lokala, Usha; Lamy, Francois; Dastidar, Triyasha Ghosh; Roy, Kaushik; Daniulaityte, Raminta; Parthasarathy, Srinivasan; Sheth, Amit title: eDarkTrends: Harnessing Social Media Trends in Substance use disorders for Opioid Listings on Cryptomarket date: 2021-03-29 journal: nan DOI: nan sha: a6d44723c8a8a0629ccff78b03c497629dfbd329 doc_id: 496953 cord_uid: bqgs4zeg Opioid and substance misuse is rampant in the United States today, with the phenomenon known as the opioid crisis. The relationship between substance use and mental health has been extensively studied, with one possible relationship being substance misuse causes poor mental health. However, the lack of evidence on the relationship has resulted in opioids being largely inaccessible through legal means. This study analyzes the substance misuse posts on social media with the opioids being sold through crypto market listings. We use the Drug Abuse Ontology, state-of-the-art deep learning, and BERT-based models to generate sentiment and emotion for the social media posts to understand user perception on social media by investigating questions such as, which synthetic opioids people are optimistic, neutral, or negative about or what kind of drugs induced fear and sorrow or what kind of drugs people love or thankful about or which drug people think negatively about or which opioids cause little to no sentimental reaction. We also perform topic analysis associated with the generated sentiments and emotions to understand which topics correlate with people's responses to various drugs. Our findings can help shape policy to help isolate opioid use cases where timely intervention may be required to prevent adverse consequences, prevent overdose-related deaths, and worsen the epidemic. North America is facing the worst opioid epidemic in its history. This epidemic started with the mass diversion of pharmaceutical opioids (e.g., Oxycodone, Hydromorphone), resulting from the strong marketing advocacy of the potential benefits of opioids . The increase of opioid use disorder prevalence and pharmaceutical opioid-related overdose deaths resulted in a stricter distribution of pharmaceutical opioids, unintentionally leading to a dramatic increase in heroin usage among pharmaceutical opioid users (National Institute on Drug Abuse). The epidemic entered its third wave when novel synthetic opioids (e.g., fentanyl, U-47,700, carfentanil) emerged on the drug market. Several recent research and reports are pointing at the role of cryptomarkets in the distribution of emerging Novel Psychoactive Substances (NPS) (Aldridge & Décary-Hétu 2016; National Academies of Sciences, Engineering, and Medicine et al. 2017) . The importance of cryptomarkets has been further exacerbated by the spillover mental health and anxiety resulting from the ongoing Covid19 pandemic: recent results from the Global Drug Survey suggest that the percentage of participants who have been purchasing drugs through cryptomarkets has tripled since 2014 reaching 15 percent of the 2020 respondents (GDS). In this study, we assess social media data from active opioid users to understand what are the behaviors associated with opioid usage to identify what types of feelings are expressed. We employ deep learning models to perform sentiment and emotion analysis of social media data with the drug entities derived from cryptomarkets. We implemented LSTM, CNN, and BERT-based models for sentiment and emotion classification of social media data. Also, we performed the topic analysis using (TFIDF) to extract frequently discussed opioid-related topics in social media. Concerning Dark web data, three cryptomarkets, Dream market, Tochka, and WallStreet Market, were periodically crawled in between March 2018 and January 2019. Over 70,000 opioid-related listings were collected using the dedicated crawler (Kumar et al. 2020; . Raw HTML files collected were parsed and processed using a Named Entity Recognition (NER) to further extract substance names, product weight, price of the product, shipment information, availability, and administration route as shown in Table 1 . We collected 290,458 opioid-related posts from six sub-Reddits using custom built crawlers. These posts were further processed to extract data used for social media sentiment analysis. The SubReddit corpus is spread over different drug categories such as Heroin (136,745) (865). To complete the social media emotion analysis, we also collected 21,563 posts from Twitter using Twitter API. We applied TF-IDF over unigram, bigrams, and trigrams to identify topics in each SubReddit as shown in Table 2 . We used a pre-trained NER deep learning (NER DL) bidirectional LSTM-CNN approach (Chiu & Nichols 2016) on crypto market data to identify drug entities that use a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. The entities are then matched to a superclass using Drug Abuse Ontology (DAO) (Cameron et al. 2013; ) that acts as a domain-specific resource with all superclasses related to the entities. We identified 90 drug entities, which we then broadly classified into eight categories by mapping each entity to a super drug class in DAO. The eight broad categories considered are Heroin, Synthetic Heroin, Pharmaceutical Fentanyl, Non-Pharmaceutical Fentanyl, Fentanyl, Oxycodone, Kratom, and Opium. We classified SubReddit posts as Positive, Negative, and Neutral categories for sentiment analysis. We implemented Textblob to generate sentiment for each SubReddit post. The TextBlob generated labels are used as a training set to implement SOTA DL algorithms like CNN, LSTM, and language model BERT. The highest accuracy achieved is 93.6 with the LSTM model. We report the stats of Sentiment labels for SubReddit posts obtained from sampling 800 random data points from each drug category are reported in Table 3 . We did not choose to work on SubReddit data as we do not have self-tagged emotions in posts on SubReddit. Therefore, for Emotion analysis, we decided to crawl Twitter, where emotions are present as hashtags. We limited our crawl to 7 kinds of emotions, as stated in work done by Wang et al (Wang et al. 2012) . We extracted labeled training data by crawling tweets with each emotion's hashtags: Joy, Sadness, Anger, Love, Fear, Thankfulness, and Surprise. We then trained deep learning models LSTM, CNN, fine-tuned BERT-based models, and generated emotion labels for drug-based Twitter data. The highest accuracy achieved is 91.2 for the LSTM model. Kratom, Heroin, Fentanyl, Morphine, Cocaine, Methadone, Suboxone, and Oxycodone are the commonly discussed drugs across six subreddits. In Table 2 . For example, consider Research chemicals (RC); it is interesting to find that more posts talk about Pyrovalerone, a psychoactive drug with stimulant effects. Another term found is 'Quaalude,' a brand name for 'Methaqualone,' a sedative and hypnotic medication. The RC subreddit mostly discusses psychoactive and psychedelic drugs, while DrugNerds discusses Alkaloids (Kaserer et al. 2020) . Interestingly, DrugNerds talks about Naloxone, which can treat Opioid overdose. Dope is a slang term for Heroin identified in Heroin Subreddit. Several brand names of medications for anxiety, pain, seizures, insomnia, and sedatives are discussed in the Suboxone subreddit. Gabapentin is the typical seizure and pain medication discussed among most of the subreddits. Opiates Recovery is more about the withdrawal symptoms and mental health disorders, for example, 'cold turkey.' The 'cold turkey' used in the context of substance misuse is quitting substance abruptly, which carries significant risks if the drug you are discontinuing is a benzodiazepine or opiate (Just et al. 2016; Landry et al. 1992) . The results show that we can derive slang terms, brand names, novel drugs, mental health symptoms, and medications from social media. From the results in Table 3 , It is found that the highest positive sentiment is found in Pharmaceutical Fentanyl, the highest negative sentiment for Fentanyl, and the highest neutral opinion for Kratom. The emotion 'Love' is detected the top one for Kratom as people use it for self medication. The emotions among Twitter data for Fentanyl, Heroin, Oxycodone are visualized in Figure 1 with seven emotions: Joy, Sadness, Anger, Love, Fear, Thankfulness, and Surprise. The top three emotions for each drug are presented in Table 3 . The results for three deep learning approaches for sentiment analysis and emotion analysis: F-measure, Precision, and Recall are reported in Table 4 . Crawling cryptomarkets poses a significant challenge to apply data science and machine learning to study the opioid epidemic due to the restricted crawling process (Kumar et al. 2020; . To identify the best strategies to reduce opioid misuse, a better understanding of cryptomarket drug sales that impact consumption and how it reflects social media discussions is needed (Kamdar et al. 2019). Since our social media data is based on eight broad category drugs, we hope to further refine our categories by consulting with a domain expert. Further, we have identified the processes for future research. We plan to expand this work to extract mental health symptoms from the drug-related social media data to connect the association between drugs and mental health problems, for example, the association between cannabis and depression (Roy et al. 2021; Yadav et al. 2021) . We also plan to build an Opioid Drug Social Media Knowledge graph with all these different data points (Drug, Sentiment, Emotion, mental health symptom, location) and compare it against the work on 'Knowledge Graph-based Approach For Exploring The U.S. Opioid Epidemic' (Kamdar et al. 2019) . Potential areas of application would be identifying risk factors regarding addiction and mental health from subreddit data (Gaur et al. 2018) , identifying drug trends based on location, and predicting Opioid Overdoses. We would also like to rely on DEA Drug Seizures to include in our preliminary data collection process to be aware of related social media discussions. Hidden wholesale: The drug diffusing capacity of online drug cryptomarkets PREDOSE: a semantic web platform for drug abuse epidemiology using social media Named entity recognition with bidirectional LSTM-CNNs let me tell you about your mental health!": Contextualized classification of reddit posts to DSM-5 for web-based intervention Dependence on prescription opioids: prevention, diagnosis and treatment A knowledge graph-based approach for exploring the U.S. opioid epidemic Identification and characterization of plantderived alkaloids, corydine and corydaline, as novel mu opioid receptor agonists Unsupervised multi-view learning for sybil account detection Listed for sale: Analyzing data on fentanyl, fentanyl analogs and other novel synthetic opioids on one cryptomarket Benzodiazepine dependence and withdrawal: identification and medical management DAO: An ontology for substance use epidemiology on social media and dark web Board on Health Sciences Policy, Committee on Pain Management and Regulatory Strategies to Address Prescription Opioid Abuse is depression related to cannabis?": A knowledge-infused model for entity and relation extraction with limited supervision Harnessing twitter" big data" for automatic emotion identification when they say weed causes depression, but it's your fav antidepressant": Knowledge-aware attention framework for relationship extraction We acknowledge partial support from the National Institute on Drug Abuse (NIDA) Grant Number: R21DA04451: "eDarkTrends:Monitoring Cryptomarkets to Identify Emerging Trends of Illicit Synthetic Opioids Use" and NSF Award Number: 1761969: "Spokes: MEDIUM: MIDWEST: Collaborative: Community-Driven Data Engineering for Substance Abuse Prevention in the Rural Midwest". All findings and opinions are of authors and not sponsors.