key: cord-0879881-a68rtakm
authors: Brezulianu, Adrian; Burlacu, Alexandru; Popa, Iolanda Valentina; Arif, Muhammad; Geman, Oana
title: “Not by Our Feeling, But by Other's Seeing”: Sentiment Analysis Technique in Cardiology—An Exploratory Review
date: 2022-04-11
journal: Front Public Health
DOI: 10.3389/fpubh.2022.880207
sha: 906584ea744699ba5703724f9b26458410bf070a
doc_id: 879881
cord_uid: a68rtakm

Sentiment Analysis (SA) is a novel branch of Natural Language Processing (NLP) that measures emotions or attitudes behind a written text. First applications of SA in healthcare were the detection of disease-related emotional polarities in social media. Now it is possible to extract more complex attitudes (rank attitudes from 1 to 5, assign appraisal values, apply multiple text classifiers) or feelings through NLP techniques, with clear benefits in cardiology; as emotions were proved to be veritable risk factors for the development of cardiovascular diseases (CVD). Our narrative review aimed to summarize the current directions of SA in cardiology and raise the awareness of cardiologists about the potentiality of this novel domain. This paper introduces the readers to basic concepts surrounding medical SA and the need for SA in cardiovascular healthcare. Our synthesis of the current literature proved SA's clinical potential in CVD. However, many other clinical utilities, such as the assessment of emotional consequences of illness, patient-physician relationship, physician intuitions in CVD are not yet explored. These issues constitute future research directions, along with proposing detailed regulations, popularizing health social media among elders, developing insightful definitions of emotional polarity, and investing research into the development of powerful SA algorithms.

Sentiment Analysis (SA) or "opinion mining" is a novel branch of Natural Language Processing (NLP) that measures emotions or attitudes behind a written text. At the most basic level, SA tools classify pieces of text as having positive, negative, or neutral emotions, although current technologies support much more complex analysis of emotions in the written text (1) .

Several artificial intelligence (AI)/machine learning (ML) technologies and other types of computational techniques have been proposed and proved their benefits in bettering diagnostic accuracy and treatment efficacy (2) . The existence of intuition and the documentation of its importance in patient management urges the enrichment of medical AI/ML and other computational methods with the ability to detect and assess emotions to attain higher performance in solving health problems. Using SA to examine doctors' written notes on intensive-care-unit patients, the paper showed that doctors' intuitions ("gut feelings") were an essential factor in determining the disease management for each patient (3) .

To further explore the effectiveness of SA in medical context, a review investigating the SA methods used for examining emotions in healthcare tweets has been published (4) . However, no review has focused, so far, on evaluating the utility of SA in cardiology. Cardiovascular diseases (CVD) arouse a particular interest as they are the deadliest diseases in the world (5) . Recent studies suggest that SA could be extremely useful in cardiology, especially in the context of extensive use of telemedicine due to the COVID-19 pandemic (6) .

The increasing use of social platforms may be the foundation for developing SA-based models applied to various fields in cardiology. A study on drug safety showed that adding SA features improves the performance of state-of-the-art methods to identify adverse drug reactions (ADR). These models used a corpus of posts from Twitter and other online forums. SA features significantly increased the F-measure of adverse reaction detection (for 81 drugs, including cardiovascular medication) from 72.14 to 73.22% in the Twitter corpus of posts. The improvement of ADR detection by SA became possible due to the rapidly growing popularity of social media and health forums (7, 8) .

Our narrative review aims to: (1) summarize the current directions of SA in cardiology and the results achieved so far in a systematic manner, (2) raise the awareness of cardiologists about the potentiality of this novel and promising domain that will soon become a practical reality, and (3) open new perspectives regarding the dialogue between AI specialists and cardiologists. Given the small number of studies so far, a systematic methodology is not suitable, therefore the purpose of this review is purely narrative.

We searched PubMed/Medline and Google Scholar for studies in English addressing the topic of SA in cardiology, from inception to february 2022. The following search string was used: ("Sentiment analysis" OR "Emotions recognition" OR "Sentiment recognition") AND ("Heart failure" OR "Cardiac insufficiency" OR "Coronary Artery Disease" OR "CAD" OR "Coronary syndrome" OR "Coronary" OR "Stable angina" OR "Angina pectoris" OR "Ischemic heart disease" OR "IHD" OR "Ischemic" OR "Ischemia" OR "Myocardial infarction" OR "Infarction" OR "Atrial fibrillation" OR "AF" OR "Stroke" OR "Arrhythmia" OR "Heart rate" OR "Pulse" OR "Sudden death" OR "Sudden cardiac death" OR "Cardiovascular prevention").

We reviewed an initial number of 550 studies, and after excluding the duplicates, 498 studies remained. After excluding the studies irrelevant to our objectives, we selected 11 papers that address SA methods focusing on cardiovascular diseases. Papers were included regardless of whether they constituted original research, reviews, opinions, reports. Any type of study was considered eligible for inclusion. Three researchers realized the agreement between the studies selected. All included studies are illustrated in Table 1 .

The main directions of research regarding SA in cardiology identified from the retained studies are: the identification of emotional risk factors for CVD, the detection of positive/negative attitudes of CV patients toward their disease and its clinical implications, the detection of cardiac arrhythmia, the triage of CV patients, spotting feedback from patients and newspapers regarding drugs, therapeutic procedures, or medical devices and the integration of SA modules in new technological concepts for monitoring CV patients. Each of these topics is discussed below in an attempt to synthesize the current literature on SA in cardiovascular diseases, right after introducing the reader to basic concepts regarding medical SA and justifying how SA can contribute to increasing quality in cardiovascular healthcare.

Medical SA is the field of study that analyzes patients' and doctors' opinions, sentiments, attitudes, and emotions toward various clinical contexts (treatment side-effects, medical diagnosis concerns, emotional consequences of illness, emotional context during the onset or evolution of a specific disease, patient-physician relationship, physician attitudes in clinical notes) expressed in written text (20) . While traditional AI deals with facts and logical, objective data analysis, sentiment research refers to opinions-correctly identifying subjective emotional communication.

Several medical entities associable with sentiments have been defined: health status (improved/worsened, good/bad), medical condition (improved/worsened), diagnosis (certain/uncertain/preliminary), medical procedure (positive/negative outcome), medication (helpful/useless/adverse events) (20) .

Sentiment classification comprises two comprehensive categories: lexicon-based and ML/NLP-based classifications (21) . The classifiers build upon sentiment lexicons (i.e., a collection of polar or opinion words, associated with their sentiment polarity, that is, positive or negative) are lexicon-based (or rule-based classifiers). Sentiment lexicons are produced manually or semiautomatically (22) and regularly stored as dictionaries. Conversely, ML/NLP-based classifiers are built using training datasets or annotated data collections.

Some of the most popular types of SA are: fine-grained SA, emotion detection, aspect-based SA and multilingual SA (23) . Fine-grained SA considers an expanded number of polarity categories (e.g., very positive / positive / neutral / negative / very negative). Emotion detection uses lexicons or ML/NLP systems to detect sentiments. Aspect-based SA highlights which particular aspects or features people are mentioning in a positive, neutral, or negative way (23) . Multilingual SA techniques have been developed in order to analyses data in different languages (24) . The introductory concepts in medical SA are summarized in CVD and emotional disorders seem to influence each other in a bidirectional manner (25) . Coronary arteries diseases (CAD) and their impact in triggering emotional disorders is well documented. Moreover, emotional disturbances have the power to induce or worsen existing coronary artery diseases (25) . Emotional disturbances were reported as potent cardiovascular (CV) risk factors (25) .

There is a need to integrate the assessment of the emotional status in the cardiovascular risk prediction, a desideratum long considered unattainable due to a high degree of subjectivity regarding human sentiments and insufficient technical developments (25) . The considerable technological and AI progress provides the opportunity to start developing strategies for building solutions capable of systematically assessing emotions. Since patients express their thoughts and feelings more openly in online than ever before (26) , SA is becoming an essential tool to monitor and understand these sentiments and provide predictive models (27) .

The increasing popularity of social platforms and discussion forums enabling the collection of unlimited amounts of written text and opinions, creates a favorable ground for testing novel SA methods. A descriptive study identified 4.9 million Tweets about CVD having common topics such as risk factors, awareness, and management of CVD (28) . Using the vast amount of Twitter data on CVD, a study predicted countylevel heart disease mortality based on the assessment of psychological language in Twitter posts (9) . Given the SA superiority over standard predictive models (29) , SA integration would lead to higher performance models and more complex CV predictions.

An obstacle to further improvement of AI/ML models is that 90% of the world's data is unstructured (30) . While exam results are reported in a structured way, observations, intuitions, opinions, and experiences are communicated in an unstructured manner throughout clinical records, notes or online. Unstructured data is extremely time-consuming to analyses and it is unusable by the standard AI/ML solutions. This represents a missed opportunity for understanding patients' experience in an increasingly "connected" world. Thereby, SA and its ability to systematically review unstructured data is ready to overrun old limitations and produce higher-quality results.

In addition to these major benefits, several other perquisites have been highlighted in the literature. Firstly, data mining and SA may be used to explore the issues surrounding controversial research subjects, guidelines' changes or new recommendations in CV medicine (31) .

Secondly, according to the Information Strategy for the National Health Service (NHS) in England, SA may be valuable for patients in facilitating choice of hospitals (32) by predicting, from free-text, "a reasonably accurate assessment of patients' opinion about different performance aspects of a hospital" (27) .

Thirdly, online public testimonies carry classical indicators (such as self-reported quality of life indicators during and post treatment) and other relevant indicators (such as attitudes toward political legislation, loss of healthcare coverage, raising support, spreading awareness) that are difficult to capture by conventional means of self-reporting. Social listening can provide valuable feedback from patients and can help healthcare professionals and regulators to personalize and improve treatment regimens and improve public health surveillance strategies (33) .

Poor emotion regulation was associated with CV risk in several studies (34, 35) . Moreover, educational programs aimed at improving emotion regulation strategies among patients in cardiac rehabilitation proved to be feasible (36) .

Whereas emotion regulation is a teachable skill that may play a role in preventing CVD, emotions must first be systematically recognized and documented before deciding whether the intervention of educational programs is appropriate. Several models of emotional recognition, capable of ensuring a systematic evaluation of sentiments, have been proposed based on eye-tracking (37-39), heart sound signals (40) , cardiovascular response in daily life using the k-Nearest-Neighbor classifier (41), heart rate data collected from wearable devices (42), or even computational evaluation of facial expressions (43) .

The Twitter platform was used on a large scale to assess the psychological language as a risk factor for atherosclerotic CAD by applying SA analysis (9) . Hostility and chronic stress are known risk factors for CVD (44) . All language patterns highlighting negative psychological traits (anger, negativerelationship, negative-emotion, and disengagement) emerged as significant risk factors [partial rs = 0.06, 95% confidence interval, or CI = (0.00, 0.11), to 0.12, 95% CI = (0.07,0.17)] for CAD mortality while the use of engagement words [r = −0.09, 95% CI = (−0.14, −0.04)] and positive-emotion words [partial r = -.05, 95% CI = (-.00, -.11)] appeared significantly protective. Surprisingly, a regression model "based only on Twitter language" predicts CAD mortality significantly better than a model with 10 common demographic, socioeconomic, and health risk factors (e.g., smoking, diabetes, hypertension, and obesity) (9) .

A SA/ML methodology has been proposed to identify the relevant myocardial infarction (MI) emotional risk factors using Twitter data (10) . Even if it seems unbelievable, the authors explore the possibility of screening tweets for MI risk factors as a tool to be used in preventive medicine. However, no results are yet provided.

Acute MI was repeatedly studied as an important consequence of stressful social disasters and social stress [e.g., the influence on MI of the death of a beloved (45) , earthquakes (46) , or war (47) ]. In this context, a Korean team used the SA/ML algorithm of Semantria Lexalytics and managed to prove the effects of the Sewol Ferry Disaster on social stress by examining data from the top social media platforms used worldwide (YouTube, Twitter, and Facebook) (45) .

Another way to harness the capabilities of SA was to analyze tweets in an attempt to find the relation between the sentiment of tweets and CVD in the counties along Interstate 20 in Texas (11) . The sentiment of tweets from each region was determined by five classifiers (Naïve Bayes, Multinomial Naïve Bayes, Bernoulli Naïve Bayes, Support Vector, and Linear Support Vector) and was compared with the regional CVD rates. The Positive-to-Population rate is related to the CVD data map and Negativeto-Population rates have inverse relations to the CVD data map. This descriptive study highlights the potential of SA in epidemiological research, generating causal hypotheses and finding trends of diseases.

Patients' attitude toward their disease may be an important drive for treatment adherence and a significant buffer of the impact of illness (48) . However, the prevalence of positive attitudes toward the disease may often be low (49) . SA is capable of contributing with systematized analysis and information on subjective attitudes (where traditional healthcare management is not able), fine-tuning deeper preventive strategies.

A method using SA was built to understand public health impact of CVD and patients' attitudes toward the disease in order to develop personalized therapeutic strategies depending on patients' adherence (12) . Less than half of the tweets on CVD (45%) were found positive raising awareness on the importance of positive behavior change.

Addressing Behavioral change interventions could drastically reduce overall mortality from CVD (13) . Behavior Change Techniques Taxonomy version 1 (BCTTv1) was applied to a sample of fitness and nutrition mobile apps and SA was used to identify which apps that support behavior change elicits a positive response from the users (13) . StepsApp Pedometer had the highest percentage of positive tags while VeryFitPro had the lowest.

Various types of SA tools were used to examine the impact and improvement in diseases such as CVD, as SA contributes in designing strategies to improve patients understanding and behavior (50) .

Computers can be trained to learn as humans do. Behadada et al. (14) proved that computers, as students, can learn from experts, textual data (scientific articles) and experience (experiments data). The authors introduced a novel method to define semiautomatically fuzzy partition rules to provide a powerful and accurate insight into the detection of cardiac arrhythmia. Fuzzy logic allows merging three completely different sources of knowledge by learning to define and integrate rule bases. The expert is invited to express his/her knowledge through linguistic (expert) rules. Moreover, the rules induced from data are called induced rules. Finally, the automated extraction of fuzzy partition rules from Pubmed articles identifies relevant arrhythmia insights and intuitions (mood described by text fragments) through grammar-based text extraction and SA. All extracted rules are merged into a unique knowledge base resulting in the definition of a common universe for the different knowledge domains. The evaluation carried out showed an accuracy rate of 93% and a high level of interpretability of 0.646 for the detection of cardiac arrhythmia.

Compared to the traditional ML solutions, besides an excellent accuracy, the approach proposed by Behadada et al. (14) comes with the major advantage of a high interpretability, as the computer is able to highlight all knowledge rules that led to a certain result.

In a cardiovascular secondary prevention setting, the feasibility of using an ML program to triage and classify incoming SMS text messaging replies as requiring health professional review or not, was assessed and reported (15) . The SMS messaging programs are a cost-efficient way for patients monitored in secondary prevention centers to regularly report their health status. However, the additional staff required to monitor and moderate the patients' SMS text messaging replies may negatively impact the cost-effectiveness of the SMS-based system. In order to reduce these costs, Lowres et al. (15) proposed five ML models (Naïve Bayes, OneVsRest, Random Forest Decision Trees, Gradient Boosted Trees, and Multilayer Perceptron) and an ensemble model for the automatic triaging of SMS replies. The Multilayer Perceptron model achieved the highest accuracy (AUC 0.86; 4.85% false negatives; and 4.63% false positives). After future validations against larger datasets, the authors are optimistic that the ML solution will significantly reduce staff workload.

Medication and medical devices reviews are important to improve their quality, safety, adherence and use (51) . Side effects may influence patients' adherence, thus pharmacovigilance is a key strategy to improve adherence (51) .

The "SentiScrap" system applies SA through a hybrid approach (supervised machine learning and use of semantics through a tagged corpus) to identify opinions, comments, and polarity of the drugs prescribed for chronic-degenerative diseases (including hypertension medication), available in blogs and specialized websites in the Spanish language (16) . Such a solution is of great help to health specialists as it reduces the time and effort to systematically search for patients' opinions, comments and experiences regarding the use of drugs, facilitating clinical decision making.

Medical devices' reviews were also considered for sentiment assessment. A lexicon-based SA was performed to pool together patients' experiences (fears, opinions, thoughts) from MyLVAD.com regarding their implanted left ventricular assist device (LVAD) (17) . The results of the analysis indicate dominant positive sentiment {a net sentiment ratio [(number of positive words-number of negative words)/(number of total words)] of 2.1%} and a common use of the word "infection" (208 mentions) compared to other words denoting complications such as "stroke" (29 mentions), "bleeding" (30 mentions), and "thrombosis" or "clot" (32 mentions) . This type of analysis might help to elucidate hidden, subjective segments of patients' health which factor into the objective measures of health.

Biotricity Inc. is a medical diagnostic and consumer healthcare tech company that is a leading producer of remote medical monitoring devices. Biotricity's main targets are chronic CVDs. With the help of the analytics firm InfoTrie Financial Solutions' Sentiment Analysis it was proved that Biotricity has been trending positively in the media, achieving a news buzz score of 10 out of 10, with a market sentiment score of 4.0 (18) .

A smart conceptual framework for monitoring patients with CV or diabetes was proposed (19) . The concept respresents a hybrid healthcare system designed to merge distinct emerging computing techniques such as data mining, ML, Internet of Things (IoT), bio-sensors, SA, chatbots, contextual entity search, and granular computing. Bio-sensors and IoT are used for the continuous monitoring of the patient's health parameters and emergency notifications. SA is intended to mine social media and other online resources in order to keep the patient and the healthcare professional up to date regarding CV and diabetes updated informations. Data mining and ML are used for patient classification, diagnosis, and health predictions. This hybrid AI and smart framework may provide an effective and economical solution to CV and diabetes patients by minimizing various implicit and explicit medical expenses, optimizing the use of vital medical resources and manpower, and further enhancing the patient care.

SA is indeed a promising field that can add valuable insights to the traditional and objective measures of health and contribute to clinical decision making. However, SA is the hardest task in NLP as analyzing sentiments in an accurate manner is a difficult task even for humans.

Context and meaning play a crucial role in interpreting emotions. For instance, this Twitter post: "Safe to say she may have been shocked to hear that the research does not suggest that high colesterol is a risk factor for heart disease" was automatically classed as negative by an automated SA algorithm, due to the potentially negative concepts such as "shocked", "high cholesterol", "risk factor" and "heart disease" (52) . However, the actual meaning is positive as the author is referring to the positive fact that the cited research does not incriminate negative associations. A major and mandatory challenge to SA techniques is to be able to integrate context (such as cultural, medical, political, legal, economic) and meaning. Moreover, in some cases, it is necessary to know much more than emotional polarity. For real life impact, SA algorithms should be equipped with the ability to categorize and organize subjective information, detect irony and sarcasm, comparisons, and emojis.

In general, the measure of how well humans annotators can decide on the same labels (inter-annotator agreement) is low when it comes to SA (53) . Since machines learns from the data they are fed, SA models might not be as accurate as other types of classifiers. This challenge may be overcome after developing more rigorous definitions of emotional polarity and neutrality.

Another aspect to consider is that only 10% of individuals between the ages of 50-64 use social media sites such as Twitter (54) . This limitation is worthy of consideration until social media platforms will become more popular among older patients.

One further obstacle is represented by the ethical implications of utilizing online publically available data from social media platforms for research purposes (55) . Current regulations do not yet fully consider this aspect, although this is probably just a matter of time until ethical implications will be rigorously addressed and clarified.

This paper introduced the readers to basic concepts surrounding medical SA and justified how SA can contribute to increasing quality in cardiovascular healthcare, emphasizing the need to invest more research into this new, promising and challenging domain. Our synthesis of the current literature on SA in CVDs proves its clinical potential. It also shows that the domain is only at the beginning. Many other clinical utilities, such as the assessment of emotional consequences of illness, patient-physician relationship, physician intuitions in CVD are not yet explored. These remain important research directions for the future, along with proposing detailed regulations for ethical implications, popularizing health social media and online expression among elders, developing more insightful definitions of emotional polarity and neutrality, and investing research into the discovery of powerful SA algorithms that are able to integrate global context and meaning.

ABu and ABr: conceptualization. ABu and IP: methodology and writing-review and editing. ABr, ABu, IP, MA, and OG: resources and writing-original draft preparation. ABr, OG, and MA: supervision. All authors have read and agreed to the published version of the manuscript.

Systematic reviews in sentiment analysis: a tertiary study

Artificial intelligence in medicine: today and tomorrow

How is the doctor feeling? ICU provider sentiment is associated with diagnostic imaging utilization

Sentiment analysis of health care tweets: review of the methods used

Available online at

Telemedicine outpatient cardiovascular care during the COVID-19 pandemic

Analysis of the effect of sentiment analysis on extracting adverse drug reactions from tweets and forum posts

Pharmacovigilance from social media: mining adverse drug reaction mentions using sequence labeling with word embedding cluster features

Psychological language on twitter predicts county-level heart disease mortality

Sentiment analysis on myocardial infarction using tweets data

A preliminary investigation with twitter to augment cvd exposome research

Semantic analysis of cardiovascular disease sentiment in online social media

Designing positive behavior change experiences: a systematic review and sentiment analysis based on online user reviews of fitness and nutrition mobile applications

Big data-based extraction of fuzzy partition rules for heart arrhythmia detection: a semiautomated approach

Use of a machine learning program to correctly triage incoming text messaging replies from a cardiovascular text-based secondary prevention program: feasibility study

Sentiment analysis approach for drug reviews in

Computational sentiment analysis of an online left ventricular assist device support forum: positivity predominates

Available online at

An advanced conceptual diagnostic healthcare framework for diabetes and cardiovascular disorders

Sentiment analysis in medical settings: new opportunities and challenges

Sentiment lexicons and non-english languages: a survey

Multilingual sentiment analysis: from formal to informal and scarce resource languages

Lakshman Naik R. A Study of Sentiment Analysis: Concepts, Techniques, and Challenges

A machine learning approach to sentiment analysis in multilingual web texts

The impact of emotions on coronary heart disease risk

Sharing feelings online: studying emotional wellbeing via automated text analysis of facebook posts

Use of sentiment analysis for capturing patient experience from free-text comments posted online

Twitter as a potential data source for cardiovascular disease research

Sentiment analysis algorithms and applications: a survey

Is co-infection with influenza virus a protective factor of COVID-19?

The impact and implications of twitter for cardiovascular medicine

The Power of Information: Putting All of Us in Control of the Health and Care Information We Need. London: Department of Health

Applications in Sentiment Analysis and Machine Learning for Identifying Public Health Variables Across Social Media. The University of Vermont and State Agricultural College

Emotion regulation moderates the association between chronic stress and cardiovascular disease risk in humans: a crosssectional study

The relationship between worry and anger rumination with adjustment problems to heart disease: The mediating role of difficulties in emotion regulation

Feasibility of an emotion regulation intervention for patients in cardiac rehabilitation

Emotion recognition using eyetracking: taxonomy, review and current challenges

Eye-Tracking Analysis for Emotion Recognition

Eye movements during emotion recognition in faces

Heart sound signals can be used for emotion recognition

Emotion Recognition Through Cardiovascular Response in Daily Life Using KNN Classifier

Wearable emotion recognition using heart rate data from a smart bracelet

Emotion recognition using facial expressions

Psychological risk factors and cardiovascular disease: is it all in your head

Risk of acute myocardial infarction after the death of a significant person in one's life: the determinants of myocardial infarction onset study

Hanshin-Awaji earthquake as a trigger for acute myocardial infarction

Routine angiographic follow-up versus clinical follow-up after percutaneous coronary intervention in acute myocardial infarction yonsei

A study of drug attitude and medication adherence and its relationship with the impact of illness among the mentally ill

Knowledge and attitude of patients with diabetes mellitus in primary health care. Escola Anna Nery

Impact of sentiment analysis tools to improve patients' life in critical diseases

Medication reviews

Does Automated Sentiment Analysis Work for Studying Healthcare Conversation? Creation Knowledge

Inter-annotator agreement in sentiment analysis: machine learning perspective

The Demographics of Social Media Users

The impact and implications of twitter for cardiovascular medicine

ABr and IP were employed by GreenSoft Ltd.The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.