key: cord-0991990-0k18e4iz authors: Burton, Joshua C.; Regala, Samantha; Williams, Deonte; Desai, Aditi; He, Han; Aalami, Oliver; Mariano, Edward R.; Stafford, Randall S.; Mudumbai, Seshadri C. title: A Comparative Utility Score for Digital Health Tools date: 2022-05-05 journal: J Med Syst DOI: 10.1007/s10916-022-01821-3 sha: e0c657689e7eba5b81ba1daa035163d08d3ef734 doc_id: 991990 cord_uid: 0k18e4iz Digital health tools (DHT) are increasingly poised to change healthcare delivery given the Coronavirus Disease 2019 (COVID-19) pandemic and the drive to telehealth. Establishing the potential utility of a given DHT could aid in identifying how it could be best used and further opportunities for healthcare improvement. We propose a metric, a Utility Factor Score, which quantifies the benefits of a DHT by explicitly defining adherence and linking it directly to satisfaction and health goals met. To provide data for how the comparative utility score can or should work, we illustrate in detail the application of our metrics across four DHTs with two simulated users. The Utility Factor Score can potentially facilitate integration of DHTs into various healthcare settings and should be evaluated within a clinical study. The ongoing Coronavirus Disease 2019 (COVID-19) pandemic has focused intense attention on the role of telehealth in medicine and healthcare in general [1, 2] . Both primary care as well as specialist physicians and other healthcare professionals have had to adopt virtual visits, remote monitoring, and mobile technology to provide care while promoting social and physical distancing. Within telehealth, the term "digital health tool" (DHT) is broad in scope, referring to wearable accelerometers and activity trackers as well as mobile applications [3] . These DHTs have also been categorized within electronic health/ehealth or mobile health/ mhealth [4, 5] . The role of DHTs extends to physical and mental health issues ranging from diabetes to depression, fall risk to mindfulness [4, 6, 7] . They can also be classified as active where the user inputs data, passive where the DHT collects data, or hybrid which is a mixture of passive and active [2] . The range of options within current DHTs provides opportunity to custom fit user needs and comfort levels. The presence of DHTs in daily life and the funding involved in their development therefore suggest that these tools will only continue to grow in number, specificity, and scope. The following categories may be useful for initial classification [8] : Attempts to answer the questions surrounding which DHTs work and why one should choose one over the other have been scant. Current approaches draw from theoretical models of technology and/or psychology and have primarily focused on predicting adoption or use. The Technology Acceptance Model (TAM) assumes that the decision to use a specific object is based on its perceived ease of use as well as the perceived usefulness of the object [9] [10] [11] . Since its inception, TAM has undergone several iterations, most recently the Unified Theory of Acceptance and Use of Technology which improved predictive ability by including social variables [11] . However, several recent studies modified the original TAM to explore acceptance of DHTs and related this to intention to use rather than actual use [12, 13] . Another defined use minimally (at least once) although the focus of this study was not solely on use [11] . Critiques of TAM include the interchanging of terms like acceptance, adoption, and use, and direct use (particularly consistent use) is not typically measured [12] . Moreover, because TAM was developed for measurement of acceptance in environments where technology use is mandatory, it is unclear how this may impact this model's utility. Another predictive model of use is the Theory of Planned Behavior (TPB) [13] . TPB includes attitude toward a behavior (e.g., using a DHT), the subjective norm of a behavior, and one's perceived ability to attain the desired outcomes. For example, in the context of utilizing a mindfulness application, one study was able to successfully predict use through TPB [12, 14] . Furthermore, an application developed to manage non-specific backpain based on TPB saw greater reductions in back pain than control groups [15] . Another study showed that application usage possessed strong correlations with attitudes and perceived behavioral control, though not as much influence through social means [16] . TPB can add predictive value regarding use, but it is currently unclear for how this approach can be practically deployed. While iterations of TAM and TPB are informative, the most glaring gap in our understanding of the effectiveness of DHTs is the lack of a metric that combines multiple, key aspects of the DHT. For example, use of a DHT possesses an inextricable link to the attainment of a health goal. Some reports have suggested wearable technology use declines or ceases after six months, with similar estimates for smartphone apps [17, 18] . The current literature evaluating wearable devices indicates little benefit of the devices on chronic disease health outcomes [19] . Any attempt to quantify the utility of a DHT must consider how much and what type of use is necessary to achieve a goal. Another gap is related to satisfaction. Most phones come preprogrammed with healthoriented applications (e.g., Apple Health, Samsung health). If someone is satisfied with a product, they may be more likely to recommend it to others which may indicate a higher likelihood of becoming socially acceptable (a component of both TPB and TAM). We propose a formula that includes objective and subjective data to inform the effectiveness of a DHT (Fig. 1 ). Combining objective (adherence) and subjective (satisfaction) data could provide the foundation of a metric to appraise consumers and researchers on the relative usefulness of a particular tool. Healthcare practitioners and consumers could then access these scores in selecting the best DHT to meet their unique health needs. We define this metric as a DHT's Utility Factor score which can then be applied to multiple DHTs. In general, the utility score should be a function of the goal or outcomes achieved, the user's satisfaction with the DHT, and the adherence to the DHT. We suggest the following equation as a candidate approach to calculate this score: Here, U represents the utility factor score between 0 and 100 of a particular DHT or its ability to provide a benefit to a patient; specifically, the goal in mind (e.g., weight loss). U is a function of the absolute value of a health goal that is met (amount of weight lost), G; satisfaction with the DHT is indicated by recommending the product to another person, S; and adherence, A. Here, A can be viewed as a measure that incorporates important aspects of how individuals use a particular DHT. Specifically, adherence is defined as use density U d (number of uses in a day which we define, D i , and longest number of consecutive days used, D c ) plus use duration, U m , or the number of months where the device was used at least once a week. A is represented below: Finally, we classify x, y, z, as variables representing the relative weight each factor contributes to the overall utility factor score. While the weights can be calibrated to different situations and to healthcare practitioner or consumer preferences, we suggest the following general approach to weighting the weighting of inputs of goals, adherence, and satisfaction: The attainment of health goals should be weighted heaviest. If a health goal is not met, then its utility factor score (i.e., the benefit of a DHT) should be zero because it does not help. Adherence should be weighted greater than satisfaction because use puts a burden on the consumer and often requires input, a major issue given that adherence rates can be low. Here, it should be noted that what maintains adherence is beyond the scope of this paper. We are more focused on identifying a metric to inform consumers of the potential for a DHT to provide some benefit, i.e., its utility. Finally, satisfaction is given the distinction of z(S + 1) because if using a DHT allows the user to meet a health goal, dissatisfaction should not completely eliminate its utility. To help compare similar DHTs which might occur when accessing a DHT in an App Store for example, we suggest the following approach: 1. First, normalize U scores to 0-100 using min-max normalization: After U values are collected, both a minimum and maximum are identified. Each value based on the above formula would then normalize to a range from 0 to 100. 2. Averages for a given category could then be provided by the App Store. Overall, these metrics can help healthcare practitioners and consumers select a DHT to achieve individual goals across a spectrum of specific tools and identify which specific individuals will define the Goals met and Adherence metrics. We also suggest that the goal or "outcome" is what primarily should drive the weighting approach and secondarily healthcare practitioner or consumer preferences. For example, a goal of a DHT (like Headspace) to help reduce anxiety may have more frequent adherence needs than another DHT to help reduce weight that might be daily. This would therefore require a change in the adherence weighting. Because we consider the weight of each variable's contribution, different scores could be considered a "good" score for DHTs themselves and the specific goals attained. For example, one score could New value = (value − min)∕(max − min) * 100 Fig. 1 Utility Factor Score. Adherence is defined as the amount of use per digital health tool (DHT). Health goals defined as type of health goal met. Satisfaction defined as whether a user would recommend a specific DHT be considered "good" for a phone application aimed at reducing anxiety whereas a "good" score for a wearable designed to aid in weight loss might be different. These scores could be obtained from users during regular visits to their healthcare practitioners and then averaged across different tools and goals. The flexibility of these metrics allows for clarity over the large number of DHTs available for consumers. To provide data for how the comparative utility score can or should work, we illustrate in detail the application of our metrics across four DHTs with two simulated users. Our rationale in selecting these DHTs was to identify a group that were commonly available; widely used; and potentially initiated in either consumer or clinical settings for a variety of purposes: For the purpose of demonstration and based on the clinical experience of the authors, our research. (A) developed two simulated users and their histories. To assist illustration and comparison, the goals of both persons were set up to be identical. The goals were to lose weight, sleep better, lower blood sugar, and anxiety management. • User 1: 68 year-old white male. Presenting concerns: Type II diabetes, high blood pressure, sleep issues. • User 2: 38-year-old Hispanic female. Presenting Concerns: Type I diabetes, anxiety, low physical activity. (B) and identified a range of inputs for goals (G), satisfaction (S), and adherence (A; use density U d (number of uses in a day D i , and longest number of consecutive days used, D c ; plus use duration, U m , or the number of months where the device was used at least once a week). Goals were set up to be quantitative (e.g., weight loss in pounds) and identified initially, followed by inputs for satisfaction, and adherence. To assist in interpretation of our examples, weights ( e.g. x,y, and z) for goals, satisfaction, and adherence were set to equal 1. (C) entered the inputs into an EXCEL (EXCEL 16.5, Microsoft, Redmond, WA) spreadsheet for scenario calculation. The data inputs and calculations for the two simulated users are provided in Tables 1 and 2 ; utility scores between the two users for each of the DHTs are graphed in Fig. 2 . Across all the DHTs, the relative role of the various factors and in particular the impact of goals in changing the overall utility score is evident. For example, though User 2 had a markedly lower utility score (0.5) for the mySgr DHT vsUser 11 (23.6). User 2 experienced a goal of 1 h on average per day in the number of hours with blood sugar in the normal range (70 to 99 mg/dL) vs. User 1's 10 h. Similar comparisons may be made for Headspace and the role of the goal of numbers of hours slept on average per day (User 2's utility score of 48.6 vs User 1's utility score of 16.7). In establishing these scores a patient/user should have sufficient experience with a given DHT and suggest that 2-4 weeks of routine would allow the time necessary to become comfortable with the device and troubleshooting a routine. We also suggest several approaches to store user metrics for a particular DHT including 1) an application itself where one would peruse a list of DHTs which are identified by the particular goal in mind; 2) the UF as a score that is advertised with an application (i.e., Apple App Store or Google Playstore); and 3) as a tool associated with facility or a health care system or (i.e. Veterans Healthcare Administration). De-identified user profile data or population statistics should also be available to the end-users as they interpret UF scores (such as in an App Store); for example, of the patients who provided feedback about Headspace, how many of them self-identified as having anxiety. A potential limitation of our simulated datasets is the proposed equation for calculating utility scores. Alternative formulas can also be proposed but we nevertheless suggest that goals, satisfaction, and adherence be included. Another potential limitation is in how we defined goals (e.g., weight loss in pounds). Other definitions could be equally valuable, but our aim was to highlight the importance of defining goals quantitatively. Given potential concerns for honesty in how a user may enter data, one method to address this is to include language asking users at the start of the feedback process to commit to providing complete and accurate information [24] . DHT use is on the rise, but research is still in the nascent stages. Few studies distinguish between those who use DHTs once or twice a month from those who use daily, but this difference may be critically important. 21 Moreover, there are no studies that compare the types of health goals (e.g., reducing anxiety symptoms or H b A1c levels) attained when using a DHT. And, despite the popularity of DHTs, few studies assess how satisfaction relates to health goals. We propose defining use, and linking it directly the types of health goals met and to adherence and satisfaction elements through the creation of a novel Utility Factor score that should quantify a DHT's ability to provide benefit to and inform consumers. Our simulations suggest that sustained attention should be made to the goals or outcomes achieved rather than simply satisfaction and adherence when evaluating DHTs. One advantage of our approach is its applicability to different types of DHTs with different health goals. The generalizability of defining utility as a function of health goals met, satisfaction, and adherence allows consumers and healthcare practitioners to select and deploy from a wide array of DHTs to address their needs. In summary, the Utility Factor Score can potentially facilitate integration of DHTs into various healthcare settings and should be evaluated within a clinical study. A clinical study would help elucidate many of the "nuts and bolts" details associated with use of the scoring algorithm, namely how to define G, how heavily to weight x,y,z, and who keeps the results data. Authors contribution All authors have made substantial contributions to the conception and design of the work; drafted the work or substantively revised it; approved the submitted version; and agreed both to be personally accountable for the author's own contributions and to ensure that questions related to the accuracy or integrity of any part of the work, even ones in which the author was not personally involved, are appropriately investigated, resolved, and the resolution documented in the literature. Conflicts of interest None of the authors have any conflicts of interest to declare. Funding for this study was derived from institutional funding. Simulated Users 1 and 2. Utility factor scores are on the y-axis and digital health tool (DHT) type is on x-axis Digital mental health and covid-19: Using technology today to accelerate the curve on access and quality tomorrow Telehealth in the Context of COVID-19: Changing Perspectives in Australia, the United Kingdom, and the United States Evaluating digital health interventions: Key questions and approaches The history and future of digital health in the field of behavioral medicine Top-funded digital health companies and their impact on highburden, high-cost conditions Many mobile health apps target high-need, highcost populations, but gaps remain The promises and pitfalls of leveraging mobile health technology for pain care 2022) Digital Health Testing the Technology Acceptance Model: HIV case managers' intention to use a continuity of care record with context-specific links Development of an instrument to measure technology acceptance among homecare patients with heart disease In The technology acceptance model: Its past and its future in health care The theory of Planned Behaviour: Reactions and reflections Theorybased predictors of mindfulness meditation mobile app usage: A survey and cohort study Mobile-web app to self-manage low back pain: Randomized controlled trial The fitness of apps: a theory-based examination of mobile fitness app usage over 5 months Determinants for sustained use of an activity tracker: Observational study Diet and physical activity apps: Perceived effectiveness by app users Clinical review of user engagement with mental health smartphone apps: Evidence, theory and improvements Is there a benefit to patients using wearable devices such as Fitbit or health apps on mobiles? A systematic review Questions on honest responding