key: cord-0694926-ruvs5guh
authors: Alabdulkarim, Lamya
title: End users’ rating of a mHealth app prototype for paediatric speech pathology clinical assessment
date: 2021-04-24
journal: Saudi J Biol Sci
DOI: 10.1016/j.sjbs.2021.04.046
sha: c84603c620280da8277c9ad3eec7b94aefaa8956
doc_id: 694926
cord_uid: ruvs5guh

BACKGROUND: Previous studies reported the efficacy of the implementation of information technology in clinical evaluation. No research has addressed the development of mobile applications for the clinical evaluation and diagnosis of paediatric language disorders. PURPOSE: This study aims to investigate the usability of a clinical assessment mobile application (app) prototype, the “Paediatric Arabic Language Test” (PALT), to diagnose language disorders among paediatric patients. METHODS: Using the Lewis Computer Software Usability Questionnaire (CSUQ) and a 5-point Likert Scale, data was collected and scored on the usability of the app prototype developed on two mobile platforms- iPhone and iPad- across a general operating system, iOS. A sample of 77 potential end-users rated the usability of the app prototype that they used between 2017 and 2019. RESULTS: The average CSUQ rating for the app prototype was 75.68 and 53.2% strongly agreed that the information and its organization was clear and easy to understand; 75% of the end users were very satisfied (p < 0.0001). Out of the total items, 68% of studied items on the scale loaded on factor 1 even after subjecting the scale to three standardization methods. CONCLUSION: The prototype design was judged to be usable. Users reported an effective user interface that allows effective operation. Differences in the factor loading may be explained by cultural factors, type of task and field context.

Mobile applications (apps) have for some time been integrated in health care and have become growing key resource facilitating health care service delivery. This impact has grown even faster during COVID-19 pandemic due to its inherent accessibility. The number of downloadable health related apps (mHealth apps) on Google Play is 43,285 and on Apple App Store is 45,478 for iOS, with an estimated mobile health market of 37 billion US dollars (Puri-Mirza, 2021) . However, more than 90% of mHealth apps developers focus on communication and medication and not diagnosis, which is of equal importance in clinical procedures in health care. Recent recommendations indicate that, to enhance diagnosis and health care, digital technology could play a supportive role (Fact sheets, 2020) . A gap in this area, as evidenced by a literature review in the field, is therefore of some concern.

Communication disorders are a notable health concern, with a prevalence rate ranging from 5% to 10% worldwide and expected to increase in upcoming years (Overby and Baft-Neff, 2017) . Immediate referrals to a speech therapist is necessary to reduce acute communication impairment and limit long-term impacts, including decreased quality of life, reduced academic and vocational opportunities, chronic behavioural, social, and emotional problems (Taylor, Armfield, Dodrill, and Smith 2014) .

In speech pathology, most of previous studies on the application of mHealth have focused on home care management and intervention. Systematic reviews of thousands of apps on Google Play and Apple App store have reported few of good quality (Furlong, Morris, Serry, and Erickson 2018) , many of which are prone to non-evidence based assessments (McKean and Bloch 2019); and in all of those, none were found for paediatric clinical assessment disorders for Arabic speakers. Saudi Arabia has the highest number of mobile phone users than any other country in the world (Technology TMoCaI, 2020) . Though no statistics on Arabic health apps were found, observations on apps e-stores revealed most apps focus on telehealth consultations, booking appointments and patient's record viewing.

To date there is no implementation of apps in the diagnosis of speech and language pathology in Saudi Arabia and regionally. Diagnostic sessions in speech pathology, especially language disorders, is a highly specialized and time and effort consuming process. Enhancing delivery of diagnosis allows clinicians to obtain an evidence-based clinical profile of a child's communication disorders, thereby providing valid information for planning interventions (Edwards, and Dukhovny 2017) .

Involvement of potential users' experience in health apps design has been a gold standard in mHealth which play a significant factor in apps ease of usability (van de Kar, and Hengst 2009). It has been established that applying user-centered design during the prototype phase has an impact on increasing its usability and hence return on investment (Marcus, 2005) . The aim of this study was to validate the usability of a mHealth app prototype in clinical assessment of language disorders for paediatric population using Lewis CSUQ.

To investigate end users' rating of usability, this study employed the Computer System Usability Questionnaire (CSUQ) developed for field testing (Lewis, 1995) .

On the basis of a review of the literature and international commercial grade standardized clinical assessments, a clinical language assessment framework was developed, the Paediatric Arabic Language Test (PALT). Using a relational database FileMaker ProÓ, an app prototype was created to be easy to use in design and function (Claris FileMaker Pro). Fig. 1 displays an overview of PALT design. The prototype included major functional components with multimedia content. Each page had an interface design that enables the user to access information. The interface design was consistent across items. The prototype had 8 sub-tests and a total of 97 culturally and developmentally appropriate items. A manual was compiled for the standard test protocol. Data were collected for two kinds of mobile platforms-iPhone and iPad-across the general class of operating systems (iOS) to facilitate graphic display of the screen and for its excellent user interface after the download and launch of the app prototype.

For item construct relevance to the proposed objectives of the test and the prototype design, face and content validity was reviewed and validated by an expert researcher, two clinicians, and a cohort of 108 student clinicians. The PALT prototype was uploaded and made available via Dropbox TM , to be downloaded by end-users.

This was a retrospective, cross-sectional study. End-users were recruited from the speech pathology and audiology program at the Department of Rehabilitation Sciences, King Saud University, and from speech pathology clinics in Riyadh city. Initially, end-users received a short orientation session on how to download and use the prototype. Each end user used the prototype for a period of 8 weeks. End-users then tested 583 Saudi children (274 females and 260 males) between 3 and 8 years of age at governmental and private pre-school and school settings as well as outpatient clinics. Two trained co-examiners tested each child at her school setting as shown in Fig. 2 . Data was collected with 100% supervision by an American Speech and Hearing Association (ASHA) licensed experienced bilingual speech-language pathology faculty. Each child participated voluntarily and could withdraw at any time. Prior to data collection, the study was approved by King Saud University Research Ethics Committee and reviewed and approved by the Administration and Planning Department at the Ministry of Education. Consent for children to participate was obtained from parents. Field clinical assessment occurred where service delivery is usually administered to this population. The average testing ses- sion period was 45 min. At the end of testing, end-users rated the usability of the prototype using the CSUQ. Approximately 89% of participants used series Xs, XR and 11 iPhones. The remaining 11% had an iPad 9-7 or 10-2.

Computer System Usability Questionnaire (CSUQ) is a standardized, generalizable and widely applied usability questionnaire to assess users' satisfaction with system usability in field research (Lewis, 1995) . Responses are rated on a 7-point scale from 1 to 7, with 1 corresponding to ''strongly disagree" and 7 corresponding to ''strongly agree." In our study however, we applied a classical 5-point Likert scale for the following reasons:

The population sample is familiar with the 5-point Likert scale (1 = Strongly Disagree to 5 = Strongly Agree) Literature suggests that five-point scale appears to be less confusing and to increase response rate and response quality (Buttle, 1996) . Previous studies found that a 5-point scale is easier to read and more readily comprehensible to respondents (Dawes, 2008; Krosnick, and Presser, 2010) 

Data were compiled and tested using IBM SPSS v22.0. Statistical analysis was done using two-tailed tests and an alpha error of 0.05. A confidence interval of 95% (p-value 0.05) was set to determine statistical significance. Frequencies and percentages were used to describe the distribution of items and scale. To determine the factors and their related items, an exploratory factor analysis (EFA) using principal components analysis (PCA) with Varimax rotation was run. The selection threshold for the factor loading to be considered significant was set at 0.60 as recommended for this sample size (Hair, Babin, Anderson, and Black 2018). This sample size was sufficient to conduct factor analysis which was more than the recommend minimum (Lewis, 2014; Fischer, and Milfont, 2010) .

A total of 77 end-users completed the questionnaire. General descriptive data on participants is listed below (Table 1) . Fig. 3 displays a normal distribution of the total scores on the CSUQ for the PALT prototype were the highest number of users (N = 12) had a score ranging between 75 and 80. In all of the 19 items, except for item 9, most users agreed with the presented questions. More than half of the end users strongly agreed that the information and its organization was clear and easy to understand (items 11 = 42 (54.5%), 15 = 41(53.2%), 13 = 40 (51.9%) ( Table 2) . These findings align with reported means of the items (3.09 and 4.32). When categorizing the total CUSQ scale, 75% of users were very satisfied. The overall Cronbach's alpha was 0.957 which indicates a high level of internal consistency, and hence reliability (Table 3) .

Regarding construct validity, EFA yielded an orthogonal varimax rotation for two factor structure (p < 0.0001). Factor 1 (F1) received 73.7% (14 items) of item loadings with threshold of 0.60 in bold, whereas only 4 items loaded on Factor 2 (F2). Question (10) did not meet the threshold (Table 4) . A mean of 75.68 ± 14.1 8 out of 95 suggests a high satisfaction level (Table 5 ). Table 6 , summarizes a comparison of factors loading per items between the current study and Lewis' standardization data. Ten items (1 -10) (53%) were comparable to Lewis CSUQ. Items (11 -15) loaded on factor 1 in the current study contrary to loading on 2 in Lewis'. Items (16 -18) loaded on factor 2 compared to 3 in Lewis'. Item 19 though loaded on factor 1 in this study, was not available comparison in Lewis' study. Possible variables may have impacted the direction of itemfactor association. These may include use of a 5-point Likert scale, cross-cultural and clinical assessment field testing. To statistically reduce the possible effect of the 5-point Likert scale, three conversion methods were applied. The z-standardization score is a procedure used to transform the absolute rating values (1 = Strongly Disagree to 5 = Strongly Agree) to relative scores that reflect the response's rank related to the rank of all responses. The second method was the percent of maximum possible (POMP) (Fischer and Milfont, 2010) . Advantages of POMP is maintaining the multivariate and covariance metric of the rating scores. The converted scores on the 5-point scale to a metric value of 0 (minimal possible) to 100 (maximum possible) represent the possible maximum score (Moeller, 2015) . The third method was IBM transformation method to convert a 5-point Likert scale to a 7-point Likert scale (IBM, 2020) .

PCA with Varimax rotation was run on the questionnaire converted scores. Across all methods, items loaded on two factors with the same values as in Table 4 . Although the first factor included the learnability items suggested by previous work (Questions 1-8) , it also contained questions that do not reflect learnability (Questions 9, and 11-15) ( Table 7) .

Rating usability at the prototype phase of mHealth apps seemed useful and cost effective as it informs healthcare practitioners about the apps feasibility and improvement prior to commercial stage (Lewis, 1995) . Further, application of cross-cultural usability scales proved to yield different results from the original validation data. Although data in the current study showed overall high satisfaction, the scale yielded different constructs and was unidimensional. Finally, there was no difference among scale results when three conversion methods were applied confirming the different results.

The high rating was expected given that the design was carefully constructed using evidenced based theory and clinical practice. The feedback on the usability of the prototype for clinical assessment of speech pathology indicated that end-users agreed on the effectiveness, efficiency, and were satisfied with the prototype for the intended purpose of diagnostic assessment.

An implication of the findings is that usability scales may depend on cross-cultural population (Lewis, 2014; Marcus, 2007) . The divide of the items might indicate that the app prototype is evaluated by end users along the construct of ''usability benefit" (easy-to-use, highly integrated in the clinical assessment of the target area, well interactive, transparent) and ''usability limitations" (automated error recover).

A further factor to consider is the context of field research in which the scale was applied. In this study, the usability rating was tested for a diagnostic mHealth app prototype in a nonclinical setting. Results supported that the design had a user interface that allows effective operation and information processing that aids the user on conducting clinical assessment in nonlaboratory field settings. The clinicians were able to learn to use the system easily with minimal required training. The study provides evidence of commercial potential and value of the app prototype for clinical assessment or outcomes. The development of diagnostic apps may help to relieve financial and access difficulties by leading evaluation not only in local schools but also at home, utilizing commercially accessible devices with internet access. Fig. 3 . Distribution of the overall CSUQ scores for PALT prototype.

Percentages, means and SD of the CSUQ ratings for the PALT prototype (N = 77).

Strongly Disagree Strongly Agree As for cultural obstacles, such as the theory that mHealth usability studies is limited to research contexts, were approached in this study by pointing out that the application of mHealth usability studies in a real-world framework, outside a specialist research department, is feasible (Sutherland et al., 2017; May and Erickson, 2014) .

One of the limitations of this study was related to the measurement scale. A 5-point Likert scale was used instead of a 7-point Likert scale. However, statistics showed identical results. It is hypothesized that the high usability is due to the theoretical and clinical experience the developers engaged in constructing the PALT design relying on evidence-based research and practice specific to Saudi Arabic population and culture. Digital integration into paediatric clinical language assessment could support making speech pathology specialist assessments more accessible, informative and time efficient. Future studies should also implement other validated instruments to compare more usability dimensions.

The positively high usability rating of the prototype should inspire practitioners to learn how their designs may be evaluated at an early phase for cost effectiveness and improved quality. Of particular note to these clinical practitioners is the finding that good usability could be achieved as early as the prototype. Results of this study supported a two-factor representation of the CSUQ instead of a three-factor structure. Such findings will provide indications for the importance of cultural and context adaptation of usability scales. We hope these results may aid in the creation of a ''gold standard" for clinical assessment tools in speech pathology as there are no pre-existing tools.

The data that supported the findings of this study will be available from the corresponding author upon reasonable request as a Microsoft ExcelÓ datasheet.

The study was approved by the committee of scientific research ethics at King Saud University (CAMS 129-36/37), and the Ministry of Education, Planning Department (No. 37972368). Consent to participate was obtained from the end users by the online questionnaire after explaining the objective of the study and confidentiality. Table 3 Categorizing percentages of the total CSUQ scale scores.

Not satisfied (19) (20) (21) (22) (23) (24) (25) (26) (27) (28) (29) (30) (31) 0 N 0% percent Neutral (32-62) 19 25% Very satisfied (63-95) 57 75% Table 4 Varimax-rotated factor pattern for the principal factor analysis of the CSUQ for PALT. 

SERVQUAL: review, critique, research agenda

Do data characteristics change according to the number of scale points used? An experiment using 5-point, 7-point and 10-point scales

Technology Training in Speech-Language Pathology: A Focus on Tablets and Apps. Perspectives of the ASHA Special Interest Groups.2:33. Asha.org. Fact Sheets

Standardization in psychological research

Mobile apps for treatment of speech disorders in children: an evidence-based analysis of quality and efficacy

Multivariate Data Analysis

Question and Questionnaire Design In: Wright PMaJ

Usability: Lessons Learned and Yet to Be Learned

Transforming different Likert scales to a common scale

IBM computer usability satisfaction questionnaires: Psychometric evaluation and instructions for use

User Interface Design's Return on Investment: Examples and Statistics

Global and intercultural user interface design

Telehealth: Why not?

The application of technology in speech and language therapy

A word on standardization in longitudinal studies: don't

Perceptions of telepractice pedagogy in speechlanguage pathology: A quantitative analysis

Smartphone market in MENA

Telehealth language assessments using consumer grade equipment in rural and urban settings: Feasible, reliable and well tolerated

A review of the efficacy and effectiveness of using telehealth for paediatric speech and language assessment

Involving users early on in the design process: Closing the gap between mobile information services and their users

The author would like to thank all study participants.

This research project was supported by a grant from the ''Research Centre of the Female Scientific and Medical Colleges," Deanship of Scientific Research, King Saud University.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.