key: cord-1041033-b3dwpe3q authors: McLaughlin, Anne Collins; DeLucia, Patricia R.; Drews, Frank A.; Vaughn-Cooke, Monifa; Kumar, Anil; Nesbitt, Robert R.; Cluff, Kevin title: Evaluating Medical Devices Remotely: Current Methods and Potential Innovations date: 2020-09-22 journal: Hum Factors DOI: 10.1177/0018720820953644 sha: b78ea9f7cc0cf8014526292024e4055ade427f0a doc_id: 1041033 cord_uid: b3dwpe3q OBJECTIVE: We present examples of laboratory and remote studies, with a focus on studies appropriate for medical device design and evaluation. From this review and description of extant options for remote testing, we provide methods and tools to achieve research goals remotely. BACKGROUND: The FDA mandates human factors evaluation of medical devices. Studies show similarities and differences in results collected in laboratories compared to data collected remotely in non-laboratory settings. Remote studies show promise, though many of these are behavioral studies related to cognitive or experimental psychology. Remote usability studies are rare but increasing, as technologies allow for synchronous and asynchronous data collection. METHOD: We reviewed methods of remote evaluation of medical devices, from testing labels and instruction to usability testing and simulated use. Each method was coded for the attributes (e.g., supported media) that need consideration in usability studies. RESULTS: We present examples of how published usability studies of medical devices could be moved to remote data collection. We also present novel systems for creating such tests, such as the use of 3D printed or virtual prototypes. Finally, we advise on targeted participant recruitment. CONCLUSION: Remote testing will bring opportunities and challenges to the field of medical device testing. Current methods are adequate for most purposes, excepting the validation of Class III devices. APPLICATION: The tools we provide enable the remote evaluation of medical devices. Evaluations have specific research goals, and our framework of attributes helps to select or combine tools for valid testing of medical devices. Due to the COVID-19 pandemic and the need for social distancing to reduce spread of the coronavirus, laboratory research has decreased in a wide range of disciplines (Servick et al., 2020) , with termination of studies that involve in-person data collection from human participants (Clay, 2020) . This affects not only academic institutions but industries that develop medical devices and must provide human factors validation to receive U.S. Food and Drug Administration (FDA) approval. One alternative is to conduct human factors testing remotely. We present an overview of the technologies and best practices for remote evaluations of medical devices, from observational studies to usability tests to controlled behavioral experiments. We combined searches of the literature using the Summon database, a multidisciplinary unified search engine of databases and journals, with our own knowledge of tools used by user researchers in industry. Many of the tools most used in industry did not show up in the published literature, but we believed it was important to detail their features to best help those needing to user-test medical devices remotely. Because remote testing is a cutting-edge field, we limited our literature search to the last 15 years and emphasized work found from the last 5 years. We evaluated the match of these methods to FDA guidelines for medical device evaluation, the attributes of devices that can be tested, and other considerations such as cost and whether the platform was well-established. We focused on options that required little-to-no knowledge of programming or system administration. Vol. 62, No. 7, November 2020 , pp. 1041 -1060 The FDA outlines the expectations for a human factors evaluation of a medical device according to device class (FDA, 2016) . Class I devices are considered low risk, for example, a surgical tool. Class II devices have some risk in their use, for example, pregnancy test kits or infusion pumps. Class III devices are considered high risk, as they often sustain life, such as ventilators and pacemakers. Only 10% of medical devices are Class III (FDA). Because most remote testing will be formative, it can apply to all classes of devices. However, for summative assessment, remote testing will be most difficult for Class III devices and, at times, impossible. The data collected for medical device usability can vary from qualitative and contextual information gathered during formative testing to safety-related use errors in summative testing. The most commonly needed data include signs of difficulty, close calls, and use errors. Reference to instructions for use, need for assistance, and unsolicited comments are also often desired (Wiklund et al., 2016) . Many tools and techniques transfer well to remote use, such as surveys, interviews, and expert evaluations. Others are more challenging, such as simulated use and recruiting representative users for validation testing. Items to be tested vary as well, from the usability of instructions and warnings to the operation of physical devices. The methods reviewed here are most useful for testing Class I and Class II medical devices, and for formative evaluation of Class III devices for premarket review processes (FDA, 2016) . Remote summative testing is more of a challenge and has not been addressed in published literature. Summative testing focuses on safetyrelated use errors with the actual device, meaning that a production-level device must be in the hands of the user (often a three-dimensional object). When collecting data during a summative test, use errors must be recorded and cannot be missed. Further, a comprehensive set of representative tasks must be carried out by representative users under the conditions that would be expected in the field. As summative testing is essentially required for Class II and Class III medical devices, it will depend on the device and testing needs to determine if a remote test is possible. An exception is the summative usability testing of electronic health records (EHRs) which do not require specialized equipment to be sent to the participant for usability testing. Though EHRs are not regulated as medical devices by the FDA (21st Century Cures Act of 2016), usability testing is needed to meet the safety-enhanced design requirement of the Office of the National Coordinator for Health Information Technology (Office of the National Coordinator for Health Information Technology [ONC], 2015) . Although usability testing performance is often measured on the order of minutes, we note that delays due to network connectivity issues could be a limitation if performance must be measured on the order of seconds. Thus, network connection would be a limiting factor for many summative tests-at the very least, making some participant data unusable. However, it is unlikely to affect performance measurements for formative tests. That said, a poor connection, with dropped audio and video throughout, is a barrier to communication and heightens frustration, making even some formative tests or interviews unusable. As mentioned in our later section regarding recruitment, users at home with lower socioeconomic status may be the most adversely affected, either through bandwidth or through the needs of many persons in a home to use the same internet connection. Remote testing has the advantage of collecting data from large, diverse populations, quickly at low cost (Woods et al., 2015) . However, the sine qua non of a remote test is whether online results replicate those from a controlled laboratory environment. The implication is that the same psychological processes were activated in the two testing environments despite their physical differences (psychological fidelity; Kantowitz, 1988) . The quality of results from online studies of cognitive performance often is comparable to that of laboratory studies (Woods et al., 2015; see also Germine et al., 2012) . Replicated results included the Forward Digit Span task, Flanker task, and Face Memory task. The most challenging results to replicate are those with short display presentations, such as masked priming tasks. Other concerns included stimulus timing (onset and duration), response time measurement, lack of stimulus control (e.g., visual size, luminance, resolution, color; auditory volume) of participant equipment, participant environment, duplicate or random responders, and ethical concerns about maintaining participant anonymity and privacy. Results of a problem-solving laboratory study that compared three learning conditions were replicated in an online format but with a higher participant dropout rate and lower performance accuracy in the online condition (Dandurand et al., 2008) . Similarly, comparable performance data were obtained for online and laboratory administrations of an interruption task that was time-sensitive, long in duration, and required sustained concentration (Gould et al., 2015) . This study showed that online tests can replicate results of laboratory conditions for tasks that are more complex and longer in duration than those typically examined in comparisons of laboratory and online tests. Although controlled laboratory experiments are considered the gold standard, they are potentially limited by a lack of external validity, which is important to consider for medical device use. For example, some usability problems are unlikely to appear in a laboratory or highly controlled setting, such as sociological issues or working conditions not anticipated by the study designer (Wiklund et al., 2016) . This is one reason we included review of tools that offer contextual and qualitative information on use (Table 1) . Results of usability testing of a regional hospital website in Switzerland were compared between laboratory and two remote testing conditions, including asynchronous and synchronous administrations (Sauer et al., 2019) . Task completion rate, time, and efficiency did not differ across the three testing conditions. Nor were there differences between perceived usability, perceived workload, or affect. When the usability of a (computer-simulated) smart phone was measured with laboratory and asynchronous remote formats, the difference in task completion time and efficiency between testing conditions was not significant when the usability of the smartphone was good (Sauer et al., 2019) . When usability was poor, task completion time and click frequency was higher in the laboratory. Perceived usability ratings were higher in the lab, and workload did not differ statistically between testing conditions, regardless of the quality of the smartphone's usability. Other examples of laboratory to online comparisons included comparing usability of email software in various tests-conventional lab test, remote synchronous test, and remote asynchronous test. Findings showed few differences in performance results (e.g., task completion time), but identification of more usability issues by the conventional lab and remote synchronous testing conditions (Andreasen et al., 2007) . Similar results were found when comparing synchronous lab and asynchronous remote testing using critical incident reporting, forum discussions, and longitudinal reporting in user diaries (Bruun et al., 2009 ). Evaluation of a shopping website using a think-aloud protocol had similar results when conducted in a laboratory or online, though the sample size was small (Thompson et al., 2004) . Descriptive results suggested that remote users took more time and made more errors, but identified more usability issues than in-person lab participants. In summary, remote testing obtained results comparable to laboratory settings. However, published comparisons were few and limited to tasks without specialized hardware or software (e.g., vibrotactile devices, motion sensors). No studies were found comparing laboratory and remote testing of medical devices. A "HUMAN FACTORS TOOLBOX" FOR REMOTE USABILITY TESTING OF MEDICAL DEVICES We collected potential remote usability tools, from those appropriate for scientific study and use of inferential statistics to those intended to gather qualitative data from a small number Medical devices are often physical, threedimensional, with moving parts critical to their operation, and may require other equipment to be used (e.g., patient simulator). Prototypes are often expensive and difficult to create or repair. Because of the scarcity of remote medical device studies, we included usability testing of products similar to medical devices. Pros and cons of each tool are provided, with considerations for the collection of performance and observational data (Table 1) . Remote testing relies on software that can host the surveys and stimuli, and enable communication. Some platforms were for specialized use, such as eye tracking, while others purported to provide everything from participant recruitment to study-building to analysis and reports. Because of the particular attributes of medical devices, we have organized these platforms into categories: (1) those appropriate for the evaluation of 2D or "flat" stimuli: web interfaces, labels, instructions; and (2) those appropriate for the evaluation of 3D stimuli: physical devices and packaging. In each of these categories, we review how the platform has been used or validated in the psychological literature. EVALUATION Flat interfaces include websites, warnings, and labels. The available platforms varied greatly in terms of price, features, functionality, and need for technical knowledge ("Behavioral Experiment Hosting Platforms" in Table 1 ). Many were free to use but usually involved the need for more programming knowledge and online resources, such as web servers or installing the open source software from a repository. Sauter et al. (2020) provided a review of extant solutions for online behavioral studies requiring high experimental control. Overall, these platforms were for collecting scientific data. They emphasized timing accuracy and supported the typical protocol of "display stimuli -> collect response." Although they mimicked well-established measures, they have not all been validated to show that remote results were the same as those collected in a laboratory. They often differed in the inputs for the measures (e.g., allowing use of a mobile device rather than a keyboard) or in other ways that changed the outcome (e.g., screen brightness). The published comparisons of online are promising in this regard, but we recommend caution in assuming a validated cognitive test will replicate on these platforms. Many medical devices need to be assessed with simulated-use methods in a 3D environment. There are several options for evaluation, and the choice depends on the type of measures needed. The options are (1) display a virtual 3D prototype on a flat screen that can be manipulated via an interface (e.g., using the mouse to rotate the prototype to view the other side), (2) display a virtual 3D prototype on a flat screen using virtual or augmented reality (VR/ AR), where the user can have limited interactions with the device, or (3) send a prototype or device to the user and record interactions via teleconferencing software or contextual video diary. All these methods except the last are most suited for formative evaluation and iterative design (the "design verification" stage). The third option may fulfill simulated use at the "design validation" stage (Mejía-Gutiérrez & Carvajal-Arango, 2017). The ubiquity of mobile devices with recording capability makes it possible for users to provide contextual information for their needs and use of medical devices ("General Purpose/ Qualitative Emphasis" in Table 1 ). On the low cost end, users can take photos or make videos using their own mobile devices. These can be prompted by questions about their environment, such as "Show us how you manage your medications in the morning" or "Please make a video showing how you test your blood sugar using your current device." The constraint of relying on a user's smartphone concerns data access: it may be challenging for users to (1) understand how to send video files and (2) be able to store or transfer large video files using their own device. Also, populations of interest may not own or be comfortable with smartphones. Commercial tools have been developed to aid user researchers in collecting these data. For example, indeemo ( indeemo. com) offers a platform for remote, asynchronous ethnography, where users can be invited to download the app, prompts for audio, photo, video, or diary entries are automated, and the data are accessible to the researcher. Difficulty in recruiting some populations still applies. The literature on remote 3D testing was sparse and often limited to novel computing solutions not easily available and accessible. We were unable to find any studies of remote evaluations of 3D medical devices, perhaps because, thus far, such tests have not been necessary. We did find evidence of testing done on other 3D devices, such as usability testing of a 3D mobile phone prototype online that showed the benefit of remote data collection (Figure 1 , Kuutti et al., 2001) . Kuutti et al. recommended training on use of the 3D viewer before exposure to the product. The evaluation of a camera interface in 3D (Kanai et al., 2009 ) provided similar conclusions regarding benefits and limitations of a 3D virtual prototype usability test. Mixed reality (MR) was used for prototype testing, though not remotely. MR means a physical object was altered with virtual attributes. For example, in a study on the usability of a projector system, an abstracted physical form was created-a plastic block (Figure 2 , Faust et al., 2019) . When a fiducial marker was added to the form, participants saw an AR image in the place of the block, where the block now appeared to be a fully functioning projector. MR thus allowed for physical interaction-the plastic block could be touched or lifted by a participant. Virtual buttons were shown on the block and participants could touch them to complete tasks with the projector. Performance and subjective assessments were similar when compared to the same tasks with a real projector, making MR a promising option for 3D remote testing. Some researchers developed head-mounted virtual and AR displays using smartphones so that users could see objects in 3D, but these were not easily available (Rakkolainen et al., 2016) . Commercially available options included Google Cardboard (https:// arvr. google. com/ cardboard/), where a phone can be placed inside the cardboard viewer and held to the eyes to create an immersive virtual environment. Studies comparing inlab VR systems to Google Cardboard systems found similar results (Mottelson & Hornbaek, 2017) . Researchers can create virtual prototypes situated in a VR environment for remote testing. However, interactions with prototypes in VR are limited, making this method better for showing a design and collecting subjective data rather than performance data. No usability studies were found that employed this method for remote or in-person data collection. The mail system has been utilized in some user experience testing (Diamantidis et al., 2015) . The product being tested was electronic and shown online (a medication inquiry system); however, the inputs to the test were pill Faust et al. (2019) . Left image shows the plastic model of the projector with no AR overlay to make it appear to be a projector. Right image shows the same model with AR overlay making it appear like a real projector, with a user interface appearing on the surface of the model. Buttons on the AR interface could be pressed and outcomes observed on the projection screen as though the plastic model were a functioning projector. AR, augmented reality. bottles that were mailed to participants. This study was performed with participants low in health literacy. Two interfaces were tested, one on a mobile phone via text and the other on a personal digital assistant (PDA) such as an iPod Touch. Participants entered information from the physical pill bottles into the electronic systems. Similar to this method, 3D prototypes can be printed at low cost. Some services specialize in printing for the medical industry (e.g., stratasys. com). These prototypes can be mailed to users and paired with testing via videoconference or users filming themselves while carrying out the tasks. Data can include think-alouds and also provide insights on tactile interactions. For both flat and 3D interfaces/devices, eye tracking is used by researchers studying medical devices (Koester et al., 2017) . Multiple online options exist, making data easy to collect provided the remote user has a webcam. A 2014 study showed similar results between webcam and traditional eye tracking for "reasonably" sized images in the focal area, and it is likely the technology has been improved and refined in the past 6 years (Burton et al., 2014) . Unfortunately, the tracking is limited to the display, meaning that the medical device or interface must be shown in two dimensions. One of the earliest efforts took place in 2011, where the teleconferencing program Skype was paired with an eye-tracking program to collect website usability data (Chynał & Szymański, 2011) . Since then, remote eye tracking has exploded with commercial versions and academic or open-source versions (Table 1) . Measures provided usually include videos of the gaze paths, heatmaps, and (less frequently) dwell time in areas of interest (AOIs). Although use of online eye tracking is a viable remote testing tool, use of wearable eye trackers will likely remain complicated. The cost of mobile systems, the difficulty of shipping them to enough participants (and receiving them back), sanitization during the pandemic, and the challenges for a participant to calibrate and record likely means their use would be reserved for testing devices with already highly trained and motivated experts (e.g., surgeons). The FDA encourages medical device manufacturers to include test participants who are "representative of the range of characteristics within their user group," with each group representing distinct user populations who will "perform different tasks or will have different knowledge, experience or expertise that could affect their interactions with elements of the user interface" (FDA, 2016). One advantage of remote usability testing is that individuals who cannot participate in laboratory testing due to high-risk conditions preventing them from leaving their home can still be included in remote testing. Because of the importance of recruiting representative users for patient-facing devices, efforts put into finding and including these individuals should help to uncover usability issues that might otherwise have been missed. It is also easier for stakeholders to observe test sessions from distant geographic locations when testing is done remotely and to include more geographically diverse participant samples (Wiklund et al., 2016) . Adhering to this guideline is critical for patient-facing devices, whose user population consists of highly heterogeneous chronic disease patients, dominated by high-risk characteristics such as limited health literacy (Poureslami et al., 2017) and limited technological competence (Kruse et al., 2018) . Also, many patient-facing devices are used primarily "remotely" for disease self-management (e.g., glucometer) and must facilitate treatment in cases where direct physician supervision is not feasible (Greenwood et al., 2017) . Unfortunately, patient recruitment and proportional representation in the design process is typically difficult and expensive due to population heterogeneity and recruitment barriers (Marquard & Zayas-Cabán, 2012) . These barriers may be exacerbated when moving studies online. Lower levels of trust in the medical system are well-documented, particularly among marginalized and socioeconomically disadvantaged populations (Benkert et al., 2019) , who comprise a large portion of the chronic disease population. This impacts participation rates in studies, which may decrease when conducting studies in an unfamiliar online format. In addition, recruitment efforts from a company or organization with whom participants are not familiar may fail. However, actively including trusted parties in the recruitment process (e.g., primary care providers) may help to alleviate existing trust issues. The single most important factor affecting accrual is whether the patient's healthcare provider recommends that the patient participate in a particular study (Albrecht et al., 2008) . Health literacy refers to skills such as reading, writing, numeracy, communication, and the use of electronic technology (Güner & Ekmekci, 2019) that are necessary to make appropriate health decisions and navigate the healthcare system. To ensure representation of major user groups as required by FDA, it is recommended that patients are stratified based on expected health literacy, often assessed via an Agency for Healthcare Research and Quality (AHRQ) health literacy survey tool (Agency for Healthcare Research and Quality, 2020) , such as the Short Assessment of Health Literacy (Lee et al., 2010) or Rapid Assessment of Adult Literacy in Medicine (Arozullah et al., 2007) . Alternatively, patients with Medicare, Medicaid, and no insurance are shown to have lower health literacy levels (National Center for Education Statistics, 2006) . Therefore, these groups can be recruited to target the low health literacy strata. Transferring usability studies that traditionally involved in-person interactions to online means the patient is responsible for adhering to study protocols, at times outside of the supervision of the study moderator, and will be more of a challenge than in-person studies. Presenting literacy level-appropriate information that is linguistically and idiomatically aligned with the patient's needs is critical (Lopez et al., 2018) . It is widely recommended that printed information should not exceed a 7th or 8th grade reading level (Asiedu et al., 2020) . These recommendations become even more critical in the context of remote usability studies. Other recommendations are to avoid medical jargon, use smaller and more manageable concrete steps to break down instructions, and assess comprehension (Hersh et al., 2015) . Instructional videos, in comparison to textual information, have also been shown to be effective communication tools to increase memory retention and patient satisfaction (Güner & Ekmekci, 2019; Sharma et al., 2018) . As important as these recommendations are for in-person studies, they will be even more critical for remote studies. Synchronous data collection would be preferred for lower health literacy participants, leveraging video conference and screen sharing technologies. It is recommended that a high proportion of persons with limitations or low language proficiency be recruited for formative medical device usability studies, as this will increase the number of use errors and increase accessibility of the final product (Wiklund et al., 2016) . In some cases, it may be easier to recruit these users and users with lower socioeconomic status as they do not need to find transportation, child care, or use vacation time to attend a session. However, connectivity and internet access will remain a challenge for remote testing. While the use of digital technologies and internet access has become more widespread, a health disparity exists between young adults who predominantly use these tools and older adults who dominate the chronic disease population (Madrigal & Escoffery, 2019 ). An additional barrier to online testing of patientfacing devices is the limited access to online resources and competence using technology. For example, chronic disease patients have low rates of online health information technology use despite the widespread availability (Ali et al., 2018) , with substantial impact on the usability and acceptability of online testing platforms. Prior studies have shown that access to the internet and digital technologies affect a patient's willingness to use online services (Estacio et al., 2017) . Given the existing barriers associated with technological competence in this older adult population, the move to remote testing, where technology is the sole platform for interaction, is expected to exacerbate these barriers. To mitigate issues with basic interactions (e.g., web navigation) or software installation, experimenters should provide resources for phone support prior to any online usability study. Last, as with in-person testing, there is an art to remote testing. Camera position, video and audio clarity, and good moderation are critical for detecting participant reactions in remote testing. Some of the reviewed solutions offer automated affect detection using facial analysis (e.g., EyeSee), but this would only be for tests with the participant looking at stimuli on the display rather than interacting with any physical object. Helping the Studies on the left were reimagined as online, and tools that could provide the same or similar data are given. The types of stimuli that would be inputted into these tools is shown on the right. participant to set up the test with the best lighting and angles possible for synchronous tests, and clear instructions for asynchronous tests will be needed to fully witness participant interactions. We have summarized a variety of tools to conduct remote usability evaluations of medical devices and outlined important challenges. We provided tools appropriate for various research goals and ideas for extending other usability methods to remote use. We conclude that remote evaluation of medical devices is possible but challenging. Though some studies of cognitive and usability tasks suggest that results of remote tests are comparable to those in laboratory tests, such studies covered a limited range of tasks. Though a few researchers have attempted evaluations of 3D devices, both virtually and physically, the literature is not strong enough for firm conclusions on comparison between remote and in-person testing. Fortunately, in many cases, the type of data desired from usability tests (subjective assessments, qualitative impressions, learning) can be collected remotely, maintaining comparable data quality to lab-based testing. Last, aside from the technological hurdles, remote evaluations will require dedicated resources and attention toward recruiting representative users, who may be the most challenging to test online. Acknowledging the challenges of moving studies to remote testing, we have created examples of remote testing using four medical device studies taken from the literature. Figure 3 provides four examples of how published medical device evaluation studies might be moved to an online format. The choices of platform and software are made based on the research goals of each study and the data needed to support those goals. Then, the remotely presented stimuli are created to display on the chosen data collection platform. These studies were chosen to show the variety of research that may be moved online, from perceptual experiments, to mobile device interfaces, to 3D devices, and finally eye-tracking usability methods. Remote usability testing is an emerging field that has the potential to increase efficiency of data collection. In addition, it has the potential to allow access to user groups that are difficult to recruit if the correct precautions are taken. It is promising that initial work demonstrates equivalence between labbased and remote testing, and that with the emergence of new approaches, remote testing can expand beyond subjective usability assessments. nership of academic and industry researchers dedicated to developing safer and more effective medical devices through human-centered design. • Remote evaluation of medical devices will be necessary if the field is to progress during the restrictions of a pandemic. • Many solutions are available, from those specialized to controlled experiments to those collecting qualitative data from a small number of participants. • Novel attributes of some remote testing platforms include the ability to assess teams of participants, eye tracking, and enabling evaluation of 3D devices. • Recruiting remote users with appropriate demographics to meet FDA obligations is expected to be more difficult than in-person testing. • We are cautiously optimistic that the tools for remote testing are at a point where medical devices can be design verified, with some able to be fully validated. Anne Collins McLaughlin https:// orcid. org/ 0000-0002-1744-085X Patricia R. DeLucia https:// orcid. org/ 0000-0002-1735-9154 the University of Maryland, College Park. She earned her PhD in industrial engineering in 2012 from The Pennsylvania State University. Her expertise is in the area of human factors and healthcare, with a focus on improving human performance for medical device interaction. Anil Kumar is currently an associate professor in the Industrial and Systems Engineering Department at San Jose State University. He earned his PhD in industrial engineering from Western Michigan University in 2007. His areas of specialty include product design and development (medical products and healthcare), ergonomics, human factors, work measurement and analysis, and safety. Robert R. Nesbitt is currently the director of Human-Centered Design and Human Factors at AbbVie, Chicago, Illinois. He has worked across a number of domains in industry, from Deere & Co. to Eli Lilly. In his role at AbbVie, he focuses on early-stage ethnographic or in-context understanding of patients' and users' needs, particularly for the design of combination medical products. Kevin Cluff is principal consultant at BioWork Engineering, specializing in research and human factors for late stage combination products. Prior to BioWork, Kevin was a principal research engineer for 16 years at a major biopharmaceutical company where he was responsible for HF studies and documentation of FDA submissions. Date received: May 20, 2020 Date accepted: July 9, 2020 Health literacy measurement tools (revised) Influence of clinical communication on patients' decision making on participation in clinical trials Focus section health it usability: Applying a task-technology fit model to adapt an electronic patient portal for patient work What happened to remote usability testing? An empirical study of three methods Gorilla in our midst: An online behavioral experiment builder Development and validation of a short-form, rapid estimate of adult literacy in medicine An assessment of patient perspectives on pharmacogenomics educational materials The disruptive effects of pain on n-back task performance in a large general population sample Improvement of design of a surgical interface using an eye tracking device Envisioning consumers: How videography can contribute to marketing knowledge Ubiquitous yet unclear: A systematic review of medical mistrust Evaluating varied label designs for use with medical devices: Optimized labels outperform existing labels in the correct selection of devices and time to select Human factors validation study of 3 Mg sumatriptan autoinjector, for migraine patients Let your users do the testing: A comparison of three remote asynchronous usability testing methods[Conference session A comparison of the performance of webcam vs. infrared eye tracking technology Remote usability testing using eye tracking IFIP Conference on Human-Computer Interaction Conducting research during the COVID-19 pandemic Comparing online and lab methods in a problem-solving experiment Face processing skills predict faithfulness of portraits drawn by novices Remote usability testing and satisfaction with a mobile health medication inquiry system in CKD The digital divide: Examining socio-demographic factors associated with health literacy, access and use of the internet to seek health information Mixed prototypes for the evaluation of usability and user experience: Simulating an interactive electronic device LabVanced: a unified JavaScript framework for online studies[Conference session Is the web as good as the lab? Comparable performance from web and lab in cognitive/ perceptual experiments Home is where the lab is: A comparison of online and lab data from a time-sensitive study of interruption A systematic review of reviews evaluating technology-enabled diabetes self-management education and support Identifying learning style through eye tracking technology in adaptive learning systems A survey study evaluating and comparing the health literacy knowledge and communication skills used by nurses and physicians Health literacy in primary care practice PLATT: A flexible platform for experimental research on team performance in complex environments 3D digital prototyping and usability enhancement of information appliances based on UsiXML Laboratory simulation of maintenance activity[Conference session Testing the effectiveness of the Internet-based instrument PsyToolkit: A comparison between web-based (PsyToolkit) and lab-based (E-Prime 3.0) measurements of response choice and response time in a complex psycholinguistic task The use of eye-tracking in usability testing of medical devices Evaluating barriers to adopting telemedicine worldwide: A systematic review Virtual prototypes in usability testing[Conference session Just Another Tool for Online Studies"(JATOS): An easy solution for setup and management of web servers supporting online studies Short assessment of health literacy-spanish and english: A comparable test of health literacy for spanish and english speakers Comparing four online symptom checking tools: preliminary results[Conference session Depression screening and education: An examination of mental health literacy and stigma in a sample of Hispanic women Electronic health behaviors among US adults with chronic disease: Cross-sectional survey Commercial off-the-shelf consumer health informatics interventions: Recommendations for their design, evaluation and redesign OpenSesame: An open-source, graphical experiment builder for the social sciences Design verification through virtual prototyping techniques based on systems engineering Virtual reality studies outside the laboratory[Conference session The Health literacy of america's adults: results from the 2003 national assessment of adult literacy. U.S. Department of Education. Office of the National Coordinator for Health Information Technology (ONC) WebGazer: Scalable Webcam eye tracking using user interactions[Conference session The color nutrition information paradox: Effects of suggested sugar content on food cue reactivity in healthy young women Health literacy and chronic disease management: Drawing from expert knowledge to set an agenda Casual immersive viewing with smartphones[Conference session Seamless recording of glucometer measurements among older experienced diabetic patients-A study of perception and usability Extra-laboratorial usability tests: An empirical comparison of remote and classical field testing with lab testing Building, hosting and recruiting: A brief introduction to running behavioral experiments online Updated: Labs go quiet as researchers brace for longterm coronavirus disruptions A prospective, randomized, single-blinded trial for improving health outcomes in rhinology by the use of personalized video recordings Here, there, anywhere: remote usability testing that works[Conference session Tatool: A Javabased open-source programming framework for psychological studies Usability testing of medical devices (2nd) Conducting perception research over the internet: A tutorial review Anne Collins McLaughlin is currently a professor in the Department of Psychology at North Carolina State University in Raleigh, NC. She earned her PhD from Georgia Tech in 2007. Her research interests include the study of individual differences in cognition, particularly those that tend to change with age DeLucia is currently a professor in the Department of Psychological Sciences at Rice University in Houston, TX. She earned her PhD in psychology in 1989 from Columbia University. Her research interests include the human factors of health care (minimally invasive surgery, telehealth, medication administration, patient safety, and medical device design) Drews is currently a professor in the Department of Psychology at the University of Utah. He earned his PhD in psychology in 1999 from the Technical University of Monifa Vaughn-Cooke is currently an assistant professor in the Mechanical Engineering Department at We are grateful to Richelle Huang for assistance with the literature review.All authors are members or partners of the hfMEDIC consortium ( hfMEDIC. org), a part-