key: cord-0481779-njdq6n4z authors: Moorman, Nina; Gombolay, Matthew title: Do People Trust Robots that Learn in the Home? date: 2022-04-08 journal: nan DOI: nan sha: 2501ecc63f0229bb673c7bb8bc257d32649d875a doc_id: 481779 cord_uid: njdq6n4z It is not scalable for assistive robotics to have all functionalities pre-programmed prior to user introduction. Instead, it is more realistic for agents to perform supplemental on site learning. This opportunity to learn user and environment particularities is especially helpful for care robots that assist with individualized caregiver activities in residential or nursing home environments. Many assistive robots, ranging in complexity from Roomba to Pepper, already conduct some of their learning in the home, observable to the user. We lack an understanding of how witnessing this learning impacts the user. Thus, we propose to assess end-user attitudes towards the concept of embodied robots that conduct some learning in the home as compared to robots that are delivered fully-capable. In this virtual, between-subjects study, we recruit end users (care-givers and care-takers) from nursing homes, and investigate user trust in three different domains: navigation, manipulation, and preparation. Informed by the first study where we identify agent learning as a key factor in determining trust, we propose a second study to explore how to modulate that trust. This second, in-person study investigates the effectiveness of apologies, explanations of robot failure, and transparency of learning at improving trust in embodied learning robots. Care robots currently perform a wide variety of tasks that require some degree of learning-at-home. Some of these tasks require observation-based learning. For instance, robotic agents such as Roomba [16] and Moxi [37] learn the layout of their environment through observation and exploration. Other robots observe users directly to classify and track user behavior. Robotic agents such as PHAROS's use this functionality helps monitor user well-being and daily exercise [29] . Finally, agents such as the PARO robot perform interactionbased learning user preferences via straightforward physical feedback [38] . These forms of supplemental learning enable the agent to adapt to its environment and user(s), and offer care-givers and care-receivers the option of being directly involved in the agent's learning to specify preferences or teach the agent new tasks. The customization afforded by this learning is important for ensuring users receive effective, individualized care. Though in situ learning is already common practice for care robots, we lack an understanding of how observing this learning in the home affects user trust. This work aims to develop a better understanding of how users will respond to embodied learning agents in the home. To do so, we conduct a user study that evaluates user trust in agents that exhibit different levels of at-home learning. We compare trust in three different types of agents. The control condition is a fully pre-programmed agent that does not demonstrate it's learning to the user. We additionally choose two types of learning agents: high user involvement and low user involvement. For the high involvement condition, the agent learns from demonstration (LfD), via kinesthetic teaching. We choose reinforcement learning (RL) to represent the low involvement condition in which the user does not interact with the agent during learning. Though surveys and behavioral metrics, we determine the effect of agent learning on user trust. Informed by this first study, we propose a second user study in which we determine the techniques that are most effective in repairing trust in the learning agent. We choose to compare the following three techniques [8] , [36] . • An apology provided directly after the trust violation. • Transparency of agent learning through a high-level narration of what is learned. • An explanation of what caused the error, without acknowledging fault, provided directly after the trust violation. Informed by the results of these two user studies, we develop some guidelines to inform the design of care robotic systems that operate in residential or nursing-home environments. To summarize, in our work we propose the following: 1) We study the difference between user trust in a fully pre-engineered agent compares to user trust in an agent that learns in the home. 2) We investigate how to best perform trust repair with respect to embodied agents that learn in the home. 3) We develop guidelines to inform the design of assistive robotic systems deployed in residential environments. Care robots are defined by their function as robots that support care-givers and/or care-receivers [44] . These robots often operate in residential environments where they perform a variety of assistive tasks and promote extended independent living [11] , [18] , [27] , [47] . Care robot roles generally fall under physical assistance or medical assistance. Physical assistance includes tasks such as navigation, fall-prevention, object manipulation, and household chores. A few notable examples of care robots that perform physical assistance tasks include Moxi [37] , Hobbit [13] , Relay [14] , Care-O-Bot 4 [20] , RAMCIP [23] , and Lio [32] . Medical assistance includes tasks such as health monitoring, medicine delivery, and the exertion of a social presence for coaching or social interaction [7] , [43] , [45] . Examples of care robots that perform medical assistance include Nao [39] , Pepper [40] , and PHAROS [29] . Regardless of whether the care agent performs both physical or medical assistance, learning in the home affords the agent an opportunity to observe and adapt to individual user needs and preferences. We wish to understand how this learning impacts users acceptance of agents. Acceptability of care robots in depends not only on the benefits the agent can bestow upon the user, but also on the user's perception of and attitudes towards the robot [6] . One of the most important attitudes with respect to acceptability is user trust [25] , [46] . Trust is defined as a user's attitude that the agent will help them achieve a goal, specifically in a situation of uncertainty, or vulnerability [21] , [42] . Prior work in human-automation (HA) trust has categorized trust based on the extent of interaction with the user into dispositional, situational, and learned trust [15] . These represent baseline trust in automation [31] , trust with respect to a particular interaction [17] , and trust developed though a series of interactions [8] , respectively. These measures of HA trust can be useful in measuring human robot (HR) trust, but do not account for the impact on trust of robot embodiment. In human-robot interactions, it is useful more meaningful to study trust dependent on agent-specific factors (performancebased) separately from trust dependent on interaction-specific factors (relation-based) [26] . Performance-based trust encompasses robot performance and user's awareness of the robot's abilities while relation-based trust looks at social factors between the robot and society including robot appearance, adherence to social norms, and morality. Recent work develops surveys that directly measure these types of HR trust [28] . In effective human robot interactions, a user's trust in an agent is moderated to avoid over or under-reliance [41] . This trust calibration can be accomplished via the following techniques: transparency, explanations, trust repair, and trust damping [8] . Trust repair refers to the rebuilding of trust after a trust violation. Common trust repair techniques include apologizing, committing to change, accepting or shifting the blame, denial or gas-lighting the user, qualifying or downgrading the severity of the trust violation, or by increasing agent transparency [2] , [8] , [36] . Prior work in human-human trust repair has studied how the type of trust violation affects the difficulty of trust repair in individual and group settings [19] . Similarly, has evaluated the effectiveness of trust repair techniques for human robot interactions in high risk, time critical situations [36] . Our second user study is similar to [36] in that we are evaluating how effective different forms of trust repair in a particular HRI domain. Our work differs from [36] in that we focus on mitigating deficient trust specifically in learning agents that operate in residential environments. In the context of in-home care, for the agent to be successfully relied upon and adopted, we must understand whether caregivers and care takers, as well as their families and other involved individuals, would accept the involvement of the care robots [24] . Thus, the target population of this work encompasses both care-givers and care-receivers, where care-receivers are defined to be members of the geriatric population, i.e. elderly adults of the age of 65+. We conduct Study 1 remotely due to the particularly high health risk in-person user studies impose on elderly adults. As our target population is unlikely to be well-represented by a study hosted on Amazon Mechanical Turk, for our first user study, we work in conjunction with several nursing homes to recruit participants. We recruit 60 participants, 30 care-givers and 30 care-receivers. As the first user study is a between-subjects study with 3 conditions, this number of participants allows us to counterbalance participants such that 10 care-givers and 10 care-receivers will be assigned to each condition. Exclusion criteria for participants include the ability to speak and read English, and the ability to operate a computer and browse the internet independently. Remote studies may not fully represent the impact of the agent's embodiment on trust. Thus, we conduct Study 2 in person. As our second user study is conducted in person, imposing a higher health risk on our target population, we recruit 50 participants both from nursing homes and from the general population. As this study is a within-subjects user study, all participants will experience each condition. We account for the inclusion of participants outside of our target population in our multivariate regression analysis. First, we wish to establish whether there is a difference between user trust in fully pre-programmed agents and user trust in agents that conduct some learning in the home. In this between-subjects study, we isolate learning as the independent variable by keeping performance constant. The only difference in participant experience between different agent conditions is the addition of a video showing the agent learning prior to observing agent performance for the learning agent conditions. Research Question 1 Is trust in an agent that is preprogrammed different from trust in an agent that additionally learns in the home? Given that many care robots already perform learning in residential settings, we investigate how to best mitigate deficient trust in embodied learning agents. Research Question 2 How can we best modulate trust in agents that learn in the home? The list of surveys administered via Qualtrics for the virtual study is available in Table I . Note that the surveys we employ remain the same for both user studies. Mid-Study Situational Trust [17] Performance-based Trust [28] Risk [12] Post-Study User Assumptions Extent of User Adoption In the post-survey questionnaire, we ask two hand-crafted questions. First, we ask the participant what tasks, from a list of tasks both in and outside of the distribution of tasks observed in the study, they would trust the agent to do. This question measures the extent of user adoption. Secondly, we ask an open-ended question about the participant's understanding of the agent's learning and general perception of agent competence. This question is to collect qualitative information about user assumptions regarding the agent's learning and competence. We evaluate a user's dispositional trust, situational trust and performance-based trust through surveys [17] , [28] , [31] . We do not evaluate relation-based trust as we do not alter how the agent adheres to social norms in this study. We also study user trust through behavioral measures, in terms of reliance on and compliance with the agent, through user average intervention rates while observing the agent's behavior on the test task of each domain. The virtual user study, designed to address RQ1, will be a between-subjects experiment, where participants are assigned to one of three conditions: • Control: agent is fully pre-programmed. • Low-Involvement: partially pre-programmed agent supplemented with RL. • High-Involvement: partially pre-programmed agent supplemented with LfD via kinesthetic teaching. Note that in the low-involvement condition we do not explain how the agent knows the outcome of each trial and error attempt, but the reward function is implied to be obtained from the environment. The design of the first user study is as follows. b) The participant observes a video of the agent's final performance on the train task. This video is the same for all conditions. c) The participant then observes 10 video clips of the agent executing the test task. For each of these 10 clips, they are told that they can intervene by pressing a red STOP button if they no longer trust the agent to succeed or feel the agent will fail. Seven out of the ten video clips are failed attempts at the task by the agent, and three are successes. d) Mid-study questionnaire. 4) Post-study questionnaire. The in-person user study, designed to address RQ2, will be a within-subjects experiment in the domain of highest effect size from the first user study. All participants in this study will interact with the learning agent from Study 1 that had the lowest average user trust. In this user study, participants are assigned to one of the four conditions: • Control: No technique employed to improve trust in embodied learning agent. • Transparency: Transparency of robot learning, though a high-level narration of what is being learned. • Explanation: Explanation of what caused the error, without acknowledging fault, provided directly after the trust violation. • Apology: Apology provided directly after the trust violation. The explanations and transparency provided in these techniques are hand-crafted rather than autonomously generated. The design of the second user study is as follows. 1) Introduction to the learning agent. The participant is shown a video of a human un-boxing the agent narrated with a summary of the agent's pre-engineered abilities. 2) Pre-study questionnaire. 3) For each of the four conditions (control, transparency, explanation, apology): a) User observes a video of the agent learning the train task in the home. In the transparency condition, this video will be narrated with high-level descriptions of what is being learned. b) The participant observes a video of the agent's final performance on the train task. This video is the same for all conditions. c) The participant then observes 10 video clips of the agent executing the test task. For each of these 10 clips, they are told that they can intervene by pressing a red STOP button if they no longer trust the agent to succeed or feel the agent will fail. Seven out of the ten video clips are failed attempts at the task by the agent, and three are successes. After each of these 10 video clips, if the video was of an unsuccessful attempt at the test task, the user will receive an apology or explanation in these respective conditions. In both user studies, participants are shown videos of agent learning as well as agent performance on the train and test tasks. We employ Ai2Thor's iTHOR environment [22] and ManipulaTHOR environment [10] to create these videos with a Wizard-of-Oz policy using the LoCoBot agent. We choose to manipulate the agent using a Wizard-of-Oz policy rather than actual policies as we are isolating agent learning as the independent variable, and wish to control for performance and behavioral changes between agents that are trained with different policies. As seen in Table II , we choose three tasks for the first user study, one for each of three domains: a navigation domain, a manipulation domain, and a preparation domain. We pick the following tasks as there has been demonstrated need for care robots that can assist in these tasks [11] , [30] , [33] , [35] . Recall that we intend to measure trust in each domain. As trust is defined in situations of uncertainty or risk, we intentionally inject risk into the test task of each domain. Accordingly, the navigation train task entails navigating between rooms, while the test task has agents physically guiding a user between different rooms, all the while avoiding obstacles. The manipulation task is comprised of pouring liquids into cups without spilling. We increase risk in the test task by changing the temperature of the liquid from lukewarm to scalding water. Finally, in the preparation task the agent learns to dispense and distribute multivitamins to the user. The riskier test task is then to dispense and distribute pain medicine to the user. For the second user study, we recreate the domain of highest effect size from the first user study in the laboratory setting using a LoCoBot Robot. When demonstrating agent learning, we employ a Wizardof-Oz policy, and portray learned agent behavior and performance as identical irrespective of the type of learning, though in practice different learning may engender different agent behavior. We do so to isolate learning as the independent variable. We assume that the resulting user trust will still be pertinent to more realistic engineer-taught and learning agents. Additionally, as is often done with model based diagnosis work [3] , in our work we assume full knowledge of the causes of errors in order to provide explanations. Finally, though the embodied agent in the first user study is represented virtually, we assume the results can still motivation our second, in-person user study by establishing that trust in embodied learning agents is hard. One limitation of our work is that our research is crosssectional. While we recruit individuals from our target population for the initial, virtual study, due to COVID concerns we are unable to recruit solely from this target population for our secondary, in-person study. We choose to hand-craft these explanations rather than generate them autonomously as well, as autonomous generation of effective explanations is still unsolved. We choose not to focus on this research question, as we wish to inform design of care agents beyond current technical limitations. Another limitation of our work is that, in our first study, we are measuring trust based upon a subject's ability to imagine being vulnerable to a system, though the risk in a virtual environment is inherently hypothetical. Future work includes recruiting our target population, carrying out the two user studies, analyzing the resulting data, and developing guidelines to inform the design of care robots that learn in residential settings. In this work, we propose a user study to evaluate user trust in embodied agents that conduct some of their learning in the home. Informed by these findings, we propose a second user study to investigate how we can best improve deficient trust in embodied learning robots. We then develop guidelines that inform the design of care robotic deployed in the home. Our goal is to assess care-giver and care-taker attitudes towards the concept of embodied care robots that learn in the home, as compared to robots that are delivered fully-capable. College Majors and Occupational Choices Toward an Understanding of Trust Repair in Human-Robot Interaction: Current Research and Future Directions A Tale of Two Suggestions: Action and Diagnosis Recommendations for Responding to Robot Failure Measurement Instruments for the Anthropomorphism, Animacy, Likeability, Perceived Intelligence, and Perceived Safety of Robots Integrating trust and personal values into the Technology Acceptance Model: The case of e-government services adoption Psychological Implications of Domestic Assistive Technology for the Elderly GiraffPlus: Combining social interaction and long term monitoring for promoting independent living Towards a Theory of Longitudinal Trust Calibration in Human-Robot Teams The Mini-IPIP Scales: Tiny-yet-Effective Measures of the Big Five Factors of Personality ManipulaTHOR: A Framework for Visual Object Manipulation Assistive robots to improve the independent living of older persons: results from a needs study How Safe Is Safe Enough? A Psychometric Study of Attitudes Toward Technological Risks and Benefits Antonis Argyros, and Markus Vincze. Hobbit, a care robot supporting independent living at home: First prototype and lessons learned Autonomous Service Robot For Hospitals -Savioke Relay Trust in Automation: Integrating Empirical Evidence on Factors That Influence Trust Robot Vacuum and Mop Foundations for an Empirically Determined Scale of Trust in Automated Systems Socially Assistive Robots: A Comprehensive Approach to Extending Independent Living Repairing trust with individuals vs. groups. Organizational Behavior and Human Decision Processes Let me Introduce Myself: I am Care-Obot 4, a Gentleman Robot Measurement of Trust in Automation: A Narrative Review and Reference Guide AI2-THOR: An Interactive 3D Environment for Visual AI RAMCIP: Towards a Robotic Assistant to Support Elderly with Mild Cognitive Impairments at Home A Survey of Robots in Healthcare Trust in socially assistive robots: Considerations for use in rehabilitation Chapter 2 -Trust: Recent concepts and evaluations in human-robot interaction Reframing Assistive Robots to Promote Successful Aging A multidimensional conception and measure of human-robot trust PHAROS 2.0-A PHysical Assistant RObot System Improved A concept of needs-oriented design and evaluation of assistive robots based on ICF I Trust It, but I Don't Know Why: Effects of Implicit Attitudes Toward Automation on Trust in an Automated System Lio-A Personal Robot Assistant for Human-Robot Interaction and Care Applications Towards robotic assistants in nursing homes: Challenges and results Correlates of Computer Anxiety in College Students Timing is Key for Robot Trust Repair SoftBank Robotics. NAO the humanoid and programmable robot Pepper the humanoid and programmable robot Evaluating Trust and Safety in HRI : Practical Issues and Ethical Challenges What Does it Mean to Trust a Robot? Steps Toward a Multidimensional Measure of Trust A Holistic Approach to Behavior Adaptation for Socially Assistive Robots Designing robots for care: care centered valuesensitive design Assistive Robots for the Elderly: Innovative Tools to Gather Health Relevant Data You Want Me to Trust a ROBOT? The Development of a Human-Robot Interaction Trust Scale A Robot of My Own: Participatory Design of Socially Assistive Robots for Independently Living Older Adults Diagnosed with Depression