key: cord-0046905-qt4qbjmt authors: Tian, Xiaoyi; Lubold, Nichola; Friedman, Leah; Walker, Erin title: Understanding Rapport over Multiple Sessions with a Social, Teachable Robot date: 2020-06-10 journal: Artificial Intelligence in Education DOI: 10.1007/978-3-030-52240-7_58 sha: 06141a6b4dc92054197d716338fb61d716a4b239 doc_id: 46905 cord_uid: qt4qbjmt Social robots have been shown to be effective educational tools. Rapport, or interpersonal closeness, can lead to better human-robot interactions and positive learning outcomes. Prior research has investigated the effects of social robots on student rapport and learning in a single session, but little is known about how individuals build rapport with a robot over multiple sessions. We reported on a case study in which 7 middle school students explained mathematics concepts to an intelligent teachable robot named Emma for five sessions. We modeled learners’ rapport-building linguistic strategies to understand whether the ways middle school students build rapport with the robot over time follow the same trends as human conversation, and how individual differences might mediate the rapport between human and robot. Intelligent social robots have been shown to have positive effects on learning and motivational outcomes [5, 12, 15] in part because of the socio-emotional support they provide [9, 11, 12] . One mechanism that may contribute to these positive effects is the rapport, or feeling of connection, that social robots engender with their human collaborators. However, over time, the nature of the relationship between the human and robot might shift (as human-human relationships do), and the importance of rapport may change [19] . Most research on human-robot rapport has been done in single-session studies [9, 13, 14] , and has rarely investigated how learners develop and maintain rapport with a robot. Understanding how children build and maintain relationships during multiple encounters would help maintain engagement and personalize long-term learning experiences. A widely-accepted human-human rapport framework comes from Tickle-Degnen and Rosenthal's three-factor theory [19] , which includes mutual attention, positivity, and coordination. People start building rapport by expressing mutual attentiveness and interests towards one another. High positivity plays a role in generating a feeling of mutual friendliness and warmth, but in the initial stage of an interaction, there may be less coordination (interlocutors are "in sync" with one another). Over the long term, positivity decreases, coordination increases, and mutual attentiveness remains stable. It's not clear that the same phenomena can be observed in human-robot settings. Thus, our study aims to understand how students verbally build and maintain rapport with a robot in multiple sessions. We conducted an exploratory analysis of 7 middle school students interacting with a social teachable robot over 5 sessions. Our research question was: How do students differ from each other and differ from early to late interaction stages in the way that they build rapport with a teachable robot? For this study, a Nao robot named Emma was taught by middle school students how to solve mathematics problems utilizing spoken language [10] . Students sat at a desk with a Surface Pro tablet in front of them. Emma stood on the desk to the right of the participant. Table 1 is an example of exchange between Emma and a learner on a ratio and proportions problem. More details on the system design can be found in [10] . Over multiple sessions, Emma mimicked the [19] 's model of rapport as follows. To implement coordination, we utilized an acoustic-prosodic entrainment module [9] , which transforms Emma's utterances to converge to the user's pitch. Entrainment increased over the five sessions. Emma exhibited higher positivity in the initial sessions by exhibiting greater politeness and enthusiastic language (e.g., "Great! Thank you for teaching me") than in later sessions. We operationalized attention as gaze behavior, and did not change Emma's default gaze behavior throughout the sessions. Participants were 7 middle-school students (4 females, 3 males). The mean age was 12.7. Each participant interacted with Emma for five 30-min sessions over several weeks. We grouped session 1, 2 and 3 as early interaction stages, and sessions 4 and 5 as late stages. Participants solved 4-6 problems during each study session, resulting in 186 independent problems in the corpus. Each problem contains 10.06 user utterances on average. Two coders manually coded conversational strategies indicating behavioral rapport in each utterance in the human-robot tutoring dialogue (Cohen's kappa of all codes was higher than 0.8). The strategies consisted of off-topic chat, inclusive pronouns (e.g., use of "we" vs "I"), use of Emma's name, praise, apology, refer to past experience, ask a question, respond to Emma's prompt, and adherence to social norms, drawn from both human-human and human-robot rapport studies [1, 3, 7, 8, 17, 20] . Example of codes can be found in Table 1 . To supplement our manual codes, we incorporated automatic linguistic feature detection using the 2015 LIWC [16] summary language variables (analytical thinking, clout, authenticity, and emotional tone). We used an Independent Component Analysis (ICA) to map strategies to three rapport factors, revealing how particular behaviors are used as ways of expressing and managing the underlying rapport-building constructs with the robot [2, 4] . Linguistic strategies that strongly loaded on factor 1 were inclusive language, name usage, apology, and clout, and we interpret the factor as attentiveness. Name usage, praise, authenticity and emotional tone were loaded in factor 2, and we interpret this as positivity. Markers of off-task coordination load strongly positively on factor 3 (chat and adhere norm), while markers of on-task coordination load strongly negatively (responsiveness and ask question). Nevertheless, we do interpret this factor to represent coordination, with on-task and off-task coordination interestingly being negatively related. Finally, the ask question and refer to past experience loaded evenly amongst all factors. Our next step was to understand how rapport varies from early stage to late stage of interaction. From our ICA model, we computed the source matrix for the 3 extracted rapport components for each 186 participant-problem pair, and aggregated the mean rapport score across all problems in each interaction stage. The results are represented in Fig. 1 . We can observe that attentiveness went up across the majority of participants. Positivity appeared to vary from participant to participant, with some individuals who started from a high score seeing dramatic increases (p2 and p9). Coordination decreased in four out of seven learners with large variations (e.g., p5). It is worthy to note that, a decrease in coordination score meant more on-task behaviors, and thus coordination was the only factor to align with human-human rapport theory. Based on learners' rapport trends and their degree of variation, we clustered them into groups. The flat cluster including p1, p3, p6 and p7 tended to adopt one favorable strategy at the beginning of interaction and stick with it over the course of the sessions. These users did not sense their interaction mode changes over time ("At first I already feel how I was with Emma, I just kept that going with the routine." -p6). P2 and p9 were grouped as an increasing cluster and p5 as a single decreasing case. These users had not interacted with a robot or AI system before but had different expectations and perceptions towards Emma. For example, p2 believed Emma is more friendly than robots he saw in movies. Over the sessions, p2 praised Emma more (frequency from 2% to 8%) and his apologizing behavior disappeared. Similarly, p9 stopped asking questions in later sessions. The disappearance of social strategies means the student started to adhere to a "personal norm" (non-apology, no questions) rather than sociocultural norms [18] . This is a sign that the relationship between "increasing" participants and Emma had moved to more friend-like [20] dyads. On the other hand p5, the decreasing case, had a low expectation of Emma's intelligence and socialness ("She's a robot...I don't think she would have background information or whatever."). It is important to note that participants' behaviors from session to session are not only due to their rapport states, but also to contextual factors such as energy or mood. For example, in session 3, p6 seldom offered further elaboration except for saying "Yes", seeming bored with the problems or upset on that study day. In session 5, he was very engaged in the task, and was more wordy and responsive ("Yes, but we also can convert it into a decimal, which is 0.125."). Our goal was to investigate how middle school learners manage rapport with robots over multiple tutoring sessions. We demonstrated that rapport changes from early to late interaction stages in human-robot tutoring did not follow the same trends as human rapport theory [19] . The variation between individuals on positivity and the increase of attentiveness over time suggested that users may be shifting how they express rapport as their expectations of Emma change. This corresponded to the contrast between users with flat rapport trends, who tended to stick with the same linguistic strategies, and users with either increasing or decreasing rapport trends, who articulated evolving perceptions towards Emma. Given the cross-session variability of individuals and that the majority of rapport studies' focus on an "instant" rapport [6] , it is critical to conduct multiple session studies to understand more about human-robot rapport dynamics. This work is a first step towards personalizing rapport-based learning experiences over longterm human robot interactions. Rapport building in student group work Probabilistic independent component analysis for functional magnetic resonance imaging Identification of social acts in dialogue Eeglab: an open source toolbox for analysis of single-trial eeg dynamics including independent component analysis Can children catch curiosity from a social robot Virtual rapport Rapport-building behaviors used by retail employees Using nonconscious behavioral mimicry to create affiliation and rapport Automated pitch convergence improves learning in a social, teachable robot for middle school mathematics Comfort with robots influences rapport with a social, entraining teachable robot Facial features for affective state detection in learning environments Supporting interest in science learning with a social robot Oh dear stacy!" social interaction, elaboration, and learning with teachable agents Rudeness and rapport: insults and learning gains in peer tutoring Growing growth mindset with a social robot peer The development and psychometric properties of liwc2015 Investigating people's rapport building and hindering behaviors when working with a collaborative robot Im) politeness, face and perceptions of rapport: unpackaging their bases and interrelationships The nature of rapport and its nonverbal correlates Towards a dyadic computational model of rapport management for human-virtual agent interaction Acknowledgements. This work is supported by the National Robotics Initiative and the National Science Foundation, grant #CISE-IIS-1637809. We would like to thank Mesut Erhan Unal for his guidance in the ICA modeling process.