key: cord-0467298-gbhbpph5 authors: Qi, Peng; Huang, Jing; Wu, Youzheng; He, Xiaodong; Zhou, Bowen title: Conversational AI Systems for Social Good: Opportunities and Challenges date: 2021-05-13 journal: nan DOI: nan sha: a5a853fbcca89ec308511527e1614fac701681e8 doc_id: 467298 cord_uid: gbhbpph5 Conversational artificial intelligence (ConvAI) systems have attracted much academic and commercial attention recently, making significant progress on both fronts. However, little existing work discusses how these systems can be developed and deployed for social good in real-world applications, with comprehensive case studies and analyses of pros and cons. In this paper, we briefly review the progress the community has made towards better ConvAI systems and reflect on how existing technologies can help advance social good initiatives from various angles that are unique for ConvAI, or not yet become common knowledge in the community. We further discuss about the challenges ahead for ConvAI systems to better help us achieve these goals and highlight the risks involved in their development and deployment in the real world. Conversational artificial intelligence (ConvAI) systems, or dialogue systems, are increasingly prevalent in our everyday lives, taking the form of smart assistants for home and handheld devices (e.g., Amazon's Alexa, Apple's Siri, Google Assistant) or friendly helpers with telephone banking or online customer service, to name a few. Contemporaneously, there has been a renewed interest in ConvAI from the research community, evidenced by the enthusiasm around the ConvAI challenge 1 and the growing number of publications in the ACL Anthology that mention ConvAI-themed keywords in their titles (see Figure 1 ). Aside from the recent technical advances, we recognize two important reasons for their popularity. First, ConvAI systems provide a natural language user interface (LUI) for their underlying applications, requiring little to no prior training on the users' part to use them. In an ideal world, ConvAI technology would help us build LUIs that allow users to convey their needs as easily as they would with other people. Second, with the help of increasingly robust automatic speech recognition (ASR) and speech synthesis (or text-to-speech, TTS) systems, ConvAI systems are a lot easier and inexpensive to access than many other technical solutions. One could gain access to a ConvAI system as long Copyright © 2022 Figure 1 : Statistics of papers in the ACL Anthology 2 that mention "dialog(ue)", "chat", or "conversation(al)" explicitly in the title over the past five decades (as of August 2021). This demonstrates a renewed interested in ConvAI from the ACL community since the turn of the millennium, followed by a sharp growth in the past five years. as they can access a telephone, without needing Internet access, smart devices, or even a digital cellular network. Both reasons also make ConvAI a great candidate technology for social good, because these properties help the underlying technical solution reach a much broader population at virtually no additional cost to the society, nor delay in implementation to the communities to be served. However, despite these inherent advantages and the growing interest in ConvAI, there has been little existing work that discusses how these systems and technologies can be developed and used for social good to the best of our knowledge. In this paper, we aim to better understand how conversational AI technologies and systems can be developed and deployed for social good. To this end, we begin by briefly reviewing the community's efforts and progress on ConvAI over the past few decades. We then explore how existing technolgy can be, or have been, applied to various scenarios to advance the United Nations' Sustainable Development Goals (SDGs) for social good. We illustrate specific use cases with concrete examples, with an focus on application scenarios that are far from common knowledge. Finally, we conclude with reflections on potential challenges and risks when we develop and deploy new ConvAI technologies into the real world, focusing on the societal impact and how one might mitigate them. Communicating with computers in human languages has been a long-standing goal of computer science since its early days. From intuitively named commands to more recent conversational AI systems, effective use of computer systems has been made more and more accessible to more users. For our discussion, we adopt an deliberately broad definition of the term "Conversational AI": any system or technology that allows human users to interact with computers via natural language. This definition encompasses most language-related interactive systems, including those based on audible speech interactions (e.g., Carnegie Mellon University's Let's Go! system for bus information; Raux et al. 2005) . Many early ConvAI systems give one of the conversationalists the exclusive agency to drive the conversation. System-initiative ConvAI agents typically follow a predefined dialogue plan and offer users limited options. As a result, these systems tend to have an easier time understanding the user as long as they are cooperating (e.g., automated banking services). On the other hand, user-initiative systems focus on responding or reacting to user requests. These systems can perform specific tasks at the user's request (e.g., Winograd's (1971) SHRDLU, which manipulates geometric shapes in a block world following natural language instructions in a dialogue, or question answering systems) or keep the user company and soothed (Folk 2021). Recent work has placed more emphasis on mixed-initiative dialogues, where interlocutors take turns to direct the conversation. This is also reflected in many ConvAI systems that serve a large number of users, such as XiaoIce (Zhou et al. 2020) and Alexa Prize socialbots (Gabriel et al. 2020) . The community has also witnessed an evolution in format of the conversations systems are prepared to engage. Taskoriented conversations remain a popular problem to tackle, as they help human users to achieve concrete goals, such as booking a restaurant or turning on the light (e.g., conversations in the MultiWOZ dataset (Budzianowski et al. 2018) ). Similar are experiment-based dialogues where users perform experiments within an environment with the help of an ConvAI system (e.g., SHRDLU or data analytics LUIs). These conversations often take place in a relatively closed world and involve well-framed problems. On the other end of the spectrum, chitchat ConvAI systems (or chatbots) engage and entertain the user without a predefined agenda, goal, or even limit on topics. More recent research has also explored practical conversations across the spectrum of different goal specificity, domain openness, as well as levels of grounding. Some take the familiar form of a ConvAI system assisting human users (e.g., knowledge-based ConvAI, including question answering systems (Choi et al. 2018) ), while others focus more on conversational skills that featured in collaborative or competitive peer conversations (He et al. 2017 . Before we move on to exploring how ConvAI systems can be applied for social good, we must first answer the following questions: once we have built a ConvAI system, what effects can we expect it to have on the world around us? What properties make them more appealing to, say, their human counterparts? One of the most basic properties that makes ConvAI systems potentially more appealing to human operators is that they are more readily available and scalable. This makes it possible for ConvAI services to be accessible outside typical working hours, and allow them to cater to more users at the same time at virtually no marginal cost. As a result, ConvAI systems can be perfect alternatives to help us gather information from, or disseminate information to, a large population for social benefit with minimal requirements to the technological infrastructure. What is more, ConvAI systems can potentially be more easily personalized than human operators in the future when serving a large population (e.g., adjusting the volume and/or rate of speech if the user has difficulty in hearing, or remembering user profile with consent for a more personal-level connection). With data properly handled, they can also be better at preserving the privacy of individuals using them. This makes them desirable alternatives to human interlocutors especially when the risks of personal repercussions or social stigma are perceived by the user. In this section, we will focus on discussing and illustrating how existing ConvAI technologies and systems can be applied for social good. This section is organized around how ConvAI systems can be applied to help us approach the UN's Sustainable Development Goals (SDGs). For each SDG, we will focus on providing examples that are salient but sometimes lesser-known to illustrate how today's ConvAI technology can be applied. However we acknowledge that this is far from a comprehensive review. As our effort to avoid the pitfall of technological solutionism, we will focus on the SDGs on which we see ConvAI technology having a more direct impact. For each application, we also analyze the limitations and boundaries of these ConvAI systems, and offer cautionary notes about areas of future development and potential risks when applicable. As the world is swept by the COVID-19 pandemic and quarantine measures in response, health and mental well-being is becoming more and more the center issue on people's mind. Many in the technology community wonder what we can do to help improve the situation, if anything. Smart assistants can help inform the public. The smart voice assistants in many homes is one way through which ConvAI technologies can help inform the public about practical guidelines in this global health crisis. ConvAI agents can act as a source for answers to frequently asked questions regarding the pandemic, such as "What are the common symptoms of COVID-19?" and "What are some best practices to prevent the spread of COVID-19?" (Eddy 2020). Notes. This user-initiative, knowledge-based interface is only possible when we are confident the scope of questions can be relatively restricted, and can be answered from Figure 2 : Two illustrative examples of pandemic survey over the phone using a ConvAI system. Besides gathering information for the survey (a), the system can be equipped with relatively simple natural language understanding and/or emotion detection tools to help offer appeasing messages and helpful advice and information (b). trusted sources. That being said, it remains a technical challenge for current QA systems to know when to abstain when questions are out of scope. Thus, such systems should always consider redirecting users to healthcare providers for further assistance and more accurate information. More effective and equitable public health measures via telephone. Effective and equitable public health measures should be able to reach a much broader population than just those who have access to stable Internet connections and smart devices. Telephone surveys are an irreplaceable means for understanding the spread and effect of the pandemic (World Bank 2021). A timely update on a critical mass in the population is crucial to the control and monitoring of a rapidly developing pandemic, but it is also tedious and labor-intensive. One potential way that ConvAI systems can help improve this (and similar future) situation is conducting automated surveys over the phone. Since surveys are usually designed to be well-structured with a relatively one-sided flow of information, a ConvAI system can engage in system-initiative conversations with a large number of users and collect their responses relatively easily (see Figure 2 (a) for an example). With the help of the ASR technologies, such systems can further be made to interact via natural language (instead of keypad), further making it accessible to a broader demographic to collect answers. Notes. While accurate ASR systems require a significant amount of computational resources to serve, it is relatively easy to transmit the speech signal for centralized processing. However, to truly realize the democratizing potentials of telephone access, ASR systems do need to be robust to low-bandwidth audio, noises and disruptions caused by unreliable cellular coverage, and underrepresented accents. Public health is not just about physical health. Aside from physical health, mental well-being is also of great importance, though public awareness, discovery, and intervention are still lagging behind (World Health Organization 2013). With the right set of tools, ConvAI systems have the great potential of helping us uncover potential mental issues (stress, anxiety, depression, among others), and even administer the proper intervention. As the pandemic exerts great mental stress on the public (World Health Organization 2020), one need not look far from our COVID-19 survey example to find a potential candidate (see Figure 2 (b)). Notes. Combining the capabilities in the previous two examples, a mixed-initiative conversation can help build rapport by actively listening to and addressing the user's concerns. However, this does result in a significantly larger set of possible system states, and therefore require careful auditing before deployment. Safeguard mental health via early detection. Many people's lives have been affected by mental health issues even without the pandemic. For instance, suicide was identified as the second leading cause of death in 2016 for young adults between 15 and 29 years of age (World Health Organization 2019). While chatbots have been developed for counseling and/or remedying certain mental issues (Han et al. 2013 ; Fadhil and AbuRa'ed 2019; Simonite 2020), they require active participants to make a difference. On the other hand, as ConvAI systems are already widely deployed to offer customers help in task-oriented dialogues on online platforms like e-commerce marketplaces. With the wide reach of these online platforms, equipping ConvAI systems with a relatively simple detector of suicidal tendencies can help with early intervention before precious lives are lost (e.g., (XinhuaNet 2020) where one such system notified help in time to prevent a suicide attempt via drug purchase). Notes. This is crucial for widely deployed ConvAI systems to consider, namely, what worst-case scenarios these automated systems can facilitate just by doing the exact job they are designed to do. We should also note that, this is an example where the effect of ConvAI systems is not reflected directly in contributing to the dialogue. Instead of engaging mentally at-risk users directly, ConvAI systems should always flag professionals for further help. Help instructors be more effective. Making effective use of computers to aid human instructors better achieve their educational goals has been the pursuit of the computerassisted instruction (CAI) community for decades (Graesser et al. 1999; Yacef 2002; Olney et al. 2012) . Combined with domain knowledge and teaching strategies inspired by human instructors, these conversational agents can potentially distribute the experience of quality one-on-one tutoring more broadly. This can be especially helpful in communities that experience a shortage in supply of qualified educators due to economic and/or geographic reasons, or where in-person learning is limited and engaging learners via one-to-one instruction becomes more difficult (Becker et al. 2020) . While teaching dialogues are often mixedinitiative, which are more challenging, we could pursue relatively closed subsets of the conversation space by offering experiment-based help for solving specific multi-step problems, for instance. With direct contact to the learners, ConvAI systems can also be equipped to detect learning disabilities (Håvik et al. 2018 ). This unique advantage will not only allow the ConvAI systems themselves to adapt, but also potentially inform the human instructors to better cater to each learner's unique learning needs. Notes. In many cases, ConvAI needs to work together with other forms of user interface to provide an effective extension to the learning experience, e.g., real or simulated physical or chemical experiments to help students explore in physics and chemistry. We also note that the diagnosis and intervention of learning disabilities should be left to education professionals, with ConvAI in a role to aid early flagging or approved assistance in intervention. Available beyond the classroom. Besides improving the bandwidth and teaching style of traditional classroom learning, ConvAI systems can also engage learners better outside the physical or virtual classroom. Having exclusive access to a virtual learning assistant can potentially help eliminate the effect of imbalance in the allocation of limited teaching resources (e.g., asking questions in class). Not being bound to limited office hours, ConvAI systems can act as a bridge to help alleviate the pressure of non-anonymity, peer pressure, or time pressure in teacher-student interactions. Moreover, these agents can be personalized to fit each learner's habit and better help each individual tackle long-term goals, such as building one's vocabulary or preparing for a test, with helpful reminders, check-ins, and interactive exercises. Notes. While some of the functionality mentioned above do not necessitate them, ConvAI systems could potentially provide an intuitive, unified, and interactive language user interface to lower the barrier to make effective use of them. Our society evolves at breakneck speed, especially when one retrospects on the technological advances over the recent decades. In the meantime, social inequalities in opportunities also increase as different countries, communities, or individuals benefit from these advances differently. ConvAI is well-positioned to reduce some of these inequalities. Equitable policies begin with equitable access to governments. One of the essential needs that ConvAI systems are well-equipped to help address is that each community can easily express their needs to local officials to inform policy-making. Similar to the telephone survey we described in Section , a system-initiative ConvAI agent over the phone can help gather community feedback while preserving the privacy or identity of callers (Androutsopoulou et al. 2019 ). This allows users to share their opinions on poignant issues (e.g., housing, public safety) freely without worrying about stigma or repercussions. ASR transcripts can further be triaged and clustered with NLP tools for government officials to process more efficiently, further expanding the bandwidth in public opinion intake. Notes. While privacy preservation can be achieved technically, stakeholders should operate with transparency to ensure it is known to and trusted by the public to achieve the desired outcome. Abuse under anonymity could also be a potential risk to mitigate. Disability and quality of life should not be mutually exclusive. Aside from accessing the public discourse, it is also important that everyone is able to enjoy their private life, especially the benefits brought by technological development. People with disabilities are too often left behind by the convenience of modern life. NLP has great potential in improving life quality for the visually and hearing impaired through assistive technologies (e.g., audio description for movies (Rohrbach et al. 2017 ) and closed captioning via ASR). However, many existing assistive language technologies provide one-sided interfaces where people are only receiving information from the system. We argue that ConvAI technologies can be applied in many of these applications to bring an interactive experience that further extend the potential for them. For instance, augmented with knowledgebased visual question answering (Antol et al. 2015 ) and visual dialogue (Das et al. 2017) , movie description systems can further help the visually impaired explore scenes in creative art, providing a fuller viewing experience. Notes. Despite the technical plausibility of interactive assistive technology, it is nevertheless essential to involve potential users to understand whether/how they desired the technology to take shape in innovations like this. (2019)) could be helpful, e.g., "Have you heard of Save the Children?" or "Are you familiar with the organization?". Notes. One salient risk with persuasive technology is the use to persuade people to do harm, or for economic or political advantages. We urge the use of these technologies be regulated, and required to disclose the motivation, target, beneficiary, etc. for transparency. Social inequalities within and between countries are often rooted in the socioeconomic statuses of the population. It is imperative that we consider technological solution to not only reduce existing inequalities, but also eradicate its source when possible. Perhaps one of the most important and sustainable approach towards ending poverty is providing decent jobs with reasonable pay, where the oft-neglected process of data collection can potentially help. ConvAI can positively affect people aside from directly serving them. Despite our focus so far on applications of ConvAI for social good, one aspect that is commonly neglected is how these systems are built. Data is critical to modern (Conv)AI systems, and the process of acquiring this data often involve paid annotation teams working in a controlled setting. The nature of ConvAI makes it a great candidate for redistributing the payout of data collection to the disadvantaged, because: (1) natural conversation data is in great need, which is easy to generate without too much training (e.g., using a Wizard-of-Oz approach, see Kelley 1984; Budzianowski et al. 2018) , and can usually be collected in a safe environment; 3 (2) ConvAI data can be delivered via telephone/text messaging (SMS) systems if necessary, which is much more widely available; (3) as a result of the wider reach, the resulting data is also more diverse, and will help ConvAI systems better adapt to different data variations, including those that would naturally occur in some of the aforementioned scenarios where ConvAI can be applied (over the phone or SMS). Notes. Data collection should only be conducted with full knowledge and consent from participants, with privacy protection measures where applicable. Given the prevalent income inequality, good pay in economically disadvantaged countries/regions can be economic for affluent ones, where researchers and practitioners are typically based. This is no excuse for unethical, exploitative low pay, however (Gray and Suri 2019). In the previous section, we have explored various opportunities that current ConvAI technologies can contribute to social good. In this section, we turn to potential challenges we need to tackle to better realize this goal, and articulate some of the salient risks that might lie within the further development and deployment of ConvAI technology in the real world. 3 Here, we focus on the raw speech/text data, and acknowledge that they usually require separate post-processing. Before diving into issues that are more unique to ConvAI, we briefly review some of the key challenges and considerations that CovnAI systems share with other machine learning (ML) or natural language processing (NLP) technologies when applied for social good. Problem-centric, not tech-centric. One of the common pitfalls is technology-centered solutionism. Tuomi (2018) summarized its origins and risks aptly in an European Union Science for Policy report on AI's impact on education: In the stage of technology push, technology experts possess scarce knowledge ... [which] often dominates and overrides other types of knowledge ... this can become a problem as technologists easily transfer their own experiences and beliefs about learning to their designs. Although technologists do bring fresh perspectives, it is crucial to remember that when developing technology to solve problems, domain experts are more knowledgeable of the mechanisms, causes, and nuances regarding the subject matter. Green (2019) also warns us against the dangers of underdefined or short-sighted metrics of social good, which are sometimes the result of lack of communication between technologists and subject matter experts. By extension, it is also important to assess whether ConvAI (or other approaches familiar to the technologists) is significantly better than alternative investments. While requiring simpler equipment to deliver, language interfaces are not always the most efficient if graphical user interfaces are applicable (e.g., providing directions in a building). Avoid Amplifying Bias. It is also important to avoid exacerbating existing social biases or inequalities. Today, this is commonly rooted in the kind of data available for many ML applications, from facial recognition (Garvie and Frankle 2016) to automatic speech recognition (Koenecke et al. 2020) , to what is more and more recognized in virtually any NLP application, the lack of linguistic diversity (Joshi et al. 2020) . It is important to carefully design and curate the data used to build these systems, especially in a social good setting to avoid making a bad situation worse. In this section, we will focus on technical challenges and research frontiers that are more closely (but not necessarily exclusively) related to ConvAI, and share our view on the path forward, and the main obstacles therein. Know thy interlocutor. One of the fundamental goals of ConvAI is to facilitate communication, therefore it is important to take into consideration whether the system is communicating in a manner that is clear and effective for the interlocutor. This is particularly important considering the increased diversity of population we set out to reach in applications for social good. One aspect to factor into consideration when designing ConvAI systems to communicate with people from diverse backgrounds is to understand how they would perceive the same message or mode of communication, as miscommunication can sometimes easily slip detection and lead to catastrophic outcomes (Wikipedia contributors 2021). We urge researchers to consider using guidelines like Datasheets (Gebru et al. 2018) to help calibrate the collection and use of data, and when possible, involve members of communities the system is intended for in the process. Go beyond words. Besides a better understanding of users that the system is designed to interact with, we can also make ConvAI systems communicate more effectively with the help of other modalities than just text and speech. As interactive technologies like high-quality computer graphics, robust computer vision, haptics devices, augmented and virtual reality (AR/VR), and embodied technology become increasingly available to the public, there are also great opportunities for computer scientists from these different areas to join forces in the quest towards better interactive computer systems (e.g., BAAI 2020). As we have discussed, one salient application that will benefit from multimodal interactions is education. These technologies will help broaden our horizon, and potentially help enable interactive conversational systems for social good that is beyond our current imagination. However, some of the obstacles of further developments in this area, we believe, include data sharing and interdisciplinary collaboration. Faster, Higher, Greener. As we pack more features into ConvAI systems, it is always useful to remember that at the end of the day, these systems are built to interact with humans in real time. Thus, besides being able to communicate effectively, reasonable response time is of great importance compared to off-line ML systems. On the other hand, the great opportunity for ConvAI to help in various settings and grounded situations should also encourage us to investigate better methods for transferring between different settings with higher data-efficiency. For instance, once we have built a well-designed system for pandemic survey for COVID-19, it can ideally help with surveying other infectious diseases without significant effort in data collection, training, or system configuration. Both of these goals are not only practical and desirable for ConvAI developers and social good stakeholders, but potentially also helpful in reducing the computational cost and climate impact of these systems while they are serving to address societal problems (SDGs #12 Responsible Consumption and Production & #13 Climate Action). We believe that the rise of multi-domain dialogue datasets (e.g., MultiWOZ, Budzianowski et al. 2018) are a great first step in this area. It takes two... or more. Although so far in this paper we have largely restricted our attention to conversations between one human user and one ConvAI system, this is by no means the only form of conversation that naturally occurs, or in the context of social good. One of the common scenarios that multi-party conversations naturally occur is peer support groups (SDG #3), where ConvAI can potentially serve as the coordinator between multiple human speakers to help guide the conversation (Nordberg et al. 2019) . With the growing popularity of online forums, automatic moderation of online discussions has also received ample research attention in recent years (Delort, Arunasalam, and Paris 2011) . We believe there is still a large space of exploration for what multi-party ConvAI systems can help us achieve in building healthier online communities large and small. Now that we have discussed the technical challenges facing ConvAI and its application in social good in the near future, we turn to social challenges that are unlikely addressed solely by technical advances, and/or that require a wider societal awareness and collective effort to mitigate. Data sourcing and quality. The source of training and evaluation data is one of the most important aspects to consider when building and deploying modern AI systems. This is because not only does data affect the behavior of these systems, but it also often functions as a standard to compare and evaluate different candidates before "better" systems are chosen and deployed. As we have discussed in our section on shared challenges, it is important that we don't exacerbate existing social biases when building these systems, and one of the most fundamental ways to alleviate this bias is to collect data that is representative of the target audience of our system. (SDG #10) Aside from representation of the population, what can usually be neglected is the representation of actual users of the system. In the ConvAI research literature, many datasets are collected from paid crowdworkers rather than actual users that the systems are intended to serve, which might not faithfully represent user behavior (de Vries, Bahdanau, and Manning 2020). Furthermore, as most datasets are collected via a Wizard-of-Oz approach, even if we were able to simulate the intended users with crowdworkers, they will be conversing with a "perfect" system that doesn't make certain mistakes that a trained ConvAI system would. This divide is crucial to realize and mitigate in future work, especially for real-world applications. Another challenge that threatens the data quality for ConvAI concerns systems that adapt on the fly to data collected during the time they are serving. On one hand, overly relying on usage data could potentially obfuscate or even exacerbate the impact of blind spots in the system, as humans tend to adjust their language use to accommodate the perceived properties of the interlocutor (Giles and Coupland 1991) . If a component in a ConvAI system does not meet a user's initial expectation, instead of trying the same thing over and over again, they are much more likely to accommodate the system by either enunciating or avoid the component entirely if possible. It is thus important to look beyond the survivorship bias that is usually present in such usage data. On the other hand, systems made publicly available are more vulnerable to data poisoning attacks or adversarial data manipulation (Nelson et al. 2008; Rubinstein et al. 2009 ). Without sufficient defense and audit mechanisms, a wellintended system can easily and quickly be polluted through its vulnerabilities, and turned against the very people it set out to serve (e.g., Microsoft's Tay bot, which was misled by unseen, unhandled malicious data; Vincent 2016). Data sharing. For the sustained development of ConvAI and its adoption in social good, a coordinated effort between academia and the industry is also essential. Despite the abundance of talent and research freedom, academia usually lacks the means to obtain substantial amounts of data for ConvAI, especially data that reflect the "realistic data distribution". In contrast, the industry has a broader access to real-world data and problems, yet usually limited research staff to tackle all of them. While their interests somewhat diverge between academically interesting, abstract, and open research projects (academia) versus proprietary technology and practical solutions for problems at hand (industry), we would like to highlight that their shared common interests in empirically sound technology that hopefully satisfy a wide range of real-world needs. Therefore, we would like to advocate closer collaborations between the two on how to collect, convert, anonymize, and share conversational data (especially data less proprietary in nature) and work on repeatable evaluation methods, so that the broader research community can be exposed to unique problems rooted in real-world issues that can generate great positive impact. It should be stressed, however, that user privacy and consent are integral to any effort to share user-related data. Anthropomorphism and aggression from human users. As ConvAI systems are deployed in the wild, they are commonly perceived by users as having human-like traits or characteristics, which is especially salient for video-/voicebased systems. In the meantime, it is not uncommon that they meet aggression from human users from time to time, and managing user frustration is an important topic. Beyond helping the user achieve their goals, however, we should also watch out for how these interactions can potentially affect our interaction with other humans. Most ConvAI systems available today on the market project a female persona (exclusively or by default), which is even more pronounced when they interact using speech (Tay, Jung, and Park 2014 ; UNESCO and EQUALS Skills Coalition 2019). As they inevitably fail to fulfill our requests from time to time, these systems can become the target of frustration, which may also inadvertently affect how the new generation interact with ConvAI systems as they grow up around them (Elgan 2018; Rudgard 2018) . Setting aside the debate about child education, it is important and responsible that we better understand whether this phenomenon mirrors actual human interactions, and how it would contribute to social interactions in the long term. If these devices are helping instill or reinforce stereotypes of genders and gender dynamics, it is the responsibility of the community to raise public awareness and help advocate means to mitigate this effect (SDG #5 Gender equality). Misuse for Deception. Technological advances are almost always double-edged swords. As NLP systems become more advanced, it will inevitably be easier to make them appear more human-like. For instance, GPT-3 (Brown et al. 2020) was used to post on behalf of an account on Reddit, and because of its coherent and convincing posts, it went undetected for weeks (Heaven 2020). This poses a higher risk in ConvAI systems due to their interactive nature, as their misuse can lead to more elaborate and convincing deception schemes such as spreading mis/disinformation or scamming. Seemingly benign technology that allow ConvAI systems to assume a coherent identity (Li et al. 2016) can potentially help someone impersonate someone else online, akin to how generative adversarial networks (Goodfellow et al. 2014 ) have lead to convincing deepfakes. This would not only jeopardize causes for social good that we have advocated for in this paper, but can potentially cause significant damage to our society if left unchecked. We urge the community to reflect on these potential damages, and work with governments to regulate their use in legit businesses, as well as educate the public about their risks. Transparency and trustworthiness. When ConvAI systems interact with the general population, they face the choice of whether to declare their identity as a computer system or robot. This active act of transparency is not always designed into systems for various reasons (e.g., Google Duplex when it was initially revealed; Google 2018) , and as a result might sometimes have unintended consequences (Garun 2019) despite Duplex's efforts to self-identify (Fagan 2018) . Although the lack of transparency in this case is not a deceptive act for malignant purposes, the restaurants' perplexed reaction to Duplex clearly indicates that it is not inconsequential. Another aspect related to transparency is trustworthiness, which is essential for any AI system, especially ConvAI systems that interact with humans directly. To this end, we argue that ConvAI systems deployed in the real world should be able to recognize their limits of capabilities, and communicate these limits clearly. This can be a crucial step in earning trust especially in the event that the system has misstepped as a result of misidentifying user intent and sentiment, or even mischaracterized user profile in an attempt to personalize to the user. In this paper, we have introduced how conversational artificial intelligence (ConvAI) systems can be applied to causes for social good today and in the near future, and summarized some of the major technical and social challenges the community still needs to tackle down the line. We acknowledge that the opportunities and risks we included in this paper are by no means comprehensive, but an attempt of us to summarize and introduce what in our opinion are typical and atypical scenarios where ConvAI can be impactful or concerning to the community. We refer the reader to (Ruane, Birhane, and Ventresque 2019) for a more comprehensive view of social and ethical considerations of ConvAI. This paper is part of a broader conversation around the impact of AI systems in the real world, including the very definition of social good (Green 2019), transparency and interpretablility (Doshi-Velez and Kim 2017), as well as algorithmic fairness (Corbett-Davies et al. 2017 ), among others. We hope that our paper can serve as a starting point for the community brainstorming of various opportunities and potential risks for ConvAI to promote social good and betterment. Transforming the communication between citizens and government through AI-guided chatbots. Government Information VQA: Visual question answering BAAI. 2020. BAAI-JD Multimodal Dialog Challenge Remote Learning During COVID-19: Examining School Practices, Service Continuation, and Difficulties for Adolescents With and Without Attention-Deficit/Hyperactivity Disorder MultiWOZ -A Large-Scale Multi-Domain Wizard-of-Oz Dataset for Task-Oriented Dialogue Modelling QuAC: Question Answering in Context Algorithmic decision making and the cost of fairness Visual dialog Towards ecologically valid research on language user interfaces Automatic Moderation of Online Discussion Sites OlloBot -Towards A Text-Based Arabic Health Conversational Agent: Evaluation and Results Folk, D. 2021. Is a good bot better than a mediocre human? : chatbots as alternatives sources of social connection Further Advances in Open Domain Dialog Systems in the Third Alexa Prize Socialbot Grand Challenge One year later, restaurants are still confused by Google Duplex Facial-recognition software might have a racial bias problem. The Atlantic Datasheets for datasets Accommodating language Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone AutoTutor: A simulation of a human tutor Ghost work: How to stop Silicon Valley from building a new global underclass Good" isn't good enough Counseling Dialog System with 5W1H Extraction A Conversational Interface for Self-screening for ADHD in Adults Learning Symmetric Collaborative Dialogue Agents with Dynamic Knowledge Graph Embeddings Decoupling Strategy and Generation in Negotiation Dialogues The State and Fate of Linguistic Diversity and Inclusion in the NLP World An iterative design methodology for userfriendly natural language office information applications Racial disparities in automated speech recognition A Persona-Based Neural Conversation Model Designing Chatbots for Guiding Online Peer Support Conversations for Adults with ADHD Guru: A computer tutor that models expert human tutors Let's Go Public! Taking a spoken dialog system to the real world Movie description Conversational AI: Social and Ethical Considerations Antidote: understanding and defending against poisoning of anomaly detectors Alexa generation' may be learning bad manners from talking to digital assistants When stereotypes meet robots: the double-edge sword of robot gender and personality in humanrobot interaction UNESCO; and EQUALS Skills Coalition. 2019. I'd blush if I could: closing gender divides in digital skills through education Persuasion for Good: Towards a Personalized Persuasive Dialogue System for Social Good World Health Organization. 2020. COVID-19 disrupting mental health services in most countries, WHO survey Intelligent teaching assistant systems The design and implementation of xiaoice, an empathetic social chatbot