key: cord-0864573-yblmylj2 authors: Isbitski, David; Fishman, Elliot K.; Rowe, Steven P. title: Connecting With Patients: The Rapid Rise of Voice Right Now date: 2020-07-17 journal: J Am Coll Radiol DOI: 10.1016/j.jacr.2020.06.017 sha: 95b879f9199e755a7aa19a847b290177f7aef2db doc_id: 864573 cord_uid: yblmylj2 nan Up until 5 years ago, I was a developer. I never thought that computers should be hard. When I was here previously, I discussed some of the things people can do with voice-enabled technology and how health care might be influenced [1] . Even in the short time since then, interval advances have led to even more widespread adoption of voiceenabled devices and technologies, including Amazon Alexa and Echo, Google Assistant, Microsoft Cortana, Apple Siri, and others. What is happening, far beyond anything that happened with the telephone or the Internet, is that voice is becoming routine throughout the day. No other technology has become such a part of routine. Technology has often been divisive, but now voice technology can help bring us together. Some of these technologies can be incorporated across multiple devices. Access to a computer is no longer necessary to have tremendous knowledge and ability through use of voice. Anyone who can speak a sentence can interact with voice-enabled devices. The artificial intelligence algorithms in such devices have traditionally broken down sentences into units the algorithms can understand, in order to subsequently provide appropriate responses. If we take an example, the sentence "Alexa, ask for my heart rate from Healthkeeper" includes a wake word ("Alexa"), a launch word ("ask"), an utterance ("my heart rate"), and an invocation name ("Healthkeeper"). Now, the algorithms work from much more conversational sentences, such as "How did my heart rate do yesterday?" From those increasingly conversational sentence structures, a user can access a tremendous fund of knowledge. Users specify actions and provide seed data, and developers work with an application programming interface to create responses and simulated dialogue; the resultant models lead to the development of a recurrent neural network that can be deployed in the voiceenabled device. From there, the device undergoes self-learning, and user interactions give feedback on live data, thereby constantly improving the neural network. One of the emerging abilities of voice-enabled devices is anticipating latent goals. If you say, "I want a night out," the device can figure out that you will need transportation, food, and a movie and then assist you in arranging all of those. Furthermore, the ability of devices to visually recognize objects is improving, such that you can hold something up to the device and ask it to identify the object. This could profoundly help the visually impaired. Also, users are increasingly being put in charge of the neural network "black box" by being empowered to ask, "Why did you do that?" or "What did you hear?" The complexity of conversations with voice-enabled devices continues to increase from single-turn instructions leading to simple goals, through multiturn conversations leading to complex goals, to multisession reasoning leading to ambiguous goals ( Fig. 1) . Even medical devices can be enabled, and the ultimate goal is to make voice technology as available to everyone as possible. Hundreds of thousands of developers work with these devices to build skills that allow the devices to be able to talk about certain topics. Among the common topics for skills are health and lifestyle. Additionally, users are no longer restricted to just one possible voice from these devices; for example, those devices can learn to synthesize your voice. Imagine a patient's being able to access a device and hear your voice even when you are unable to be there. Voice can also be used in devices to help patients with medication routines, and they can be safety net devices for the elderly who live alone. With the limitations of senior health care shown so clearly by the nursing home coronavirus tragedy, the future will surely involve more people living in their homes rather than in large facilities. These changes can be supported by technology such as voice. Particularly during the coronavirus pandemic, there are many opportunities for us to use voice to improve our response to the public health crisis. For example, you can ask voice-enabled devices what you should do if you think you have coronavirus, how to make a face mask, to give you tips for cleaning, or to call a coronavirus helpline. You can train such devices to give you curated good news. Voiceenabled devices often have key information in many different languages. When working from home, people can teach their voice-enabled devices about topics, discuss those topics with the devices, and then publish and share the new skills the devices have now acquired. Reminders about when to take medications and when to go to a doctor's appointment are easily incorporated. Beyond providing information about the pandemic, voice-enabled devices can help us place video calls and connect with friends and relatives we are not able to see in person right now. There are voice platforms that allow kids to access kid-friendly things, giving their parents and caretakers potential breaks. Providing a voice, and connecting us with others, can help combat the profound feelings of loneliness that many have felt during this crisis. There is an increasing variety with which people are asking their devices about things, indicating that there are more people using voice-enabled technologies and that they are doing so more frequently and for longer periods of time. With the ongoing fear of the pandemic, and the conflicting data regarding possible spread from surfaces, being able to have voice commands decrease risk and provides the ability to bypass common danger points from elevator buttons to door knobs to credit card processing machines. Outside of the individual, the increasing presence of voice-enabled devices affects the data-gathering and research approaches to this pandemic as well as to future public health crises. These devices can facilitate the sharing and gathering of information, provide near instantaneous updated information, and facilitate the pooling of data for use by public health experts and artificial intelligence algorithms. The combination of rapidly advancing voice-enabled technology and the social changes we have seen because of the coronavirus have driven the adoption of voice as a potential transformative way that patients can obtain information and communicate with their health care providers. Because of the changes that have already occurred, we can expect that we will never go back entirely to how things were and that voice will be an increasingly important influence on health care. The focus on artificial intelligence in radiology has been on the use of algorithms to enhance image interpretation and uncover imaging biomarkers. However, artificial intelligence will have profound impacts across radiology practices, and the rise of voice-enabled devices indicates that. We can expect that patient preparation, explanations of studies, and the consenting process will be well handled by voice-enabled devices with artificial intelligence algorithms. Radiology, and medicine more broadly, needs to find creative ways to apply voice to existing and emerging technologies and work flows. For example, wearable technologies such as voice-enabled glasses could allow a physician to record important aspects of a patient history or physical for later review. The same technology could be used to provide guided tours and to keep patients and visitors from getting lost in the hospital. Successful practices that emerge from the coronavirus pandemic in strong positions will find ways to leverage artificial intelligence, and voice-enabled technologies can play a large role in that. Our day-to-day work in our offices will also change. Voice-enabled technologies can finally help us to realize the "paperless" office. Our phone calls, dictations, and communications with colleagues can all be done in a contactless way using voice. The authors state that they have no conflict of interest related to the material discussed in this article 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 Learning to talk again in a voice-first world ADDITIONAL RESOURCES Q4 Additional resources can be found online at The Russell H. Morgan Department of Radiology and Radiological Science