key: cord-1000455-ibcgjlgt authors: MURDAUGH, Kristen; HAUSKNECHT, Josipa BAINAC; HERBST, Christian T. title: In-Person or Virtual? – Assessing the Impact of COVID-19 on the Teaching Habits of Voice Pedagogues date: 2020-10-13 journal: J Voice DOI: 10.1016/j.jvoice.2020.08.027 sha: a0c2db2c2454962199c8bbba90efa8cc8aa1ed0d doc_id: 1000455 cord_uid: ibcgjlgt The social distancing measures implemented world-wide in the wake of the novel Coronavirus (COVID-19) crisis have forced voice pedagogues to alter their teaching habits, likely shifting from customary in-person teaching to virtual teaching. An online survey, distributed world-wide in April/May 2020, investigated how singing voice pedagogues were impacted by the COVID-19 crisis. The collected responses from 387 survey participants suggest that, overall, voice teachers were only moderately satisfied with having to teach virtually, indicating that virtual voice teaching is not a sufficient replacement for in-person teaching. The participants indicated that during virtual teaching the singing voice can be assessed relatively well through features which provide both acoustic and visual clues. In contrast, depending on utilized technology, it may be harder to judge those aspects of the singing voice that are solely defined acoustically, such as dynamic range and spectral composition. This may be explained by limitations imposed by “out of the box” technology for online communication, which is typically optimized for speech instead of singing. This calls for better information on technological solutions for virtual voice teaching. Since its emergence in December of 2019, the novel Coronavirus (COVID-19) has heavily impacted both the global economy and society as a whole [19] . With early modeling studies indicating the need for intense control measures to limit the spread and severity of COVID-19 as well as to flatten the epidemic curve to reduce pressure on the healthcare system, social distancing measures were put in place around the world, having pronounced effects on professional life [17] . Social distancing likely required many voice pedagogues of universities, as well as private and commercial studios, to change their teaching habits, shifting from customary in-person teaching to virtual teaching. (In the context of this manuscript, customary in-person teaching refers to a teacher and student being in the same physical space, whereas virtual teaching refers to all teaching which does not take place in a shared physical space, but is rather facilitated by internet or telephone connectivity, using computers, mobile phones, or tablets as communication devices.) The shift from customary in-person to virtual teaching was particularly vital as societal lockdowns were implemented. Addressing these fundamental changes of the professional landscape, this study aimed to document how voice teachers world-wide modified their teaching habits during the COVID-19related lockdowns, as well as how content they were with the newly arising teaching situations and utilized technologies for virtual teaching. One particular section of questions targeted the individual attributes of the singing voice, and how well these canin the opinion of the voice teachersbe assessed and evaluated through virtual teaching using video-conferencing tools like Zoom or Skype. This study's data were acquired by means of an online survey. This "Virtual Voice Teaching Survey", available as a PDF in the supplementary materials, was constructed on the SurveyMonkey platform (SurveyMonkey Europe UC, Dublin, Ireland). It had the following structure: (a) Welcome (survey introduction and purpose; data protection declaration); (b) Demographic Data; (c) "Your Voice Teaching" (selecting the primary type, focus, style, and size of a pedagogue's studio); (d) Virtual Voice Teaching Technology (assessed with continuous sliders); (e) Assessment Criteria of the Singing Voice (assessed with continuous sliders); and (f) Closing (anonymous ID creation for future survey participation, optional final comment(s), statement of gratitude). Every question without displayed answer options had a continuous sliding scale (visual analog scale, VAS) for assessment, with minima and maxima at 0 and 100, respectively. Instructions and notes were provided where relevant (see supplementary material). For example, the following note proceeded the Virtual Voice Teaching Technology page: "the term 'virtual voice teaching/lesson' describes all measures that do not take place in the customary in-person setting (i.e., student and teacher in the same room)". One goal of this study was to investigate to what extent voice teachers can assess the individual attributes of the singing voice through virtual teaching using communication technology. This required a list of terms to assess the singing voice. Surprisingly, despite scientifically informed suggestions for voice assessment terminology [8] , there does not yet seem to be an agreement as to which voice features are assessable in the singing voice via customary in-person teaching, let alone virtual teaching. Examining voice assessment terms listed in voice literature [21, 14, 5, 4] , inconsistencies emerge, particularly with authors from differing backgrounds (i.e., vocal pedagogy, voice therapy, acoustics) using differing vocabularies to describe voice quality [8] . For these reasons, special care had to be taken to arrive at a comprehensive and representative list of voice assessment terms used in the "Assessment of the Singing Voice" section of the survey. Aiming to reflect the current standards of teaching institutions, this list was compiled as follows: the websites of North American English-speaking, post-secondary academic music institutions were reviewed. In particular, the institutions' websites were searched for voice evaluation rubrics, voice handbooks, and voice curricula containing voice assessment terms. Taking into consideration all singing styles, a cumulative table of 186 terms, found among 26 sources, was collected. In order to avoid the more obvious redundancies, all terms with a common initial word usage were semantically grouped, with the common word becoming the synonym for all others. An overview of the grouped terms is provided in the supplementary materials (note that only the terms which would otherwise result in redundancies due to their common initial word usage were grouped; non-redundant terms were considered "as is"). Grouping was performed algorithmically with a custom script written in Python by author CTH, reducing the list to 109 terms. The 15 most used terms were identified (see Fig.1 ) and listed in the survey. All grouped terms were listed in parenthesis next to the group synonym. Possible redundancies were noted, and participants were asked to use their discretion to evaluate each term. Ethical clearance for the survey was given by an ad-hoc committee of the Mozarteum University's rectorate. The survey was distributed world-wide by email to voice organizations, music institutions, and colleagues of the authors. It was also posted in online voice teacher forums. Responses were collected for 1 month, beginning April 16, 2020. The results were preliminary sighted with SurveyMonkey's analysis tools (question summaries, insights and data trends, individual responses). Data were subsequently analyzed using Python scripts written by CTH. After one month of survey data collection, a total of 509 survey responses were recorded. 387 participants fully completed the survey, resulting in a 76.0% survey completion rate. The average response time was 6:35 [min:sec], with 5th and 95th percentiles at 3:37 and 27:08, respectively. 47.64% of participants used a computer to complete the survey, 41.88% used a mobile phone, 10.21% used a tablet, and 0.26% used an alternate device. At the time of taking the survey, the participants had primarily taught virtually for an average of 34 days, with 5 ℎ and 95 ℎ percentiles at 5 days and 60 days. 29 participants (7.5 %) indicated that they primarily taught virtually prior to the COVID-19 crisis. Analyzing the demographic data (see Fig. 2 ), the prototypical survey participant was a 50-59 year old female native English speaker, who teaches more than 30 students per month in a classical singing-focused academic or private studio, located in North America or Europe, with Zoom being the primarily used virtual teaching technology. As shown in Fig. 3 , voice teaching during the COVID-19 crisis is seen as important, with the majority of teaching having been converted to a virtual medium. As for the virtual medium used, online meetings with video and audio greatly surpassed the use of phone calls with only audio, while pre-recorded sound clips were also somewhat of importance. Overall, teachers were averagely satisfied with teaching virtually and believed their students to be slightly more satisfied. However, there was a general consensus that virtual voice lessons cannot replace customary in-person voice lessons, though the participants' opinions greatly differed in regards to this question (see the wide distribution of values in the "technology replacement" category of Fig. 3) . On average, each survey participant (387 completed surveys) was using nearly two online technology platforms (668 responses collectedsee Fig. 4 ). Zoom was selected as the most prevalent virtual teaching technology, being used by almost half of the survey participants. Runners up were Apple's FaceTime and Skype, each being utilized by about a fourth of the survey participants. In assessing the 15 provided voice assessment terms, participants' opinions greatly differed, as is shown in Fig. 5 . Keeping the assessment boundaries in mind (0 = This criterion cannot be assessed at all in the context of virtual singing teaching; 100 = This criterion can be fully assessed in the context of virtual singing teaching), every voice assessment term, with the exception of Rhythm and Interpretation, received assessments from 0 to 100, with averages and medians often differing greatly (hinting at non-normal distribution of the responses). With Memorization receiving an average rate of 81, Rhythm 76, and Diction 74, they were the three terms deemed most assessable in the context of virtual singing teaching. Most notably, the terms Dynamics, Resonance, and Tone received the worst ratings, suggesting that overall the survey participants faced the greatest difficulties when assessing these voice features during virtual teaching. Interestingly, these three features are solely based on acoustic input and not reliant on visual information. In order to better understand whether the possibility for assessment of these purely acoustic voice features (i.e., Dynamics, Resonance, and Tone) is potentially linked to the use of certain virtual technologies, they were further examined in relation to the participants' primarily used technologies. The respective results are shown in Fig. 6 . Surprisingly, Messenger consistently had the best performance in regards to these three categories. However, Messenger was only used by 15 survey participants, with the typical user being a 40-49 year old female native English speaker who teaches 11 -30 students per month in a classically or CCM singing-focused academic or private studio, located in North America. Notably, this participant profile significantly differs from that of the prototypical survey participant described above. Similar to the survey data, final survey comments differed greatly. Many comments touched on the inability for technology to fully capture the singing experience, but also noted technology's ability to aid singers in developing their stage presence and online performance skills. Others remarked on the teacher and student need for better technology and more technological knowledge, particularly in regard to latency and data security. Several participants even commented on the joy of the challengelearning to rely on more than visual cues for teaching, and instead honing the ear for acoustic variations. Overall, participants expressed gratitude that teaching may continue, but voiced discontent with the technology used to do so and the fatigue that using it may cause. During COVID-19-related measures of social distancing, virtual voice teaching has become a central aspect of voice pedagogues' work. With the reality that the situation may continue for some time, it is important to understand what technology is currently being used and how satisfied teachers and students are with the available technology as well as the situation as a whole. Overall, most voice teachers contributing to this survey seem to be only moderately satisfied with having to teach virtually, indicating that virtual teaching is not currently viewed as a sufficient replacement for in-person teaching. This notion may be comparable to online teaching and learning, i.e., pedagogical concepts that have existed for more than 60 years [15] , with significant research and advancements in the last 20 years [16] . Recent studies suggest that students prefer, and perceive themselves to learn better through, online learning [6] . Yet, there still remains a divide between those who embrace and those who resist moving from in-person to virtual instruction. With the online teacher's role including pedagogical, social, managerial, and technical tasks, teaching online inevitably requires a specific set of skills which teachers may need to learn before they feel comfortable with virtual instruction [12] . However, with the rapid shift from in-person to virtual instruction due to COVID-19, many voice teachers are left without this specific set of skills and potentially without the financial means to foster such skills. The fact that, on average, each survey participant utilized almost two technological solutions (recall Fig. 4 ) may suggest thatat least at this early stage of forced virtual teachingvoice pedagogues are still exploring options and technological solutions that would work for them. This hints at some potential parallels to the field of speech language pathology (SLP): In SLP, virtual telehealth services were also initially resisted due to various infrastructure shortcomings such as privacy and state licensure laws and technological deficiencies, but through a combination of synchronous and asynchronous technologies and years of technological research and developments in the SLP telehealth domain, clinical assessments and treatments can now be delivered virtually on a level similar to that of conventional in-person care [11] [13] . Relying on computers, web-cameras, microphones, and internet access, with a focus on technologies which do not compromise the acoustic integrity of the transmitted voice signals, virtual SLP care became more equitable and accessible, particularly when asynchronous technologies were implemented as solutions to synchronous variabilities [11] . With this history in mind, SLP telehealth research may serve as a guiding source for voice teachers as they explore technological options and solutions in the COVID-19 crisis. One central finding of this study was that voice teachers had greater difficulties rating most aspects of the voice that are solely grounded in acoustic cues, particularly dynamic range ("Dynamics") and spectral aspects ("Resonance", "Tone"). In contrast, other voice attributes which can be judged visually typically received better ratings. Surprisingly, the assertion differs as a function of technology used. While the majority of participants use Zoom, Apple FaceTime, and Skype, they only have an average satisfaction with the acoustic features of these platforms, whereas users of Messenger express a higher satisfaction with the purely acoustic features. Formal and rigorous evaluation of these tools is required to identify whether qualitative differences exist among them. In support of this, a preliminary report provided by Howell, et al. [9] showed that each of four examined video conferencing platforms (Zoom, Microsoft Teams, VoiceLessonsApp, and FaceTime) negatively affect the transmitted voice signal at various but considerable degrees, highlighting the need for online communication systems that are better suited for the singing voice, considering issues like dynamic range, spectral composition and transmission lag. That latter attribute (lag) points to another inherent issue, i.e., internet broadband speed and latency. Caused by the pandemic-induced rise in virtual activity, networks have been severely impacted, with standard deviations of latency being ~3-6 times higher than those prior to the pandemic [1] . Interestingly, higher delays occur in the evening hours rather than the daytime working hours. This could have implications for the virtual voice teaching field, influencing the scheduling of lessons for optimal network quality. Another aspect is that of sound input (microphonessee [10] for an excellent discussion) and sound output (loudspeakers/earphones), both of which are key components in how acoustic features can be assessed in a virtual voice lesson. High-quality sound recording and playback requires high-quality equipment, which may be unaffordable to some. Furthermore, choosing and setting up the proper equipment (see [20] for a recommendable, if only slightly outdated tutorial) may pose technical challenges. While it may seem obvious thatfor purposes of virtual voice teachingthe entire technological voice transmission chain (sound recording, transmission, and playback) needs to truthfully and realistically represent the entire frequency and dynamic range of the singing rather than the speaking voice, this criterion is not always met. Rather, some components are designed with the improved transmission of the speaking voice in mind, implicitly and automatically "improving" the signal (which, in fact, may introduce unwanted distortions to the transmitted singing voice signal). For instance, some microphones may under certain circumstances add a "bass boost" or a boost to the singers' formant region between 2 and 4 kHz [10] , and some loudspeakers and earphones may alter the spectral or dynamic composition of the sound via a feature called active noise control (ANC) in an attempt to equalize and attenuate ambient sound [3] . This study has been designed as an ad hoc response to the emerging COVID-19 crisis. As such, it may suffer from a number of potential limitations: For practical reasons, only voice assessment resources from North American English-speaking universities have been considered. Research shows that using a common language (such as English) rather than native languages in cross-national questionnaires can obscure cultural differences in the resulting data [7] . However, using a common language for cross-national research is also recommended to avoid translation errors and resulting data discrepancies [18] . Another issue might come from a potential limitation to the list of voice assessment terms (recall Fig. 1 ), since the data pool was narrowed down to available online resources from 26 North American universities. Given the surprising number of different voice assessment terms found in these 26 sources (186 individual terms, reduced to 109 terms through semantic groupingsee Methods), a larger formal study is warranted, collecting and reviewing voice assessment terms utilized in English-speaking tertiary education institutions world-wide. Finally, the survey results may have been influenced by the choice to use continuous (VAS-like) sliders over discrete scales (Likert scale) for term assessment. Studies comparing continuous (VAS-like) sliders with discrete evaluation scales, such as the Likert scale, show positives and negatives for both options [2] . While discrete evaluation scales may evoke data inconsistencies due to the effect of response-order, continuous sliders may suffer from functionality issues leading to low response rates. That latter aspect may explain this survey's relatively low completion rate of 76 %. The data from this survey suggest that singing voice teachers world-wide have been forced to convert their teaching activity to virtual teaching with little time for preparation of particularly the technological aspects of that transition. Therefore, there may be a great potential for improvements of current teaching situations. There is a need for (a) clearer information on the technical capabilities of systems/setups; (b) recommendation(s) backed up by a consensus group of "tech-savvy voice teachers"; and (c) the means to provide this information and the required technical know-how to the international community of voice teachers. Impact of the COVID-19 pandemic on the Internet latency: a large-scale study Evidence-Based Survey Design: The Use of Continuous Rating Scales in Surveys Acoustic Equalization for Headphones Using a Fixed Feed-forward Filter Characterisation of Voice Quality in Western Lyrical Singing: from Teachers' Judgements to Acoustic Descriptions Glossaire de l'Atelier du Chanteur PERCEPTIONS OF DISTANCE LEARNING: A COMPARISON OF ONLINE AND TRADITIONAL LEARNING Does the Use of English-language Questionnaires in Crossnational Research Obscure National Differences? Towards a Common Terminology to Describe Voice Quality in Western Lyrical Singing: Contribution of a Multidisciplinary Research Group Preliminary Report: Comparing the Audio Quality of Classical Music Lessons Over Zoom, Microsoft Teams, VoiceLessonsApp, and Apple FaceTime Guidelines for Selecting Microphones for Human Voice Production Research Telehealth Technology Applications in Speech-Language Pathology Towards Best Practices in Online Learning and Teaching in Higher Education Telehealth: Voice Therapy Using Telecommunications Technology The structure of singing : system and art in vocal technique A history of E-Learning: Echoes of the pioneers Lessons from the Virtual Classroom: The Realities of Online Teaching The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study Problems of Translation in Cross-Cultural Research World Health Organization declares global emergency: A review of the 2019 novel coronavirus (COVID-19 Recommendations for the Creation of a Voice Acoustics Laboratory Singing : the mechanism and the technic This publication was supported by a research grant received from Land Salzburg. We thank all survey participants for their time and willingness to share their opinions.