key: cord-0837967-y04vq0ou
authors: Asiaee, Maral; Vahedian-azimi, Amir; Atashi, Seyed Shahab; Keramatfar, Abdalsamad; Nourbakhsh, Mandana
title: Voice quality evaluation in patients with COVID-19: An acoustic analysis
date: 2020-10-01
journal: J Voice
DOI: 10.1016/j.jvoice.2020.09.024
sha: aa7c6738458669b2d0424b98662eeb852600c9e2
doc_id: 837967
cord_uid: y04vq0ou

OBJECTIVES: : With the COVID-19 outbreak around the globe and its potential effect on infected patients’ voice, this study set out to evaluate and compare the acoustic parameters of voice between healthy and infected people in an objective manner. METHODS: : Voice samples of 64 COVID-19 patients and 70 healthy Persian speakers who produced a sustained vowel /a/ were evaluated. Between-group comparisons of the data were performed using the two-way ANOVA and Wilcoxon's rank-sum test. RESULTS: : The results revealed significant differences in CPP, HNR, H1H2, F0SD, jitter, shimmer and MPT values between COVID-19 Patients and the healthy participants. There were also significant differences between the male and female participants in all the acoustic parameters, except jitter, shimmer and MPT. No interaction was observed between gender and health status in any of the acoustic parameters. CONCLUSION: : The statistical analysis of the data revealed significant differences between the experimental and control groups in this study. Changes in the acoustic parameters of voice are caused by the insufficient airflow, and increased aperiodicity, irregularity, signal perturbation and level of noise, which are the consequences of pulmonary and laryngological involvements in patients with COVID-19

In early March 2020, the World Health Organization (WHO) declared COVID-19, as a pandemic. Since then, with more than 17,000,000 confirmed cases worldwide and counting, coronavirus has become one of the most challenging issues with which the world is faced. As indicated by Rothan and Byrareddy 1 , COVID-19 principally affects the respiratory system, and people infected with the disease may experience pneumonia and acute respiratory distress syndrome. Symptoms of acute upper respiratory tract infections, such as cough, sore throat, rhinorrhea and sneezing, as well as digestion symptoms, like vomiting, are associated with mild cases of this disease, and symptoms like pneumonia and acute respiratory distress are primarily seen in severe and critical cases, respectively. 2 "Respiratory tract infections affect the same system and structure that are used for voice production". 3 Although the primary function of the respiratory tract is to provide oxygen to the body, its secondary and equally important function is to provide energy and produce phonation for the purpose of speech communication.

Voice production is a 3-stage process: Respiration, phonation and resonating system. In the respiratory stage, the force that is needed for generating sound is provided by the air expelled from the lungs. A person who has contracted the coronavirus not only may experience shortness of breath but may also have difficulty exhaling, which results in the lack of energy to produce sound and hence a disruption in the speech production cycle. In the phonation stage, the subglottal pressure must reach a certain point to set the vocal folds into a vibratory position. If the first stage of speech mechanism is disrupted, the phonation function of the larynx will be accordingly impaired. Other symptoms of coronavirus, such as recurrent dry coughs, may cause changes in the vocal folds and will consequently give rise to modifications in the acoustic cues related to voice quality. As shown in a study using self-assessment questionnaires, 28.6% of those infected with COVID-19, showed symptoms of dysphonia. 4 Nonetheless, using only selfassessed results is subjective and prone to error.

The acoustic analysis of voice quality is of great interest among phoneticians and voice clinicians due to its noninvasive nature, low costs and ease of application. 5, 6, 7 In 2001, the European Laryngological Society (ELS) put forward a basic protocol for assessment of voice-related diseases in which they recommended using the acoustic analysis of speech as a diagnostic tool. 7 In the ELS protocol, fundamental frequency (F0), perturbation measures of pitch (jitter) and amplitude (shimmer) as well as harmonic-to-noise ratio (HNR) were noted as relevant parameters when evaluating voice quality. 7 Along with these parameters, which are the most frequently-used ones 6, 8, 9 , cepstral peak prominence (CPP) 10, 11, 9, 12, 13 , harmonic amplitude measures 14, 15 and the aerodynamic parameter of voicing, i.e. maximum phonation time (MPT) 16, 17 are among the most-studied characteristics of voice. In addition to all the aforementioned parameters, this study also employed fundamental frequency variation (F0 standard deviation) and number of voice breaks (NVB) for accomplishing the study objective.

Fundamental frequency is the rate at which vocal folds vibrate per second and is expressed in Hertz (Hz). Changes in mass, length and tension of the vocal folds modify the fundamental frequency. 18, 19, 20 Variation in fundamental frequency is either normal (e.g. difference between F0 in children, women and men) or may occur because of vocal fold pathologies. 21 Asymmetrical changes in the mass and tension of the vocal folds caused by a laryngeal pathology such as tumors or paralysis lead to a deviant vibration and consequently change the fundamental frequency. 22 F0SD depicts the amount of variation in the frequency of vocal fold vibrartion. 23 Jitter and shimmer are defined as the cycle-to-cycle variation in frequency and amplitude during phonation, respectively. 21 Normally, in healthy speakers, vibration of vocal folds show a lowlevel jitter, and higher levels of jitter are observed in pathological voices. 24 Variation in shimmer is mostly detected when there is a mass lesion in the vocal folds such as edema, polyps or carcinomas. 25 Values above 1.04% for jitter and above 3.81% for shimmer in adult speakers are considered pathological. 26 HNR, also known as signal-to-noise ratio, depicts the degree of periodicity in a signal. 27 It is an estimate of energy in the harmonics of voice signal and the noise energy in the signal. 21 HNR values are usually higher in normal voice than in pathological voice, since normal voices are more sonorant than pathological ones. HNR values below 7dB are stated to be pathological. 28, 29 CPP is another measure that corresponds to the degree of regularity and periodicity of the voice signal. 27 Derived via linear regression, CPP gauges "the relative amplitude of the cepstral peak prominence in relation to the expected amplitude". 14,30 Dysphonic voices show lower CPP values in comparison to normal voices, which exhibit higher level of CPP. 10 Difference between the amplitude of the first and second harmonics (H1-H2) is indicative of the degree of glottal adduction in different voices. 31 As this parameter reflects changes in the open quotient, the higher is the value of (H1-H2), the greater is the open quotient. The (H1-H2) measure is associated with breathiness in the voice. Breathy voices show higher (H1-H2) values; however, strained voices exhibit lower (H1-H2) values. 32 In Praat 33 , the number of voice breaks (NVB) is defined as "the number of distances between consecutive pulses that are longer than 1.25 divided by the pitch floor". It is believed that normal voices show a lower number of voice breaks than pathological voices.

MPT is an aerodynamic parameter that describes the maximum length with which a vowel can be vocalized continuously and is expressed in second. 16 Generally, phonation time less than 10 seconds are considered abnormal. 34 Since COVID-19 is mostly considered as a respiratory disease, and many of its symptoms are associated with the larynx and the lungs infections, those acoustic parameters that represent these organs' functions, are chosen to be analyzed. Differences in the acoustic parameters of voice between patients and healthy participants could be considered as one of the diagnostic tools, which depicts the involvement of the larynx and the other respiratory organs; hence, this study compares COVID-19 patients with healthy individuals to evaluate the effect of this disease on the noted acoustical parameters without resorting to invasive methods.

The National Institute for Medical Research Development (NIMAD) Ethics Committee in Iran approved the study protocol under the code IR.NIMAD.REC.1399.056. All the participants had given their informed consent to use their speech samples for research purposes.

The present study was an observational case-control study.

Simple random sampling method was employed to choose the healthy participants. Patients were chosen from those people who were hospitalized at the Baqiyatallah hospital. Diagnosis of participants with COVID-19 was carried out using the World Health Organization (WHO) interim guidance. 35 Upon admission, chest computed tomographic (CT) scan plus swab test was performed for patients. Since the scan results are readily available (compared with the swab test which takes at least 24 hours), diagnosis was made based on the CT results; Moreover, positive results on a reverse-transcriptase polymerase chain reaction (PT-PCR) assay of a specimen obtained on a nasopharyngeal swab, indicated the confirmation of COVID-19. Therefore, all patients in this study, were positive based on the two methods.

At the initial stage of sampling, data from 100 healthy participants and 100 individuals with COVID-19 was collected. All these participants completed a questionnaire. The questionnaire contains questions about participants' gender, age, health background (including questions about any history of asthma, COPD, laryngitis and chronic bronchitis), their smoking habits, whether they have any history of substance abuse, whether the participants recently travelled during the COVID-19 pandemic and whether the participants were/are in contact with someone who was/is tested positive for COVID-19.

The inclusion criteria for the healthy participants was to be a non-smoker and non-drug addict, someone who did not travel during the pandemic and had/ has no contact with a person who has contracted COVID-19, in addition to having no prior voice disorder or any kind of respiratory disease. The inclusion criteria for patients were the same as healthy participants except for the travelling and being in contact with a COVID-19 patient.

The final number of participants who met the inclusion criteria was 147. Participants were then divided into an experimental and a control group. In the experimental group, speech samples of 77 speakers who had the disease were initially collected, but, after the first recording session, 13 patients were either transferred to the ICU or passed away. The final number of participants whose data were used in the experimental group was thus 64 (38 male, 26 female). Their age ranged between 16 and 77 (mean= 52.3 years, SD= 12.89).

The control group comprised of 70 (33 male and 37 female) healthy Persian speakers who were aged 18 to 70 (mean=42.35 years, SD= 10.01).

All recordings were obtained using ZOOM H5 handy recorder with a sampling rate of 44100 Hz and 16 bit quantization. During the recordings, the recorder was held at the distance of 20 cm with a 45° angle from the participants' mouth. Before starting the main recording sessions, the examiners demonstrated the task individually for each participant. To minimize the effect of intonational changes and any irregularity caused by the coarticulation effect, only a prolonged vowel was used. 36 All the participants were asked to produce a vowel, namely /a/, in as long a time as they could, at their comfortable pitch and with a flat tone and a constant amplitude. For the MPT assessment, the participants were asked to take a deep breath before producing the /a/ sound.

Two sessions of recording were carried out for each participant. In the control group, the recordings were carried out on two different sessions. In the experimental group, the interval between recordings was two days. The first recording was recorded on the day the participants were hospitalized and the second one, two days after their first recording session.

Voice recordings were done by two hospital nurses who were trained to do the speech data collection. All safety measures were taken by these nurses while recording; they wore face mask and face shield, disposable gloves and suits. The recorder was also sterilized before and after each recording session, using alcohol pads.

Two different methods were used for extracting the acoustical parameters related to voice quality. A Praat 33 script was used to extract the local values of jitter, shimmer, MPT and the number of voice breaks. The measurements were performed using the default settings in Praat. Fundamental frequency, CPP, HNRs and (H1-H2) were automatically extracted using VoiceSauce 37 , a freeware for voice analysis. F0 measurement was done using the default algorithm of VoiceSauce, i.e. STRAIGHT. 38 By default, VoiceSauce detects F0 at 1-ms intervals and computes the harmonic spectra magnitude, pitch-synchronously over a three-cycle window; however, the default was changed into 5-ms intervals. In VoiceSauce, CPP is calculated using the algorithm proposed by Hillenbrand et al.. 14 HNR values are gauged using a variable window of five pitch periods by de Krom's algorithm. 39 HNR05, HNR15, HNR25, HNR35 measure HNR form 0-500 Hz, 0-1500 Hz, 0-2500 Hz and 0-3500 Hz, respectively. This study used HNR35 (henceforth HNR). Finally, H1*-H2* was used for (H1-H2) assessment, which is the H1-H2 corrected for the effect of formants based on the algorithm proposed by Iseli et al. 40 and used in VoiceSauce.

Data were analyzed in R (R Core Team 2020) version 4.0.0. 41 Due to the large number of tokens extracted from the acoustic analysis of the parameters in VoiceSauce (F0, CPP, HNR and H1-H2), the average of the results for each parameter and each participant was first calculated. Two values were thus obtained for each parameter, each representing the value of that parameter in a repetition for each participant. Another averaging process was then carried out on the results obtained from all the parameters' repeated recordings; thus, for each participant and each parameter, only one value was obtained.

A two-way ANOVA was performed to evaluate the effects of participants' health status (healthy/sick (infected)) and gender (male/female) on the different acoustic parameters studied in this research. The data collected on the F0, CPP, HNR, H1H2 parameters were normally distributed according to Shapiro-Wilk's test of normality (p > 0.05). To evaluate the homogeneity of variances, Levene's test of homogeneity of variances was run. The assumption of homogeneity of variances was met (p > 0.05). Since data in jitter, shimmer, MPT and F0SD did not show a normal distribution, the nonparametric Wilcoxon rank-sum test was used. As for the number of voice breaks, only a descriptive analysis will be reported. Table 1 summarizes the total number of voice breaks (NVB) in the female and male participants grouped by their health status and across the two repetitions. Based on the data in Table 1 , there was no voice break in 77% of the voice samples from the healthy female participants; as for the healthy male participants, the rate was 71.2%. In the healthy groups, at least one voice break was observed in 23% of the women's and 28.8.9% of the men's phonation. The percentage of non-occurrence of voice breaks dropped to 48.1% for the female and 51.3% for the male patients in the COVID-19 group. A total of 51.9 % of the data form the women and 48.7% of the data from the men in the patients' group showed at least one voice break during the articulation of the sustained vowel /a/. Overall, the percentage of NVB was higher in them in comparison with the healthy participants. Table 2 presents the mean, standard deviation and range of all the acoustic parameters in the healthy and infected male participants. Table 3 presents the mean, standard deviation and range of all the acoustic parameters in the healthy and infected female participants. Cohen 42, 43 , r values varying more than 0.5 indicate a large effect.

The aim of this study was to investigate whether acoustic parameters of voice differ significantly between covid-19 patients and healthy participants. Fundamental frequency (F0) and its variations (F0SD), fundamental frequency perturbation measures (i.e. jitter and shimmer), harmonics-to-noise ratio (HNR), difference between the first two harmonic amplitudes (H1-H2), maximum phonation time (MPT) and cepstral peak prominence (CPP) were thus measured. These parameters can delineate different aspects of vocal apparatus dysfunction in voice production including irregularity and aperiodicity in vocal fold vibration, airflow insufficiency, increased noise and signal perturbations. 44, 45, 23 Except F0, all the other acoustic parameters were significantly different between the experimental and control groups.

The results obtained in this study showed a notable difference in fundamental frequency variation (F0SD) between the healthy and infected participants, which could stem from tremor and insufficient control over laryngeal muscles in the experimental group. 23, 46 An increase in jitter and shimmer was also observed in both female and male patients. The uneven weighting of the vocal folds, which occurs due to inflammation or degeneration of the vocal fold tissues 47 as a result of recurrent dry coughs, could explain the higher values of jitter and shimmer in the experimental group. The present study also revealed a decrease in both HNR and CPP values in COVID-19 patients. A decline in these parameters is an indication of increased spectral noise in patients' voices, which consequently led to breathier voice in the experimental group. 12, 48 Moreover, many studies have shown that an increase in H1H2 could be considered one of the acoustic indicators of breathiness 49, 50 , especially in pathological voices. 51 The considerable growth in the value of the H1H2 parameter is in line with these findings. Air leakage and incomplete vocal fold closure, which may result from the trauma of vocal folds 52 during recurrent coughing, may have contributed to the lowered values of HNR and CPP and breathiness in patients with COVID-19. During coughing, the mechanical forces of contact pressure are remarkably larger that in normal phonation 53 , which may cause vocal fold injuries. Vomiting, as another symptom of coronavirus, can also give rise to injuries of the vocal folds because of the mechanical force of the gag reflex 54 and the acidity of the gastric content, which rises up to the throat and irritated the tissues. 54 Previous studies have shown more aperiodicity in pathological voices with an increase in voice break numbers. 23, 16 According to the present findings, the occurrence of voice break is almost rare in healthy participants, but in the experimental group, it had an increased incidence. This finding also confirms the voice dysfunction and the possible injury in vocal folds. As shown by the results, the MPT is significantly below the normative data in the experimental group. The phonation duration is strongly correlated with lung volume. As noted earlier, this disease has certain effects on the lungs, which accordingly cause the airflow insufficiency for continuation of voice. Moreover, the inadequate closure of vocal folds in pathological larynx generally reduces MPT due to the leakage of through rima glottidis. 55, 44, 16 

This study revealed significantly higher values of F0SD, jitter, shimmer, H1H2 and voice break numbers in COVID-19 patients in comparison with the control group. The values of HNR, CPP, and MPT were significantly lower in the experimental group. Changes in MPT demonstrated that these patients suffer mostly from airflow insufficiency due to the involvement of the lungs, which was not far from expectation. The other changes demonstrated some laryngological involvement in patients, since they showed higher aperiodicity, irregularity and signal perturbation and also increased levels of noise in the patients' voice in comparison with the control group.

The epidemiology and pathogenesis of coronavirus disease ( COVID-19 ) outbreak

COVID-19 pathophysiology : A review

Respiratory Tract Infections and Voice Quality in 4-Year-old Children in the STEPS Study

Features of Mild-to-Moderate COVID-19 Patients with Dysphonia

Acoustic Discrimination of Pathological Voice

Reliable jitter and shimmer measurements in voice clinics: The relevance of vowel, gender, vocal intensity, and fundamental frequency effects in a typical clinical task

A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques: Guideline elaborated by the Committee on Phoniatrics of the European Laryngolo

Multi-Dimensional Voice Program (MDVP) vs Praat for Assessing Euphonic Subjects: A Preliminary Study on the Genderdiscriminating Power of Acoustic Analysis Software

Cepstral peak prominence: A more reliable measure of dysphonia

A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs

Spectral / Cepstral Acoustic Measures Differentiate Hypofunctional from Normal Speakers Purpose

Cepstral peak prominence : A comprehensive analysis

Spectral-Cepstral Estimation of Dysphonia Severity : External Validation

Acoustic correlates of breathy vocal quality

Acoustic properties of different kinds of creaky voice. ICPhS

Acoustic Voice Analysis and Maximum Phonation Time in Relation to Voice Handicap Index Score and Larynx Disease

Correlation between the Voice Handicap Index and voice measurements in four groups of patients with dysphonia. Otolaryngol -Head Neck Surg

Some laryngeal correlates of vocal pitch

Principles of Voice Production

Essentials of Anatomy & Physiology for Communication Disorders. Delmar Cengage Learning

Sataloff's Comprehensive Textbook of

Acoustic Characteristics of Normal and Pathological Voices

Objective Acoustic Quantification of Phonatory Dysfunction in Huntington's Disease

Speech and Voice Science

Analise acústica da voz de indivíduos na terceira idade

Acoustic model and evaluation of pathological voice production

Acoustic Analysis of Voice: A Tutorial

Acurate short-term analysis of the fundamental frequency and the harmonicsto-noise ratio of a sampled sound

Vocal Acoustic Analysis -Jitter, Shimmer and HNR Parameters

Acoustic Correlates of Breathy Vocal Quality : Dysphonic Voices and Continuous Speech

Glottal characteristics of male speakers: Acoustic correlates and comparison with female data

Voice Disorders

Praat: doing phonetics by computer

Diagnosis of voice disorders

Clinical Management of Severe Acute Respiratory Infection When Novel Coronavirus ( 2019-NCoV) Infection Is Suspected: Interim Guidance

Acoustic waveform perturbations and voice disorders

VOICESAUCE: A program for voice analysis

Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneous-frequency-based F0 extraction: Possible role of a repetitive structure in sounds1Speech files available

A Cepstrum-Based Technique for Determining a Harmonics-to-Noise Ratio in Speech Signals

Age , sex , and vowel dependencies of acoustic measures related to the voice source a )

R: A Language and Environment for Statistical Computing

Statistical Power Analysis

Statistical Power Analysis for the Behavioral Sciences

Von Leden H. Phonation and respiration. Function study in normal subjects

Breathiness and Phonation Length

Phonatory function of neurologically impaired patients

Voice Disorders: Scope of Theory and Practice. Boston: Allyn and Bacon

Objective voice quality analysis before and after onset of unilateral vocal fold paralysis

Analysis , Synthesis , and Perception of Voice Quality Variations among

Acoustic analysis and perception of breathy vowels

Some spectral correlates of pathological breathy and rough voice quality for different types of vowel fragments

Subjective and Objective Evaluation of Voice Quality in Patients With Asthma

Endolaryngeal contact pressures. J Voice

Laryngopharyngeal Reflux : Larynx on Fire

Computerized acoustic voice analysis and subjective scaled evaluation of the voice can avoid the need for laryngoscopy after thyroid surgery