key: cord-0684114-9ysye8h5 authors: Maor, Elad; Tsur, Nir; Barkai, Galia; Meister, Ido; Makmel, Shmuel; Friedman, Eli; Aronovich, Daniel; Mevorach, Dana; Lerman, Amir; Zimlichman, Eyal; Bachar, Gideon title: Non-invasive Vocal Biomarker is Associated with Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) Infection date: 2021-05-14 journal: Mayo Clin Proc Innov Qual Outcomes DOI: 10.1016/j.mayocpiqo.2021.05.007 sha: a87aeb52e58705aff7f12c020259dcf521d5f8c5 doc_id: 684114 cord_uid: 9ysye8h5 Objective To investigate the association of voice analysis with SARS-CoV-2 infection. Patients and methods A vocal biomarker, a unitless scalar with a value between 0-1, was developed based on 434 voice samples. The biomarker training was followed by a prospective, multi-center, observational study. All subjects were tested for SARS-CoV-2, had their voice recorded to a smartphone application and gave their informed consent to participate in the study. The association of SARS-CoV-2 infection with the vocal biomarker was evaluated. Results Final study population included 80 subjects with a median age of 29 [23-36], of whom 68% were men. Forty patients were positive for SARS-CoV-2. Infected patients were 12 times more likely to report at least one symptom (odds ratio 11.8, p<.001). The vocal biomarker was significantly higher among infected patients (0.11 [0.06-0.17] vs. 0.19 [0.12-0.3], p=.001). The area under the receiver operating characteristic curve (AUC) evaluating the association of the vocal biomarker with SARS-CoV-2 status was 72%. With a biomarker threshold of 0.115, the results translated to a sensitivity and specificity of 85% [95% CI: 70-94%] and 53% [95% CI: 36-69%], respectively. When added to a self-reported symptom classifier, the AUC significantly improved from 0.775 to 0.85. Conclusion Voice analysis is associated with SARS-CoV-2 status and holds the potential to improve the accuracy of self-reported symptom-based screening tools. This pilot study suggests a possible role for vocal biomarkers in screening for SARS-CoV-2 infected subjects. During the current coronavirus disease (COVID-19) pandemic and in the absence of any pharmaceutical intervention, the contemporary strategy against disease spread is social distancing. 1 Global strict confinement restrictions have been associated with the bringing down of new cases in many countries, turning the attention to a possible "second wave" of the disease. The suppression of new waves of viral infection is dependent on the scope and efficiency of testing strategies. Molecular and serologic testing are the gold standard methods for both identifying infected people and for gathering information on people who have recovered. However, there is a clinical need for remote, non-invasive and transparent methods to screen large population in specific scenarios such as airports and public transportation hubs. There is also a clinical need for a self-administered pre-screening tool available to the general public that will significantly improve classification and the effectiveness of the existing PCR testing regime. Voice signal analysis and voice recognition are being used extensively for commercial purposes. Amazon, Google, Samsung and other companies are using the technology to allow customers to talk, activate and search their devices for content and it is estimated that one in six Americans owns a voice-activated device. 2 Recent data suggests that voice analysis can be used to develop vocal biomarkers that are associated with disease states Examples include Parkinson's' disease, coronary artery disease, pulmonary hypertension and chronic obstructive pulmonary disease. [3] [4] [5] We have recently showed how vocal biomarkers can identify congestive heart failure patients at risk for hospital re-admission and mortality. 6 Lungs play a critical role in voice production, and voice may be affected by interstitial fluid and pulmonary edema. Thus, it is biologically plausible that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection could be detected by voice signal analysis. The purpose of the current pilot study was to investigate the association between voice analysis and SARS CoV 2 infection in a prospective multi-center clinical registry. This is an observational, prospective, multi-center clinical registry of audio recordings and clinical data of SARS-CoV-2 infected patients and negative controls. The analysis of the J o u r n a l P r e -p r o o f collected data was conducted retrospectively and did not affect the disease management. Data was collected in parallel in three sites in Israel during the first pandemic wave: Sheba Medical Center, Rabin Medical Center and the Israeli Army Medical Corps. Patients were recruited while being quarantined in a center for mild COVID19 patients in one of those three sites. Negative subjects in this study were subjects who underwent a clinically indicated SARS-COV-2 PCR test and were found to be negative. Data collection was approved by the ethical committees of all sites and participants signed an informed consent form before any data was collected. Patients were included in the study if they were (1) capable to comply with the study protocol, handle smartphone application and have basic computer skills and (2) signed informed consent. Exclusion criteria included age under 18, pregnant women, inability to sign informed consent documents or any speech or voice impairment. Demographic data, medical history and SARS-CoV-2 test history were documented in pre-specified case report forms (CRFs). The participants were divided into two groups based on their polymerase chain reaction PCR results; Group A: Patients with laboratory-confirmed SARS-CoV-2 infection, hospitalized or in quarantine outside of hospital (n=40). Group B: Participants who underwent a clinically indicated PCR test and were PCR-confirmed negative (n=40). Participants were instructed to download the Vocalis Health © Research mobile application to record themselves and to document their symptoms at the time of the recording. Participants were instructed to record while holding the phone in their hand or putting the phone on a table in front of them at a distance that allows them to read text on the screen. In addition, participants documented the following symptoms: shortness of breath (not at all/light/mild/severe), cough (not at all/light/mild/severe), runny nose (not at all/light/mild/severe), fever above 37.8 C (y/n/did not check) and decreased smell sensations (yes/no/don't know). Data collected via the study dedicated patient's smartphone using the Vocalis Health Research mobile application was uploaded and stored encoded in a secured cloud hosted at Amazon Web Services (AWS). The transmission of data from the mobile app to the secured cloud was conducted automatically according to a secured and encoded standard SSL 3/ TLS 1.3. as existing online audio recordings. Mean age of the training cohort was 37±16.3and 59% were males. The feature extraction process was based on previously described transfer learning and adaptation methods, which are appropriate for small training databases. 7 All recordings were re-sampled to 16 kHz and a Mel spectrogram was calculated using the Librosa library in Python 8 , window size was 1024 samplers, hop size was 512 samples and the number of Mel coefficients was set to 128. For each recording, 10 seconds of continuous speech were converted to a Mel spectrogram. Each Mel spectrogram was passed through a Vggish convolutional neural network (CNN) architecture which was pre-trained on Vocalis internal databases. 9 This process can be viewed as feature extraction, converting 10 seconds of continuous speech recording to a 512-dimensional features vector ( Figure 1 ). A 10-fold cross validation procedure was conducted on the training cohort patients to evaluate two classification models (random forest and support vector machine, scikit-learn implementation in Python 10 ) at different regularization levels via a grid search. The best model was selected using the average of the 10 folds area under the receiver operating curve (AUC) metric. The resulting model is described as the vocal biomarker, a positive scalar between 0-1, which is a non-linear combination of the 512 features mentioned above. The best model on the training cohort achieved an AUC of 0.78±0.08, using a support vector machine model with a nonlinear kernel (radial basis function) and a regularization constant C=1. As part of the participants contributed more than one voice recording, data leakage between folds must be excluded. Thus, we randomized the labels (positive/negative) between the participants and reached an AUC 0.5±0.03, which is equal to a random classifier. This randomization validated there was no data leakage between the various folds in the 10-fold cross validation process. The biomarker that achieved the highest AUC on the 10-fold cross validation procedure was tested on the study population. Final study population included 80 subjects of whom 40 (50%) were positive for SARS-CoV-2 infection. Median age of the study population was 29 [23-36] years and 54 (68%) were men. Overall rate of comorbidities in the study cohort was relatively low: there were 8 (%10) active or past smokers, 3 (4%) patients with asthma, 3 (4%) patients with hypertension, 2 (3%) patients with diabetes mellitus and 2 (3%) patients with neurological disease. SARS-CoV-2 infected patients were more likely to have a history of tobacco use and had similar body mass index as PCR-negative subjects. Baseline characteristic of the study population with comparison between the two study groups are summarized in Table 1 . Self-reported clinical symptoms of both study groups are displayed and compared in Table 2. SARS-CoV-2 status and the vocal biomarker Vocal biomarker association with signs and symptoms SARS-CoV-2 positive patients were more likely to report fever, shortness of breath, cough and runny nose (Table 2) . Binary logistic regression demonstrated that compared to SARS-CoV-2 negative patients, SARS-CoV-2 positive patients were 12 times more likely to report at least one symptom (OR 11.77, 95% CI 4-35, p<.001), were 15 times more likely to report cough (95% CI 3-73, p<.001) and 8 times more likely to report shortness of breath (95% CI 1-70, p=.05). When patients reporting at least one symptom were classified as positive, this symptom-based classifier reached an AUC of 0.775 which is consistent with previous reports on the accuracy and AUC of symptom-based screening tools 11 In recent years, with the widespread use of wearable devices and smartphones, there is a growing interest in a remote voice analysis as a complimentary non-invasive telemedicine tool. Machine learning algorithms have helped to identify an association between voice and several disease states including coronary artery disease, pulmonary hypertension and congestive heart failure patients at risk for re-admission and/or death. [4] [5] [6] Voice is just one example of the many digital biomarkers that are emerging in recent years due to advances in artificial intelligence and machine learning algorithms, coupled with high quality big data electronic registries. 12 A recent example is the use of deep learning to detect coronary artery disease based on facial photos with AUC of 0.73. 13 In the specific case of vocal biomarkers, a correlation between respiratory viral infection and changes in voice analysis is physiologically plausible. Voice is created by three major components that include the lungs the larynx and the articulators (eg, the tongue, the palate, and the mouth muscles). 14 The current study extends and supports preliminary evidence linking abnormal lung state of pulmonary congestion with J o u r n a l P r e -p r o o f changes in voice. In a preliminary study by Murton and colleagues, patients with decompensated heart failure who were successfully treated in hospital demonstrated a higher proportion of automatically identified creaky voice, increased fundamental frequency, and decreased cepstral peak prominence variation, suggesting that speech biomarkers can be early indicators of heart failure. The authors suggested a role for vocal cord and lung edema in the changes in voice analysis before and after treatment. 15 This study was followed by a larger more recent study by Amir and colleagues who consistently showed in 40 patients how voice changes between "wet" and "dry" states of patients with acute heart failure. 16 Moreover, our previous study demonstrated that voice biomarkers are associated with pulmonary hypertension 6 This is the first study to document a relationship between a vocal biomarker and SARS-CoV-2 respiratory infection. Vocal signal analysis is a noninvasive biomarker that holds the potential to assist in remote screening of large populations that could increase the effectiveness of current PCR testing strategies, both in conjunction with social distancing restrictions and after Maor serves as a consultant for Vocalis Health. GERD -Gastroesophageal reflux disease, CKD-Chronic kidney disease; † defined as treated with pulmonary medications. J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f J o u r n a l P r e -p r o o f Feature extraction process using transfer learning and adaptation methods. The process produces a 512-features vector for each x seconds of a given recording. The process is agnostic to the chosen recording length, due to a global average pooling layer at the end of the finetuned Vggish network. J o u r n a l P r e -p r o o f Interventions to mitigate early spread of SARS-CoV-2 in Singapore: a modelling study. The Lancet Infectious Diseases Automatic detection of neurological disordered voices using mel cepstral coefficients and neural networks. In: 2013 IEEE Point-of-Care Healthcare Technologies (PHT) Voice Signal Characteristics Are Independently Associated With Coronary Artery Disease Non-invasive vocal biomarker is associated with pulmonary hypertension Vocal Biomarker Is Associated With Hospitalization and Mortality Among Heart Failure Patients Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes Audio and Music Signal Analysis in Python Rethinking Model Scaling for Convolutional Neural Networks Scikit-learn: Machine Learning in Python Real-time tracking of self-reported symptoms to predict potential COVID-19 Artificial Intelligence in Cardiology: Present and Future Feasibility of using deep learning to detect coronary artery disease based on facial photo Hey Goglexiri, Do I Have Coronary Artery Disease? Acoustic speech analysis of patients with decompensated heart failure: A pilot study Speech analysis to evaluate acute heart failure patient clinical status Temperature screening has negligible value for control of COVID-19