key: cord-0706959-nldnu0s3
authors: Kobat, Mehmet Ali; Kivrak, Tarik; Barua, Prabal Datta; Tuncer, Turker; Dogan, Sengul; Tan, Ru-San; Ciaccio, Edward J.; Acharya, U. Rajendra
title: Automated COVID-19 and Heart Failure Detection Using DNA Pattern Technique with Cough Sounds
date: 2021-10-22
journal: Diagnostics (Basel)
DOI: 10.3390/diagnostics11111962
sha: 50b0b0e8fa7e8b143cd1b1c901254a8a7a5731be
doc_id: 706959
cord_uid: nldnu0s3

COVID-19 and heart failure (HF) are common disorders and although they share some similar symptoms, they require different treatments. Accurate diagnosis of these disorders is crucial for disease management, including patient isolation to curb infection spread of COVID-19. In this work, we aim to develop a computer-aided diagnostic system that can accurately differentiate these three classes (normal, COVID-19 and HF) using cough sounds. A novel handcrafted model was used to classify COVID-19 vs. healthy (Case 1), HF vs. healthy (Case 2) and COVID-19 vs. HF vs. healthy (Case 3) automatically using deoxyribonucleic acid (DNA) patterns. The model was developed using the cough sounds collected from 241 COVID-19 patients, 244 HF patients, and 247 healthy subjects using a hand phone. To the best our knowledge, this is the first work to automatically classify healthy subjects, HF and COVID-19 patients using cough sounds signals. Our proposed model comprises a graph-based local feature generator (DNA pattern), an iterative maximum relevance minimum redundancy (ImRMR) iterative feature selector, with classification using the k-nearest neighbor classifier. Our proposed model attained an accuracy of 100.0%, 99.38%, and 99.49% for Case 1, Case 2, and Case 3, respectively. The developed system is completely automated and economical, and can be utilized to accurately detect COVID-19 versus HF using cough sounds.

The COVID-19 pandemic is continuing to the present time despite recent vaccination efforts. Experts advise people to continue to wear masks, implement sanitization procedures, and avoid crowds [1, 2] . Curfews still exist in many countries. COVID-19 has disrupted normal life and has strained national health resources, even more so at the beginning of the pandemic [3] . A new normal is necessary to limit its spread [4] and people are often living in isolation according to quarantine rules [5, 6] . Many patients with

New local feature generator based on graph theory and the chemical structure of nucleotide basic units of the DNA molecule, which we labelled as DNA pattern-based. • New prospectively acquired dataset comprising cough sounds recorded from healthy subjects, COVID-19, and HF patients using basic smart phone microphones, which we divided into standardized one-second sound segments for analysis.

To the best our knowledge, this is the first work to automatically classify healthy subjects, HF and COVID-19 patients using cough sounds signals.

The major contributions of this study include: • Three distinct clinically relevant classification problems were defined: Case 1, COVID- 19 vs. healthy; Case 2, HF vs. healthy; and Case 3, COVID-19 vs. HF vs. healthy.

The DNA pattern-and ImRMR-based model combined with the standard kNN classifier attained excellent results, with greater than 99% accuracy for every Case.

Here, we review selected publications on computer-aided diagnostic systems for HF and COVID-19 detection using biomedical signals and imaging readouts, respectively. Masetic and Subasi [20] developed an electrocardiogram (ECG) method based on the autoregressive Burg method and random forest classifier, tested it on the MIT BIH arrhythmia [21] , PTB diagnostic ECG [22] and BIDMC-congestive HF datasets [21, 23] , and reported a 100.0% accuracy rate for HF diagnosis. Tripathy et al. [24] processed ECGs from the MIT BIH arrhythmia [25] and BIDMC congestive HF datasets [21, 23] using a high-pass filter and applied Stockwell-transform for time-frequency analysis to extract entropy features. Using hybrid classifiers with mean metric, 98.78% accuracy rate was reported for congestive HF detection. Porumb et al. [26] developed a convolutional neural network (CNN) model to diagnose congestive HF on single raw ECG heartbeats, and reported 100.0% accuracy after analyzing 490,505 individual ECG heartbeat signals. Abbas et al. [27] tested a DeTraC (Decompose, Transfer and Compose) CNN model on a combined chest X-ray image dataset [28, 29] , and reported a 93.10% accuracy rate for COVID-19 diagnosis. Jaiswal et al. [30] used a DenseNet201-based image classification model to analyze computed tomographic (CT) chest images [31] , and attained a 96.25% accuracy rate for discriminating between COVID-19 (+) vs. COVID-19 (−) status. Singh et al. [32] applied a CNN model on CT chest images and attained a 93.50% accuracy rate for a binary classification of images into infected (+) vs. infected (−). Horry et al. [33] used a transfer learning-based method that analyzed X-ray, CT, and ultrasound images from four different datasets-COVID-19 image data collection [34] , NIH chest X-Ray [35] , Covid-CT [36] , and POCOVID [37] -and for each imaging modality, calculated the performance metrics of the different analysis models that included VGG16 [38] , VGG19 [38] , Xception [39] , InceptionResNetV2 [40] , InceptionV3 [41] , NASNetLarge [42] , DenseNet121 [43] , and ResNet50V2 [44] . For instance, F1-score values for VGG19 were 87.00%, 99.00%, and 78.00% for X-ray, ultrasound, and CT, respectively. Zebin and Rezvy [45] applied a CNN method to analyze chest X-ray images for initial COVID-19 classification into COVID-19, normal and pneumonia classes, as well as for monitoring of disease progression. They reported 90.00%, 96.80%, and 94.30% accuracy rates for VGG-16, EfficientNetB0 [46] and ResNet50 models, respectively.

Using various mobile phones, cough sounds were recorded from 247 healthy subjects as well as 241 COVID-19 and 244 HF patients who attended Firat University Hospital, and stored in m4a (719), mp3 (3) or ogg (10) formats. Ethical approval for the study was obtained from the Firat University Ethics Committee. These recordings were of different durations and had to be subdivided into standardized one-second sound segments for analysis. There were 696 (32%), 906 (42%) and 554 (26%) sound segments from healthy subjects, COVID-19 and HF patients, respectively, out of a total of 2156 segments.

The model comprised a graph-based local feature generator, an iterative feature selector, and classification components. The former used graphical depictions of the chemical structures of nucleotide basic units of the DNA molecule, purine and pyrimidine, to generate features from cough sounds. The optimal number of features was selected using ImRMR and classification of the chosen features performed using standard kNN classifier. A schematic of this model is shown in Figure 1 . graphic (CT) chest images [31] , and attained a 96.25% accuracy rate for discriminating between Covid-19 (+) vs. Covid-19 (−) status. Singh et al. [32] applied a CNN model on CT chest images and attained a 93.50% accuracy rate for a binary classification of images into infected (+) vs. infected (−). Horry et al. [33] used a transfer learning-based method that analyzed X-ray, CT, and ultrasound images from four different datasets-Covid-19 image data collection [34] , NIH chest X-Ray [35] , Covid-CT [36] , and POCOVID [37] -and for each imaging modality, calculated the performance metrics of the different analysis models that included VGG16 [38] , VGG19 [38] , Xception [39] , InceptionResNetV2 [40] , Incep-tionV3 [41] , NASNetLarge [42] , DenseNet121 [43] , and ResNet50V2 [44] . For instance, F1score values for VGG19 were 87.00%, 99.00%, and 78.00% for X-ray, ultrasound, and CT, respectively. Zebin and Rezvy [45] applied a CNN method to analyze chest X-ray images for initial Covid-19 classification into Covid-19, normal and pneumonia classes, as well as for monitoring of disease progression. They reported 90.00%, 96.80%, and 94.30% accuracy rates for VGG-16, EfficientNetB0 [46] and ResNet50 models, respectively.

Using various mobile phones, cough sounds were recorded from 247 healthy subjects as well as 241 Covid-19 and 244 HF patients who attended Firat University Hospital, and stored in m4a (719), mp3 (3) or ogg (10) formats. Ethical approval for the study was obtained from the Firat University Ethics Committee. These recordings were of different durations and had to be subdivided into standardized one-second sound segments for analysis. There were 696 (32%), 906 (42%) and 554 (26%) sound segments from healthy subjects, Covid-19 and HF patients, respectively, out of a total of 2156 segments.

The model comprised a graph-based local feature generator, an iterative feature selector, and classification components. The former used graphical depictions of the chemical structures of nucleotide basic units of the DNA molecule, purine and pyrimidine, to generate features from cough sounds. The optimal number of features was selected using ImRMR and classification of the chosen features performed using standard kNN classifier. A schematic of this model is shown in Figure 1 . The pseudocode of the model is given in Algorithm 1. The pseudocode of the model is given in Algorithm 1. 

A new DNA pattern-based local feature generator was proposed. There have been several graph-based feature extraction models in the literature [18, 47] and molecular structure graphs used in deep learning models and graph networks have attained high classification performance [48, 49] . In this study, we used the aromatic heterocyclic chemical structures of nucleotide basic units of the DNA molecule purine with its fused six-and fivemembered ring conformation; and pyrimidine, its six-membered ring to generate features from cough sound signal segments. Each purine nucleotide unit (adenine, guanine) on one DNA strand is hydrogen-bonded to the corresponding pyrimidine nucleotide unit (thymine, cytosine) of the second DNA strand (base pairing) to collectively form the DNA double helix, which is the basis of our genetic code. The chemical structures of purines and pyrimidines are topologically distinctive and can be represented as directed cyclic graphs ( Figure 2 ). These graphs are utilized as the pattern of a histogram-based local feature generator. As can be seen in Figure 2 , there are 25 edges in these two graphs, and these edges are denoted parameters of generated binary features. 

A new DNA pattern-based local feature generator was proposed. There have been several graph-based feature extraction models in the literature [18, 47] and molecular structure graphs used in deep learning models and graph networks have attained high classification performance [48, 49] . In this study, we used the aromatic heterocyclic chemical structures of nucleotide basic units of the DNA molecule purine with its fused sixand five-membered ring conformation; and pyrimidine, its six-membered ring to generate features from cough sound signal segments. Each purine nucleotide unit (adenine, guanine) on one DNA strand is hydrogen-bonded to the corresponding pyrimidine nucleotide unit (thymine, cytosine) of the second DNA strand (base pairing) to collectively form the DNA double helix, which is the basis of our genetic code. The chemical structures of purines and pyrimidines are topologically distinctive and can be represented as directed cyclic graphs ( Figure 2 ). These graphs are utilized as the pattern of a histogram-based local feature generator. As can be seen in Figure 2 , there are 25 edges in these two graphs, and these edges are denoted parameters of generated binary features. Directed cyclic graphical representations of purine (fused six-and five-membered ring conformation) and pyrimidine (six-membered ring). Individual directed paths are constructed using red arrows, which are enumerated. The initial and final points of each arrow represent the first and second parameters of the signum function for bit generation, respectively. With both structures combined, 25 bits (total number of directed paths) can be generated using 5 × 7 and 6 × 5 sized matrices (see text).

A schematic of the proposed DNA pattern-based feature generation is shown in Figure 3 . Steps of the proposed DNA pattern-based feature generation:

Step 1: Divide cough sound into overlapping blocks with a size of 35.

Step 2: Create first matrix with a size of 5 × 7 using vector to matrix transformation.

Step 3: Use the purine pattern and signum function to generate 14 bits. The definition of the signum function is given in Equation (1).

where (. , . ), and are the signum function first and second parameters, respectively.

Step 4: Divide cough sound into overlapping blocks of size 30.

Step 5: Create a second matrix with dimension 6 × 5 using vector-to-matrix transformation.

Step 6: Use the pyrimidine pattern and signum function to generate 11 bits.

Step 7: Merge the generated bits (total 25 bits) from Steps 3 and 6.

Step 8: Divide these bits into left, middle and right groups.

From Equations (2)-(4), left, middle and right bit groups contain 8, 9, and 8 bits, respectively.

Step 9: Create three map signals using the generated bit groups.

where 1 , 2 and 3 are the generated first, second, and third map sounds for feature generation. Histograms of these map sounds are extracted to obtain feature vectors. From Equations (5)-(9), these signals are coded with 8, 9, and 8 bits, respectively. Steps of the proposed DNA pattern-based feature generation:

Step 1: Divide cough sound into overlapping blocks with a size of 35.

Step 2: Create first matrix with a size of 5 × 7 using vector to matrix transformation.

Step 3: Use the purine pattern and signum function to generate 14 bits. The definition of the signum function is given in Equation (1).

where γ(., .), f and s are the signum function first and second parameters, respectively.

Step 4: Divide cough sound into overlapping blocks of size 30.

Step 5: Create a second matrix with dimension 6 × 5 using vector-tomatrix transformation.

Step 6: Use the pyrimidine pattern and signum function to generate 11 bits.

Step 7: Merge the generated bits (total 25 bits) from Steps 3 and 6.

Step 8: Divide these bits into left, middle and right groups.

From Equations (2)-(4), left, middle and right bit groups contain 8, 9, and 8 bits, respectively.

Step 9: Create three map signals using the generated bit groups.

Diagnostics 2021, 11, 1962 6 of 15 where m 1 , m 2 and m 3 are the generated first, second, and third map sounds for feature generation. Histograms of these map sounds are extracted to obtain feature vectors. From Equations (5)-(9), these signals are coded with 8, 9, and 8 bits, respectively.

Step 10: Extract histograms of m 1 , m 2 , and m 3 . The lengths of the created histograms of m 1 , m 2 , and m 3 are calculated as 2 8 , 2 9 , and 2 8 , respectively.

Step 11: Merge the extracted histograms to obtain the feature vector of the DNA pattern.

f v(a) = h 1 (a), a ∈ {1, 2, . . . , 256}

f v(g + 256) = h 2 (g), g ∈ {1, 2, . . . , 512}

f v(a + 768) = h 3 (a) (10) where f v defines a feature vector with length 1024, and h 1 , h 2 , and h 3 are histograms extracted using the m 1 , m 2 , and m 3 map signals, respectively. The eleven steps above define the DNA pattern-based feature generation. 1024 features are generated from each sound segment by deploying these steps.

For automatic selection of the optimal number of generated features, we proposed an iterative version of the maximum relevance minimum redundancy selector (mRMR) [50] , ImRMR, that incorporated an error calculator with kNN classifier. A schematic of the ImRMR selector is shown in Figure 4 . Step 10: Extract histograms of 1 , 2 , and 3 . The lengths of the created histograms of 1 , 2 , and 3 are calculated as 2 8 , 2 9 , and 2 8 , respectively.

Step 11: Merge the extracted histograms to obtain the feature vector of the DNA pattern.

( ) = ℎ 1 ( ), ∈ {1,2, … ,256}

( + 256) = ℎ 2 ( ), ∈ {1,2, … ,512}

where defines a feature vector with length 1024, and ℎ 1 , ℎ 2 , and ℎ 3 are histograms extracted using the 1 , 2 , and 3 map signals, respectively.

The eleven steps above define the DNA pattern-based feature generation. 1024 features are generated from each sound segment by deploying these steps.

For automatic selection of the optimal number of generated features, we proposed an iterative version of the maximum relevance minimum redundancy selector (mRMR) [50] , ImRMR, that incorporated an error calculator with kNN classifier. A schematic of the ImRMR selector is shown in Figure 4 . Steps involved in the selection of an optimal number of features using the ImRMR selector.

By deploying ImRMR, each of the 1024 features extracted by the DNA pattern is selected iteratively, and the kNN classifier employed to calculate the resultant error rates of the selected feature vector. The steps of the ImRMR used are detailed below.

Step 1: Apply mRMR and calculate 1024 index ( ) values.

Step 2: Select features using the that has been calculated in Step 1. 

where represents ith selected features, and is the number of observations. Here, iterative feature selection is described. By deploying ImRMR, each of the 1024 features extracted by the DNA pattern is selected iteratively, and the kNN classifier employed to calculate the resultant error rates of the selected feature vector. The steps of the ImRMR used are detailed below.

Step 1: Apply mRMR and calculate 1024 index (id) values.

Step 2: Select features using the id that has been calculated in Step 1. where s f i represents ith selected features, and k is the number of observations. Here, iterative feature selection is described.

Step 3: Calculate loss values of each feature vector selected using the kNN classifier with 10-fold cross-validation.

In Equation (12), µ and kNN(.) represent the error value and the kNN classifier, respectively.

Step 4: Find the minimum loss value.

Step 5: Select optimal feature vector (last) using index (ind) of the minimum error value.

last(k, j) = f v(k, id(j)), j ∈ {1, 2, . . . , ind}, k ∈ {1, 2, . . . , 2156}

A standard distance classifier (kNN) [19] was utilized for selecting the best and optimal number of feature vectors (it functioned as error value generator, see Section 2.2.2) as well as for calculating the classification results. Parameters of the kNN are: k was selected as one; distance parameter, Spearman; distance weight, equal; and standardize, true. Ten-fold cross-validation was chosen as the validation technique.

The MATLAB (2020b) coding environment was used to develop the proposed DNA pattern-and ImRMR-based cough sound classification model. Systems configuration of the computer used were as follows:

Operating system: Window 10.1 professional, RAM: 48 gigabytes, CPU: Intel i9 9900 with 3.60 GHz cycling frequency, Specifically, neither graphical core nor parallel processing was used to develop the model.

To evaluate the proposed model comprehensively, three distinct clinically relevant classification problems were defined based on the collected cough sound dataset: 

Standard performance metrics including accuracy, sensitivity, precision, F1-score, and geometric mean [51] were evaluated (see Table 1 ) and confusion matrices constructed ( Figure 5 ) for all Cases. High classification accuracy rates of 99.38%, 100% and 99.49% were attained for Case 1, Case 2 and Case 3, respectively, with low rates of classification error. The time burden (computational complexity) of the presented model was denoted using big O notation. The time complexity of the DNA pattern-based local feature generator function was ( ), where n was the length of the cough sound segment analyzed. ImRMR used both kNN and mRMR, and constituted the most complex phase of the model. Its time burden was ( 2 ), where , and were the iteration number, length of the features, and number of observations, respectively. In the classification phase, kNN was used and the associated time complexity was ( ).

Cough sound-based Covid-19 detection is an emerging field of research for both clinicians and machine learning experts. The prevalence and incidence of HF has been on the increase even before the onset of the Covid-19 pandemic, and is now often affected by a lack of access to routine medical care. The clinical presentations of both Covid-19 and HF can overlap, which underscores the need for the development of computer-aided diagnostic tools to support clinicians in triage and management. Both conditions can induce cough symptoms. Therefore, we collected cough sounds from Covid-19 and HF patients, as well as healthy subjects, to test the performance of our proposed DNA pattern-and ImRMR-based model. Our proposed model is able to classify three clinically relevant classification problems: Covid-19 vs. healthy; HF vs. healthy; and Covid-19 vs. HF vs. healthy. The model generated 1024 features from each one-second cough sound segment. An iterative feature selector is employed to select the most discriminative features. We presented the results obtained using ImRMR, iterative neighborhood component analysis (INCA), iterative ReliefF (IRF) and iterative Chi2 (IChi2) feature selectors. The plots of error rates 

Cough sound-based COVID-19 detection is an emerging field of research for both clinicians and machine learning experts. The prevalence and incidence of HF has been on the increase even before the onset of the COVID-19 pandemic, and is now often affected by a lack of access to routine medical care. The clinical presentations of both COVID-19 and HF can overlap, which underscores the need for the development of computer-aided diagnostic tools to support clinicians in triage and management. Both conditions can induce cough symptoms. Therefore, we collected cough sounds from COVID-19 and HF patients, as well as healthy subjects, to test the performance of our proposed DNA patternand ImRMR-based model. Our proposed model is able to classify three clinically relevant classification problems: COVID-19 vs. healthy; HF vs. healthy; and COVID-19 vs. HF vs. healthy. The model generated 1024 features from each one-second cough sound segment. An iterative feature selector is employed to select the most discriminative features. We presented the results obtained using ImRMR, iterative neighborhood component analysis (INCA), iterative ReliefF (IRF) and iterative Chi2 (IChi2) feature selectors. The plots of error rates versus number of features selected using these feature selectors implemented for Case 3 are shown in Figure 6 . It can be noted from Figure 6 that the number of features selected corresponding to least error rates for Case 3 classification using IChi2, INCA, IRF and ImRMR are 226, 802, 701, and 895, respectively. The minimum error rate of 0.0051 is obtained for ImRMR, 0.006 for IChi2, INCA, and IRF selectors. Application of ImRMR to Case 1 and Case 2 yielded minimum error rates of 0.0062 and 0 for 198 and 50 selected features, respectively ( Figure  7) . Overall, the model attained 99.38%, 100% and 99.49% accuracy rates for Case 1, Case 2 and Case 3 classifications, respectively. It can be noted from Figure 6 that the number of features selected corresponding to least error rates for Case 3 classification using IChi2, INCA, IRF and ImRMR are 226, 802, 701, and 895, respectively. The minimum error rate of 0.0051 is obtained for ImRMR, 0.006 for IChi2, INCA, and IRF selectors. Application of ImRMR to Case 1 and Case 2 yielded minimum error rates of 0.0062 and 0 for 198 and 50 selected features, respectively ( Figure 7) . Overall, the model attained 99.38%, 100% and 99.49% accuracy rates for Case 1, Case 2 and Case 3 classifications, respectively.

The Standard kNN classifier is employed for calculating the error rate during the feature selection phase (see Section 2.2.3) in order to obtain classification results. We have used decision tree (DT) [52] , linear discriminant (LD) [53] , naïve Bayes (NB) [54] , support vector machine (SVM) [55] , kNN [19] , bagged tree (BT) [56] , and subspace discriminant (SD) [57] classifiers in addition to kNN for the classification tasks using 1024 features. It can be noted from Figure 8 that the best results are obtained using the kNN classifier. Therefore, kNN is selected both as the classifier and the error/loss value generator in the features selection phase. The Standard kNN classifier is employed for calculating the error rate during the feature selection phase (see Section 2.2.3) in order to obtain classification results. We have used decision tree (DT) [52] , linear discriminant (LD) [53] , naïve Bayes (NB) [54] , support vector machine (SVM) [55] , kNN [19] , bagged tree (BT) [56] , and subspace discriminant (SD) [57] classifiers in addition to kNN for the classification tasks using 1024 features. It can be noted from Figure 8 that the best results are obtained using the kNN classifier. Therefore, kNN is selected both as the classifier and the error/loss value generator in the features selection phase. The performance parameters (%) obtained for automated Covid-19 detection using cough sound signals is depicted in Table 2 . The Standard kNN classifier is employed for calculating the error rate during the feature selection phase (see Section 2.2.3) in order to obtain classification results. We have used decision tree (DT) [52] , linear discriminant (LD) [53] , naïve Bayes (NB) [54] , support vector machine (SVM) [55] , kNN [19] , bagged tree (BT) [56] , and subspace discriminant (SD) [57] classifiers in addition to kNN for the classification tasks using 1024 features. It can be noted from Figure 8 that the best results are obtained using the kNN classifier. Therefore, kNN is selected both as the classifier and the error/loss value generator in the features selection phase. The performance parameters (%) obtained for automated Covid-19 detection using cough sound signals is depicted in Table 2 . The performance parameters (%) obtained for automated COVID-19 detection using cough sound signals is depicted in Table 2 . The benefits and disadvantages of our proposed DNA pattern-based method are given below.

The benefits are as follows.

• Developed a new cough sound dataset, which was collected from healthy subjects, and COVID-19 and HF patients.

Presented a novel histogram-based feature generator inspired by DNA patterns. To the best our knowledge, this is the first work to automatically classify healthy subjects, HF and COVID-19 patients using cough sounds signals. • Proposed a DNA pattern-and ImRMR-based model which attained greater than 99% accuracy for all (binary and multiclass) defined classification problems. • Generated an automated model based on cough sounds that is accurate, economical, rapid, and computationally lightweight.

The limitations of this work are given below:

The system should be validated with a larger dataset prior to clinical application.

Only a three-class system was used (normal, COVID-19 and HF).

We have presented a histogram-based hand-modeled feature generation function using the DNA molecular pattern. New-generation deep learning models based on molecular shapes can be further studied to improve model performance. A snapshot of cloud-based cough detection via mobile application with cough sounds is presented in Figure 9 .

Pre: 99.35 F1: 99.47 Gm: 99.59 for Case 3 AUC: Area under the ROC curve, Acc: Accuracy, Sen: Sensitivity, Spe: Specificity, Pre: Precision, F1: F1-Score, Gm: Geometric mean, Rec: Recall.

The benefits and disadvantages of our proposed DNA pattern-based method are given below.

The benefits are as follows.

 Developed a new cough sound dataset, which was collected from healthy subjects, and Covid-19 and HF patients. 

Presented a novel histogram-based feature generator inspired by DNA patterns. To the best our knowledge, this is the first work to automatically classify healthy subjects, HF and Covid-19 patients using cough sounds signals.  Proposed a DNA pattern-and ImRMR-based model which attained greater than 99% accuracy for all (binary and multiclass) defined classification problems.  Generated an automated model based on cough sounds that is accurate, economical, rapid, and computationally lightweight.

The limitations of this work are given below:

The system should be validated with a larger dataset prior to clinical application. 

Only a three-class system was used (normal, Covid-19 and HF).

We have presented a histogram-based hand-modeled feature generation function using the DNA molecular pattern. New-generation deep learning models based on molecular shapes can be further studied to improve model performance. A snapshot of cloudbased cough detection via mobile application with cough sounds is presented in Figure 9 . 

This paper presents a new automated COVID-19 and HF failure detection model using cough sounds. This model extracts subtle features from a cough sound signal using a histogram-based feature generator with a chemical structure of DNA molecule. The proposed DNA patterns used for feature bit generation, combined with the ImRMR and kNN classifier, yielded an accuracy of 99.38%, 100%, and 99.49% for COVID-19 vs. healthy, HF vs. healthy, and COVID-19 vs. HF vs. healthy diagnoses, respectively. The model is accurate, economical and computationally lightweight. In the future, we intend to detect asthma in addition to the three classes currently used for cough sound signal analysis. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to restrictions regarding the Ethical Committee Institution.

A deep learning based classification for COVID-19 detection using chest X-ray images

A Guide to Novel Coronavirus (COVID-19) Infection Control for Businesses

The New (Ab) Normal: Reshaping Business and Supply Chain Strategy Beyond COVID-19

COVID-19 imaging: What we know now and what remains unknown

A tale of two pandemics: How will COVID-19 and global trends in physical inactivity and sedentary behavior affect one another?

Antivirus-built environment: Lessons learned from COVID-19 pandemic. Sustain

Unleashing the power of disruptive and emerging technologies amid COVID 2019: A detailed review

COVID-19 myocarditis and prospective heart failure burden

A comparative study of existing machine learning approaches for parkinson's disease detection

Performance comparison of machine learning techniques in identifying dementia from open access clinical datasets

CNN-based transfer learning-BiLSTM network: A novel approach for COVID-19 infection detection

An efficient approach for classifying chest X-ray images using different embedder with different activation functions in CNN

A fuzzy inference-fuzzy analytic hierarchy process-based clinical decision support system for diagnosis of heart diseases

Diagnosing COVID-19 pneumonia from X-ray and CT images using deep learning and transfer learning algorithms

Occupancy-driven energy-efficient buildings using audio processing with background sound cancellation

Improving the generalization ability of deep neural networks for cross-domain visual recognition

Nanoparticle synthesis assisted by machine learning

Application of Petersen graph pattern technique for automated detection of heart valve diseases with PCG signals

kNN-IS: An Iterative Spark-based design of the k-Nearest Neighbors classifier for big data

Congestive heart failure detection using random forest classifier

Components of a new research resource for complex physiologic signals

The PTB Diagnostic ECG Database

Survival of patients with severe congestive heart failure treated with oral milrinone

Automated detection of congestive heart failure from electrocardiogram signal using Stockwell transform and hybrid classification scheme

The impact of the MIT-BIH arrhythmia database

A convolutional neural network approach to detect congestive heart failure

Classification of COVID-19 in chest X-ray images using DeTraC deep convolutional neural network

Lung segmentation in chest radiographs using anatomical atlases with nonrigid registration

Automatic tuberculosis screening using chest radiographs

Classification of the COVID-19 infected patients using DenseNet201 based deep transfer learning

Classification of COVID-19 patients from chest CT images using multi-objective differential evolution-based convolutional neural networks

COVID-19 detection through transfer learning using multimodal imaging data

COVID-19 image data collection: Prospective predictions are the future

Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases

A ct scan dataset about COVID-19. arXiv 2020

Automatic detection of COVID-19 from a new lung ultrasound imaging dataset (POCUS). arXiv 2020

Very deep convolutional networks for large-scale image recognition

Deep learning with depthwise separable convolutions. arXiv 2017

Inception-v4, inception-resnet and the impact of residual connections on learning

Rethinking the inception architecture for computer vision

Learning transferable architectures for scalable image recognition. arXiv

Deep residual learning for image recognition

COVID-19 detection and disease progression visualization: Deep learning on chest X-rays for classification and coarse localization

Cross-validation metrics for evaluating classification performance on imbalanced data

Graph based feature extraction and hybrid classification approach for facial expression recognition

CGNet: A graph-knowledge embedded convolutional neural network for detection of pneumonia

Graph neural network: Current state of Art, challenges and applications

mr2PSO: A maximum relevance minimum redundancy feature selection method based on swarm intelligence for support vector machine classification

Automated arrhythmia detection using novel hexadecimal local pattern and multilevel wavelet transform with ECG signals

A survey of decision tree classifier methodology

Empirical performance analysis of linear discriminant classifiers

An empirical study of the naive Bayes classifier

The nature of Statistical Learning Theory

A novel fault classification scheme for series capacitor compensated transmission line based on bagged tree ensemble classifier

Ensemble of subspace discriminant classifiers for schistosomal liver fibrosis staging in mice microscopic images

Exploring automatic diagnosis of COVID-19 from crowdsourced respiratory sound data

A Real-time Robot-based Auxiliary System for Risk Evaluation of COVID-19 Infection

Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data. arXiv 2021

COVID-19 detection system using recurrent neural networks

COVID-19 Cough Classification using Machine Learning and Global Smartphone Recordings. arXiv 2020

Coswara-A Database of Breathing, Cough, and Voice Sounds for COVID-19 Diagnosis. arXiv 2020

The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 cough, COVID-19 speech, escalation & primates. arXiv 2021

Exploring Automatic COVID-19 Diagnosis via voice and symptoms from Crowdsourced Data

Generic Deep Learning Based Cough Analysis System from Clinically Validated Samples for Point-of-Need COVID-19 Test and Severity Levels

An Artificially Intelligent Mobile Application to Detect Asymptomatic COVID-19 Patients using Cough and Breathing Sounds. arXiv 2021

The COUGHVID crowdsourcing dataset: A corpus for the study of large-scale cough analysis algorithms

Diagnosis of COVID-19 and Non-COVID-19 Patients by Classifying Only a Single Cough Sound

Global Applicability of Crowdsourced and Clinical Datasets for AI Detection of COVID-19 from Cough. arXiv 2020

Novel coronavirus cough database: Nococoda

Robust Detection of COVID-19 in Cough Sounds: Using Recurrence Dynamics and Variable Markov Model

We gratefully acknowledge the Ethics Committee, Firat University data transcription.

The authors declare no conflict of interest.