key: cord-0222312-rq485y1m
authors: Levy, Jeremy; Naitsat, Alexander; Zeevi, Yehoshua Y.
title: Distortion measure of spectrograms for classification of respiratory diseases
date: 2021-06-04
journal: nan
DOI: nan
sha: 79defb1dbfcdd9a83d8b0282f6ea425c9ba436ff
doc_id: 222312
cord_uid: rq485y1m

A new method for the classification of respiratory diseases is presented. The method is based on a novel class of features, extracted from pulmonary sounds, by parameterizing their spectrograms that are represented as surfaces, and by utilizing geometrical distortions defined with reference to these surfaces. This method yields a set of highly descriptive features for the analysis of pulmonary sound recordings. Furthermore, by combining these features with Mel-frequency cepstral coefficients, we introduce a powerful model for the automatic diagnosis of common respiratory pathologies. Compared with baseline methods, our model achieves superior results for binary and multi-class classifications of common respiratory diseases. Our new approach to the classification of one-dimensional signals is applicable to other signals in the context of their representations in combined spaces or manifolds.

M ACHINE learning has begun to have impact on various facets of medical signals and image analysis [1] , [2] . Computational means for lung sounds analysis have been the subject of various studies using machine learning [3] and other related approaches [4] , aiming at detecting respiratory pathologies.

Nevertheless, the shortage in well-trained physicians who specialize in osculation pulmonary diseases is alarming. This is especially the case in certain parts of the word like Africa, where the spread of obstructive lung diseases is already beyond control [5] . Since advanced means of mass screening by medical imaging is not an option in such places, let alone that expert radiologists are not available there, there is an urgent need for inexpensive, yet reliable methods for automatic diagnostics of lung diseases by means of an electronic stethoscope.

In view of the above outstanding medical problem, it is of utmost importance to develop new algorithms for classification of lung sounds and to advance a fully automatic pipeline for diagnosis of common respiratory pathologies.

The physics of sounds and the natural characteristic of nonstationarity of lung sounds, lend themselves to their analysis in the combined space of time-frequency, i.e. the spectrogram. We carry this approach one step further, by considering the spectrogram as a surface and by exploiting its geometric properties in generating novel features that are powerful in classification of lung diseases by means of machine learning. There exist several deep learning and model-based methods for automatic classification of lung pathologies based on their fingerprints that are hidden in the pulmonary sounds. For instance, the recent method of [6] implements a deep transfer learning-based multi-class classifier for diagnosis of COVID-19, using cough recordings.

Chanbres et al. [7] employ the algorithm of the Essentia library [8] for extracting sound features from cough recordings. This system was trained on the dataset of the ICBHI 2017 challenge [9] and it uses a boosted-decisional-tree algorithm to classify sounds like crackles and wheezes.

Kandaswamy et al. [10] developed a method of analyzing lung sounds signals using wavelet transform, combined with Artificial Neural Network (ANN) for the classification. They obtained an accuracy of 94.56% on the validation set and 91.33% on the test set. The dataset was composed of 126 recordings, categorized into inspiratory wheezes, fine crackles, stridor, squawk, and rhonchus, apart from normal vesicular sounds.

A similar method with ANN layer was used in [11] for the analysis of cough sounds recorded from pediatric patients.

Sankur et al. [12] proposed an auto-regressive (AR) model combined with KNN classifier for the task of diagnosis of the lung sounds. They obtained Correct Classification (CC) of 93.75% on the test set. The dataset was composed of asthma, chronic bronchitis, emphysema, pneumonia, pleurisy and bronchiectasis pathologies.

The Respiratory Sound Database from (https://www.kaggle. com/vbookshelf/respiratory-sound-database) was used for the analysis. A total of 918 lung sounds recordings from 126 patients were used. This database incorporates 7 different pathologies: URTI, Asthma, COPD, LRTI, Bronchectasis, Pneumonia, Bronchiolitis, and healthy recordings.

The histogram depicted in Figure 1 presents the repartition of the labels among the cases included in the database. Due to the very poor occurrence of the Asthma and LRTI pathologies, the corresponding recordings were excluded.

One of the major problems that one must overcome in the process of analysis of lung sounds is S/N level. Sound generated by instruments and other ambient activities affect significantly the quality of the lung sounds signal. It is therefore crucial to improve the level of the S/N without distorting the stethoscope's signal.

Our algorithm employs the Saviztky-Golay filter [13] for denoising lung sounds. The purpose of this filter is to smooth the signal, and increase the SNR without altering the signal. This filter has been widely used in the field of time series analysis [14] , especially for lung sound analysis [15] .

The filter aims to fit a specific polynomial suitable for a signal frame, using least squares method. The central point of the window is replaced with that of the polynomial, producing a smoother signal. Denote a polynomial of the degree N by

then, the aim of Saviztky-Golay filter is to minimize the following error:

where 2M + 1 is the width of the window.

A large value of M will produce a smoother signal, but may neglect some important variations in the signal. A low value of M may "over fit" the data. This is in a way the principle of uncertainty. Secondly, N , which specify the degree of the polynomial can produce a smooth signal for low value. On the other hand, high value of N may "over fit" the data. By experimenting with various combinations of these filter parameters, we converged on the values of N = 3, M = 11 that yielded the best results. Figure 2 shows an example of a filtered signal, and the corresponding raw data.

Our algorithm for the feature extraction can be decomposed into the two main parts: the signal processing and the geometry processing parts.

In the signal processing part, we preprocess data and compute spectrogram images of the denoised audio signals. In the geometry processing part, the original problem of signal processing is converted to another problem in which we analyze geometric representations of time-varying signals. The first goal of the geometry processing part is to represent spectrogram images by discrete surfaces, and the second goal is to reduce the problem of signal processing to the problem of analyzing shapes of the obtained surfaces. In this process, each sound signal is, first, identified with its spectrogam surface, and then the measure of dissimilarity between two signals is estimated as an effort that takes to deform one spectrogam surface onto the another.

By using various metrics for estimating this effort, we are capable of computing different geometrical quantities associated with spectrogram surfaces. We refer to these quantities as to the geometrical distortions or distortion measures.

In our method, geometrical distortions are used as descriptive features for analyzing the original sound recordings. Based on these features, we train an automatic system for classification of respiratory diseases from sound recording.

Summarizing all the above, our algorithm for extracting features consists of the following steps:

1) First, we use the FFT algorithm to compute the spectrogram of each lung sounds recording. Each recorded signal is represented as a spectrogram surface z(x, y), where x is the time in seconds, y is the normalized frequency and z is the corresponding power-frequency (see Figure 3 . 2) We next triangulate the spectrogram surfaces z(x, y) using the Delaunay triangulation algorithm. That is, we take n samples V = (x 1 , y 1 , z 1 ), . . . , (x n , y n , z n ) of each spectrogram surface and compute a set of triangles that connect vertices over V. The mesh of triangles obtained by that triangulation is denoted by (V, ). It provides a discrete representation of the original sound signal (see supplemental material for details). Figure 4 ). Apparently, the first two stages are aimed at computing a discrete geometric representation of audio signals, whereas the goal of the last two stages is to compare these discrete representations. The details of the above algorithm stages are presented in the next sections.

Assume that S is a manifold surface, embedded in R 3 , and that V is a finite set of points (vertices) sampled on S. Then, a common way to discretize S is to divide that surface into a finite set of triangles such that: (i) vertices of the obtained triangles belong to V; (ii) for any pair of non-disjoint triangles t 1 , t 2 ∈ the intersection t 1 ∩ t 2 is either a common edge of t 1 and t 2 or a common vertex of these triangles. We will refer to the pair (V, ) as to the triangle mesh of a surface S.

In our case, each input signal I is represented by a spectrogram surface S = S(I) that can be written in following parametric form:

where X and Y are the time and frequency ranges of the audio signal I. We divide X and Y into a number of uniformly distributed points x 1 , ..., x N and y 1 , ..., y N , and the vertex set V of S is defined by V = (x i , y j , z(x i , y j ))| i, j = 1, 2, . . . , N .

We tested the above sampling method and found empirically that it works well for N = 150. However, in some scenarios, using adaptive sampling can potentially yield even better results. See Section VI for the discussion on more advanced sampling schemes. The triangle set of S is constructed by the standard algorithm for Delaunay triangulation [16] that minimizes the minimum angle of all the angles of triangles in . This triangulation algorithm avoids generation of slim triangles which appearance may lead to numerical issues in the stage of the feature extraction.

After representing data by triangular meshes, we proceed to the next step of analyzing geometric properties of these meshes.

Given two meshes of spectrogram surfaces, we wish to define a metric suitable for of quantifying geometrical (de)similarities between these meshes. In computer vision, such metrics are often referred to as shape descriptors. There exist many approaches to computing shape descriptors for a collection of 3D objects. These approaches can be divided qualitatively into the following categories: 1) Spectral methods. In spectral approaches, shape descriptors are derived from discrete representations of the Laplace-Beltrami operator, defined on surfaces [17] . Cotangent weights are most commonly used for approximating Laplace-Beltrami operators over meshes. By using cotangent weights, the Laplace operator action on a mesh M can be represented by a sparse Laplacian matrix L = L(M ). In such a case, the spectral descriptors of M = (V, ) are often defined as n-largest eigenvalues of L, for a constant number n < | V | [18] . 2) Metric methods. These methods represent each mesh M by a matrix G of pairwise distances between vertices of M . Usually, these are the Euclidean or geodesic distances. A desimilarity measure between two meshes M 1 and M 2 is defined in the metric approaches as a function of the distance matrices G 1 and G 2 of these two meshes. For example, metric descriptors of triangulated surfaces can be obtained by solving the problem of the General Multi-Dimensional Scaling (GMDS) [19] , or by solving other related problems that involve computations of geodesic distances [20] , [21] . 3) Deformation-based methods. In deformation-based methods, a distance between two shapes S 1 and S 2 is estimated by computing an optimal deformation f 12 of S 1 onto S 2 and by measuring changes in various geometric features induced by f 12 . There exist many criteria for definition of map's optimality. Most of these criteria are targeted at preserving the map injectivity and avoiding visual distortions, as much as possible.

Note that for a large collection {S 1 , ..., S m } of shapes it may be very expensive to compute optimal deformations f ij , for each 1 ≤ i < j ≤ m. Therefore, instead of matching all the pairs of shapes, a more practical approach would be to compute an optimal mapping f i of each shape S i into a simple target domain. Depending on the dimensionality of an object and its topology, a simple target domain could be the plane, a sphere [22] , the unit circle [23] , a cube [24] , or a ball [25] .

Our model employs deformation-based descriptors for measuring similarities between triangle meshes. Note that all the meshes that constitute a peak surface of spectrograms have the topology of a planar disc. Therefore, a natural candidate for the optimal deformation of a such mesh M is a lengthminimizing mapping of M into the plane. We refer to this mapping process as to the surface flattening, for short. In our model, surface flattening algorithms are used for computing deformation-based descriptors of spectrographic shapes.

If f is a flattening of a mesh M , then we select the shape descriptors of M to be the geometrical distortions that measure how Euclidean lengths are deformed under f . In such a way, each mesh M can be associated with its signature vector (E 1 , ...E 2 ), where numbers E i are various estimates of the metric deviations induced by flattening M into the plane.

In the the sequel, we address in the detail the surface flattening and the distortion estimation processes.

The problem of flattening triangular meshes into the plane, also referred to as the parametrization problem, constitutes one of the central issues in geometry processing. Consequently, there exist many algorithms for flattening triangulated surfaces 1 . These algorithms are aimed at computing a locallyinjective parametrization that minimizes distortions of fundamental geometric quantities, such as angles and lengths. Surface parametrization tasks can be reduced to the following optimization problem:

where f * is a piecewise affine mapping of a mesh (V, ) that minimizes the chosen distortion criteria E under the following constraints: for each mesh triangle t, the component of f * on t is an orientation preserving map. These constraints are expressed by the determinant signs of Jacobian matrices df t , t ∈ . Negative determinants of the Jacobians yield inverted triangles in the image of f . Therefore, satisfying the orientation constraints is the necessary condition for inducing one-to-one parametrization of surface meshes. We adopt the recently-proposed Adaptive Block Coordinate Descent (ABCD) algorithm [28] , combined with the Tutte embedding method [29] , to solve the optimization problem (2) and thereby the parametrization problem. In particular, we initialize the parametrization problem (2) by mapping triangular meshes onto a circle via the method of [29] . We then employ the ABCD algorithm to induce locally injective parametrization characterized by minimal length distortions.

Note that, since (2) is a non-convex problem, solving it with different initial maps may lead to distinct local minima. Therefore, choosing an appropriate initialization method is crucial for adequate approximation of the global minimizer f * .

We tested a number of different initialization schemes, and found that using a convex combination mapping of meshes [29] onto a planar disc, yields the best results. Note that the algorithm of [29] is actually a variant of the classical Tutte embedding algorithm that is widely used in shape processing applications. This method guarantees a bijective mapping onto convex planar domains and it has a low computational cost. Figure 3 demonstrates this initialization scheme and the related process of distortion minimization.

We proceed to discuss the process of feature extraction. It includes the local sub-step of extracting features of individual triangles and the global sub-step in which local features are summed over large subsets of mesh triangles.

If M = (V, ) is a triangle mesh and f is a simplicial mapping of M , then a local distortion induced by f , on a triangle t, is defined to be a function E(σ 1 , σ 2 ) of the singular values σ 1 (df t ) and σ 2 (df t ) of the Jacobian df t .

The Jacobian singular values uniquely define the shape of a triangulated surface, up to rotation and sliding of mesh triangles. Generally speaking, local distortions estimates how extensively is the shape of t distorted under f .

These measures are instrumental in many applications in computer vision, including shape classification and shape analysis [25] , [30] . In our algorithm, geometric distortions are used as measures of dissimilarity of triangulated surfaces 2 .

Note that for a dense triangulation, feeding singular values σ i (df t )|t ∈ , i = 1, 2 to a deep learning model preserves all the information contained in the pixels of the spectrogram.

Our algorithm employs several distortion measures. These distortions belong to the following major classes of geometric measures: Isometric distortions. These measures estimate distortions of the Euclidean length. We use the following isometric distortions:

• ARAP (As-Rigid-As-Possible) energy [31] E ARAP (σ 1 , σ 2 ) = (σ 2 1 − 1) 2 + (σ 2 2 − 1) 2 ;

• Symmetric Dirichlet energy [32] E SD (σ 1 , σ 2 ) = 1

• Quasi-isometric (qi) dilatation [33] , [34] E QI (σ 1 , σ 2 ) = max {σ 1 , σ −1 2 } ;

Conformal distortions. These distortions estimate how far f is from being an angle-preserving mapping. Our algorithm uses the following estimates of conformal distortions:

• MIPS energy [36] , [37] 

MIPS (Most Isometric Parametrizations) is a quadratic function, widely used for optimizing conformal distortions over triangular domains [37] .

Area distortions. These distortions estimate dilatation and compression of triangle areas induced by f . We use the following measure of the area distortion:

• Unsigned area distortion [38] E AD (σ 1 , σ 2 ) = max |σ 1 σ 2 |, | σ 1 σ 2 | −1 ;

Scale distortions. These distortions assess the degree to which mesh triangles are scaled by f . Scale distortions are closely related to discrete harmonic mappings [39] and to stretchminimization mappings [40] . We use the following scale distortions:

• Conformal factor [30]

Note that conformal factors are closely related to conformal distortions such as quasi-conformal dilatation and MIPS energy. Indeed, according to the uniformization theorem [41] any disc-topology surface S can be mapped into the plane by a conformal map f S . The map f S can be described by its conformal factors, up to a composition of f S with a rigid transformation. For this reason, the conformal factor has been used by [30] as a geometric signature for a collection of 3D surfaces. All those distortion measures are rotation invariant, since they are functions of signed singular values of the Jacobian. This work aims to show that the dimensionality of the data can be considerably reduced by employing weighted sums of local distortions over different subsets of . The obtained quantities will be referred to as a global distortions.

Let f be a simplicial map of the mesh (V, ), E be a local distortion, 0 be a subset of . The global distortion of f , computed with respect to E over 0 , is then defined as follows:

where df t is the Jacobian of f on t, σ 1 (df t ) and σ 2 (df t ) are the Jacobian singular values and area(t) denotes the area of a triangle t. In many cases, values of local distortions are distributed non-uniformly over mesh triangles. As demonstrated by Figures 4 and 5, a small number of highly distorted triangles may have more impact on the global distortion D (f, E) than the rest of the mesh triangles. Therefore, in order to extract more information from each distortion measure, one can divide the triangle set into a number of disjoint subsets. We employ this approach to extract more features for each distortion measure E(σ 1 , σ 2 ) and to compensate for the adverse effects of a nonuniform distribution of distortions. In particular, we divide triangles into the two subsets according to triangle frequency. Fig. 4 . Flattening surfaces and measuring the resultant geometrical distortions, attained as shape descriptors. The process is visualized for the MIPS distortion energy, defined by (5). We first use ABCD algorithm [28] , initialized with Tutte embedding, to map the triangulated surfaces into the plane. We then compute Jacobian singular values σ 1 (t) and σ 2 (t) over each mesh triangle t. Finally, for each distortion measure E(σ 1 , σ 2 ) and mapping f we compute the two quantities E 1 (f ) and E 2 (f ), defined according to (9) and (10). The frequency of a triangle t is defined as the frequency of the center of gravity of t. The median over the frequencies of all the triangles of the surface is computed. Then, the global distortion D 1 (f, E) is computed for all the triangles with a frequency lower than the median (t ∈ 1 ), and a second global distortion D 2 (f, E) is computed for the triangles with a frequency greater that the median (t ∈ 2 ). In such a way, each distortion measure E contributes the following two features: D 1 (f, E) and D 2 (f, E). We will denote these features by E 1 (f ) and E 2 (f ), for short. That is,

where D i is defined according to (9) . To summarize, we measure global distortions over the two subsets of triangles and use the obtained quantities as a shape descriptors of spectogram surfaces. This approach has the following advantages over the previously available distortionbased models for shape analysis [30] , [25] :

2) The overall number of features is further increased by dividing distortions into the low and high frequencies.

3) The method operates on triangular meshes instead of tetrahedral meshes. Compared with the volumetric method of [25] , extracting features in our algorithm has a lower computation cost 3 . A high-level overview of the proposed model is summarized in Figure 6 . 

A baseline (i.e. a reference mode) has to be created for comparison the model created with. A different approach has been selected for this purpose, based on a set of features has been handcrafted. Twelve Mel Frequency Cepstrum Coefficients (MFCCs) were extracted from the audio files. MFCC is the most widely used feature extraction method in automatic speech recognition [42] . In the feature extraction phase, 6 statistical parameters have been extracted from each of the 12 MFCC coefficients as follows: mean, standard deviation, min, max, mean of the absolute difference, standard deviation of the absolute difference. Altogether, 72 features.

The reference model has been applied to all classifiers, with the same training cost in the case of the proposed model. The models that we use for comparison have been trained and tested on the same train/test subdivision of the data. The proposed model was first applied with each distortion measure separately, to assess the efficiency of each of them. Combination of the distortion measures were then tested.

Finally, the MFCC-based model has been combined with the proposed model (based on distortion measures). For each recording, 88 features have been computed: the 16 features based on distortions measures (2 for each distortion), and the 72 features based on MFCC coefficients. As the number of features increased significantly, a feature selection step has been applied, based on the ranking of features, determined by implementation in the Random Forest classifier. Altogether, 45 features have been selected.

A total of two classification tasks have been conducted: a multi-class classification, with the 5 pathologies and the healthy recordings, and a binary classification, for each of the 5 pathologies against the class of healthy recordings.

The dataset was subdivided into 80% training set and 20% test set. For each task, several classifiers were experimented: Logistic regression (LR), support vector machine (SVM), Random forest (RF), K nearest neighbors (KNN), and AdaBoost (AB). For all of these models we used the 16 engineered features. For each model, hyper-parameters such as the number of estimators or number of neighbors were optimized using 5-fold cross validation. A large random grid of hyperparameters was searched for (see Supplements). In the case of the multi-class classification, the performance measure used for optimization was the accuracy, whereas for the binary tasks the area under the receiver operating characteristics curve (AUROC) was used. A weight has been assigned to each class, inversely proportional to the class frequencies in the training set.

For each iteration of the cross-fold, training examples were divided into training and validation set by stratifying among patients, which means several recordings from the same patient are always in the same set.

All the models were trained on the same test set. That is, for all the models, the database was split into the same training and test subsets.

The following metrics were used for the performance evaluation:

where T P, T N, F P, F N are respectively the the True Positives, True Negatives, False Positives and False Negatives. P is the number of positives samples, N is the number of negatives samples.

The area under the ROC curve, AU ROC, is computed.

The results of the MFCC baseline model are summarized in Table I ,

The results obtained by each classifier are summarized in Table II and the results of the binary classification appear in  Table III . For clarity, we report only the best results of the binary classifications in Table III. Ranking of the features according to their importance, as determined by Random Forest classifier, is depicted in Figure 7 . The proposed model obtains a better AU ROC than the baseline models for almost all the binary tasks. For differentiation of pneumonia pathology from the rest of diseases, the proposed model yields a lower AU ROC value than the baseline model (0.87 vs 0.90).

After comparing the proposed models with the MFCC-based model, a combination of the models has been implemented as follows:

For each recording the feature vectors from the 2 models have been concatenated, leading to a total of 88 features. Figure 8 presents the ranking of the features, for each of the 5 binary tasks and the multiclass task of identifying the 5 pathologies. Although there are 16 distortion measures features and 72 MFCC features, for most of the pathologies the occurrence of the distortion measure features is relatively high. In particular, there are six distortion measures out of ten most highly-ranked features for the Bronchiectasis and URTI pathologies. Likewise, distortion measures appear among the four most highly-ranked features used in classification of the Bronchiolitis and COPD diseases. Indeed, for this pathology the MFCC-based model outperformed the proposed model. However, if our task is to identify a Pneumonia lung sound, then only a single distortion measure appears in the feature ranking list. Indeed, for this pathology the MFCC based model outperformed the proposed hybrid model.

Secondly, Table IV presents the results of the combined model for the binary tasks. This hybrid model receives a AU ROC of 1.00 for Bronchiectasis, Bronchiolitis and COPD diseases. To summarize, the combination of both approaches with the feature selection step improved the results.

Finally, Figure 9 presents the accuracy on the test set of the combined model for the multiclass classification, over the number of features selected. The results are computed by using the test set. The most accurate classification results are achieved with 40 features. The accuracy decreases with higher number of features, as the model begin to overfit the data.

The success penetration of innovative machine learning methodologies into certain disciplines of medical diagnostics, such as radiology, has increased the interest in achieving an automatic process for analysis of lung sounds.

This a process can be used for second opinion in the diagnosis of pulmonary diseases, it can be useful for monitoring patients in critical care units and may even substitute for medical experts in mass screening of malaria, as example, in parts of the world where there is severe deficiency in medical expertise and far from sufficient manpower.

With these goals in mind, a new approach has been proposed in this work, based on geometric properties of the spectrographic representation of lung sounds. This model outperformed MFCC-based model in four out of five pathologies for which we had access to sufficient data. Further, as we have shown, the two approaches can be combined by concatenating feature vector, or by linear regression between the models output probabilities. These approaches can even be incorporated into a deep learning framework which has emerged as the most widely used approach in the field of machine learning. Such a hybrid paradigm has recently attracted the attention of the community of AI [43] .

It should be noted that our model matches the performance of the existing deep learning methods, while requiring much fewer data samples for training. Furthermore, compared with purely data-driven approaches, our handcrafted features allow a better theoretical understanding of the sound classification problem. In particular, by analyzing properties of simplicial maps we can identify distortion-preserving transformations of audio signals in the time-frequency domain (see Figure 10 for the illustration).

With the available, rather limited volume of dataset, we cannot predict what will be the optimal subset of features to select in the combined method that incorporates MFCC and distortion measures. Indeed, the subset of features varies from disease to disease. A larger dataset is needed to assess what is the global optimal subset of features.

Our novel approach to classification, based on distortion measures, highlights an interesting direction for future research would be to consider a higher dimensional distortion measures. In particular, a spectrogram surface can be represented by the tetrahedral volume enclosed by that surface and the plane z = 0. In such a case, sound spectrograms can be characterized by 3D distortions induced by mapping tetrahedral meshes into canonical domains. Although the tetrahedral approach has a higher computational cost, it can yield potentially more accurate results because volumetric distortions can detect both the changes made to the boundary surface and the changes made to the interior volume.

It should be also interesting to combine our model with various types of shape descriptors, such as the metric and spectral geometrical features, listed in Section III-E.

We also plan to examine more methods for discretizing the spectrogram images. In particular, a curvature based method [44] can be used for a more accurate sampling of spectrogram images and for building triangle meshes with an optimal number of vertices.

Finally, we stress that our approach to the classification of one-dimensional signals is also applicable to higherdimensional signals. Distortion measures can be extended in a straightforward manner to R n and to piecewise linear manifolds embedded in R n , for any n ≥ 2 4 . Indeed, if f : R n → R n is a simplicial map and s is an n-dimensional simplex, then a local distortion of f over s can be expressed by a function E(σ 1 , σ 2 , . . . , σ n ), where σ i denotes the i th singular value of the Jacobian matrix df s ∈ R n×n . So that our distortion-based analysis of surfaces has been extended to m-manifolds embedded in R n and to their discrete representations, for any 2 ≤ m ≤ n. For instance, consider a simultaneous recording of different time-varying signals such as pulmonary sounds, heart rate, oxygen saturation, body plethysmography, etc. Instead of computing spectrograms for each signal separately, one can represent a n-channel data stream by a 2-manifold embedded in R n . The obtained manifold can be discretized using the sampling method of [44] and a Delaunay-based algorithm for triangulation. An extension of our approach to manifolds thereby allows a more general analysis of multichannel biomedical data, collected from various devices. We therefore consider other applications of the proposed distortion-based model in related fields of biomedical signal processing, medical imaging and voice recognition.

For Random Forests classifier, the grid focused on:

• Number of estimators (100, 110, 120, 150, 200, 250, 300). • Number of features to consider at every split (could be all features, or just the square of overall features). • Maximum number of levels in tree (from 10 to 110, with pace of 10). • Minimum number of samples required to split a node (2, 5, 10) . • Minimum number of samples required at each leaf node (1,2,4). • Enable/Disable bootstrap. For SVM classifier, the grid contained:

• Regularization parameter, C. The strength of the regularization is inversely proportional to C. An exponential search was used, from 10-9 to 1013. • Radial Basis Function (RBF) Kernel was used. • Degree of the kernel function (relevant only for polynomial kernel): 1, 2, 3, 4, 5, 10. • Value of gamma. An exponential search was used, from 10-9 to 1013. • Whether to use the shrinking heuristic. Regarding the KNN classifier, the random grid has the following features:

• Number of neighbors to use for k-neighbors queries: small numbers were tried, like 3, 5, 7, 11 and then numbers from 41 to 131, with pace of 10. • Weight function used in prediction. Whether uniform, or distance (meaning weight points by the inverse of their distance). • The algorithm used. It could be ball tree, kd tree, or brute force. • The power parameter for Minkowski metric, p: 1 for manhattan distance, 2 for Euclidean distance. In the case of Logistic Regression model:

• The norm used in the penalization. It can be L1, L2, or elasticnet. • Tolerance for stopping criteria. An exponential search was used, from 10 −9 to 10 13 . • Inverse of regularization strength, C: an exponential search was used, from 10 −9 to 10 13 . • The algorithm to use in the optimization problem. There are liblinear, lbfgs, sag, saga and newton-cg. Adaboost combines a series of weak learners, with the aim of creating an improved classifier. The weak learners vote for the final prediction label, and a majority voting is performed. The grid search for the Adaboost classifier is constitued of:

• Number of estimators: A linear search was used, from 50 to 200 with step of 10. • Learning rate: An exponential search was used, from 10 −9 to 10. • Base estimator: RandomForest, DecisionTree, SVM, Lo-gisticRegression.

A Delaunay triangulation of a finite set P ⊂ R 3 is defined as a collection I of triangles, such that:

• conv (V) = ∪ T T • P = ∪ T V (T ) • For every distinct pair T, U ∈ , the intersection T ∩ U is either a vertex, an edge, or empty. For a finite set P ⊂ R 3 , the Delaunay triangulation is unique if the points are in a general position, which means no 5 points are cospherical. The assumption of unique Delaunay triangulation has been made in this work.

A survey on deep learning in medical image analysis

Artificial intelligence outperforms pulmonologists in the interpretation of pulmonary function tests

Application of semi-supervised deep learning to lung sound analysis

A multiresolution analysis for detection of abnormal lung sounds

Prevalence of obstructive lung disease in an african country using definitions from different international guidelines: a community based cross-sectional survey

Ai4covid-19: Ai enabled preliminary diagnosis for covid-19 from cough samples via an app

Automatic detection of patient with respiratory diseases using lung sound analysis

Essentia: An audio analysis library for music information retrieval

International Society for Music Information Retrieval (ISMIR)

A respiratory sound database for the development of automated classification

Neural classification of lung sounds using wavelet coefficients

Cough sound analysis for pneumonia and asthma classification in pediatric population

Comparison of ar-based algorithms for respiratory sounds classification

Smoothing and differentiation of data by simplified least squares procedures

A simple method for reconstructing a high-quality ndvi time-series data set based on the savitzky-golay filter

Savitzky-Golay Filter for Denoising Lung Sound

Geometry and topology for mesh generation

Laplace-beltrami spectra as 'shape-dna'of surfaces and solids

Laplace-beltrami eigenfunctions for deformation invariant shape representation

Generalized multidimensional scaling: a framework for isometry-invariant partial surface matching

Geodesic object representation and recognition

On bending invariant signatures for surfaces

Map-based exploration of intrinsic shape differences and variability

Algorithms to automatically quantify the geometric similarity of anatomical surfaces

Volumetric mapping of genus zero objects via mass preservation

Geometric approach to detecting volumetric changes in medical images

Mesh parameterization: Theory and practice

On inversion-free mapping and distortion minimization

Adaptive block coordinate descent for distortion optimization

One-to-one piecewise linear mappings over triangulations

Characterizing shape using conformal factors

As-rigid-as-possible surface modeling

Bijective parameterization with free boundaries

Geometric approach to estimation of volumetric distortions

Boundeddistortion piecewise mesh parameterization

Computing quasi-conformal maps in 3d with applications to geometric modeling and imaging

Mips : An efficient global parametrization method

Computing locally injective mappings by advanced mips

An adaptable surface parameterization method

Reversible harmonic maps between discrete surfaces

Signal-specialized parameterization

The uniformization theorem

Invited paper: Automatic speech recognition: History, methods and challenges

Model-based and data-driven strategies in medical image computing

Sampling and reconstruction of surfaces and higher dimensional manifolds