key: cord-0790132-z8i2qcov authors: Hsiao, Wei-Ting; Kan, Yao-Chiang; Kuo, Chin-Chi; Kuo, Yu-Chieh; Chai, Sin-Kuo; Lin, Hsueh-Chun title: Hybrid-Pattern Recognition Modeling with Arrhythmia Signal Processing for Ubiquitous Health Management date: 2022-01-17 journal: Sensors (Basel) DOI: 10.3390/s22020689 sha: 44c3d44ae3089ce50a574167719810583ea50da2 doc_id: 790132 cord_uid: z8i2qcov We established a web-based ubiquitous health management (UHM) system, “ECG4UHM”, for processing ECG signals with AI-enabled models to recognize hybrid arrhythmia patterns, including atrial premature atrial complex (APC), atrial fibrillation (AFib), ventricular premature complex (VPC), and ventricular tachycardia (VT), versus normal sinus rhythm (NSR). The analytical model coupled machine learning methods, such as multiple layer perceptron (MLP), random forest (RF), support vector machine (SVM), and naive Bayes (NB), to process the hybrid patterns of four arrhythmia symptoms for AI computation. The data pre-processing used Hilbert–Huang transform (HHT) with empirical mode decomposition to calculate ECGs’ intrinsic mode functions (IMFs). The area centroids of the IMFs’ marginal Hilbert spectrum were suggested as the HHT-based features. We engaged the MATLAB(TM) compiler and runtime server in the ECG4UHM to build the recognition modules for driving AI computation to identify the arrhythmia symptoms. The modeling extracted the crucial data sets from the MIT-BIH arrhythmia open database. The validated models, including the premature pattern (i.e., APC–VPC) and the fibril-rapid pattern (i.e., AFib–VT) against NSR, could reach the best area under the curve (AUC) of the receiver operating characteristic (ROC) of approximately 0.99. The models for all hybrid patterns, without VPC versus AFib and VT, achieved an average accuracy of approximately 90%. With the prediction test, the respective AUCs of the NSR and APC versus the AFib, VPC, and VT were 0.94 and 0.93 for the RF and SVM on average. The average accuracy and the AUC of the MLP, RF, and SVM models for APC–VT reached the value of 0.98. The self-developed system with AI computation modeling can be the backend of the intelligent social-health system that can recognize hybrid arrhythmia patterns in the UHM and home-isolated cares. In the artificial intelligence (AI) era, ubiquitous health management (UHM) and care services have been made to be compliant with smart medical techniques. With communication computing technology, physiological signals can be routinely tracked by wearable sensors at home for clinical diagnosis of chronic diseases (e.g., arrhythmia) and even home-isolated cares (e.g., [1, 2] . Medical informatics with AI computation can be targeted in a recognition model to efficiently identify clinical data characteristics for health risk assessment and prevention [3] [4] [5] . The AI-enabled machine learning (ML) model typically consists of featuring and data training, which pre-processes the labeled samples (e.g., specific symptoms in the electrocardiogram (ECG) of arrhythmia) and explores the of the EMD is equivalent to FFT [28] . The IMF can further constitute the Hilbert spectrum (HS) including the instantaneous frequency and energy in the time-frequency domain. The HHT has been widely used for wavelet analysis of the electroencephalogram and ECG in biomedical engineering and informatics [29] [30] [31] [32] . In this study, we developed a prototype of a UHM system (UHMS) and conducted an AI computation with multiclass recognition models composed of the ML methods (e.g., MLP, RF, SVM, and NB) and the HHT-based features to identify the hybrid patterns of four arrhythmia symptoms (APC, AFib, VPC, and VT) versus normal sinus rhythm (NSR). The system was constructed using the Java and MATLAB TM languages. The MIT-BIH arrhythmia open database was used in this study to assess multiclass recognition modeling in the proposed UHMS. As shown in Figure 1 , the AI-compliant computation processes four analytical slices including data pre-processing, featuring, labeling, and machine learning behind the UHMS. The self-developed modules, "ECGHHT" and "MMLCA", were built using MATLAB TM (license no. 40697750) to progress the featuring and machine learning processes, respectively, for achieving the recognition scope. The computation processes request manual evaluation at the data training stage but achieve automatic processing at the final prediction stage. As training data, the input signals were automatically transformed to the HHT-based features by the ECGHHT module in the slices of data pre-processing and featuring; then, they needed manual evaluation with the MMLCA module in the slices of labeling and machine learning before prediction. The computation skips the slices in manual evaluation and approaches completely automatic mode for prediction. The MIT-BIH arrhythmia database's resource web site [18] presents the ECG data with annotations at the interest time points for various symptoms. We adopted the symptoms, including the normal status (NSR) and the abnormal conditions in the atrium (APC and AFib) and the ventricle (VPC and VT), for sampling in our model. The waveforms of Figure 1 . The AI-compliant modeling process with HHT-based featuring and coupling ML analysis for multiclass recognition of arrhythmia ECG data. The MIT-BIH arrhythmia database's resource web site [18] presents the ECG data with annotations at the interest time points for various symptoms. We adopted the symptoms, including the normal status (NSR) and the abnormal conditions in the atrium (APC and AFib) and the ventricle (VPC and VT), for sampling in our model. The waveforms of the arrhythmia described above are shown in Figure 2a The NSR is the rhythm originating from the sinus node and is observed in a healthy human heart. The NSR presents the heart rate in its normal range, and the P waves are regular on ECGs with conformable rates. The APCs perform premature heartbeats originating in the atria to cause contractions for heart palpitations or unusual heartbeats. The beats of APCs are irregularly timed too fast or slow in an atrial cycle. The AFib is the common arrhythmia that shows the RR intervals following no repetitive pattern and no distinct P waves in an atrial cycle length with a PP interval less than 200 milliseconds. The VPCs are caused by ventricular myocardium situations including patients without structural heart disease. The VT displays a rate greater than 120 beats per minute and at least three vast QRS complexes in a row. The symptom can be classified as sustained versus non-sustained based on whether it lasts more or less than 30 s, respectively. The NSR is the rhythm originating from the sinus node and is observed in a healthy human heart. The NSR presents the heart rate in its normal range, and the P waves are regular on ECGs with conformable rates. The APCs perform premature heartbeats originating in the atria to cause contractions for heart palpitations or unusual heartbeats. The beats of APCs are irregularly timed too fast or slow in an atrial cycle. The AFib is the common arrhythmia that shows the RR intervals following no repetitive pattern and no distinct P waves in an atrial cycle length with a PP interval less than 200 milliseconds. The VPCs are caused by ventricular myocardium situations including patients without structural heart disease. The VT displays a rate greater than 120 beats per minute and at least three vast QRS complexes in a row. The symptom can be classified as sustained versus non-sustained based on whether it lasts more or less than 30 s, respectively. We noticed the waveform of an AFib would rapidly unite the APC's waveforms and is similar to a VT that combines with the VPCs. This study, therefore, defined the premature pattern (i.e., APC and VPC) and the fibril-rapid pattern (i.e., AFib and VT) for analysis. In clinical practices, the APC and VPC often appear in ordinary people due to the fact of anxiety and, in such cases, it is not regarded as a severe disease [33] . All points of interest of the specified arrhythmia symptoms annotated on the MIT-BIH website were referred to as the official samples. Their symptom ID labeled the specimens as NSR, APC, AFib, VPC, and VT (i.e., 1, 2, 3, 4, and 5, respectively). The featuring frame, which can contain a complete waveform for different symptoms within a unified period, was applied to screen the samples. For example, the APC and VPC can be mixed with the NSR in a 3 s frame. According to the MIT-BIH website's analytical resource for all 48 patients, the 30 min recording was adopted from each patient's ECG, and several points of interest were annotated for the arrhythmia symptoms (i.e., one point denoted a symptom around the timestamp as shown in Figure 2 ). One patient's recording could have some but not all of the five arrhythmia symptoms. We then screened the recording of five patients (#100, #105, #202, #203, and #205), who were observed with many interesting symptoms related to our study. For example, patient #202's recording was annotated for NSR, APC, AFib, and VPC with 2, 3, 7, and 4 points of interest, respectively; we also observed an additional 507, 14, 278, and 14 points (i.e., we acquired 509, 17, 285, and 18 samples of NSR, APC, AFib, and VPC from patient #202). Finally, we extracted 4190 samples including 176 samples from all patients' points of interest for NSR (35) , APC (11), AFib (23), VPC (24) , and VT (83) and 4014 samples self-observed for NSR (2964), APC (97), AFib (649), VPC (272), and VT (32); the total amount of NSR, APC, AFib, VPC, and VT samples were 2999, 108, 672, 296, and 115, respectively. Each sample covered a 3 s frame including a complete waveform of the symptom for a point of interest, and the signals in a frame could be transformed to the features set in a record through HHT processing. However, the amounts of different symptoms were still insufficient and unbalanced in the observed samples. We then designed three sample data sets below for analysis. With the resource limitations, we designed crucial, simulative, and test data sets in the modeling. We arbitrarily adopted 300 equivalently labeled data (i.e., 60 records per symptom) for the crucial data set (A) and 3890 remaining records (i.e., 2939, 48, 612, 236, and 55 records for NSR, APC, AFib, VPC, and VT, respectively) for the test data set (B). In addition, we created a simulative data set (C) that had 5000 equivalently labeled data (i.e., 1000 records for each symptom) for comparison. That is, we separated the observed samples into data set (A) and (B) for data training and testing, respectively; then, we created data set (C) with the same amount of the five symptoms, which were produced by normal distribution due to the mean (µ) and standard deviation (σ) of all observed data for the simulation. In this study, data set (C), independent of the extracted samples but containing an abundant quantity of every arrhythmia feature, supplemented the sufficient and balanced data compared to data set (A) for machine learning. In other words, we applied the feature's mean and standard deviation, as shown in Table 1 , and employed MATLAB's "normrnd" function that can generate a random data set in normal distribution to create the simulative data set. The data set can randomly fill enough data into the same range as crucial feature's distribution. We created several simulative data sets for analysis with the randomized process and adopted the appropriate one for demonstration. The data sets (A) and (C) were taken for training data with 5-fold cross-validation in the modeling. This means that 80% and 20% of the data sets were separated for training and validation in each fold to deliver the performance information for modeling evaluation. We tried the cross-validation processes several times to evaluate the proper model, while the training and validating data sets were independent of each other every time. The samples, including the training and testing data points, were applied for the ML models of the MLP, RF, SVM, and NB. We established the module, ECGHHT, to enhance the HHT functions in the featuring process. The HHT drives the EMD process to produce the Hilbert spectrum (HS) for deriving the features in a time-frequency domain. The EMD assumes that the signals are empirically composed of several IMFs that present waveforms with a sum of peaks and troughs equal to one or more of the number of zero-crossing points. In addition, the average of the upper and lower envelopes due to the waveforms should be close to zero. Each EMD cycle, comprising six steps in the "data pre-processing block", as shown in Figure 1 , can obtain an IMF component in the HHT. The steps were repeated until the residue function was monotonic. The IMF can be expressed by Equation (1) that contains a sine wave function ω(t) with a decomposable mode function s j (t). The IMF can be converted into the HS by a sparse matrix with the Hilbert transform as shown in Equation (2) . The spectrum entails the instantaneous frequency and energy for nonlinear phase change. On the HS's time-frequency plane, the amplitude of the frequency can be integrated along the time axis to approach the marginal Hilbert spectrum (MHS) as shown in Equation (3) . The MHS represents the total energy corresponding to the frequency. Therefore, ECG signals can be decomposed by various IMFs that may include the specific symptoms corresponding to arrhythmia. Figure 3a ,b compare the IMFs of the NSR and the VPC signals, respectively, decomposed with the HHT process. The instantaneous frequency and energy, as shown in Figure 3c ,d, may present the HRV with the waveform of IMF1 (i.e., the first-order IMF). The MHS, as per the examples shown in Figure 3e ,f, performs the valid regions and the centroids (i.e., xand y-coordinates with respect to frequency and power) of various IMFs in a range of frequency distribution. For this data pre-processing step, the loss-pass filter was employed to filter the noise of the signals for the HHT. Past studies approved a filter at 16 Hz that well-exhibited characteristics of the NSR and arrhythmias [34, 35] . Referring to the literature's approach for the Butterworth filter, the cutoff frequency with a low-pass order between 4 and 8 was one of the suggestive parameters to remove ECG's noise [36] . We then suggested a 5th-order low-pass digital Butterworth filter with a normalized cutoff frequency of 16 Hz for the data sampled at 360 Hz. The parameter adopted in the modeling must be identical for the processing in recognition. In practice, the proposed parameters should be adjusted once the clinical data applied for training are different from the samples in this study. The spectral characteristics of IMFs are available for different signal patterns to explore the features in which the instantaneous frequency and energy of the IMFs perform a stable variation for the normal rhythm with respect to the arrhythmias. Meanwhile, the MHS-area centroids (i.e., the mean frequency and the energy per second (or power) in a featuringframe period) are distributed in the different ranges for various symptoms. We then considered the MHS-area centroids of the first three orders of IMFs as the features for data training in the ML process. In addition, we referred to the mean and standard deviation of the features in the limited data source (i.e., data sets (A) and (B)) to produce sufficient and balanced trainable data (i.e., data set (C)) for evaluating the ML model's efficacy in different data set groups. The ML models were bundled into the recognition module, "MMLCA", for coupling the same HHT-based features in analysis. The MMLCA was expandable to compose comprehensive ML methods, such as the MLP, RF, SVM, and NB, which were well known in biomedical data analysis. We, therefore, adopted them in the modules to recognize the hybrid symptom patterns. For comparison, all modules can be applied to screen the ECG data, including the four arrhythmias. The methodology details can be referred to in the literature [7] [8] [9] [10] [11] and their brief highlights are addressed below. The ML models were bundled into the recognition module, "MMLCA", for coupling the same HHT-based features in analysis. The MMLCA was expandable to compose comprehensive ML methods, such as the MLP, RF, SVM, and NB, which were well known in biomedical data analysis. We, therefore, adopted them in the modules to recognize the hybrid symptom patterns. For comparison, all modules can be applied to screen the ECG data, including the four arrhythmias. The methodology details can be referred to in the literature [7] [8] [9] [10] [11] and their brief highlights are addressed below. The MLP presents a feedforward neural network to transfer information among the input, hidden, and output layers. The MLP involves enough neuron nodes in a hidden layer for mapping the input-output features of the patterns. The nodes in the hidden and output layers can activate the nonlinear functions to separate the initially heterogeneous data from the linear perceptron. The functions can optimize weight values for bias evaluation, such as backpropagation, regularization, or conjugate gradient descending. Inside The MLP presents a feedforward neural network to transfer information among the input, hidden, and output layers. The MLP involves enough neuron nodes in a hidden layer for mapping the input-output features of the patterns. The nodes in the hidden and output layers can activate the nonlinear functions to separate the initially heterogeneous data from the linear perceptron. The functions can optimize weight values for bias evaluation, such as backpropagation, regularization, or conjugate gradient descending. Inside the MLP, the learning mechanism can adjust the nodes' correlation to enhance data training with self-adaptive weighting factors. The RF ensembles abundant decision trees that enable non-parametric supervised learning with conditional classification rules for classification. The bagging and boosting algorithms are usually applied for the ensemble analysis according to various sub-samples to enhance the model's predictive efficacy. The SVM in analysis follows three principles: (a) separating dimensions in the feature space, (b) calculating projection through the inner product of the kernel functions, and (c) optimizing data separation to obtain the support vector. The categorical features could approach good efficacy with analyses due to past research [37] . The method looks for the support vector with an optimized boundary in a feature space due to the kernel function that minimizes the errors between different dimensions. For nonlinear projection, the kernel function is applied to reduce the features' dimensions through two eigenvectors, such as the linear, polynomial, or Gauss radial basis functions (RBFs), to identify the dimension's similarity [38, 39] . In this study, a RBF was employed to classify the four arrhythmia symptoms and NSR. The NB drives the independence assumptions conditionally between feature pairs to calculate the probability of features based on Bayes' theorem. The NB classification can fast estimate the targets with few training data since the parameters can be linearly functioned with the relative variables. We employed the MATLAB™ modules "patternnet", "fitcensemble", "fitcecoc", and "fitcnb" to wrap up the MLP, RF, SVM, and NB models, respectively, in the MMLCA. The parameters optimized for the ML models are suggestive in Table 2 . For enabling the multiclass recognition capability, we compared the coding design modes "one-versusone" (OVO) and "one-versus-all" (OVA), which were built in the modules to reduce the multiclass problem to a series of binary problems. We then explored the best option with the OVO mode, which takes a binary pair for positive and negative classes but ignores the rest for the approach. In this study, the recognition model drove the ECGHHT module to extract the HHTbased features and used the MMLCA module to couple the above four ML methods for training the features of the hybrid arrhythmia symptoms. The UHMS infrastructure, as shown in Figure 4 , integrates the Java-based web platform "ECG4UHM" with the MATLAB TM runtime server (MRS) to enable AI computation with the proposed ML models for cloud applications in real-time arrhythmia recognition. The modeling and recognition tiers include the dashed and solid arrow lines, respectively, for the manual data training and automatic recognition processes. The platform was constructed by Java development kit 1.8.0 and MATLAB TM 2020b. The infrastructure comprises the modeling, recognition, and management tiers to manage the MRS functions and uHealth services. The UHMS infrastructure, as shown in Figure 4 , integrates the Java-based web platform "ECG4UHM" with the MATLAB TM runtime server (MRS) to enable AI computation with the proposed ML models for cloud applications in real-time arrhythmia recognition. The modeling and recognition tiers include the dashed and solid arrow lines, respectively, for the manual data training and automatic recognition processes. The platform was constructed by Java development kit 1.8.0 and MATLAB TM 2020b. The infrastructure comprises the modeling, recognition, and management tiers to manage the MRS functions and uHealth services. The infrastructure exhibits four AI computation slices, as shown in Figure 1 , to generate hybrid patterns based on recognition modeling in the MRS. The runtime server can achieve three procedures in the modeling tier: (i) transform ECG features, (ii) specify recognition patterns, and (iii) identify arrhythmia labels. We engaged the MATLAB™ compiler (named MMC Apps) to compile the MMLCA's functions, which progress the ML models such as MLP, RF, SVM, and NB for coupling features analysis, such as Javabased classes, and compress them in a Java-class library (named "jar" file). The jar files are stored in ECG4UHM's pattern library (named "jarlib"). Referring to the MRS architecture, we created two jars named "HHTFeature" and "TracePlot" to incorporate the recognition model with the "javabuilder" that builds Java classes of the modules for AI computation in the MRS. The HHTFeature conducted the ECGHHT and MMLCA modules to The infrastructure exhibits four AI computation slices, as shown in Figure 1 , to generate hybrid patterns based on recognition modeling in the MRS. The runtime server can achieve three procedures in the modeling tier: (i) transform ECG features, (ii) specify recognition patterns, and (iii) identify arrhythmia labels. We engaged the MATLAB™ compiler (named MMC Apps) to compile the MMLCA's functions, which progress the ML models such as MLP, RF, SVM, and NB for coupling features analysis, such as Java-based classes, and compress them in a Java-class library (named "jar" file). The jar files are stored in ECG4UHM's pattern library (named "jarlib"). Referring to the MRS architecture, we created two jars named "HHTFeature" and "TracePlot" to incorporate the recognition model with the "javabuilder" that builds Java classes of the modules for AI computation in the MRS. The HHTFeature conducted the ECGHHT and MMLCA modules to accomplish procedures (i) and (ii), and the TracePlot was in charge of the procedure (iii) to export recognition results in the traceable diagrams. In the ECG4UHM platform, we designed the pattern repository including "data_log", "patt_bank", "out_log", and "fig_log" to save input features, suggestive patterns, recognition labels, and traceable diagrams, respectively. The uploaded ECG data sets are suggested to unique filename with an identified number and timestamp for managing the personal traceable diagram. The ECG4UHM saves the ECG data set in the data_log when identifying the symptoms in the recognition tier. With the Java-class library, the "HHT" class transforms the ECG data into the HHT-based features. The "Pattern-Recognition" class then activates the pattern recognition process by evaluating the suitable models in patt_bank for the diverse arrhythmia symptoms and restores the results in out_log for the suggestion. The "Data-Convert" class finally converts the output labels to Java-compliant data format in fig_log and displays the traceable diagrams on the web interface. The interface supports cloud computing for modeling the pattern and tracking the ECG data behind arrhythmia recognition. Data exchange between the management and recognition tiers assists the users to access their health data ubiquitously. The UHMS's management tier was designed to manage arrhythmia status and daily reports for home-isolated healthcare via uHealth services. The traceable diagram can be retrieved from the ECG4UHM platform and be referred to physicians for clinical diagnoses. More trainable ECG samples can be cyclically uploaded to the modeling tier to improve the recognition patterns for more arrhythmia symptoms. Still, the web page can present information for health education. In advance, the designed UHM functionality can be an important portal of the intelligent social-health system. Following the modeling, the ML models due to the crucial data set for the NSR versus both the premature pattern (i.e., APC and VPC) and the fibril-rapid pattern (i.e., AFib and VT) were approved by cross-validation. Then, we applied the models to train data sets (A) and (C) and evaluated their performance by testing data set (B). The range of feature values in the training data set should cover that in the testing data set. The models were finally applied to scan some ECG segments that reported the arrhythmia periods on the official website for evaluating the feasibility. The HHT process retrieved the features regarding the symptoms of APC, AFib, VPC, VT, and NSR from the sample data sets. From inspecting the features due to the three IMFs of all observed data, the VT's mean powers were higher than VPC and similar to those observed for AFib versus APC. In contrast, the frequencies of VT and AFib were mainly lower than VPC and APC, respectively. The mean (µ) and standard deviation (σ) of frequencies and powers are shown in Table 1 . The NSR usually presented the lowest power and the highest frequency for the three IMFs with respect to other arrhythmia symptoms. The tendency of the orders attained the approach in the past study [40] . In general, the MHS-area centroids of the orders after IMF3 were similar for all rhythms. Regarding the values of µ and σ, the IMF1 had the lower deviation, while the IMF2 and IMF3 had the larger deviation. We then employed a statistical test to verify significance, as shown in Table 1 , for various symptoms. The results revealed that the IMF1's features presented significant difference (i.e., p < 0.001) for all patterns excluding AFib-VPC; the IMF2's frequency was not significant for the patterns APC-VPC, APC-VT, AFib-VPC, AFib-VT, and VPC-VT; while the IMF3's frequency and power were not significant respectively for the pattern sets {APC-VPC, AFib-VPC, AFib-VT, and VPC-VT} and {AFib-VPC, AFib-VT, and VPC-VT}. We used the features of data set (A) to train the proposed ML models with fivefold cross-validation, and then the trained models were employed to test data set (B) for evaluation. The same process was applied for data set (C) derived from data set (A) to evaluate the models' performance due to the simulative data training designs. By comparing the ML models' effectiveness in coupling analysis, we suggested the parameters shown in Table 2 for the proposed models. With the five-symptom set {NSR, APC, AFib, VPC, and VT}, recognition for each symptom versus the rest (i.e., one-versus-rest mode, OVR) was evaluated by the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristics (ROC) in the modeling. The sensitivity and 1-specificity constructed the ROC curve for the possible cut-point values [41] . The formulations, as shown in Equation (4), comprised the number of true-positive (TP), true-negative (TN), false-positive (FP), and false-negative (FN) data for calculating the accuracy, sensitivity, and specificity due to the ROC to perform the model's effectiveness. We computed the average accuracy, sensitivity, and specificity of the model's ROC for evaluation. We compared NSR versus APC and VPC (i.e., premature pattern) in data set (A) trained by the RF model with cross-validation, and the best ROC curves are shown in Figure 5a . The best accuracy and AUC reached 0.99, which can be referred to in the past study [20] . If the NSR mixed with AFib and VT (i.e., the fibril-rapid pattern), the AUC, shown in Figure 5b , was approximately 0.99 on average. The evaluation approved the reliability of the suggested models. the models' performance due to the simulative data training designs. By comparing the ML models' effectiveness in coupling analysis, we suggested the parameters shown in Table 2 for the proposed models. With the five-symptom set {NSR, APC, AFib, VPC, and VT}, recognition for each symptom versus the rest (i.e., one-versusrest mode, OVR) was evaluated by the sensitivity, specificity, and area under the curve (AUC) of receiver operating characteristics (ROC) in the modeling. The sensitivity and 1specificity constructed the ROC curve for the possible cut-point values [41] . The formulations, as shown in Equation (4), comprised the number of true-positive (TP), true-negative (TN), false-positive (FP), and false-negative (FN) data for calculating the accuracy, sensitivity, and specificity due to the ROC to perform the model's effectiveness. We computed the average accuracy, sensitivity, and specificity of the model's ROC for evaluation. We compared NSR versus APC and VPC (i.e., premature pattern) in data set (A) trained by the RF model with cross-validation, and the best ROC curves are shown in Figure 5a . The best accuracy and AUC reached 0.99, which can be referred to in the past study [20] . If the NSR mixed with AFib and VT (i.e., the fibril-rapid pattern), the AUC, shown in Figure 5b , was approximately 0.99 on average. The evaluation approved the reliability of the suggested models. Then, we mixed all the patterns for coupling analysis of the proposed ML models. The models were trained by data set A and tested by data set B, and then the recognition results were shown in Table 3 . The average accuracy of the four ML models for all hybrid patterns was approximately 86% but reached 90% if excluding VPC versus AFib and VT. The recognition efficacy decreased in comparison with the previous evaluations. Most Figure 5 . The one-versus-rest ROC curves in the RF model for the NSR versus the premature and fibril-rapid pattern due to the best performance fold in cross-validation by training data set A: (a) NSR versus premature pattern (i.e., APC-VPC); (b) NSR versus fibril-rapid pattern (AFib-VT). Then, we mixed all the patterns for coupling analysis of the proposed ML models. The models were trained by data set A and tested by data set B, and then the recognition results were shown in Table 3 . The average accuracy of the four ML models for all hybrid patterns was approximately 86% but reached 90% if excluding VPC versus AFib and VT. The recognition efficacy decreased in comparison with the previous evaluations. Most models distinguished the NSR versus the premature and fibril-rapid patterns with acceptable AUCs (from 0.747 to 0.942). All models validated with the pattern AFib-VT achieved an AUC average of 0.83, but with the AFib-VPC and VPC-VT, the result was less than 0.7. The sensitivities of hybrid patterns, except for VPC-AFib, VPC-VT, and AFib-VT, were over 0.9 for most models. On average, the respective AUCs of the NSR and APC versus AFib, VPC, and VT were 0.94 and 0.93 for the RF and SVM. The MLP, RF, SVM, and NB models' best sensitivities reached the value of 1 for the APC-VT pattern, while their accuracies and AUCs were {0.964, 1, 0.984, and 0.898} and {0.96, 1, 0.98, 0.89}, respectively. The ensemble analysis of RF obtained good sensitivity, specificity, and AUC for the patterns of NPC and APC versus AFib, VPC, and VT in comparison with the other models. The model performed an excellent sensitivity and specificity of 0.986 and 0.96, respectively, for the NSR-VT, while 0.927 and 0.901 for the NSR-VPC. Figure 6a -d display the OVR ROC curves for all models; the RF and SVM models performed better AUCs over 0.85 on average than other models. Table 3 . Evaluation of the ML models for the patterns of hybrid symptoms due to the training the data set (A) and testing the data set (B), which the pattern labeled by two symptoms (e.g., NSR-APC means normal sinus rhythm versus atrial premature atrial complex) presents their ROC's results due to the four models (i.e., multiple layer perceptron, random forest, support vector machine, and naive Bayes models) for comparison. We further evaluated the test results by training the simulative data (i.e., data set (C)) for the proposed models. As shown in Table 4 , the average AUCs of all models for both the APC-VPC and AFib-VT patterns versus the NSR could reach approximately 0.84 and 0.92, respectively. The ML models obtained the best AUCs at approximately 0.9 on average for recognizing the patterns of APC-NSR versus AFib-VT. The average accuracy of the MLP, RF, and SVM models for the APC-VT pattern reached approximately 0.983 and 0.977 for both the crucial and simulative data sets, while the respective AUCs were 0.98 and 0.973. However, similar to data set (A), the recognitions among AFib-VPC and VPC-VT were ambiguous. This approach showed the acceptable efficacy of the simulative data We further evaluated the test results by training the simulative data (i.e., data set (C)) for the proposed models. As shown in Table 4 , the average AUCs of all models for both the APC-VPC and AFib-VT patterns versus the NSR could reach approximately 0.84 and 0.92, respectively. The ML models obtained the best AUCs at approximately 0.9 on average for recognizing the patterns of APC-NSR versus AFib-VT. The average accuracy of the MLP, RF, and SVM models for the APC-VT pattern reached approximately 0.983 and 0.977 for both the crucial and simulative data sets, while the respective AUCs were 0.98 and 0.973. However, similar to data set (A), the recognitions among AFib-VPC and VPC-VT were ambiguous. This approach showed the acceptable efficacy of the simulative data compared with the observed data. This implied that the mean and standard deviation of few reliable data could produce sufficient and balanced training data for the ML model. Table 4 . Evaluation of the ML models for the patterns of hybrid symptoms due to the training the data set (C) and testing the data set (B), which the pattern labeled by two symptoms (e.g., NSR-APC means normal sinus rhythm versus atrial premature atrial complex) presents their ROC's results due to the four models (i.e., multiple layer perceptron, random forest, support vector machine, and naïve Bayes models) for comparison. With the above evaluation, we can suggest the proper models to identify arrhythmia patterns in the UHMS. The proposed modeling was applied to practical cases through the ECGHHT and MMLCA modules. We selected the cases reported to have APC, AFib, VPC, or VT from the official website and screened their records by a 3 s frame with a 1 s overlap during 30 s of an arrhythmia period. The MHS-based features were extracted from each frame by ECGHHT. Then, the labels of APC, AFib, VPC, VT, and NSR could be recognized by MMLCA. Both modules were activated by the HHTFeature built in the ECG4UHM to enable AI computation. The multiclass recognitions of the arrhythmia symptom patterns regarding the original ECG are displayed in Figure 7a -d. Particularly, the NB and RF algorithms recognized the hybrid patterns, including the VPC mixing AFib and VT, as shown in Figure 7b ,d, respectively. The traceable diagrams display the ECG signals on the time axis, mapping to the symptom labels highlighted in the associated frames. The modules approved the feasibility of AI computation for the ECG4UHM. With the above evaluation, we can suggest the proper models to identify arrhythmia patterns in the UHMS. The proposed modeling was applied to practical cases through the ECGHHT and MMLCA modules. We selected the cases reported to have APC, AFib, VPC, or VT from the official website and screened their records by a 3 s frame with a 1 s overlap during 30 s of an arrhythmia period. The MHS-based features were extracted from each frame by ECGHHT. Then, the labels of APC, AFib, VPC, VT, and NSR could be recognized by MMLCA. Both modules were activated by the HHTFeature built in the ECG4UHM to enable AI computation. The multiclass recognitions of the arrhythmia symptom patterns regarding the original ECG are displayed in Figure 7a -d. Particularly, the NB and RF algorithms recognized the hybrid patterns, including the VPC mixing AFib and VT, as shown in Figure 7b ,d, respectively. The traceable diagrams display the ECG signals on the time axis, mapping to the symptom labels highlighted in the associated frames. The modules approved the feasibility of AI computation for the ECG4UHM. Furthermore, we practiced the ECG4UHM's online recognition process. As shown in Figure 8a , the screen snapshot presents the interface for inputting the parameters to transform the uploaded ECG data set into the HHT-based features in the file named by the prefix "feat". The models available for the various symptoms were suggested for choice Furthermore, we practiced the ECG4UHM's online recognition process. As shown in Figure 8a , the screen snapshot presents the interface for inputting the parameters to transform the uploaded ECG data set into the HHT-based features in the file named by the prefix "feat". The models available for the various symptoms were suggested for choice as shown in Figure 8b . In addition, the user can compare the comprehensive models to recognize the hybrid arrhythmia patterns according to the ECG data set and the relevant feature set as shown in Figure 8c . The recognized result as shown in Figure 8d was traceable with symptom annotations with the AI computation. All traceable diagrams were saved in the fig_log for advanced management and review as shown in Figure 8e . The implementation approved the feasibility of the UHMS prototype for extensive applications of the uHealth services. recognize the hybrid arrhythmia patterns according to the ECG data set and the relevant feature set as shown in Figure 8c . The recognized result as shown in Figure 8d was traceable with symptom annotations with the AI computation. All traceable diagrams were saved in the fig_log for advanced management and review as shown in Figure 8e . The implementation approved the feasibility of the UHMS prototype for extensive applications of the uHealth services. This study suggested four ML models for the UHMS to recognize hybrid arrhythmia ECG patterns. The HHT-based features, including frequency and power due to the MHS of the first three-order IMFs, were coupled with the ML algorithms for analysis. The development of the ECG4UHM detailed the AI computation process, and the prototype revealed the advantages of combining simulative arrhythmia ECGs in the expandable ML modeling. We discuss the findings and limitations below. Past studies of HRV applied the spectrum to classify arrhythmias and normal heart rhythm but rarely recognized the symptoms with various pattern waveforms simultaneously [42, 43] . Therefore, we discovered the representative features from the HHT-based IMFs after decomposing the ECG signals. (1) Candidate features can be extracted from the first three-order IMFs. In the analysis, we used the low-pass filter to remove noises from the ECG signals before the EMD process. The past study employed this process with the SVM to achieve good performance for recognizing the APC and VPC [20] . As inspecting the decomposed IMFs of the AFib and VT samples, the IMF1 showed the major features of the frequency and power due to the MHS-area centroid in a significant range. At the same time, IMF2 and IMF3 contributed minor characteristics with a dispersed distribution. The MHS-area centroids for the latter-order IMFs were similar for all symptoms. Therefore, the first three-order IMFs, which used to include the hybrid patterns' recognizable features, are suggested for the candidate features. (2) Symptomatic waveforms should be wholly involved in the featuring frame. The timestamps of the arrhythmia symptoms' interest points were annotated on the official web page [18] , but they mixed with the NSR or other symptoms in a featuring frame. The impure waveforms of the symptoms may affect the features of the observed databases This study suggested four ML models for the UHMS to recognize hybrid arrhythmia ECG patterns. The HHT-based features, including frequency and power due to the MHS of the first three-order IMFs, were coupled with the ML algorithms for analysis. The development of the ECG4UHM detailed the AI computation process, and the prototype revealed the advantages of combining simulative arrhythmia ECGs in the expandable ML modeling. We discuss the findings and limitations below. Past studies of HRV applied the spectrum to classify arrhythmias and normal heart rhythm but rarely recognized the symptoms with various pattern waveforms simultaneously [42, 43] . Therefore, we discovered the representative features from the HHT-based IMFs after decomposing the ECG signals. (1) Candidate features can be extracted from the first three-order IMFs. In the analysis, we used the low-pass filter to remove noises from the ECG signals before the EMD process. The past study employed this process with the SVM to achieve good performance for recognizing the APC and VPC [20] . As inspecting the decomposed IMFs of the AFib and VT samples, the IMF1 showed the major features of the frequency and power due to the MHS-area centroid in a significant range. At the same time, IMF2 and IMF3 contributed minor characteristics with a dispersed distribution. The MHS-area centroids for the latter-order IMFs were similar for all symptoms. Therefore, the first three-order IMFs, which used to include the hybrid patterns' recognizable features, are suggested for the candidate features. (2) Symptomatic waveforms should be wholly involved in the featuring frame. The timestamps of the arrhythmia symptoms' interest points were annotated on the official web page [18] , but they mixed with the NSR or other symptoms in a featuring frame. The impure waveforms of the symptoms may affect the features of the observed databases in this study. Therefore, the appropriate sample should involve the complete waveform of the specific symptom in the frame, which allows some NSRs to fill up the frame size and reduce the bias. (3) HHT-based data pre-processing can imply innovative features in the ML model. With EMD in the HHT, the features (i.e., MHS-area centroids for various IMFs) of the multiclass symptoms due to the limited samples were observed to scatter in a separable distribution. The simulative features can be supplemented following the mean and deviation of the observed samples for data training. The evaluation showed similar performances for AFib and VPC corresponding to NSR using the simulative data set compared to the crucial data set. However, the VPC versus AFib and VT did not reach acceptable results, since they lacked enough samples with reliable means and standard deviations for simulation. Various models with ensemble analysis can be pipelined in a suitable pattern for the diverse symptoms to achieve a good recognition in practice. Proper feature pre-processing with validation can avoid unbalanced or insufficient samples and improve training efficacy before constructing the reliable recognition model. This approach revealed the requirement of conventional AI-based analysis in which the significant features can enhance the various machine learning methods to reach good results in classification [44] [45] [46] . (4) The coupling ML models can customize the UHMS to recognize hybrid arrhythmia patterns. The model's parameters are adjustable for the specific feature set. The current prototype could recognize four arrhythmia symptoms for application and suggest the suitable ML models with respect to the hybrid patterns for the frequency-domain features. The user can determine the most possible symptom based on the models' suggestion. In advance, we suggested the features in the HS for recognizing more arrhythmia diseases. The HS offers the instantaneous energy and frequency in the time-frequency domain, which also implies the time-dependent characteristics of the HRV symptoms with noticeable phase changes in the wave period. The previous study selected the features in both domains to avoid ambiguous identification for the similar waveform of ventricular arrhythmia [47] . The HS can be reflected as more features when the MHS-area centroids are not apparent than other symptoms [48] . Some features of arrhythmia symptoms obtained at the current stage were still insufficient for classification. There were only 108, 672, 296, and 115 of the APC, AFib, VPC, and VT, respectively, observed in the sample data sets by limited manpower, which affected the number of equivalent labels of the crucial data set for cross validation as training the data sets. According to the big data concept, the machine learning model would require more training data for a better approach. We increased the supplementary data (i.e., data set (C)) for simulation and comparison. The simulative data were randomly produced by MATLAB's function and surely located in the same range as the observed sample's distribution (i.e., data sets (A) and (B)). Three data sets were ensured in the same distribution but independent for training, validation, and testing in machine learning. In practice, the arrhythmia patients could report their experienced symptoms (e.g., chest pain, racing heartbeats, and shortness of breath) to the clinicians. The AI system developed at the current stage could not identify these self-reported conditions but only helped classify arrhythmia due to the ECG patterns. From the aspect of ML modeling, the sample data set could involve the measured ECG features and the reported symptoms in the clinical records for the advanced data training process (e.g., let AFib's features contain the label of chest pain). The improved model could be more feasible in clinical application. The modeling's effectiveness can be improved if more useful features and eligible samples of the various symptoms are classified. Currently, mobile ECG apps are restricted in some countries. Therefore, the prototype design in this study contributed to a portal of sensing application to connect the ECG sensors with biomedical signal processing for the uHealth services. In practice, the multiclass discrimination usually contains unbalanced and highdimensional data clusters. The previous studies suggested the concept of complexity reduction in the SVM and NN-based modeling to improve convergence efficiency [49, 50] . The simulation implied that the multiple labels of the APC, AFib, VPC, and VT versus NSR could be reduced to the pairs of binary classes to achieve the model with reliable sensitivity and specificity. Some prior works merged the similar-property symptoms in a class, e.g., {supraventricular tachycardia, atrial tachycardia, sinus tachycardia} and {atrial flutter, atrial fibrillation} to increase the model's recognition capability [21] . In addition, the standard deviations of the HHT-based feature distribution, as shown in Table 1 , revealed the overlaps between the features of the various symptoms, which could cause ambiguous separation between the multiple classes and lead to difficulty in discrimination. We then verified the statistical significance in various features and compared their p-values with the overlapped level. The statistical test performed p-value less than 0.001 for each feature with respect to the five symptoms in the OVR mode. We further applied the test of multiple categories for each feature corresponding to the symptom patterns in the OVO mode and showed all p-values in Table 1 . The patterns with p-values greater than 0.05 (i.e., non-significant difference) implied poor recognitions evaluated by ROC data as shown in Table 3 . The test could approve the proper model including the feature patterns with significant difference. Therefore, we experienced that the physicians' knowledge was necessary in clinics to exclude the outliers and uncertain data to ensure the cleanup sample in the modeling. For diagnosing assistance, the multiclass recognition modeling at the current stage can distinguish hybrid patterns of the arrhythmia symptoms against the normal heart rhythms. Many health service developers keep in customizing a user-friendly interface for commercial application. The customization process due to the wearable device's requirement is a major challenge for popular home-use solutions. The ML model's development is still undergoing to improve the UHM services with appropriate interfaces for advanced implementation. In the next phase, we will consider the deep learning methods with pre-processing of diverse arrhythmia ECG signals in the time-frequency domain through HHT to improve the developed programs' predictability. This study proposed a prototype of the self-developed UHMS with the ML modeling for AI computation to recognize arrhythmia ECG in uHealth services. The studied arrhythmia symptoms mixed the premature pattern (i.e., APC and VPC) and the fibrilrapid (i.e., AFib and VT) at the current stage. The MIT-BIH arrhythmia open database was applied for the modeling. We observed the limited data points from the MIT-BIH website's records for the training and testing processes. The modeling retrieved the features from the IMF through the EMD process based on the HHT algorithm. The HHT-based features included the frequency and power following the MHS-area centroids of the first three-order IMFs. Four ML models, including MLP, RF, SVM, and NB, were trained for coupling features analysis. The ML models' efficacy achieved the good AUCs of ROC by separately recognizing the premature and fibril-rapid patterns of multiple symptoms. All models could reach the perfect sensitivity for the pattern APC-VT but recognizing the AFib-VPC and VPC-VT requires improvement. The suggestive models were engaged to the UHMS platform "ECG4UHM" as the backend for tracking the arrhythmia symptoms in the real-time UHM services. The development can be integrated with the intelligent social-health system for the home-isolated cares in the future. The authors appreciate Brady Ling, Dublin High, U.S.A., for his hard work and effort in helping as an English volunteer editorial reviewer. The authors declare no conflict of interest. Methods of usability testing in the development of eHealth applications: A scoping review Development and Validation of an Arterial Pressure-Based Cardiac Output Algorithm Using a Convolutional Neural Network: Retrospective Study Based on Prospective Registry Data UPHSM: Ubiquitous personal health surveillance and management system via WSN agent on open source smartphone System architecture of a wireless body area sensor network for ubiquitous health monitoring AIoT-Enabled Rehabilitation Recognition System-Exemplified by Hybrid Lower-Limb Exercises Approaches of artificial intelligence in biomedical image processing: A leading tool between computer vision & biological vision Supervised Machine Learning: A Review of Classification Techniques An Enhanced Electrocardiogram Biometric Authentication System Using Machine Learning Epilepsy seizure detection using eigen-system spectral estimation and Multiple Layer Perceptron neural network Support vector machine-based arrhythmia classification using reduced features of heart rate variability signal Random Forests ECG-based heartbeat classification for arrhythmia detection: A survey Effects of ECG Data Length on Heart Rate Variability among Young Healthy Adults Invisible ECG for High Throughput Screening in eSports Efficiently Updating ECG-Based Biometric Authentication Based on Incremental Learning Modified limb lead ECG system effects on electrocardiographic wave amplitudes and frontal plane axis in sinus rhythm subjects A dynamical model for generating synthetic electrocardiogram signals The impact of the MIT-BIH Arrhythmia Database Components of a New Research Resource for Complex Physiologic Signals ECG beat classification using empirical mode decomposition and mixture of features Accurate deep neural network model to detect cardiac arrhythmia on more than 10,000 individual subject ECG records An Overview of Heart Rate Variability Metrics and Norms. Front. Public Health Spectral analysis of heart rate variability: Interchangeability between autoregressive analysis and fast Fourier transform Arrhythmia Diagnosis by Using Level-Crossing ECG Sampling and Sub-Bands Features Extraction for Mobile Healthcare A Mobile Device System for Early Warning of ECG Anomalies Nujum Navaz, A. ECG Monitoring Systems: Review, Architecture, Processes, and Key Challenges Hilbert-Huang Transform and Its Applications On the computational complexity of the empirical mode decomposition algorithm Naet-Alib, A. QRS complex detection using Empirical Mode Decomposition An Accurate QRS Complex and P Wave Detection in ECG Signals Using Complete Ensemble Empirical Mode Decomposition with Adaptive Noise Approach Transform for analysis of heart rate variability in cardiac health Quantification of Respiratory Sinus Arrhythmia Using Hilbert-Huang Transform Evaluation and management of premature ventricular complexes Prediction of Sudden Cardiac Death in Implantable Cardioverter Defibrillators: A Review and Comparative Study of Heart Rate Variability Features Spectrogram analysis of electrocardiogram with Normal Sinus Rhythm, Arrhythmia and Atrial Fibrillation Comparative Study on the Effect of Order and Cut off Frequency of Butterworth Low Pass Filter for Removal of Noise in ECG Signal Integer programming models for feature selection: New extensions and a randomized solution algorithm Least Squares Support Vector Machine Classifiers Support vector machines with genetic fuzzy feature transformation for biomedical data classification Cardiac Arrhythmia Classification Using Autoregressive Modeling Defining an Optimal Cut-Point Value in ROC Analysis: An Alternative Approach Computer aided diagnosis for atrial fibrillation based on new artificial adaptive systems Beyond ventricular fibrillation analysis: Comprehensive waveform analysis for all cardiac rhythms occurring during resuscitation Classification of Imbalanced Data: A Review A Fast and Accurate Recognition of ECG Signals Based on ELM-LRF and BLSTM Algorithm Application of irregular and unbalanced data to predict diabetic nephropathy using visualization and feature selection methods A novel algorithm for ventricular arrhythmia classification using a fuzzy logic approach Automated Method for Discrimination of Arrhythmias Using Time, Frequency, and Nonlinear Features of Electrocardiogram Signals High dimensional data classification and feature selection using support vector machines Fuzzy modeling of high-dimensional systems: Complexity reduction and interpretability improvement