key: cord-0955841-or72w40w
authors: Agrawal, Amulya; Chauhan, Aniket; Shetty, Manu Kumar; P, Girish M.; Gupta, Mohit D.; Gupta, Anubha
title: ECG-iCOVIDNet: Interpretable AI model to identify changes in the ECG signals of post-COVID subjects
date: 2022-04-30
journal: Comput Biol Med
DOI: 10.1016/j.compbiomed.2022.105540
sha: f8a0a070164cc7842a762951add229b002901bb2
doc_id: 955841
cord_uid: or72w40w

OBJECTIVE: Studies showed that many COVID-19 survivors develop sub-clinical to clinical heart damage, even if subjects did not have underlying heart disease before COVID. Since Electrocardiogram (ECG) is a reliable technique for cardiovascular disease diagnosis, this study analyzes the 12-lead ECG recordings of healthy and post-COVID (COVID-recovered) subjects to ascertain ECG changes after suffering from COVID-19. METHOD: We propose a shallow 1-D convolutional neural network (CNN) deep learning architecture, namely ECG-iCOVIDNet, to distinguish ECG data of post-COVID subjects and healthy subjects. Further, we employed ShAP technique to interpret ECG segments that are highlighted by the CNN model for the classification of ECG recordings into healthy and post-COVID subjects. RESULTS: ECG data of 427 healthy and 105 post-COVID subjects were analyzed. Results show that the proposed ECG-iCOVIDNet model could classify the ECG recordings of healthy and post-COVID subjects better than the state-of-the-art deep learning models. The proposed model yields an F(1)-score of 100%. CONCLUSION: So far, we have not come across any other study with an in-depth ECG signal analysis of the COVID-recovered subjects. In this study, it is shown that the shallow ECG-iCOVIDNet CNN model performed good for distinguishing ECG signals of COVID-recovered subjects from those of healthy subjects. In line with the literature, this study confirms changes in the ECG signals of COVID-recovered patients that could be captured by the proposed CNN model. Successful deployment of such systems can help the doctors identify the changes in the ECG of the post-COVID subjects on time that can save many lives.

The first Case of COVID-19 was registered in the Wuhan City of China. COVID-19 is a type of viral disease that is contagious and rapidly spreads through spilled respirational material (cough, sneeze) present in the exhaled air of the infected people. Reverse Transcription-quantitative PCR (RT-qPCR) is a gold standard test 5 for diagnosis of COVID-19 (Yamayoshi et al., 2020) . This disease crossed the geographical boundaries with devastating effects in a short period, and today the entire world is fighting against this pandemic and hence, COVID-19 has caused immense social and economic losses throughout the world . Based on the global statistics, until August 12, 2021, more than 205 million people 10 suffered from this infection, and 4.3 million people lost their battle to COVID-19 (https:www.worldometers.info/coronavirus). Researchers have been trying to develop time-series models to predict the statistics in order to support the agencies with appropriate policy decisions .

The infection impacts the respiratory tract and causes lung pneumonia, fever, 15 cough, and loss of taste and smell (Majumder & Minko, 2021) . Although medical science is focused on developing effective medication and preventive therapy like vaccines, there is no effective therapy available for COVID-19. Early diagnosis, patient isolation, supportive therapy are the primary modes of management of COVID-19.

Recently, some studies have indicated cardiac problems in patients recovered from 20 COVID (Gasecka et al., 2021; Hall et al., 2021) . It has been observed that even after recovering from COVID-19, survivors develop sub-clinical to clinical heart damages, even though subjects did not have underlying heart disease before COVID-19 (Lang et al., 2020) . Type 1 heart attack, which is caused due to blockage in heart arteries because of a blood clot, is rarely reported during or after COVID-19 infection. Type 25 2 heart attacks, caused by stress or low oxygen levels, are most commonly reported in subjects with COVID-19 (Cameli et al., 2021) . It has been discovered that during the COVID-19 blood report, some people have elevated levels of a substance called "troponin" in their blood, along with ECG changes and chest pain. Elevated "troponin" levels are a sign of damaged heart tissues, and this can cause a heart attack 30 (Mahajan & Jarolim, 2011) .

Electrocardiogram (ECG) is used to identify cardiac abnormalities. A 12-lead ECG is generated using six unipolar chest leads (V 1 to V 6 ), three bipolar limb leads (I, II, and III), and three unipolar limb leads (AVR, AVL, and AVF) placed on the specific locations of the body surface. Each ECG wave consists of P, Q, R, S, 35 and T waves. To diagnose or detect heart abnormalities, cardiologists analyze the Electrocardiogram (ECG) recordings of the subjects that is time-consuming. Thus, it is required to develop methods to analyze and interpret the variability in ECG signals of post-COVID subjects.

Several deep learning models have been developed to diagnose or predict a dis-40 ease in early stage using the signals generated from human body such EEG, ECG, and non-invasive images Kaplan et al., 2021; Baygin et al., 2021; Attallah, 2022; Aggarwal et al., 2022) . Antczak (2020) trained an Inception network, generated synthetic ECG data from time-domain Wasserstein GAN, and trained a denoising encoder to perform ECG denoising. Ullah et al. (2020) first pre-processed 45 the ECG data for denoising to remove the drift noise and then transformed them into two-dimensional spectrogram images. These spectrograms were fed to the 2D convolutional neural network (CNN), which extracts and represents prominent features and classifies the ECG recordings into eight major cardiovascular diseases. Li et al. (2018) suggest transforming the ECG recordings into two-dimensional spectrogram 50 images. These transformed images carry the patient's heartbeat morphology and the temporal relation between two adjacent heartbeats. These images are input to a 2D CNN that performs classification using the information fusion techniques. Jun et al. (2018) suggest that there is no need to pre-process the ECG signals manually because they can be directly converted into two-dimensional gray-scale 55 images by plotting. These images are input to a two-dimensional neural network with an architecture similar to VGGNet. Zhang et al. (2020) proposed a novel deep learning technique for multi-class arrhythmia classification using a spatio-temporal attention-based convolutional recurrent neural network. The feed-forward CNN extracts only the local features of ECG. Spatial attention-based pooling extracts the 60 more significant channels. All the local features then combine to form the global features learned using a bi-directional gated recurrent unit (GRU). Avanzato & Beritelli (2020) used a convolution neural network (CNN) to diagnose cardiovascular diseases using the subjects' ECG data. Zhang et al. (2021) proposed a deep learning architecture that includes stacking first two layers of the convolutional network extract the ECG morphology patterns and feed them to the RNN. Transfer learning technique in training the RNN gives outstanding accuracy and an optimal global solution for abnormal ECG classification into different cardiovascular diseases. Borra et al. (2020) performed several experiments on decoding ECG signals using 75 deep learning techniques on the standard dataset of PTBXL. They applied Inception Time, ResNet, and XResNet models to classify the ECG abnormalities into 27 different categories and reported the Inception Time model to perform the best with the 12-lead ECG data. Jo et al. (2021) presented an explainable artificial intelligence mechanism to detect irregularities in the heartbeat pattern, atrial fibrillation, and 80 the absence of P-waves using 8-second ECG of subjects of multiple hospitals. The usage of explainability of multi-labeled data was observed as helpful in validating the deep learning models. One module detects irregular heartbeats, and the other detects the absence of P waves. Another recent study has built a CNN based interpretable AI model for cardiac disorders using ECG wave analysis on PTBXL dataset (Anand 85 et al., 2022) . Similarly, interesting machine learning and deep learning studies have been conducted recently to detect stress in COVID healthcare workers using ECG signal analysis (Gupta et al., 2021c,a) . Thus, we observe that CNN based DL models are increasingly being used for ECG analysis. Heart Rate Variability (HRV) indicates variation in the consecutive heartbeats of 90 ECG signals. The maximum upwards deflection of a normal QRS complex is called the R wave peak in the ECG and the duration between two adjacent R wave peaks is termed the R-R interval. The time period between the adjacent QRS complexes is termed the N-N (normal-normal) interval. HRV is the measurement of the variability of these N-N intervals. Some recent studies have indicated change in heart rate 95 variability (HRV) in COVID-recovered subjects (Adler et al., 2021; Kunal et al., 2021; Shah et al., 2022) . This indicates that tracking of the heart status of post-COVID recovered subjects can help in providing timely assistance to these subjects for better survival.

In general, the analysis of HRV involves preprocessing of ECG data including 100 noise removal (Agrawal & Gupta, 2013a,b; Singh et al., 2019) , feature extraction, normalization. Traditionally, ECG signals are analyzed using the time domain and the frequency domain features, say HRV features, extracted from the one-dimensional waveforms of different leads. The manual examination of ECG signals requires expertise in the field and is a time-consuming process. Recent advances in AI can help hospitals in Delhi, India. 2. Several traditional and deep learning models are trained and evaluated on the ECG data of healthy and post-COVID subjects. 3. Two shallow convolutional neural network architectures are proposed for the classification task. The first model, ECG-iCOVIDNet, works only on the raw 115 ECG data, while the second model, ECG-HiCOVIDNet carries out the late fusion of the HRV features with the latent space embedding of the CNN features extracted from the raw ECG data. In general, traditional ML methods are used on HRV features, while DL methods utilize only raw ECG waveforms. In this paper, we have designed a DL architecture, ECG-HiCOVIDNet, that works on 120 the raw ECG signals and on HRV features. To the best of our knowledge, this is one of the first studies that carries out the late fusion of HRV features in the DL model for ECG data analysis. Both the proposed models are shown to outperform the standard state-of-the-art CNN models on the ECG data. 4. ShAP technique is used to evaluate interpretability at the patient and the popu-125 lation level. At the patient level, the segments of ECG wave contributing to the classification are highlighted. The lead-wise contribution to the classification is identified. 5. To the best of our knowledge, this is one of the first studies to analyze the raw ECG signals of COVID-recovered patients for detecting cardiac abnormalities.

Data were collected by the Department of Cardiology, G.B. Pant Hospital, Delhi, India and Lok Nayak Hospital, Delhi, India. COVID-19 patients who had recovered (30-60 days after the date of infection) were initially screened for the eligibility criteria. Patients with preexisting cardiac conditions and pathological conditions before 135 COVID-19 infections were excluded from the study. After screening, 117 subjects were eligible for the study. A 12 lead, 500 Hz, 60 second ECG data was collected. These data were recorded during supine paced breathing using VESTA 301i (500 Hz) . Similarly, ECG data of 430 healthy subjects recorded in the study (Gupta et al., 2021b) at the same hospitals using the same machines were used as the control group data. We removed 12 post-COVID-19 and 3 healthy samples because their ECG data were very noisy. Finally, the ECG data of 105 post-COVID subjects (labelled as class '1') and 427 healthy subjects (labelled as class '0') were included for analysis in the study.

The dataset is divided into five folds corresponding to which five classifiers are 145 trained. Each time while training a new classifier, one fold is used as the test set and the rest of the 4 folds are used for training the model. For each classifier, the training data of 4 folds is further divided into 80% as the training data and 20% as the validation data. The distribution of samples into training set, validation set and test set is shown in Table- 

Heart rate features were extracted using the HRV-analysis module Tarvainen et al. (2014) . This tool removes outliers and ectopic beats from a signal using Malik's rule Acar et al. (2000) . The following time-domain HRV features were extracted: mean heart rate (Mean-HR), standard deviation of heart rate (STD-HR), mean of NN 155 intervals (Mean-NNI), where R peak of ECG is also called the N point, median of the successive difference between NN intervals (Median-NNI), range NNI (Range-NNI), PNNI-20 (percentage of successive NN interval greater than 20ms), PNNI-50 (percentage of successive NN interval greater than 50ms) and standard deviation of the NN intervals (STD-NNI). HRV features derived from the NN intervals included 160 RMSSD (root mean square NN intervals), CVNNI (Co-efficient of variation equal to the ratio of standard deviation of the NN intervals divided by mean NN interval) and CVSD (Coefficient of variation of successive difference equal to the root mean square NN intervals divided by mean NN interval). Frequency domain HRV features included High frequency (HF), Low Frequency (LF), Very Low Frequency (VLF), 165 HFNU (normalized high frequency power), LFNU (normalized low frequency power value), and LF/HF (ratio of low frequency and high frequency power). 6 J o u r n a l P r e -p r o o f 3. Methods

In this subsection, we present the existing state-of-the-art DL models, traditional 170 ML classifiers, and the proposed architectures that were trained and tested using five-fold cross-validation, on the above described dataset. In general, traditional ML methods are used on HRV features, while DL methods are used on only raw ECG waveforms. Hence, we used the existing state-of-the-art DL models on the raw ECG data, the traditional ML classifiers on the HRV features, and our proposed DL 2019) proposed a spatio-temporal CNN model for ECG data analysis that considers the temporal aspect of ECG signal of the leads along time using eight temporal layers and spatial 180 aspect across the leads using one spatial layer. These temporal and spatial layers are followed by two fully connected dense layer with sigmoid activation function. We implemented this architecture and also named it as ST-CNN-8.

ResNet50: ResNet architecture was introduced by He et al. (2016) for image recognition problem. ResNet50 is a 50-layer deep architecture. These networks have a general architecture of convolution, pooling, activation and fully-connected layers stacked one over the other. Although this stacking allows better feature to be learned, a deeper architecture can still show degradation owing to multiple reasons including the problem of vanishing or exploding gradients. Thus, each layer of ResNet learns a residual function instead of fitting a desired underlying function via the use of skip 190 connections. These skip connections solve the problem of vanishing gradients and enable the model to learn an identity function. This ensures good performance in the deeper layers as well. Thus, ResNet provides better performance, in general. Here, ResNet50 model is adapted for one-dimensional inputs. We have used the publicly available implementation by Kotikalapudi (2017). 195 SENet: Squeeze-and-Excitation Network or SENet (Hu et al., 2018) explicitly model the independencies between channels and adaptively gives importance to them according to the relevance. SENet applies global average pooling to generate channelwise statistics to squeeze global spatial information into a channel vector of size equal to the number of convolutional channels. This squeezed vector is passed through a attention modules are stacked between the residual units for modeling an attentionaware network that can learn attention-aware features. We have used the publicly 205 available implementation by Sourajit2110 (2018).

Logistic Regression: Logistic regression is a supervised learning method that is used to implement binary classification. It predicts the probability of the input sample belonging to each class. It is computed by fitting an "S" shaped logistic 210 function to the data. The output probability indicates the likelihood of a subject belonging to the "post-COVID class".

SVM: Support vector machine (SVM) is a very popular traditional supervised machine learning classifier. In SVM, a data sample is plotted as a point in an ndimensional space (where n is the number of features). These data points are divided 215 into different classes via finding the hyperplane that maximizes the distance of the nearest data point of each class (on opposite sides of this plane) from the hyperplane. Thus, it is also called the maximum-margin classifier. We utilized SVM classifier with RBF kernel.

Decision Tree: Decision tree is a supervised tree-structured classifier that takes 220 decision by asking binary questions. Based on the answers (yes/No), it splits the branches of the tree. Features are present at the nodes and the branches represent the decision rules. The outcome is represented by the leaf nodes. This is also one of the most efficient traditional machine learning method. We utilized GINI criterion in the decision tree. 

Convolutional Neural Network without HRV: ECG-iCOVIDNet ( Fig. 1 ) The architecture of the proposed ECG-iCOVIDNet model comprises of three convolutional blocks stacked sequentially after each other. Each convolutional block comprises of a 1D-convolutional layer with ReLu activation followed by a batch-230 normalization (BN) layer. BN layer is used to deal with the internal covariance-shift problem. After the BN layer of the first convolutional block, a dropout layer is also used. Dropout layer discards some nodes randomly from a layer by removing all their connections, and helps in preventing overfitting of the model. The third convolutional block is followed by a global average-pool layer that produces the final 235 feature set of the raw ECG data. These features are also called as the latent space embedding and are fed as input to the fully-connected (FC) layer of 50 nodes. The FC layer also uses Relu activation and is followed by a hidden layer with a single output node with sigmoid activation. This layer outputs the probability value that is used to determine the class of an input data sample as healthy or post-COVID. For the classification, the raw ECG data of a subject is fed as input to the proposed network. A block diagram of ECG-iCOVIDNet is shown in Fig. 1 

For evaluating the proposed model, we have used six evaluation metrics: accuracy, precision, recall, AUC, F 1 -score, and Matthews correlation coefficient (MCC) . These evaluation metrics are derived from true positive (TP), false positive (FP), true negative (TN) and false negative (FN) . Here, a sample is defined as TP if it is class '1' (post-COVID) and also predicted by the model as class '1' label; a sample is 255 defined as FP if it is class '0' (healthy) and predicted as class '1' label; a sample is defined as TN if it is class '0' (healthy) and predicted as class '0' label; and a sample is defined as FN if it is class '1' (post-COVID) and predicted as class '0' label. A brief description of these evaluation metrics is given as below: 

All the models were trained on the above explained dataset using five-fold cross validation. Google Colab, a cloud-based Jupyter notebook environment, was utilized. The data split is provided in Table- 1 describing the number of subjects in the training, 270 validation and test phase for each of the fold's classifier. It was made sure that no test set sample was shown during the training phase. GPU was used as the hardware accelerator. Keras API, which runs on top of the Tensor flow framework, was used to implement the models.

Since traditional ML methods are used on HRV features, these are trained and 275 tested on HRV features, existing state-of-the-art DL models and the proposed ECG-iCOVIDNet are tested on raw ECG data, while the proposed ECG-HiCOVIDNet utilizes both the raw ECG data and HRV features. In Both ECG-iCOVIDNet and ECG-HiCOVIDNet, a dropout rate of 0.5 was used and the models were trained using 100 epochs for each of the five splits of the data. The models use Adam optimizer as 280 the optimization algorithm that combines the best of the AdaGrad and RMSProp algorithms and performs much better than other optimizers. Binary cross-entropy is used as the loss metric.

The parameter settings for different DL models are described next. Resnet-50 was trained with the learning rate of 0.0001 for 100 epochs using ADAM optimizer 285 with binary cross entropy loss function. ST-CNN-8 was trained with the learning rate of 0.0005 for 100 epochs using ADAM optimizer with binary cross entropy loss function. A batch size of 64 was chosen. We also used a dropout rate of 0.05. Both SENet and Attention-56 were trained with the learning rate of 0.0005 for 100 epochs using ADAM optimizer with binary cross entropy loss function. A batch size of 64 290 and a dropout rate of 0.2 was chosen.

Results of all models that are described above are shown in Table- 2. The table contains all the evaluation metrics described above, namely accuracy, precision, recall, F 1 -score, AUC and MCC, calculated on the test fold for each of the fold's 295 classifier and compiled for all five folds. Results show that our proposed architecture yields the best performance with 100% accuracy, F 1 -score and AUC as 1, on the test data. To visually demonstrate the ability to distinguish between the healthy and post-COVID classes, t-SNE plots are shown in Fig 3 and 4 that demonstrate that initially the input data is not distinguishable in two classes. The samples from 300 different classes start forming clusters as we moves from the first convolution block to the last convolution block. Eventually, the data gets segregated into two different classes as seen from the tsne plots made of data after the flatten layer. Thus, we can infer from this that both the proposed models have the ability to separate healthy and post-COVID samples.

The deep learning models show improvement over the traditional models. The SVM approach applied using only on the HRV features of the ECG dataset shows best performance among these traditional ML models with an accuracy of 80.26%. Logistic Regression tries to classify the linearly separable data, but performs poorly with only 69.81% accuracy on the test data. Attention-based model, namely Attention-310 56 scored an accuracy of 98.07% and, hence, performed better than ST-CNN-8 and SENet models. The ST-CNN-8 model yielded an accuracy of 97.93% and demonstrated an improvement over the SENet model. The reason behind the improvement could be the use of spatial and temporal layers that could exploit the information of all the channels as well as the information present across channels.

The proposed approach of using CNN architecture features concatenated with the HRV features in the ECG-HiCOVIDNet model demonstrated better performance than the attention-based model. ECG-HiCOVIDNet model scored an accuracy of 99.28%. Resnet model with 50 layers gained higher accuracy with 99.81% on the test data. The proposed ECG-iCOVIDNet model yielded the best results with an 320 12 J o u r n a l P r e -p r o o f accuracy of 100% on the test data. It scored an AUC of 100% and F 1 -score as 1. It outperformed the traditional ML models and the state-of-the-art DL models. Global Average Pooling (GAP) layer after the third convolutional block outputs the average of each feature map and reduced the vector size to 32 before the dense layers. This also reduced the total trainable parameters to nearly 9, 500 in all the five classifiers 325 for five folds. In the ECG-HiCOVIDNet model, the GAP layer outputs the average of each feature map with vector size to 32. These 32 features when concatenated with 43 HRV features, result in a total of 75 features after the concatenation layer. The total trainable parameters increased to nearly 66, 500 in all the five classifiers for five folds. The proposed ECG-HiCOVIDNet model demonstrated an improvement of 19.02% over the traditional models, and 1.14% improvement over Attention-56 model, while the ECG-iCOVIDNet model displayed an improvement of 19.74% over the traditional model, and 0.19% over the ResNet-50 model. 

Although we have seen that CNN based architecture, ECG-iCOVIDNet, per-335 formed best for ECG data classification, it is difficult for humans to understand the features learned by DL models due to their complex architecture and non-linear behaviour. CNNs are considered as "black boxes" due to the lack of interpretability. ShAP ECGs corresponding to these important features are highlighted in red, while the features with lesser importance are seen in the blue color. The analysis is done at three levels as described below.

At the level of single patient, it is important to identify the features from the 350 data of a particular patient that help in classifying a patient into a particular class. ShAP applied on the CNN model accepts raw ECG data and generates an output of the same size as the input, with ShAP value for each position of input ECG data. A ShAP value of S > 0 indicates positive contribution of the corresponding input position towards the classification of that patient into its predicted class. Top 500 355 ShAP values of each lead are used to highlight the most contributing features with red colour as shown in Fig.5 and 6 .

To compute the lead-wise importance for each class, we added the ShAP values of one lead for all the patients of each class separately. This is the total contribution of 360 each lead in each class. Next, we averaged this lead-wise contribution for each class. The average contribution of each lead is also calculated. This process is documented in Algorithm-1. Figure 7 shows the impact of all the 12 leads on the two classes and also the average contribution of each lead. It is observed that in the classification/ prediction of "post-COVID" class, lead-aVL has the highest contribution, whereas in 365 the prediction of "healthy" class, lead-aVR has the highest contribution. The average contribution of lead-aVL is highest amongst all the leads followed by lead-aVR. It shows that lead aVL contributes maximum to the classification of healthy versus post-COVID. Indeed, this lead is known to have good predictive capability.

We also worked towards the explainability/ interpretability of the ECG-iCOVIDNet 370 classifier. In our study, subjects had Left Ventricular Diastolic Dysfunction (LVDD) that is measured in terms of Global Longitudinal Strain (GLS) during ECHO. A GLS value of less than 16% indicates LVDD and correspondingly, change should have been observed in ECG as slurred S wave. These slurred S wave changes were indeed highlighted by our ECG-iCOVIDNet AI model via ShAP analysis in such 375 patients as shown in Fig. 6 . Similarly, the ECG data of subjects having ejection fraction less than 45% has notching or wider P wave. These changes were also detected by the ShAP interpretable AI model and were highlighted in the red color as shown in Fig.5 ).

14 J o u r n a l P r e -p r o o f 

J o u r n a l P r e -p r o o f

The 12-lead ECG is the most common screening test to check heart diseases. However, most of the time underlying heart disease can not be seen with ECG and require higher diagnostic methods such as ECHO and CT scan. Our ECG-iCOVIDNet model is able to predict the underlying heart problem in post-COVID subjects using only ECG data with 100% accuracy. In this paper, we presented results of various 385 models trained using the ECG data of healthy and post-COVID subjects. It is evident that deep learning models perform better with ECG dataset. We observed that Spatio-temporal CNN (ST-CNN-8), Attention-56, and ResNet, were good architectures for the classification of the samples. The proposed ECG-iCOVIDNet and ECG-HiCOVIDNet models use convolution blocks with global average pool layer to 390 learn features from the 12-lead raw ECG data and demonstrate outstanding performance. This observation is aligned with the visual inferences drawn from the t-SNE plots also. ECG-HiCOVIDNet model that uses HRV features derived from ECG waves along with the raw ECG data yields 99.28% accuracy, while the ECG-iCOVIDNet model without HRV features score 100% accuracy. This shows that 395 ECG-iCOVIDNet could extract relevant features from the raw ECG waveforms and hence, addition of HRV features derived from the raw ECG waveforms did not yield any advantage, again affirming the good performance of the trained model.

Secondly, we also worked towards the explainability/interpretability of the best classifier developed. This can help the medical teams to trust the decisions made 400 by the ECG-iCOVIDNet model. Our model highlighted the important abnormal segments of ECG that help to distinguish between classes using ShAP at patientlevel and population-level. Recently, non-invasive ECG finding P-wave dispersion (Pd) and Wide/notching P wave have been shown as the sign of various pathological conditions in the literature, where Pd is calculated by the shortest and the longest 405 P-wave duration recorded from ECG waves (Okutucu et al., 2016) . Identification of Pd for human eye is really impossible. Here, our model highlighted wide/notching P wave in lead I of ECG of Post-COVID in ShAP (Fig.5) . Left ventricular dysfunction is seen in some COVID recovered subjects without previously diagnosed heart disease. Slurred S wave is a sign of left ventricular dysfunction (Tudoran et al., 2021; 410 Takamatsu et al., 2008) . This is highlighted in lead II of ECG of post-COVID subjects (Fig.6) . This explainable AI model with interpretable ShAP figures showing the abnormal segments in ECG waves yields relevant medical results that would help doctors in primary and secondary healthcare centers to trust this AI model that can help to diagnose post-COVID heart abnormalities using the ECG.

Our AI model with visualization of the abnormal ECG segments helps the cardiologists in finding any underlying heart irregularities with less human error, especially, 19 in overloaded healthcare setup in low/middle-income countries such as India. Therefore, this proposed architecture of deep neural networks can be easily deployed at clinical setups, where the entire nation is struggling with a high burden of heart is-420 sues after suffering from COVID-19. Moreover, explainable AI models developed on 12-lead ECG data and its HRV features can help the non-cardiologist diagnose the issues faster and timely, which can improve the efficiency of primary and secondary healthcare services to early diagnose the heart pathology accurately.

This work can be helpful for doctors to screen the post-COVID patients who come for follow-up care for addressing the heart issues using ECG without costlier investigation as ECHO, especially, for the doctors at primary and secondary healthcare centers, where no cardiologist is available. Furthermore, the model can be used to analyze the ongoing changes in post-COVID patients and treat them before this 430 changes into major heart problems. Last but not the least, such a study can also be used to identify and develop a broader understanding of the cardiac abnormalities due to coronavirus.

This study has certain limitations. First of all, although an elaborate effort was made to collect the ECG data of post-COVID subjects, the dataset is small as of 435 now. Further, there are many interpretability methods that can be employed to draw inferences on the decisions made by the AI model. We employed SHAP analysis, which is one of the most widely used methods. Here, models are able to learn the abnormalities that exist in the patient's data, but it is possible that inference for all such abnormalities is not easy to generate. In the future, it would be interesting 440 to conduct a benchmarking study on the available interpretability methods on ECG application to figure out which model(s) work best on ECG datasets. A good question to answer is: Is there an interpretability method that works best on ECG data, in general? Researchers can do such a benchmarking study using multiple ECG dataset and multiple interpretability methods. Further, a webapp can be made and installed 445 for use by COVID healthcare workers/ doctors, where they can upload the ECG in a tabular format as an input and obtain the results on whether the person is normal or has suffered from COVID earlier. Further, the app can also be used to inform the ECG wave regions where the changes are observed. This work can also be extended to study the nature of COVID and its action on the human heart. 

Automatic ectopic beat 455 elimination in short-term heart rate variability measurement. Computer methods and programs in biomedicine

Heart rate variability is reduced 3-and 6-months after hospitalization for COVID-19 infection

COVID-19 image classification using deep learning: Advances, challenges and opportunities

Fractal and EMD based removal of baseline 465 wander and powerline interference from ECG signals

Removal of baseline wander in ECG using the statistical properties of fractional Brownian motion

Explainable AI decision model for ECG data of cardiac disorders

A generative adversarial approach to ECG synthesis and de-475 noising

ECG-BiCoNet: An ECG-based pipeline for COVID-19 diagnosis using bi-layers of deep features integration

Age and sex estimation using artificial intelligence from standard 12-lead ECGs

Automatic ECG diagnosis using convolutional neural network

Automated ASD detection using hybrid deep lightweight features extracted from EEG signals

On the application of 490 convolutional neural networks for 12-lead ECG multi-label classification using datasets from multiple centers

COVID-19 and 495 acute coronary syndromes: Current data and future implications

PrimePatNet87: Prime pattern and tunable q-factor wavelet transform techniques for automated accurate EEG emotion recognition

Post-COVID-19 heart syndrome. Cardiology journal

An interpretable DL model for stress detection using ECG in COVID-19 healthcare workers

Design and rationale of an 510 intelligent algorithm to detect Burnout in Healthcare workers in COVID era using 22 artificiaL intelligence: The BRUCEE-LI study

COVID 19-related 515 burnout among healthcare workers in India and ECG based predictive machine learning model: Insights from the BRUCEE-Li study

Identifying patients at risk of post-discharge complications related to 520 COVID-19 infection

Deep residual learning for image recognition

Squeeze-and-excitation networks

Explainable artificial intelligence to detect atrial fibrillation using electrocardiogram

ECG Arrhythmia classification using a 2-D convolutional neural network

Feed-forward LPQNet based automatic alzheimer's disease detection model

Heart rate variability in post-COVID-19 recovered subjects using machine learning

A current review of COVID-19 for the cardiovascular specialist

Deep convolutional neural network based ECG classification system using information fusion and one-hot encoding techniques

How to interpret elevated cardiac troponin levels? Circulation

Recent developments on therapeutic and diagnostic approaches for COVID-19

P-wave dispersion: What we know till now?

Heart rate variability as a marker of cardiovascular dysautonomia in post-COVID-19 syndrome using artificial intelligence

Generalized SIR (GSIR) epidemic model: An improved framework for the predictive monitoring of COVID-19 pandemic

An improved data driven dynamic SIRD model for predictive monitoring of COVID-19

Baseline wander and power-line interference removal from ECG signals using Fourier decomposition method

Residual-Attention-Convolutional-Neural-Network

Right bundle branch block and impaired left ventricular function as evidence of a left ventricular 575 conduction delay

Kubios HRV-heart rate variability analysis software. Computer methods and programs in biomedicine

Alterations of left ventricular function persisting during post-acute COVID-19 in subjects without previously diagnosed cardiovascular pathology

Classification of Arrhythmia by using deep learning with 2-D ECG spectral image representation

Residual attention network for image classification

Interpretation of electrocardiogram (ECG) rhythm by combined CNN and BiLSTM

Comparison of rapid antigen tests for COVID-19

Interpretable deep learning for automatic diagnosis of 12-lead electrocardiogram

ECG-based multi-class arrhythmia detection using spatio-temporal attention-based convolutional recurrent neural network

We would like to thank the Centre of Excellence in Healthcare, IIIT-Delhi, India for providing the financial support to carry out this research work.