key: cord-0935679-zppnfob0 authors: Zaman, Fahim; Ponnapureddy, Rakesh; Wang, Yi Grace; Chang, Amanda; Cadaret, Linda M; Abdelhamid, Ahmed; Roy, Shubha D; Makan, Majesh; Zhou, Ruihai; Jayanna, Manju B; Gnall, Eric; Dai, Xuming; Singh, Avneet; Zheng, Jingsheng; Boppana, Venkata S; Wang, Feng; Singh, Pahul; Wu, Xiaodong; Liu, Kan title: Spatio-temporal hybrid neural networks reduce erroneous human “judgement calls” in the diagnosis of Takotsubo syndrome date: 2021-09-04 journal: EClinicalMedicine DOI: 10.1016/j.eclinm.2021.101115 sha: 0fb50227cc131905202df2fb3492e2b5ce026235 doc_id: 935679 cord_uid: zppnfob0 BACKGROUND: We investigate whether deep learning (DL) neural networks can reduce erroneous human “judgment calls” on bedside echocardiograms and help distinguish Takotsubo syndrome (TTS) from anterior wall ST segment elevation myocardial infarction (STEMI). METHODS: We developed a single-channel (DCNN[2D SCI]), a multi-channel (DCNN[2D MCI]), and a 3-dimensional (DCNN[2D+t]) deep convolution neural network, and a recurrent neural network (RNN) based on 17,280 still-frame images and 540 videos from 2-dimensional echocardiograms in 10 years (1 January 2008 to 1 January 2018) retrospective cohort in University of Iowa (UI) and eight other medical centers. Echocardiograms from 450 UI patients were randomly divided into training and testing sets for internal training, testing, and model construction. Echocardiograms of 90 patients from the other medical centers were used for external validation to evaluate the model generalizability. A total of 49 board-certified human readers performed human-side classification on the same echocardiography dataset to compare the diagnostic performance and help data visualization. FINDINGS: The DCNN (2D SCI), DCNN (2D MCI), DCNN(2D+t), and RNN models established based on UI dataset for TTS versus STEMI prediction showed mean diagnostic accuracy 73%, 75%, 80%, and 75% respectively, and mean diagnostic accuracy of 74%, 74%, 77%, and 73%, respectively, on the external validation. DCNN(2D+t) (area under the curve [AUC] 0·787 vs. 0·699, P = 0·015) and RNN models (AUC 0·774 vs. 0·699, P = 0·033) outperformed human readers in differentiating TTS and STEMI by reducing human erroneous judgement calls on TTS. INTERPRETATION: Spatio-temporal hybrid DL neural networks reduce erroneous human “judgement calls” in distinguishing TTS from anterior wall STEMI based on bedside echocardiographic videos. FUNDING: University of Iowa Obermann Center for Advanced Studies Interdisciplinary Research Grant, and Institute for Clinical and Translational Science Grant. National Institutes of Health Award (1R01EB025018–01). Despite distinct pathogenesis, [1, 2] Takotsubo syndrome (TTS) can mimic clinical and electrocardiographic (ECG) features of acute myocardial infarction (AMI), including anterior wall ST segment elevation myocardial infarction (STEMI). Current guidelines advocate the use of coronary angiography to direct differential diagnosis and treatment. [3] Because a substantial portion of TTS cases are actually triggered by bleeding disorders, particularly from the central nervous system, frontline clinicians often face a dilemma when anticoagulation (for cardiac catheterization) or thrombolysis can cause adverse, potentially lethal consequences. Meanwhile, misdiagnosing TTS as STEMI can lead to harmful pharmacological or device-based treatment and worsen hemodynamic compromise. [4, 5] During the COVID-19 pandemic, TTS was increasingly found in patients with ECG features of STEMI. [6] For provider protection and capacity leverage, the updated guideline requires point-of-care ultrasound (POCUS) or bedside echocardiography to triage STEMI patients suspected for COVID infection before cardiac catheterization. [7] TTS-induced myocardial contractile dysfunction usually extends beyond a single (culprit) coronary artery territory. Nonetheless, tethering of nonischemic myocardium adjacent to ischemic or infarcted myocardium often causes two-dimensional (2D) echocardiographic analyses of regional myocardial contractile dysfunction overestimate the actual ischemic region size. Coronary artery anatomic variations also create factual difficulties to distinguish TTS from anterior wall STEMI based on regional wall motion characteristics in bedside echocardiograms. In daily practice, if clinical characteristics, biomarkers, and ECGs are inadequate for definitive diagnosis, we often have to rely on echocardiography readers' "judgment calls" to support urgent decisionmaking. In the present study, we investigated whether deep learning (DL) neural networks could reduce erroneous "judgement calls" in the differential diagnosis of TTS and STEMI based on bedside echocardiographic images and videos, and the role of DL in supporting triage and management of cardiovascular emergencies. We trained three deep convolution neural networks (DCNN) and one recurrent neural network (RNN) based on an echocardiographic database from 540 patients in a 10-year retrospective cohort (1 January 2008 to 1 January 2018) at the University of Iowa (UI) and eight other university-affiliated or regional medical centers (Washington University in St Louis, University of North Carolina, State University of New York, Weill Cornell Medical College, Kansas University, Lankenau Medical Center, Northwest Health Medical Center, and Providence Regional Medical Center) inside the United States. An overview of the study design and datasets are illustrated in Fig. 1 . The research protocols and waiver of informed consent were approved by the human subjects committee of the UI institutional review board. We obtained clinical, laboratory (Table 1) , ECG, angiographic and echocardiographic imaging data of studied patients and followed updated diagnostic criteria for STEMI [8] and TTS. [3] The differentiation between anterior wall STEMI and TTS were all confirmed by coronary angiography (CAG). Cardiac catheterization with selective CAG, left ventriculography (LVG) and percutaneous coronary intervention were performed using standard techniques according to the updated European Society of Cardiology/American College of Cardiology guidelines. [8] Based on coronary artery anatomies, all ventricular segments were divided into culprit and non-culprit artery territories. Two interventional cardiologists, blind to clinical findings, independently evaluated CAG and LVG images. Transthoracic echocardiography (TTE) was performed using standard techniques of 2D echocardiography following the guidelines of the American Society of Echocardiography. [9] All images were stored digitally for playback and subsequent offline analysis. The 2D grayscale images were acquired in the standard apical views, and the standard apical 4chamber left ventricular (LV) focused view images and videos were used for subsequent studies. Pixel data from the picture archiving and communication systems were preprocessed into numeric arrays, and the data were stored at a resolution of 800 £ 600 pixels. If necessary, they were rescaled through bilinear interpolation. In STEMI patients with cardiac catheterization/coronary angiographically (CAG)-proven significant stenosis (>70%) of the left anterior descending artery (LAD), transthoracic echocardiograms were performed within 24 h of STEMI. Patients were excluded if they had primary valvular disorders, significant pulmonary hypertension, atrial fibrillation, anomalous LAD origin, or no wall motion abnormality in left ventriculography (LVG) and transthoracic echocardiography. Based on anatomic features, Gensini score, and culprit artery location and dominant/major side-branch circulations, segments were divided into culprit or non-culprit artery-supplied areas based on the standard 17-segment LV model and previous publications. [10, 11] Research in context Evidence before this study Echocardiography plays a vital role in triage and management of cardiovascular emergencies. A PubMed search for all types of papers in all languages up to May 28, 2021 with search terms of ''echocardiography''(All Fields) AND ''diagnosis''(All Fields) AND ''deep learning''(All Fields) yielded 37 results, which have been focused on the investigation of cardiac pathology to help differential diagnosis of chronic cardiovascular disorders. The literature is lacking with regard to the studies that apply deep learning (DL) to real-time imaging for diagnosis or triage of acute cardiovascular disorders. Meanwhile, most of the reported DL prediction models were developed based on stillframe echocardiographic images with increased data yield and improved classifications, but showed variable performance in finding advanced diagnostic markers. We show that spatio-temporal hybrid DL neural networks reduce erroneous human "judgement calls" in distinguishing Takotsubo syndrome (TTS) from anterior wall ST segment elevation myocardial infarction based on bedside echocardiograms. Effective spatio-temporal modeling in real-time imaging can help triage cardiovascular emergencies and resolve time-sensitive diagnostic dilemmas. Our study also demonstrates the potential of DL neural networks to reduce reliance on the individual physician's subjective diagnosis based on images of rare cardiac diseases. Integrating effective spatio-temporal DL modeling in real-time cardiovascular imaging studies will increase clinical relevance of AI in assisting non-expert imaging readers for urgently needed triage and management decisions in acute cardiovascular disorders. We developed an ROI selection algorithm (Supplemental methods and Fig. 2A and 2B) to define the regions of interest (ROIs) in the echocardiograms as the input to the DL models, which remove artifacts and labels in the videos and also reduce the computational demands. Three DCNN models and one RNN model were implemented. The two DCNN models ( Fig. 2 .C.a) were both based on a VGG network, [12] which consisted of nine convolution layers with 3 £ 3 kernel size for each layer of convolution. The convolution stride was fixed to one voxel and the spatial padding for each convolution layer input was applied so that the spatial resolution was preserved after convolution. Max-pooling was performed over a 2 £ 2 window to down-sample the feature maps by two, with a stride of one. We started with 16 feature maps in the first convolution layer, which were doubled after every two convolution layers. At the end of these convolution layers, all feature maps were flattened and a dense layer with two channels (one for each class) was added with a soft-max activation to have a probability prediction for each class. All hidden layers were equipped with non-linear rectification (ReLU). [13] In our first DCNN model, we labeled each grayscale frame from all echocardiograms as separate individual cases and used them as the input to the DCNN model with a single-channel (DCNN [2D SCI]). We implemented the second model with exactly the same architecture as the first, but took all frames from a single echocardiogram as a whole to feed into the DCNN model with multiple-channels (DCNN[2D MCI]). The third model was also based on a VGG network with the same structure as the first two. However, instead of using 2D frames as input, we used the echocardiogram videos as a 3-dimension input. Hence, in this model, all convolution and pooling layers were equipped with 3 £ 3 £ 3 and 2 £ 2 £ 2 kernels, respectively. The other network parameters were the same as those in the first two models. We denoted this model as DCNN(2D+t). The fourth model was a recurrent neural network (RNN) with four long short-term memory (LSTM) layers stacked consecutively. The last LSTM layer connected to a dense layer with 32 neurons, followed by a soft-max layer with two neurons for class prediction. We flattened all echocardiogram frames of a video and used them as an input to the recurrent neural network. The DCNN and RNN architectures/algorithms detailed in Fig. 2C and Supplemental methods. The data training and validation was based on an image database consisting of 17,280 still-frame images and 540 videos from apical 4chambal view 2D echocardiograms in 540 patients in the University of Iowa (UI) and eight other medical centers. The internal training and validation were performed in two stepwise stages: control versus disease and TTS versus STEMI. We used ten-fold cross-validation for training and validation on 14,400 still-frame images and 450 videos from the echocardiograms of UI patients (150 control, 140 TTS and 160 STEMI). The dataset was randomly divided into ten subsets with the same ratio among the classes as the original dataset to maintain the class balance. Each model was validated on each subset of the data with the model trained on the remaining nine subsets. The performance of the model was the average of all ten validation scores. In addition, each model was trained with augmented data (detailed in Supplemental methods). The models for the TTS vs STEMI classification task were also tested for generalizability on an external dataset consisting of 2880 still-frame images and 90 videos from the echocardiograms of 90 patients with either TTS or STEMI in eight external centers (Fig. 1 ). We used Qualtrics Ò software (October 2020 version) to create video image surveys: de-identified echocardiographic videos of standard apical 4-chamber view were used for image surveys from 300 individual patients (160 STEMI and 140 TTS). The survey was anonymous and distributed through electronic links to all 57 participants. The only additional information we requested was participants' clinical specialty with training/working time. A total of 49 readers eventually completed all 300 video readings for the human-side classification. They included 30 board-certified cardiologists (8 interventional board-certified cardiologists and 22 National Board of Echocardiography board-certified general cardiologists), 11 senior The American Registry for Diagnostic Medical Sonography board-certified cardiology sonographers, and 8 frontline care (emergency and critical care) physicians with more than three years' experience of POCUS training (Acknowledgement list). The readers were blind to any additional clinical, laboratory, ECG, echocardiography, angiography, or ventriculography data. We evaluated and compared image survey results with the crossvalidated results of the DCNN(2D+t) and RNN models. We combined all (49) human outputs by majority voting and defined them as human (voting) results. The correctness for human results was defined as the percentage of human readers who made the same diagnosis as the coronary angiography. Conversely, the correctness of the DCNN and RNN models were the estimated probability that the model made the same prediction as the coronary angiography. We examined and reported the data distributions and confusion matrices of the accuracy of the human results in comparison to that of the DCNN(2D+t) and RNN results, and visualized the results using Principal Component Analysis (PCA) method. We evaluated the performance of the DL neural networks and the human readers using receiver operating characteristic (ROC) curve analysis and confusion matrix with respect to coronary angiography results. Pairwise comparisons of the area under the ROC curve (AUC) were carried out according to the DeLong method [14] while the pairwise comparison of the confusion matrices were applied based on Fisher's exact test. All statistical analysis was performed using the opensource software Python 3.7.4 with package Scipy. Statistical significance was defined as P value <0¢05. Drs. FZ, RP, KL and XDW have full access to the study data. The funders had no role in the study design; collection, analysis, and interpretation of data; writing of the report; or the decision to submit the article for publication. The demographic, clinical, and basic echocardiography assessment data of STEMI and TTS patients are summarized in Table 1 . To perform quality assessment of disease prediction, we use the interpretability method of Gradient-weighted Class Activation Mapping (GradCAM), [15] which aims to unfold the activations of the network layers in a deep neural network. In the heatmap, a brighter point indicates that the corresponding pixel in the input image and plays a more important role in class prediction. Fig. 3 showed weighted GradCAM heatmaps overlay on randomly chosen samples from each of the classes for the TTS versus MI classification and the five best-weighted activation maps for each of the randomly chosen TTS or MI samples. The color range of each of the heatmaps are from dark blue to dark red, where dark blue marks the least important and dark red marks the most important pixels for model prediction. The class specific sensitivity, specificity, PPV, and F1 scores for different models were shown in Table 2 . Briefly, the DCNN(2D SCI), DCNN(2D MCI), DCNN(2D+t), and RNN models for the control vs disease prediction showed mean accuracies of 78%, 83%, 92%, and 81%, respectively. The DCNN(2D SCI), DCNN(2D MCI), DCNN(2D+t), and RNN models for TTS vs STEMI prediction showed mean accuracies 73%, 75%, 80%, and 75% respectively, and the mean accuracies of 74%, 74%, 77%, and 73% respectively on the external validation. The numbers of correct human readings (consistent with coronary angiographic results) on STEMI showed a (left skewed) normal distribution pattern. In contrast, the correct human readings on TTS were rather random (Fig. 4) . The confusion matrices showed that the DCNN(2D+t) (81¢4% correctness) and RNN (70¢7% correctness) outperformed human readers (54¢3% correctness) to diagnose TTS (P = 0¢000,002 and 0¢006, respectively, Fisher's exact test) while their performances (77¢5% and 78¢8% correctness) are comparable to that of the human readers (79¢4%) on STEMI (P = 0¢786 and 1¢0, respectively, Fisher's exact test) (Fig. 4) . The AUC analysis showed that DCNN(2D+t) (0¢787 vs. 0¢699, P = 0¢015) and RNN models (0¢774 vs. 0¢699, P = 0¢033) consistently outperformed human readers in differentiating TTS and STEMI (Fig. 5) . In PCA, the DCNN(2D+t) result appeared to be the closest to the coronary angiography results, followed by the RNN and the human results (Fig. 5) . The present study shows that spatio-temporal hybrid DL neural networks can reduce erroneous human "judgement calls" in distinguishing TTS from anterior wall STEMI based on bedside echocardiographic videos, and demonstrates the potential of DL to assist in frontline triage and management of cardiovascular emergencies. Although echocardiographic videos indeed hold comprehensive imaging information allowing wide-ranging measurements of overt and covert ventricular function, human assessment often subconsciously limits the sampling of spatio-temporal information due to time restriction. While human brains are wired to have a bias toward specific areas of the images based on personal knowledge and experience, DL neural networks speedily analyze every individual pixel, generating the potential to objectively identify delicate features and uncover the predictive ability that may be lost by human readers. [16, 17] This lays the foundation for using spatio-temporal convolutions on classification of echocardiographic studies to support realtime differential diagnosis during acute cardiovascular disorders. DL neural networks have been increasingly used to investigate cardiac pathology based on still-frame echocardiographic images. Although increased image data yield helps binary classifications and computable decision boundaries, their accuracies to identify "advanced" markers appear to be variable. [18À20] The myocardial contractile and relaxing process is highly heterogeneous and timedependent. Classifying individual image frames in isolation may limit the perception of temporal features during or between cardiac cycles. Instead, composed spatio-temporal information within or between consecutive static images likely empowers the applicability of DL neural models in recognizing subtle changes of myocardial contractility and function. [21 23] In the present study, DCNN(2D+t) model based on echocardiography videos outperformed the DCNN(2D) models based on static image frames. From heatmap data visualization, much of the benefit appeared to be through improved discrimination between certain pairs of views that the DCNN(2D) models found challenging. In such cases, the temporal arm's saliency map showed intense focal activation over the basal LV and right ventricle (RV) segments in echocardiographic videos of TTS and STEMI patients (Fig. 3) . The DCNN(2D MCI) model showed slightly improved performance over DCNN(2D SCI), even just with its very limited capacity of making use of spatial feature changes between image frames. The true interpretation emerged when a temporal imaging sequence is integrated. Both Correctness for human result is the percentage of human readers who make the same diagnosis as the coronary angiography. Correctness for DCNN and RNN models are the estimated probability the model makes the same prediction as the coronary angiography. Left: the ROC curves of the overall human, DCNN(2D+t) and RNN, as well as the ten best human results. The P value of the area under the curve (AUC) between DCNN and the overall human results is 0.015, and the P-value of the AUC between RNN and human result is 0.033. The blue circles represent 10 human readers with best performance. Right: visualization of DCNN(2D+t)/RNN and individual human results with principal component analysis. The human, angiography, DCNN(2D+t) and RNN results are shown by different shapes. The specalities of human reader are shown by different colors. Each point represents the comprehensive diagnosis on all 300 echocardiograms from patients (with TTS or STEMI) given by one human reader or one model. These results are projected onto a two-dimensional space using a dimension reduction method À principal component analysis. DCNN(2D+t) and RNN models explore temporal motion features in an echocardiographic video. By leveraging spatial and temporal information from multiple image frames across an echocardiographic video, DCNN(2D+t) and RNN models have the potential to detect subtle functional/motion changes through a cumulative evaluation of the continuous movements of the heart, therefore being likely more sensitive than DCNN(2D) models. [21, 23] The visualization data support the theory that the DCNN(2D+t) network's ability to discriminate between such classes may be in part due to its ability to track the movement of cardiac structures (basal LV and RV walls) throughout the simultaneous multi-dimensional motion, in order to increase data resolution and catch "invisible" spatio-temporal imaging information. With a limited view of the receptive field, DCNN(2D+t) is able to "see" only a limited number of frames concurrently with the 3D convolution operations, which enables learning of relatively regional ventricular motions. [21, 22] In the present study, DL networks significantly reduced erroneous human "judgement calls" on TTS. In contrast to a normal distribution pattern of the numbers of readers making the correct diagnosis on STEMI, the distribution of the numbers of readers that correctly identified TTS appears to be random and arbitrary (Fig. 4) , which did not reflect readers' various training background and experiences. The readers' presumption on low prevalence of TTS in their previous experience may generate judgment bias. In real life practice, it may be further augmented by a fear of missing the diagnosis of possibly life-threatening anterior wall STEMI. Imaging training and research on rare diseases usually relies on establishing national/international registries to build a large-scale image database. Due to a paucity of automated resources for processing raw images and no consistent reporting of data quality measures, this practice requires the great collaboration of multiple medical centers in sharing of data and study model settings. [2] For many such diseases, including TTS, there currently exists no publicly available image database to enrich readers' training and education. Without well-accepted consensus, reading physicians sometimes have to rely on instinct and personal experience to make judgment calls for the diagnosis. The inherent subjectivity likely results in inter-observer variations and errors. Since DL image processing and analysis can reduce human readers' performance variability due to various training background and experience, our results and further data visualization may help explore possible caveats during the imaging training pathways of human readers, which would contribute to tailoring and standardizing future imaging training strategies. [24, 25] The present study has several limitations. 1. Both STEMI and TTS are dynamic and time-variant processes, and contractile dysfunction patterns likely vary at different time points, which has made differential diagnosis even more challenging with a single-time echocardiography. Enriching training echocardiographic datasets with TTS/STEMI imaging features in different evolving stages may help further improve the diagnostic accuracy of DL neural networks. [26, 27] 2. The RNN model, which implements the concept of memory with introducing feedback links between layers backward, has the capability of learning temporal context across video frames, which captures more global motion features; however, our current experiments based on limited views of echocardiograms shows that the RNN model is inferior to DCNN (2D+t), which may be due to RNN's relatively weak capability of learning spatial features. An interesting future work would be to unify the strength of both DCNNs and RNNs in a single neural network to more deeply explore the spatio-temporal information to further improve the diagnostic accuracy. 3. The present study was designed to differentiate typical (apical) type TTS and anterior STEMI, which did not apply for atypical (inverse and biventrcilar) type TTS. The diagnostic value of CAG may be limited since TTS can be present even in patients with significant coronary artery disease. [3] Due to the reotrspective nature of the present study, we were unable to apply cardiovascular magnetic resonance imaging in most patients to define STEMI vs. TTS, which helps further differentiate atypical STEMI and TTS phenotypes. For example, the comparable contractility patterns with typical apical TTS phenotype ("Takotsubo effect") have been increasingly reported and recognized in patients with STEMI, [10, 11, 28, 29] but not been excluded from our training database including echocardiograms in the past 10 years. New training datasets with more delicate phenotyping of STEMI may help further refine the prediction models. 4. Human echocardiography interpretations usually rely on comprehensive echocardiographic techniques in addition to 2D studies, such as Doppler and myocardial strain. The clinical context and information also contribute to the final (differential) diagnosis. Therefore, the present study was not designed to compare the real-life (differential) diagnostic accuracy between AI and human readers. Instead, we aimed to determine the possible added values of DL neural networks to assist nonexpert imaging readers for urgently needed disease triage and management decisions during cardiovascular emergencies. During the COVID-19 pandemic, the utilization of comprehensive TTE has been significantly replaced by focused bedside echocardiography and POCUS, to limit exposure and viral transmission. [7] Meanwhile, maintaining diagnostic accuracy and goal-directed therapy in patients with cardiac injuries based on focused echocardiography and POCUS becomes a compelling challenge. Our study serves as a proof-of-concept that DL can streamline and empower currently available bedside imagining tools to effectively and efficiently support real-time triage and management of cardiovascular emergencies, particularly in rural areas or during a global healthcare crisis. Any additional information on methods, research results, extended data, and statements of data availability are available in our submitted supplemental methods and results, and are also available online (UI sharepoint/OneDrive shared folder) with all original echocardiographic images and videos. This work was funded by the Obermann Center for Advanced Studies Interdisciplinary Research Grant and Institute for Clinical and Translational Science Grant to KL and XDW, and National Institutes of Health Award (1R01EB025018-01) to YG Wang. YG Wang declares a National Institutes of Health Award (1R01EB025018À01) and a California State University Dominguez Hills RSCA award to support 3-unit course release in Fall of 2021. All other authors have nothing to declare. Pathophysiology of Takotsubo Syndrome: JACC State-of-the-Art Review Clinical Features and Outcomes of Takotsubo (Stress) Cardiomyopathy International Expert Consensus Document on Takotsubo Syndrome (Part I): Clinical Characteristics, Diagnostic Criteria, and Pathophysiology Comatose 62-Year-Old Woman Following Cardiopulmonary Resuscitation Echocardiographic correlates of acute heart failure, cardiogenic shock, and in-hospital mortality in tako-tsubo cardiomyopathy Incidence of Stress Cardiomyopathy During the Coronavirus Disease 2019 Pandemic Management of acute myocardial infarction during the COVID-19 pandemic: A Consensus Statement from the Society for Cardiovascular Angiography and Interventions (SCAI), the American College of Cardiology (ACC), and the American College of Emergency Physicians (ACEP) ACC/ AHA/SCAI focused update on primary percutaneous coronary intervention for patients with ST-elevation myocardial Infarction: An update of the 2011 ACCF/ AHA/SCAI guideline for percutaneous coronary intervention and the 2013 ACCF/AHA guideline for the management of ST-elevation myocardial infarction: A report of the American College of Cardiology/American Heart Association Task Force on Clinical Practice Guidelines and the Society for Cardiovascular Angiography and Interventions Recommendations for cardiac chamber quantification by echocardiography in adults: an update from the American Society of Echocardiography and the European Association of Cardiovascular Imaging Takotsubo effect" in patients with ST segment elevation myocardial infarction Discrepant myocardial microvascular perfusion and mechanics after acute myocardial infarction: Characterization of the ''Tako-tsubo effect'' with real-time myocardial perfusion contrast echocardiograph Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv pre-print server Imagenet classification with deep convolutional neural networks Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach Visual Explanations from Deep Networks via Gradient-Based Localization Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning Optimal surface segmentation with convex priors in irregularly sampled space Fast and accurate view classification of echocardiograms using deep learning Deep learning interpretation of echocardiograms Fully Automated Echocardiogram Interpretation in Clinical Practice Improving ultrasound video classification: an evaluation of novel deep learning methods in echocardiography Automated Recognition of Regional Wall Motion Abnormalities Through Deep Neural Network Interpretation of Transthoracic Echocardiography Video-based AI for beat-to-beat assessment of cardiac function Big Data and Machine Learning in Health Care Predicting the Future -Big Data, Machine Learning, and Clinical Medicine Proposed Requirements for Cardiovascular Imaging-Related Machine Learning Evaluation (PRIME): A Checklist: Reviewed by the American College of Cardiology Healthcare Innovation Council Machine Learning Assessment of Left Ventricular Diastolic Function Based on Electrocardiographic Features Coexistence of acute takotsubo syndrome and acute coronary syndrome MicroRNA-33 and SIRT1 influence the coronary thrombus burden in hyperglycemic STEMI patients We are grateful to the following colleagues who volunteered to participate in our image survey: University Supplementary material associated with this article can be found in the online version at doi:10.1016/j.eclinm.2021.101115.