key: cord-0482144-8dd62cp0 authors: Schutte, Kathryn; Moindrot, Olivier; H'erent, Paul; Schiratti, Jean-Baptiste; J'egou, Simon title: Using StyleGAN for Visual Interpretability of Deep Learning Models on Medical Images date: 2021-01-19 journal: nan DOI: nan sha: d09689acb619de22f7d15101864363fbbc51edf8 doc_id: 482144 cord_uid: 8dd62cp0 As AI-based medical devices are becoming more common in imaging fields like radiology and histology, interpretability of the underlying predictive models is crucial to expand their use in clinical practice. Existing heatmap-based interpretability methods such as GradCAM only highlight the location of predictive features but do not explain how they contribute to the prediction. In this paper, we propose a new interpretability method that can be used to understand the predictions of any black-box model on images, by showing how the input image would be modified in order to produce different predictions. A StyleGAN is trained on medical images to provide a mapping between latent vectors and images. Our method identifies the optimal direction in the latent space to create a change in the model prediction. By shifting the latent representation of an input image along this direction, we can produce a series of new synthetic images with changed predictions. We validate our approach on histology and radiology images, and demonstrate its ability to provide meaningful explanations that are more informative than GradCAM heatmaps. Our method reveals the patterns learned by the model, which allows clinicians to build trust in the model's predictions, discover new biomarkers and eventually reveal potential biases. As of September 2020, the FDA had approved 64 AI-based medical devices (Benjamens et al., 2020) , and for the first time the Centers for Medicare & Medicaid Services (CMS) approved the reimbursement of deep-learning powered stroke detector for brain CT scans (Viz.ai, 2020) . The advances of deep learning in computer vision (Krizhevsky et al., 2012) are especially promising in medical imaging fields such as radiology (Ardila et al., 2019) , histology (Coudray et al., 2018) , dermatology (Esteva et al., 2017) or ophthalmology (Gulshan et al., 2016) . While many deep learning techniques may provide state-of-the-art predictive performance, interpretable deep learning models are necessary for regulatory approval, as their ability to explain their predictions can reveal potential biases and failure modes, as seen in the case of (Oakden-Rayner, 2017). Additionally, interpretable models also provide new opportunities for biomedical investigation, as evidenced in (Courtiol et al., 2019) . Finally, such models are able to make inroads with medical experts, as their explainability helps build confidence in their utility (Holzinger et al., 2019) . As illustrated by the COVID-19 crisis Li et al., 2020; Wang et al., 2020) , the go-to method for model interpretation in the medical imaging field is GradCAM (Selvaraju et al., 2017) , which produces a coarse heatmap based on gradient intensity to identify which areas of the input image are responsible for the prediction. However, these heatmaps only highlight the location of predictive features but do not explain how they contribute to the prediction. In an image where Figure 1: Our method applied to knee osteoarthritis severity prediction on an X-ray image. The input image is gradually modified to increase the osteoarthritis severity. The GradCAM heatmap is computed on the input image to compare both interpretability methods. X-rays of the patient's later visits are displayed to visually assess the clinical relevance of our method. information is diffuse, the heatmap cannot highlight any specific region so GradCAM is not sufficient to interpret the model predictions. In this paper, we propose a new interpretability method that generates small synthetic transformations of the original image that would lead to different model predictions. We train a generative model called StyleGAN (Karras et al., 2019 (Karras et al., , 2020 and find the minimal modification in the latent space that changes the model prediction, which ensures that generated images remain as close as possible to the original image. Seah et al. (2019) explore a similar idea by using an older GAN algorithm to create heatmaps highlighting features of congestive heart failure, but their method cannot be applied to any black-box model. Fetty et al. (2020) manipulate three attributes of the StyleGAN latent space in order to enlarge datasets with synthetic images. We validate our interpretability method on two different imaging modalities and demonstrate its ability to provide meaningful explanations of the predictions, and its potential to discover new biomarkers. We propose to create StyleGAN-generated visualizations that explain the predictions of a deep neural network in an interpretable manner. Let f be a classifier (e.g. a fully convolutional neural network) trained on a dataset D = (x i , y i ) ∈ X × Y, where X denotes a set of 2D images and Y a finite set of labels. Our method consists of three steps. First, the images in X are used to train a StyleGAN2 (Karras et al., 2019) , which is an improved GAN whose generator G : W → X has a linearly disentangled intermediate latent space W ⊂ R 512 . The generator G is used to generate a set of synthetic images (G (w i )), where the w i are sampled in the latent space W. Then, we train (using a Mean Squared Error loss) a ResNet50 (He et al., 2016) encoder E : X → W on the synthetic dataset (w i , G (w i )) to retrieve the latent representation w i from a generated image G (w i ). Finally, a logistic regression classifierf (w i ) = σ α w i + β is trained on the latent space W to predict the estimated labelsỹ i = f (G(w i )) associated to each latent vector w i ∈ W. Given a new input image x ∈ X , our method translates the latent vector w = E (x) along the direction α. We can then create new images from the latent representation via G (w + λα) associated with a lower or a higher prediction depending on the value of λ ∈ R. We first demonstrate our method by explaining the predictions of an osteoarthritis severity predictor on X-ray images. The dataset on which the predictor has been trained consists of 20,123 X-rays of patients suffering from knee osteoarthritis collected by the Osteoarthritis Initiative (OAI) (Nevitt et al., 2006) . Each patient has one to eight 12-month follow-up X-rays, as well as associated clinical data, including the Kellgren and Lawrence (KL) grade (Kohn et al., 2016) . The KL grade describes a degree of osteoarthritis severity and ranges from 0 to 4: grades 0 and 1 mean no or doubtful osteoarthritis, while grades 2 to 4 mean mild to severe osteoarthritis. The image classifier f is a ResNet50 trained on the multi-class prediction task. To fit this multi-class setting to our method, we transform it to a binary classification task by pooling grades 0 and 1 versus grades 2 to 4. The predictor f obtains 89% test AUC on this binary task, whilef obtains 80% test AUC on the latent space. Three radiologists evaluated the quality of the StyleGAN generator with a Turing test. They reach 58% accuracy on average, showing that synthetic and real X-rays are almost indistinguishable. In Figure 1 , our interpretability method is applied to a real X-ray image. The GradCAM heatmap provides topographical information by showing that the osteoarthritis features are located in lateral femorotibial space. Our method provides more than topographical information by showing the gradual emergence of the different osteoarthritis features as the KL grade increases, such as the joint space narrowing (red arrow) and osteophytes (blue arrow). By comparing the synthetic evolution of the image to the real evolution of the patient at 12, 24 and 72 months after baseline, we observe that the direction found in the latent space corresponds to a biologically plausible osteoarthritis progression. We apply the same method to histology images, to explain the predictions of a metastasis detector on Camelyon16 (Bejnordi et al., 2017) . The dataset contains 224,166 patch images from breast cancer lymph node whole-slide images, each with a binary label indicating the presence of tumor cells. The image classifier f is a ResNet50 trained on this dataset, obtaining 92% test AUC, while the latent predictorf reaches 95% test AUC. Figure 2 shows our interpretability method on two images: patch B contains tumoral cells while patch A does not. The GradCAM heatmaps are not relevant here because the informative features are spread over the entire image. On the contrary, our approach reveals clinically relevant features. On patch A, it shows the appearance of tumor cells (blue arrow) and the disappearance of lymphocytes (red arrow) as the tumor probability increases, and inversely on patch B. We can see that the encoder-decoder model is not able to perfectly reconstruct histology images, as opposed to knee X-rays. A possible explanation is that the StyleGAN model does not generate images that are under-represented in the training set. This issue is highlighted in this particular use-case as there is more variability in the histology images than in the knee X-ray images. Recently, Yu et al. (2020) propose to overcome this data coverage challenge by harmonizing adversarial training with reconstructive generation. In this study we explored the potential of StyleGANs to explain the predictions of black-box models on medical images. Although heatmap-based methods dominate the interpretability field, they only highlight the localization of predictive features in the image. Our method provides an intuitive way for medical researchers to understand where are located the predictive features in the image and how they impact the prediction by showing modified views of the input image that would produce different predictions. This method shows how the model learned to solve the prediction task which allows clinicians to build trust in the model's predictions, discover new biomarkers and eventually reveal potential biases. In both experiments, our method proved that the models learned clinically relevant features. Figure 2 : Our method applied to tumor probability prediction on two histology tiles of metastatic lymph nodes. The input image is gradually modified to increase (on patch A) or decrease (on patch B) the tumor probability. The GradCAM heatmap is computed on the input images to compare both interpretability methods. End-to-end lung cancer screening with three-dimensional deep learning on low-dose chest computed tomography Artificial intelligence augmentation of radiologist performance in distinguishing covid-19 from pneumonia of other origin at chest ct Diagnostic assessment of deep learning algorithms for detection of lymph node metastases in women with breast cancer The state of artificial intelligence-based fdaapproved medical devices and algorithms: an online database Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning Deep learning-based classification of mesothelioma improves prediction of patient outcome Dermatologist-level classification of skin cancer with deep neural networks Latent space manipulation for high-resolution medical image synthesis via the stylegan Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs Deep residual learning for image recognition Causability and explainability of artificial intelligence in medicine A style-based generator architecture for generative adversarial networks Analyzing and improving the image quality of stylegan Classifications in brief: Kellgren-lawrence classification of osteoarthritis Imagenet classification with deep convolutional neural networks Using artificial intelligence to detect covid-19 and community-acquired pneumonia based on pulmonary ct: Evaluation of the diagnostic accuracy The osteoarthritis initiative. Protocol for the Cohort Study Exploring the chestxray14 dataset: problems Chest radiographs in congestive heart failure: Visualizing neural network learning Gradcam: Visual explanations from deep networks via gradient-based localization Contrastive cross-site learning with redesigned net for covid-19 ct classification Inclusive gan: Improving data and minority coverage in generative models We thank Eric W. Tramel for his valuable feedback on the manuscript. We thank the three radiologists Eric Pessis, François Legoux and Thibaut Emorine for their participation in the Turing Test.