key: cord-0468823-r2xlxfto authors: Dravid, Amil; Katsaggelos, Aggelos K. title: Visual Explanations for Convolutional Neural Networks via Latent Traversal of Generative Adversarial Networks date: 2021-10-29 journal: nan DOI: nan sha: d2554717997551f00b5ef819d2619c288f347588 doc_id: 468823 cord_uid: r2xlxfto Lack of explainability in artificial intelligence, specifically deep neural networks, remains a bottleneck for implementing models in practice. Popular techniques such as Gradient-weighted Class Activation Mapping (Grad-CAM) provide a coarse map of salient features in an image, which rarely tells the whole story of what a convolutional neural network (CNN) learned. Using COVID-19 chest X-rays, we present a method for interpreting what a CNN has learned by utilizing Generative Adversarial Networks (GANs). Our GAN framework disentangles lung structure from COVID-19 features. Using this GAN, we can visualize the transition of a pair of COVID negative lungs in a chest radiograph to a COVID positive pair by interpolating in the latent space of the GAN, which provides fine-grained visualization of how the CNN responds to varying features within the lungs. Interpreting CNNs has gained significant relevance with the surge of deep learning-enabled COVID detection models. However, many of these models have been found to be biased and misled by validation and visualization techniques such as Grad-CAM (DeGrave, Janizek, and Lee 2021; Selvaraju et al. 2017) . Generative Adversarial Networks (GANs) show promise for the task of feature visualization as they have gained considerable popularity in generating photo-realistic images (Goodfellow et al. 2014) . A GAN consists of a discriminator and generator model that are trained in tandem. The generator learns to "fool" a discriminator model by trying to replicate the distribution of true data examples. It learns to map points from a low-dimensional manifold known as the latent space via a vector of randomly sampled numbers. The generator transforms this vector into an image. The discriminator determines whether inputted data is "true" data from the actual distribution, or synthesized data from the generator. After training, it can be observed how one image morphs into another image by linearly interpolating between the two images' corresponding latent vectors. This provides the basis for our proposed method of feature visualization. Figure 1 : Generator (G) takes in structural latent vector z 1 and class latent vector z 2 to produce fake chest Xrays fed into the discriminator, along with real samples x. The classifier (C) provides feedback for generating classdiscriminable images. Our method first relies on a pre-trained classifier that we wish to visualize. We specifically use a VGG16 model trained to ∼ 75% accuracy on a private COVID chest X-ray dataset of 128x128 grayscale images. The GAN framework is inspired by the Auxiliary-Classifier GAN (Odena, Olah, and Shlens 2017) , except we decouple the classifier from the discriminator, and employ a different latent vector scheme (see Figure 1 ). The generator carries out supervised disentanglement by taking in a latent vector z 1 that corresponds to lung structure, and a class information vector z 2 . The vector z 1 is sampled from a spherical normal distribution. The z 2 sampling scheme relies on the intuition that COVID manifestations are not deterministic: the same pair of healthy lungs will retain their lung structure even with COVID, but COVID features can present in many ways within the lungs. Thus, when the class is COVID-negative (class = 0), z 2 is a vector of zeros, otherwise (class = 1) it is drawn from the spherical normal distribution to represent a continuous manifold of COVID features. During training, the following objective is optimized: [log(p c (y|G(z 1 , y)))] (1) The first two terms correspond to the typical min-max game between the generator G and discriminator D, where x corresponds to data observations, z 1 is the structural latent vector, and y is the class that is encoded in the z 2 vector. The third term relates to the generator learning to generate images that the classifier C can correctly classify as COVID negative or positive. In this formulation, the generator is trained with discriminator to produce high-fidelity images, while getting feedback from the frozen classifier to incorporate class-specific features. It has been shown that minimizing this third term roughly approximates the KL divergence between the classifier's learned distribution classifier's p c (y|x) and the generator's p g (y|x) (Gong et al. 2019 ). Thus, the generator provides a representation of what the classifier has learned. After training, the generator can be leveraged to explain the classifier. Given a COVID-positive image x, the latent vectors can be reconstructed by optimizing: (2) The latent vectors z 1 and z 2 are found via gradient descent. The objective is to minimize the mean-squared error between the generated image and the ground-truth in addition to the binary cross-entropy between the classifier's output on both images. These two terms can be balanced with constant coefficients. After z 1 and z 2 are found, we can rely on the sampling scheme for z 2 , changing it to − → 0 to convert the COVID-positive lungs to COVID-negative. Finally, we can traverse the latent space to visualize how the classifier's output changes with the pathology within the lungs. We interpolate through the latent vector z 2 with steps n at a rate of λ while keeping the lung structure constant with z 1 by looking at the outputs of G(z 1 , − → 0 + nλz 2 ), for n = 1, 2, ... After training the generator for 1000 epochs, we evaluate how well z 2 maps to COVID features. We generate 4 samples from the same lung structure z 1 , generating 1 COVID negative lung with z 2 = − → 0 and 3 positive with randomly drawn from z 2 ∼ N (0, I). This is repeated 1000 times, and all samples are fed into the classifier. The classifier's predictions match the class fed into the generator with 91.15% ± 0.09 accuracy. Given that random guessing would yield 50%, the z 2 sampling scheme seems to incorporate COVID features as per the classifier. When interpolating over the z 2 latent space between pairs of COVID negative and COVID positive lungs with the same z 1 , the classifier's softmax probability for COVID positive monotonically increases as z 2 moves away from − → 0 , which suggests that the z 2 latent space is structured such that − → 0 corresponds to the mean of a highly dense COVID negative probability region. This can be exploited in feature visualization. After reconstructing the COVID positive image and its negative pair with high confidence (as seen in Figure 2a) , we can observe the softmax probabilities over the outputs as we morph the negative image into a positive (Figure 2b ). (c) Pixel-wise difference between the last and first images in the latent interpolation highlights changing regions as the pair of lungs turns COVID positive. The most active region is highlighted. Compare against Grad-CAM to the right. Thus, the images across the decision boundary can be observed as the classifier's prediction changes. Compared to Grad-CAM (Figure 2c) , traversing through the latent space provides more fine grained feature visualization and holds more explaining power. AI for radiographic COVID-19 detection selects shortcuts over signal Twin auxiliary classifiers gan Generative adversarial nets. Advances in neural information processing systems Conditional image synthesis with auxiliary classifier gans Grad-cam: Visual explanations from deep networks via gradient-based localization