key: cord-0277555-hlz1onw7 authors: Tschuchnig, Maximilian E.; Gadermayr, Michael title: Anomaly Detection in Medical Imaging -- A Mini Review date: 2021-08-25 journal: nan DOI: 10.1007/978-3-658-36295-9_5 sha: 0281ea775a33d9a294f9efe16d0b61ed597166a1 doc_id: 277555 cord_uid: hlz1onw7 The increasing digitization of medical imaging enables machine learning based improvements in detecting, visualizing and segmenting lesions, easing the workload for medical experts. However, supervised machine learning requires reliable labelled data, which is is often difficult or impossible to collect or at least time consuming and thereby costly. Therefore methods requiring only partly labeled data (semi-supervised) or no labeling at all (unsupervised methods) have been applied more regularly. Anomaly detection is one possible methodology that is able to leverage semi-supervised and unsupervised methods to handle medical imaging tasks like classification and segmentation. This paper uses a semi-exhaustive literature review of relevant anomaly detection papers in medical imaging to cluster into applications, highlight important results, establish lessons learned and give further advice on how to approach anomaly detection in medical imaging. The qualitative analysis is based on google scholar and 4 different search terms, resulting in 120 different analysed papers. The main results showed that the current research is mostly motivated by reducing the need for labelled data. Also, the successful and substantial amount of research in the brain MRI domain shows the potential for applications in further domains like OCT and chest X-ray. The increasing digitization of medical imaging enables the collection of data and machine learning (ML) based approaches to aid medical experts. One powerful part of ML comes from supervised methods, using both data and corresponding labels in e.g. segmentation or classification models. However, since the collection of annotations (labels) is often times time consuming and thereby costly [1] as well as in many cases a confident ground truth even being unobtainable, their usability is reduced. Due to this, semisupervised and unsupervised methods are applied. This is often achieved through anomaly detection. Definitions: Pathologies in medical images can often be described as a rare deviance from a norm, or a non-anomalous (in the case of medical imaging mostly healthy) sample. This fits the definition of outliers (or anomalies) in the data, motivating the application of anomaly detection [2] . In this publication, the terms anomaly detection and outlier detection are used interchangeably. This is motivated by the fact that outliers are sometimes defined as valid but out of order datapoints, while anomalies also include further differences (e.g. different image capture modalities). Therefore outliers can be defined as a subset of anomalies. Anomaly detection can be separated into 3 classes, point, collective and contextual anomalies. Point anomaly detection is the task of recognizing a single anomalous point from a larger dataset. Most anomaly detection models handle point anomalies. Collective anomalies are anomalies that may not be identified as anomalies if viewed as a single point but as a set of many they form an anomaly. Contextual anomalies can only be recognized as anomalies if context is added. There are also 3 different anomaly detection setups, supervised, semisupervised and unsupervised anomaly detection. Supervised anomaly detection is comparable with classification using a very unbalanced dataset. Semi-supervised anomaly detection aims to train a model on only one, typically the normal (in our case healthy) class and then applies the model to both healthy and pathological data, reporting the corresponding scores. Unsupervised anomaly detection uses both, normal and anomalous data, does not make use of labels at all and works purely on intrinsic properties of the dataset (using distances or densities) [3] . In anomaly detection, the usage of semi-supervised and unsupervised anomaly detection (UAD) is confused, and repeatedly applied to both semi-supervised and unsupervised methods. We believe that the separation into semi-supervised (healthy data being clearly defined) and unsupervised (no definition of labels at all) makes sense and advise to use this terminology as also pointed out by [3] . Deviation based anomaly detection: Anomaly detection using medical image data, e.g. computed tomography (CT) scans, is typically performed using either convolutional neural network (CNN) based feature extractors, followed by oneclass (OC) classifiers or deviation based methods like autoencoders (AEs) [4]- [6] or even more recently, generative adversarial network (GAN) [7] - [9] based methods. Both AEs and GANs use convolutional kernels, however their applications in the sense of deviation based anomaly detection are fundamentally different to CNN based feature extractors. In order to generate deviation based scores from an AE, the encoder of the encoder-decoder based neural network typically encodes a sample image into a lower dimensional latent space, also called a bottleneck. The decoder uses this latent space representation to recreate the sample and a deviation between the sample and the reconstruction can be calculated. During training, this deviation is used to backpropagate and update the network. The AE in an anomaly detection setting is trained using healthy data to en-and decode features of healthy samples, leading to a higher deviation for non-healthy samples, assuming that there is a difference between the learned healthy and the lesioned latent space [10] . GANs can also be used to facilitate a deviation based score. In addition to training a generator and a discriminator in an adversarial setup, an additional encoder needs to be trained, mapping the generated image back to the latent space (input to generator) [7] . By doing this, any input image can be mapped to a latent space and reconstructed into an image using the generator. This results in a reconstruction which can be used to facilitate a reconstruction loss. Additionally there are conventional methods to facilitate anomaly detection, using e.g. z-score thresholds [11] , [12] , boxplots [13] or methods built on the ideas of principal components analysis (PCA) [14] , [15] . OC support vector machines (SVM)s [16] are one of the most known semisupervised anomaly detection methods. In principle they apply the ideas of SVMs (using hyperplanes to separate two classes using support vectors with the aim of generating the largest possible margin) to a OC problem. One possibility to achieve this is to model a hypersphere to encompass all support vectors, creating the smallest possible hypersphere. Contribution: This papers contribution is the analysis of the current state of anomaly detection in medical imaging. Using this analysis, we show lessons learned and give an outlook for future applications and research targets. The method used was a semi-exhaustive literature review based on Randolph [17] . The formulated problem was the evaluation of anomaly detection in medical imaging. For data collection, the search engine Google Scholar was used. In order to obtain meaningful results, the search terms anomaly detection in medical imaging, unsupervised anomaly detection in medical imaging, outlier detection in medical imaging and unsupervised outlier detection in medical imaging were chosen. From these results the following criteria for exclusion were chosen. Only the first 3 pages of results (sorted by relevance, 10 articles per page) were used. Further, the criteria for exclusion duplicate, in context of medical imaging (in abstract, title or conclusion), peer-review and date were identified. Since the search terms were similar, duplicates had to be removed. Papers without a clear focus on medical imaging in the abstract, title or conclusion were also removed. A further criterion was to only include peer-reviewed research items. This mainly lead to the exclusion of preprints. The data timeframe was set to not include papers after the resurgence of deep-learning (AlexNet [18] ) and to still include papers after the U-net was proposed [19] , resulting in a timeframe of January 2015 − July 2021. This lead to a reduction of papers from 120 to 49. Since these papers also included 4 survey papers, the final number of application based research papers was 45. These survey papers were used as a qualitative comparision to the our extracted lessons learned. Next, the papers were manually clustered with respect to their imaging method and the following information was extracted: Aim, Applied Method and Results. From these clusters, lessons learned were extracted, which are reported in section III. The semi-exhaustive literature review resulted in 45 research items, from which further 6 were removed due to not containing applications in medical imaging (only exemplar stated in abstract) or being non-available. The resulting papers were further clustered into 5 categories (corresponding to Tab. I-V by their imaging methods. Tab. I shows papers applying anomaly detection to occular medical images with retinal fundus images and optical coherence tomography (OCT). Tab. II focuses on papers with applications in the center body region, with chest X-rays and mammography. Tab. III summarizes application papers, using CT and functional magnetic resonance imaging (fMRI). Tab. IV displays papers applying ML to brain Magnetic resonance imaging (MRI) datasets. Tab. V shows mixed applications from the domains of breast ultrasound, chest radiographs, histology and fundus images as well as multi-spectral imaging (MSI). Overall, these tables show a narrow field of application with 15 (38.46%) of all selected papers working on MRI scans of the brain. Further 6 papers use fMRI and CT scans of the brain, increasing the amount of brain image data based applications to 53.85%. Further clusters could be observed using chest X-rays and mammography, as well as ocular imaging techniques, especially OCT. Of note is, that although medical imaging includes methods like histology, only 1 paper [20] applied anomaly detection to such data. A further result is the relevance of deviation based methods, with 27 papers (69.23%) applying some form or adaptation, mostly using autoencoders AEs or GANs [7] - [9] , [20] - [43] . Investigating MRIs, 7 [37] , [39] , [40] , [42] - [45] of the 15 publications using brain MRI data focus explicitly on tumours or metastases, showing the usefulness of anomaly detection and segmentation of tumours in brains using MRI. Most other brain MRI based methods more generally handle the task of lesion classification or segmentation with only two focusing specificly on cerebral small vessel diseases [41] , [46] . A further cluster uses X-ray for the detection of pneumonia [23] , [47] or lung disease like COVID-19 [15] . Several advancements have also been made in OCT segmentation of retina lesions [7] , [8] , [25] , [26] , [48] , with one publication performing visual touring test using 2 experts, which were unable to recognize differences in the correctly reconstructed data [8] . Breast cancer and pathology detection was also improved using anomaly detection [28] , [29] , [49] . One result of this analysis is the statement that anomaly detection can be motivated by the lack of available labelled training data, which was stated in 19 publications. The reported results of these papers proved that these semiand unsupervised approaches successfully completed their tasks [7] , [8] , [14] , [20] , [22] , [24] - [26] , [28] , [29] , [31] , [33] , [36] , [38] , [40] , [41] , [44] , [48] , [49] . However, some papers also show semi-supervised methods outperforming [7] OCT new anomaly detection method AnoGAN [8] OCT new anomaly detection method fAno-GAN [25] OCT segmentation (retina lesions) Bayesian U-Net. Episdemic uncertainty estimations and post processing [26] OCT and chest X-ray new anomaly detection method encoder-decoder with additional GAN discriminators [32] brain rs-fMRI AE and frame prediction (conv-LSTM) [50] Brain fMRI anomaly detection using constraint programming (cognitive impairment) Constraint Programming using 3 constraints fully supervised methods. These outperforming methods are based on classical feature extraction followed by multipleinstance learning (MIL) based models [49] , through adaptations to GANs [27] (using skip-connections and weightsharing subnetworks) and through the adaptation of AEs to the SegAE model [32] (using pairs of T1-w, T2-w and FLAIR data for improved anomaly detection). For this improvement in comparison to fully supervised models, Khosla et al. [32] reason that fully supervised methods systematically either under or overestimate lesion volumes (when segmenting lesions), while their proposed method was reported to be free of this bias. Zhang et al. and Kim et al. both show interesting approaches, applying conventional feature extractors (CNN and edge detection) with further OC classifiers (fully connected neural networks and recurrent neural networks). By using these methods both papers reach relatively high scores, but still lower scores then their CT based baselines. A further finding is the obvious bias in the amount of research items regarding OCT, chest X-Ray, mammography and Brain MRI. An investigation in the used datasets shows a strong dataset and community driven effect. For all of the above mentioned image categories, datasets are publicly available. Further, a community driven effect can be observed, comparing new models against older ones, evaluated on the same dataset. In addition to medical image based application papers, several authors proposed improvements to the general anomaly detection pipelines. 3 papers showed an improvement of subsequent methods by removing anomalies from the data or reducing complexity in the data [11] , [12] , [14] . Also, constraint programming is shown successfully by Kuo et al. [50] . showing further approaches to perform anomaly detection. CycleGAN is also shown to work for transforming images into a space that showed reduced image artefacts [31] . Heer et al. [38] showed issues with the general idea of anomaly detection and their application of anomalies as out-of-distribution (OOD) data, remarking a blind spot using deviation based Imaging Method Aim Applied Method [33] brain MRI segmentation (brain lesions) SegAE [34] brain MRI anomaly detection (epilepsy) siamese network, stacked cAE, wasserstein AE [35] brain MRI improvements to AE based methods (glioma) VAE + LG (and several baselines) [46] brain MRI segmentation (cerebral small vessel disease) PHI-Syn [51] (image synthesis) and Gaussian mixture models used by oc-SVM [44] Brain MRI segmentation (brain lesions) Hidden markov models [36] Brain MRI anomaly detection (brain lesion) siamese, stacked cAE for latent representations in oc-SVM [45] Brain MRI segmentation (brain tumor) DistGP-Seg. Incooperating DistGP into CNN [37] Brain MRI anomaly detection (MS and cancer) spatial AE with skip connections [38] Brain MRI awareness for OOD VAE. Scores: l1, Kullback-Leibler divergence, Watanabe-Akaike information criterion score, Density of States Estimation [39] Brain MRI improvements to cycleGAN (brain tumor) SteGANomaly [9] Brain MRI anomaly segmentation (brain lesions) AnoVAEGan [40] Brain MRI anomal localization (brain tumor) VAE with additional KL divergence term in Backprop [41] Brain MRI anomaly detection (brain infarct) GANomaly [42] brain MRI new anomaly detection method (brain mestastases) (Wasserstein based) MaDGAN using self attention (paired) [13] Brain MRI (DTI) quality assurance of segmentation (brain lesions) non parametric (box-plots); supervised classification models [43] Brain MRI new anomaly detection method (tumor) GMVAE Imaging Method Aim Applied Method [22] breast ultrasound anomaly detection (normal, begning, malignant in breasts) bidirectional GAN [20] hisotlogy images image synthesis (tumor) DCGAN & WGAN [24] fundus image anomly localization (glaucoma) adversarial attention guided VAE [11] MSI outlier removal to improve burn detection z-score based outlier detection to improve SVM and KNN [12] MSI outlier removal to improve burn detection z-score based outlier detection to improve SVM and KNN methods. They state that denoting anomalies as OOD is dangerous, since non anomalous data from different sensors or image modalities may also be detected as OOD although this data not being anomalous. In their paper they further present a method based on prior knowledge to disentangle lesion based OOD from non-lesion based counterparts. In this paper we analysed the current state of research in anomaly detection using medical image data and extracted lessons learned. To accomplish this, a semi-exhaustive literature review was performed, resulting in 120 papers, from which 44 were further investigated (after filters were applied). This resulted in 4 major clusters of image domains, with the brain MRI domain comprising 39.45% of all papers. One takeway is that especially in the brain MRI domain, both lesion and tumour classification as well as segmentation have been successfully implemented multiple times. It is shown that both AE and GAN based methods as well as Gaussian mixture models, hidden Markov models and CNNs with specific feature extractors can work in this anomaly detection setup. This was further shown to be the case with chest X-ray, mammography as well as OCT data. Extrapolating from these results, first approaches in similar domains, using anomaly detection for tasks in the domains of e.g. CT scans of the skeleton or spines seem promising and should be investigated. Also, an investigation of the suitability for histological data would be of high interest, since histological data was very under-represented (1 publication). However, there are multiple differences between CT/MRI and histology. In histology it is not sufficient to detect a large object (e.g. tumor) which is indicated by different intensity values. It would rather be important to learn the shape and interaction of nuclei and cells which is supposed to be a more challenging task, relying more on high frequency information which is a reported weak point of several proposed deviation based mehtods. Further, histology images are extremely high resolution, leading to issues using current GAN or AE based anomaly detection. Stepec et al. [20] show one way to circumvent these issues successfully using patch extraction and MIL. As reported in the results, there were some semi-supervised anomaly detection models that resulted in higher or similar scores than their fully supervised alternatives. One interpretation is that, especially regarding segmentation, human labelled segmentation masks with rough edges may introduce bias. This is however still unclear and should further be investigated. Another useful takeaway is that not only improvements to state of the art (SOTA) models are needed but also simpler models or cheaper image modalities can be a major improvement, even if the SOTA scores cannot be reached e.g. replacing CT with X-ray based methods. One example was shown by Zhang et al. [47] who used X-ray images, approaching relatively high scores. Although their method did not outperform the CT based baselines, the methods is still of high significance, since it reached similar levels using Xrays requiring a lower radiation dose and an more available imaging method. As stated by [52] we also recognized the generation of free and comparable datasets as a high priority to facilitate further research. The fast growing brain MRI community showed, that open datasets are an important asset to boost research. Therefore the development and open distribution should be pursued for different medical image domains. In order to facilitate anomaly detection research, a semi-supervised dataset (only including a small amount of annotations) should be developed. A disadvantage, reported by several deep learning based approaches was [30] , [44] , that results were still unstable and more research was needed before a clinical application could be performed. This however was not always the case [22] but there are still doubts in the clinical applicability of deep learning based anomaly detection methods. Large clinical application studies would be needed to show their suitability. Conclusion: In this paper we investigated the current state of research in medical image based anomaly detection and generated lessons learned. The lessons learned can be converted into the following future targets: a very narrow domain of application that should be expanded, development of freely accessible datasets, investigation of the OCT blindspot and improvements of working approaches like constraints on the AE bottleneck. A comparative evaluation of outlier detection algorithms: Experiments and analyses Procedures for detecting outlying observations in samples A comparative evaluation of unsupervised anomaly detection algorithms for multivariate data Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion Learning sparse representation with variational auto-encoder for anomaly detection Unsupervised pathology detection in medical images using conditional variational autoencoders Unsupervised anomaly detection with generative adversarial networks to guide marker discovery f-anogan: Fast unsupervised anomaly detection with generative adversarial networks Deep autoencoding models for unsupervised anomaly segmentation in brain mr images Memorizing normality to detect anomaly: Memoryaugmented deep autoencoder for unsupervised anomaly detection Outlier detection and removal improves accuracy of machine learning approach to multispectral burn diagnostic imaging Burn injury diagnostic imaging device's accuracy improved by outlier detection and removal Quality assurance using outlier detection on an automatic segmentation method for the cerebellar peduncles Pca leverage: outlier detection for high-dimensional functional magnetic resonance imaging data Chest x-ray outlier detection model using dimension reduction and edge detection Uniform object generation for optimizing one-class classifiers A guide to writing the dissertation literature review 2012 alexnet U-net: Convolutional networks for biomedical image segmentation Image synthesis as a pretext for unsupervised histopathological diagnosis A primitive study on unsupervised anomaly detection with an autoencoder in emergency head ct volumes Efficient anomaly detection with generative adversarial network for breast ultrasound imaging Unsupervised deep anomaly detection in chest radiographs Attention guided anomaly localization in images Exploiting epistemic uncertainty of anatomy segmentation for anomaly detection in retinal oct Anomaly detection for medical images using self-supervised and translationconsistent features Descargan: Disease-specific anomaly detection with weak supervision Unsupervised clustering of mammograms for outlier detection and breast density estimation Anomaly detection for medical images based on a one-class classification Unsupervised lesion detection in brain ct using bayesian convolutional autoencoders Unsupervised medical image translation using cyclemedgan Detecting abnormalities in resting-state dynamics: An unsupervised learning approach Unsupervised brain lesion segmentation from mri using a convolutional autoencoder Unsupervised feature learning for outlier detection with stacked convolutional autoencoders, siamese networks and wasserstein autoencoders: application to epilepsy detection Unsupervised lesion detection with locally gaussian approximation Regularized siamese neural network for unsupervised outlier detection on brain multiparametric magnetic resonance imaging: application to epilepsy lesion screening Modeling healthy anatomy with artificial intelligence for unsupervised anomaly detection in brain mri The ood blind spot of unsupervised anomaly detection Steganomaly: Inhibiting cyclegan steganography for unsupervised anomaly detection in brain mri Unsupervised anomaly localization using variational auto-encoders An anomaly detection approach to identify chronic brain infarcts on mri Madgan: unsupervised medical anomaly detection gan using multiple adjacent brain mri slice reconstruction Unsupervised lesion detection via image restoration with a normative prior Automatic outlier detection using hidden markov model for cerebellar lobule segmentation Distributional gaussian process layers for outlier detection in image segmentation Brain lesion segmentation through image synthesis and outlier detection Viral pneumonia screening on chest x-rays using confidence-aware anomaly detection Towards practical unsupervised anomaly detection on retinal images Multiple-instance learning for anomaly detection in digital mammography A framework for outlier description using constraint programming Pseudohealthy image synthesis for white matter lesion segmentation Autoencoders for unsupervised anomaly segmentation in brain mr images: a comparative study Deep learning in medical imaging