key: cord-0233823-tur9knr4 authors: Hussein, Ramy; Zhao, Moss; Shin, David; Guo, Jia; Chen, Kevin T.; Armindo, Rui D.; Davidzon, Guido; Moseley, Michael; Zaharchuk, Greg title: Multi-task Deep Learning for Cerebrovascular Disease Classification and MRI-to-PET Translation date: 2022-02-12 journal: nan DOI: nan sha: 7e058a6730ca24b979e9cff014e9cb958ce61265 doc_id: 233823 cord_uid: tur9knr4 Accurate quantification of cerebral blood flow (CBF) is essential for the diagnosis and assessment of cerebrovascular diseases such as Moyamoya, carotid stenosis, aneurysms, and stroke. Positron emission tomography (PET) is currently regarded as the gold standard for the measurement of CBF in the human brain. PET imaging, however, is not widely available because of its prohibitive costs, use of ionizing radiation, and logistical challenges, which require a co-localized cyclotron to deliver the 2 min half-life Oxygen-15 radioisotope. Magnetic resonance imaging (MRI), in contrast, is more readily available and does not involve ionizing radiation. In this study, we propose a multi-task learning framework for brain MRI-to-PET translation and disease diagnosis. The proposed framework comprises two prime networks: (1) an attention-based 3D encoder-decoder convolutional neural network (CNN) that synthesizes high-quality PET CBF maps from multi-contrast MRI images, and (2) a multi-scale 3D CNN that identifies the brain disease corresponding to the input MRI images. Our multi-task framework yields promising results on the task of MRI-to-PET translation, achieving an average structural similarity index (SSIM) of 0.94 and peak signal-to-noise ratio (PSNR) of 38dB on a cohort of 120 subjects. In addition, we show that integrating multiple MRI modalities can improve the clinical diagnosis of brain diseases. Cerebrovascular disease refers to conditions that affect the blood vessels and flow of blood in the brain. Cerebrovascular diseases are a leading cause of death or serious long-term disability in the world. In 2019, the Centers for Disease Control and Prevention reported more than 150,000 cerebrovascularrelated deaths in the united states [1] , making it the fifth common cause of death. The prompt diagnosis of cerebrovascular diseases is key to faster and more effective treatment to reduce morbidity and mortality. Cerebral blood flow (CBF) is a measure of the blood supply to a given region of the brain in a given period of time, and has conventional units of ml/100 g/min. Accurate quantification of CBF is essential for the diagnosis and assessment of cerebrovascular disorders (e.g., Moyamoya, ischemic stroke), and neurodegenerative diseases where blood supply to specific regions of the brain is impaired, resulting clinically in different types of dementia [2] . Positron emission tomography (PET) with radiolabeled water ( 15 O-water, 2 min half-life) is considered the gold standard for measuring CBF in humans [3] . PET scans, however, are relatively expensive and not widely available. 15 O-water PET can only be performed in sites where the radioactive substance is produced at a nearby location and can be injected into the bloodstream quickly. Magnetic resonance imaging (MRI), on the other hand, is less expensive, more widely available, and does not involve ionizing radiation. This study aims to improve the clinical utility of MRI-derived CBF measurements, turning brain MRI into PET CBF maps and accurately diagnosing cerebrovascular diseases. Deep learning models, in particular, the deep convolutional neural networks (CNNs) have opened the door for numerous imaging applications in neuroradiology [4] , [5] . For instance, convolutional encoder-decoder architectures have markedly improved the state-of-the-art in neuroimages segmentation [6] - [9] , and more importantly, brain image-to-image translation [10] - [13] . Moreover, the common deep CNN architectures such as GoogleNet, VGG, ResNet, PyramidNet, and SENet have dramatically boosted the diagnostic and prognostic performance to classify cerebrovascular and neurodegenerative diseases [14] , [15] . The unceasing improvement in the performance of emerging neural network architectures will certainly open the door for new commercially available tools for neuroradiologists. Recently, multi-task neural networks have shown superior performance to other individual neural network architectures on different medical imaging applications [16] , [17] . This type of neural networks simultaneously integrates different pieces of information from diverse tasks to improve the overall performance of the network and leads to better generalization under real-life conditions [18] . In this study, we developed a multi-task CNN architecture for classifying cerebrovascular diseases and synthesizing high-quality PET images from multi-contrast MRI. The proposed joint dualtask model comprises two branches; the first branch adopts a 3D convolutional encoder-decoder network with attention mechanisms to predict the gold standard 15 O-water PET CBF maps from the combination of structural MRI and arterial spin labeling (ASL) MRI perfusion images without using radiotracers. The second branch comprises a multi-scale 3D convolutional network that integrates multi-parametric MRI images to distinguish between healthy controls and people with cerebrovascular diseases. Results show that the proposed multi-task deep learning model can efficiently improve the MRI-to-PET translation performance and the diagnostic accuracy for identifying cerebrovascular disorders. In the past three years, several deep convolutional neural networks have been introduced to predict PET CBF maps from structural and perfusion MRI images [11] , [19] , [20] . In [11] , Guo et al. adopted a deep CNN (dCNN) to generate synthetic 15 O-water PET CBF images from multi-parametric MRI inputs including ASL. The dCNN notably improved CBF image quality, when compared to ASL, achieving an average structural similarity index (SSIM) of 0.85. In [19] , Shin et al. studied the possibility of synthesizing different brain PET tracers (specifically AV45, AV1451, fluorodeoxyglucose) solely from T1-weighted MRI images using generative adversarial networks (GANs). This method achieved limited PET prediction results, with an average SSIM of 0.26-0.38. In [20] , a residual-based attention-guided CNN was introduced for translating 2D ASL and T1-weighted images to PET-like images; achieving an average SSIM of 0.85. Other CNN-based image-to-image translation methods, including encoder-decoder networks and GANs, have been used for MRI-to-computed tomography (CT) translation [21] , CT-to-MRI translation [22] , PET-to-CT translation [23] , CT-to-PET translation [24] , and also for translating between different MRI neuroimaging modalities [25] . For the detection of cerebrovascular diseases, several deep learning works on MRI and PET neuroimages have emerged in the past few years. In [26] , a six-layer convolutional network was proposed to identify Moyamoya disease (MMD) in plain skull radiograph images. The proposed network attained classification accuracy rates of 91% and 75·9% for the institutional test set and external validation set, respectively. To improve the diagnostic accuracy of MMD, the authors of [27] proposed to use a convolutional-recurrent network architecture that combines a 3D CNN and gated recurrent unit. This hybrid model was able to learn the spatio-temporal features from digital subtraction angiography (DSA) images, showing an average MMD detection accuracy of 97.88%. Deep learning was also used in [28] to automatically detect cerebral aneurysms from MR angiography images, yielding recognition sensitivity rates between 91-93%. Further, a deep CNN was adopted for classifying ischemic stroke onset time based on perfusion MRI images [14] . The proposed network was able to determine whether the time since stroke (TSS) onset is less than 4.5 hrs with a sensitivity of 0.78 and a negative predictive value of 0.61. In [15] , Dawud et al. used a pre-trained CNN and transfer learning to identify brain hemorrhage in CT images; revealing high classification accuracy rates of 90.65-93.48%. Recently, multi-task deep learning algorithms have been receiving attention in computer-aided medical applications. In [16] , a multi-task neural network framework was introduced to identify COVID-19 patients and segment abnormal lesions on chest CT images. This model showed impressive results with an AUC score greater than 97% for the diagnosis task and a Dice score of 0.88 for the segmentation task. Similarly, Von et al. in [17] developed a multi-task deep learning model for the classification and segmentation of primary bone tumors on musculoskeletal radiographs. This model was able to distinguish between malignant and benign tumors with an average classification accuracy of 80.2% and segment the bone lesions with an average Dice coefficient of 0.60. Also, multi-task deep learning was adopted for the segmentation and prognosis with head and neck cancer [29] . The proposed multitask deep UNet model was applied to FDG-PET/CT images to predict patient prognosis and learn the segmentation of head and neck tumors volumes. This section describes the dataset and the proposed multitask network architecture used for simultaneous MRI-to-PET translation and disease classification. This is a retrospective study, approved by the Institutional Review Board of Stanford University in accordance with the ethical standards of the Helsinki declaration for medical research involving human subjects, and HIPAA compliant. Written informed consent was obtained from all participants prior to the study. Data were acquired from 120 subjects (60 healthy controls (HC) and 60 cerebrovascular disease patients) on a 3T PET/MRI hybrid system (SIGNA, GE Healthcare, Waukesha, WI, USA) using an 8-channel head coil. The patients' dataset comprised 52 patients with Moyamoya disease, 4 patients with the intracranial atherosclerotic steno-occlusive disease (ICSD), and 4 patients with stroke. The MRI scans included T1-weighted (T1w), T2-weighted fluid-attenuated inversion recovery (T2w-FLAIR), multi-delay pseudo-continuous ASL (PCASL) from which proton density (PD) images are also available, and quantified CBF/arterial transit time (ATT) maps derived from ASL. For all ASL scans, a proton density image and a coil sensitivity map were acquired with a saturation recovery acquisition using TR=2000 ms. For the multi-post labeling delay (PLD) PCASL sequence, crushing gradients (Venc=4cm/s) were adopted to exclude the signal in the arterial component before the 3D spiral readout. All MRI images were co-registered and normalized to the Montreal Neurological Institute (MNI) brain template and resized to 96×96×64 voxels. Quantitative gold standard PET CBF was determined using 15 O-water injection and the imagederived arterial input function kinetic model described in [30] . Sixty-two participants underwent at least two simultaneous PET and MRI scans, at baseline and 20 minutes after intravenous administration of acetazolamide (a vasodilator that increases the blood flow into the brain). The remaining 58 participants underwent three separate simultaneous PET and MRI scans, two at baseline and one 20 minutes after acetazolamide administration. Acetazolamide was injected during the scan at a dose of 15 mg/kg of body weight with a maximum dose of 1000mg. Eight MRI sequences (T1w, T2w-FLAIR, PD, ATT, single-delay ASL, mean of multi-delay ASL, single-PLD CBF, and multi-PLD CBF) were used as inputs to the model and one 15 O-water PET CBF map served as the ground truth. The total number of scans, before and after acetazolamide administration, is 332. Both input MRI images and output PET images were normalized so that they had a mean intensity of 1 in the whole brain. Data augmentation was also used to enlarge the size of the dataset. The augmentation included flipping, shifting, and rotating the input and output images, resulting in an eight-fold increase in the dataset size. The proposed 3D multi-task convolutional neural network architecture is depicted in Figure 1 . The network consists of different branches/sub-networks that use multi-contrast MRI as inputs and are trained simultaneously. In particular, the network incorporates two major branches for improving the clinical utility of MRI-derived CBF measurements: (1) a PET Synthesis Branch is used to transform the structural and ASL perfusion MRI images into PET CBF maps, and (2) a Diagnosis Branch is used to classify healthy controls and patients with MMD, ICSD, and stroke. It is worth highlighting that the MRI-to-PET translation is the prime task of this study, while the classification task is added to ameliorate the extracted feature representations, and hence, improve the quality of synthetic PET images. PET Synthesis Branch: A 3D convolutional encoderdecoder network with attention mechanisms was adopted to integrate spatial data across multiple MRI image types for synthetizing PET CBF maps, as shown in Figure 1 . Both the encoder and decoder modules use 3D CNNs, where the encoder compresses the input MRI images into a more condensed representation, and the decoder uses this representation to output PET-like images. Since different MRI sequences and spatial patterns impose different effects on the quality of synthesized PET images, the attention mechanism, shown in Figure2, is embedded into the encoder-decoder network to concurrently search the relevant aspects of the input at the channel and spatial levels for a fine-grained quality prediction. The encoder-decoder network is trained with a customized loss function, computed as: where L trans is the MRI-to-PET Translation loss; w 1 ,w 2 , w 3 , and w 4 are weights that take values between 0 and 1; M SE, M AE, SSIM , P SN R refer to the mean squared error, mean absolute error, structural similarity index, and peak signal-tonoise ratio, respectively, and they are defined as: where x and y refer to the true and predicted PET images, m, n, and p are the dimensions of the 3D PET images. where µ x is the mean of x, µ y is the mean of y, σ 2 x is the variance of x, σ 2 y is the variance of y, σ xy is the covariance of x and y, c 1 =(0.01c max ) 2 and c 2 =(0.03c max ) 2 are two constants to stabilize the division with weak denominator, and c max denotes the maximum intensity value of the image. P SN R = 10 · log 10 cmax M SE (5) Diagnosis Branch: A multi-scale 3D convolutional neural network is adopted to distinguish between healthy controls and patients with MMD, ICSD, and stroke. The multi-parametric MRI images are fed into three parallel paths of 3D convolution layers with different kernel sizes, which allows learning local features (through smaller convolutions) and high-abstracted features (with larger convolutions) (see Figure 1 ). The extracted multi-scale feature representations are then concatenated into a single feature tensor forming the input of the next few layers. The aggregated output is flattened and presented as an input to two fully connected layers, and then the softmax function is used to compute the label probabilities and prediction. The categorical cross-entropy function is used as a loss function for the cerebrovascular disease classification task, which is defined as: where L class is the classification loss, N is the number of observations, C is the number of classes, 1 is an indicator function (0 or 1) of the observation i belonging to the class c, and P model [y i ∈ C c ] is the probability that observation i belongs to class c. From equations 1 and 6, the global loss function (L global ) for both MRI-to-PET translation and disease classification tasks is computed as: In our experiments, the Nesterov Adam optimizer [31] , an improved variant of the Adam optimization algorithm, was used with a learning rate of 0.0002 and a batch size of 4. The proposed multi-task neural network was trained with the global loss function for 150 epochs and early stopping of 20 epochs. The proposed multi-task network was trained and tested using fourfold cross-validation. The dataset was divided into four subgroups, each includes PET and MRI images from 15 healthy control participants (with 40 scans), 13 patients with Moyamoya disease (with 38 scans), one ICSD patient (with 3 scans), and one stroke patient (with 2 scans). For each fold, the scans from three of the four sub-groups were used for training, from which 10% were randomly selected for validation. The fourth subgroup was then used for testing the performance of the trained multi-task network. To avoid data leakage, we were careful to avoid having a single subject's scans (either baseline or post-acetazolamide) simultaneously in the training and test sets. The performance of the proposed multi-task network was quantitatively evaluated using SSIM, normalized root-meansquare error (NRMSE), and PSNR for the MRI-to-PET Translation task. The performance metrics of accuracy (Acc), sensitivity (Sens), specificity (Spec), precision (Prec), falsepositive rate (FPR), false-negative rate (FNR), and Matthew's correlation coefficient (MCC) were also used for the disease classification task. MRI-to-PET Translation Results: Compared to a previous PET CBF prediction method that achieved an average SSIM of 0.85 and NRMSE of 0.209 [11] , our model yielded improved PET prediction performance, achieving an average SSIM of 0.94, NRMSE of 0.038, and also PSNR of 38.8dB. Figure 3 shows the input MRI volumes, predicted (synthetized) PET CBF maps, gold standard PET CBF measurements, and corresponding absolute error maps (magnified) for healthy controls and patients with cerebrovascular diseases. Results indicate that a 3D convolutional encoder-decoder network integrating multi-contrast information from brain structural MRI and ASL perfusion images can efficiently synthesize high-quality PET CBF maps without the need for radiotracers. It also shows how well-designed loss functions and attention mechanisms can improve the PET CBF prediction results. Using grid search, the weights (w 1 , w 2 , w 3 , and w 4 ) of L trans (the loss function of the MRI-to-PET translation task) were assigned to 0.15, 0.15, 0.60, and 0.20, respectively. The highest weight went to the SSIM loss in view of the fact that SSIM measures the perceptual difference between two images. To assess the clinical significance of the proposed PET prediction algorithm, a set of paired comparison analyses were conducted. The Bland-Altman plot in Figure 4 delineates the agreement between the mean CBF of the true and predicted PET CBF maps. It shows a small bias, where the true gold standard PET CBF measurements in the whole brain are 4.6 ml/100g/min higher than the synthetic PET CBF maps produced by our encoder-decoder network, with 95% confidence intervals of -4.4 and +13.5 ml/100g/min. Figure 5 describes the histogram, density plots and joint plot of the mean CBFs of true and synthetic PET images, showing high levels of agreement and correlation (Pearson's correlation coefficient = 0.97). Cerebrovascular Disease Classification Results: This section depicts how 3D multi-scale convolutional networks, as part of the proposed multi-task deep learning model, can differentiate between healthy controls and patients with Moyamoya disease, intracranial steno-occlusive disease, and stroke. Figure 6 depicts the confusion matrix of the proposed multi-task network for one of the four sub-groups of data. One can observe that 38 of 40 healthy control scans are classified correctly, where only 2 scans are misclassified as Moyamoya patients (both scans were for the same participant). For Moyamoya disease, only two scans are misidentified as healthy controls and one misidentified as an intracranial steno-occlusive disease. The remaining 35 scans are identified correctly. The classification of ICSD and stroke was more challenging because of the limited number of participants having these diseases. We used data augmentation to generate variants of the multi-contrast MRI images and corresponding PET CBF measurements, which helped expand the training dataset and improve the classification performance of the model and its ability to generalize in clinical settings. The test set included one ICSD patient with three scans (two scans at the baseline and one scan after acetazolamide administration). The trained network was able to identify the two baseline scans correctly where the post-acetazolamide scan was identified as Stroke. It is worth clarifying that ICSD occurs when blood flow to the brain is restricted by narrowed arteries or plaque buildup, and without adequate treatment, ICSD patients may develop mini-strokes or strokes [32] . This may explain why a post-acetazolamide scan for an ICSD patient could be classified as a stroke. Lastly, the model was tested on a stroke patient with 2 scans (one pre-acetazolamide and one post-acetazolamide), and both scans were classified correctly. Table I reports the performance metrics of the multi-task network for the classification of healthy controls and patients with Moyamoya disease, ICSD, and stroke. An average classification accuracy, sensitivity, and specificity of 96.38%, 88.44%, and 97.10% were achieved, respectively. The FPR, FNR, and MMC were also evaluated for this imbalanced classification problem, revealing a notably high performance of 0.028, 0.115, and 0.812, respectively. This gives us insights into how multi-task deep learning and sharing feature representations among two tasks can help maintain reliable performance even with limited data. Although the proposed model produces high-quality synthetic PET images, the neuroradiologists still need to spend substantial time examining 3D images trying to identify the area(s) of the brain affected by the cerebrovascular disease. In future work, we plan to add a third branch to automatically localize the brain regions with abnormally low cerebral blood flow. An expansion of this approach to other types of brain diseases (like certain types of dementia) could also be clinically valuable. Adequate quantification of PET from MRI has a great potential for increasing the accessibility of cerebrovascular diseases assessment for underserved populations and underprivileged communities. In this study, we proposed a multi-task deep learning model that allows for accurate and simultaneous brain MRI-to-PET translation and classification of cerebrovascular diseases. The network consists of two branches that cooperatively use structural MRI and ASL perfusion images as inputs. The first branch adopted an attention-guided 3D convolutional encoder-decoder network that efficiently synthesizes high-quality PET CBF maps from multi-contrast MRI while eliminating the need for radioactive tracers. The second branch used a multi-scale convolutional neural network to extract the distinguishable imaging biomarkers and thus differentiate between healthy controls and patients with cerebrovascular diseases. The proposed multi-task learning approach was found to achieve superior performance than existing medical image-to-image translation and classification techniques. National vital statistics system -mortality data Quantification of serial cerebral blood flow in acute stroke using arterial spin labeling Brain perfusion imaging under acetazolamide challenge for detection of impaired cerebrovascular reserve capacity: positive findings with 15o-water pet in patients with negative 99mtc-hmpao spect findings Deep learning in neuroradiology Artificial intelligence and deep learning in neuroradiology: Exploring the new frontier Deep learning based segmentation of brain tissue from diffusion mri Hemorrhagic stroke lesion segmentation using a 3d u-net with squeeze-and-excitation blocks Evaluating nnu-net for early ischemic change segmentation on noncontrast computed tomography in patients with acute ischemic stroke Surface-based hippocampal subfield segmentation Ultra-low-dose 18f-florbetaben amyloid pet imaging using deep learning with multi-contrast mri inputs Predicting 15o-water pet cerebral blood flow maps from multi-contrast mri using a deep convolutional neural network with evaluation of training cohort bias Medgan: Medical image translation using gans Mri cross-modality image-to-image translation A machine learning approach for classifying ischemic stroke onset time from imaging Application of deep learning in neuroradiology: brain haemorrhage classification using transfer learning Multi-task deep learning based ct imaging analysis for covid-19 pneumonia: Classification and segmentation Multitask deep learning for segmentation and classification of primary bone tumors on radiographs An overview of multi-task learning in deep neural networks Ganbert: Generative adversarial networks with bidirectional encoder representations from transformers for mri to pet synthesis Asl to pet translation by a semi-supervised residual-based attention-guided convolutional neural network Attention-aware discrimination for mr-to-ct image translation using cycle-consistent generative adversarial networks Deep ct to mr synthesis using paired and unpaired data Unsupervised medical image translation using cycle-medgan Cross-modality synthesis from ct to pet using fcn and gan networks for improved automated lesion detection Image synthesis in multi-contrast mri with conditional generative adversarial networks Machine learning for detecting moyamoya disease in plain skull radiography using a convolutional neural network Learning spatiotemporal features of dsa using 3d cnn and biconvgru for moyamoya disease detection Deep learning for mr angiography: automated detection of cerebral aneurysms Multi-task deep segmentation and radiomics for automatic prognosis in head and neck cancer Image-derived input function estimation on a tof-enabled pet/mr for cerebral blood flow mapping Incorporating nesterov momentum into adam Detection of intracranial atherosclerotic steno-occlusive disease with 3d time-of-flight magnetic resonance angiography with sensitivity encoding at 3t