key: cord-0919504-47onovax authors: Moore, Michael M.; Iyer, Ramesh S.; Sarwani, Nabeel I.; Sze, Raymond W. title: Artificial intelligence development in pediatric body magnetic resonance imaging: best ideas to adapt from adults date: 2021-04-13 journal: Pediatr Radiol DOI: 10.1007/s00247-021-05072-1 sha: ce474953e813ee7ea27765cbba874143f1b610b2 doc_id: 919504 cord_uid: 47onovax Emerging manifestations of artificial intelligence (AI) have featured prominently in virtually all industries and facets of our lives. Within the radiology literature, AI has shown great promise in improving and augmenting radiologist workflow. In pediatric imaging, while greatest AI inroads have been made in musculoskeletal radiographs, there are certainly opportunities within thoracoabdominal MRI for AI to add significant value. In this paper, we briefly review non-interpretive and interpretive data science, with emphasis on potential avenues for advancement in pediatric body MRI based on similar work in adults. The discussion focuses on MRI image optimization, abdominal organ segmentation, and osseous lesion detection encountered during body MRI in children. The field of data science and its applications to radiology are immense. Overall, data science encompasses areas ranging from traditional databases to business analytics to artificial intelligence. For radiologists, data science can be divided into non-interpretative and interpretive tasks. Non-interpretive tasks tend to focus more often on operational efficiency and practice management. We provide an example of noninterpretive data from our data science team, demonstrating the dramatic changes in radiology exam volumes that occurred during the coronavirus disease 2019 (COVID-19) pandemic for a single academic medical center (Fig. 1) . These data empowered departmental scenario planning and staff modeling (e.g., shifting radiologist work location and the distribution of clinical assignments) to adjust for the variance. When discussing artificial intelligence (AI) in radiology, however, the focus is primarily on interpretative tasks, as it is throughout the remaining portions of this manuscript. More specifically, the focus has been deep learning with convolutional neural networks (CNNs). A CNN is a subtype of machine learning with main features including an architecture comprising multiple layers and image-based input data. Machine learning is the broader subfield of AI that encompasses deep learning as well as previously developed technologies such as support vector machines (SVMs). Further background discussion of machine learning and deep learning CNNs is beyond the scope of this discussion, so we included resources for readers [1] [2] [3] [4] . The goal of this manuscript is to provide a review of pertinent literature to assist in understanding future development of AI for pediatric body MRI. As alluded to in a previous publication, several challenges that are particularly salient to body MRI in children deserve mention [2] . Large datasets with a variety of data inputs are necessary for robust deep learning development. A 2019 CNN review of radiology literature indicated that 49% of studies had 101-1,000 cases, and 20% had 1,001-10,000 cases [1] . Datasets for pediatric body-MRI-specific applications of this size are understandably limited. A cooperative approach to such studies might therefore be needed, with careful attention to the use of findable, accessible, interoperable and reusable (FAIR) datasets as well as defined, transparent informationsharing agreements and intellectual property governance [5, 6] . Another particularly difficult challenge for pediatric radiology is the need for accurate data labeling to augment the quality of datasets, combined with a relatively small pool of pediatric radiologists. While one potential solution is utilization of natural language processing to extract pre-existing inhomogeneous labels incorporated into radiology reports, oversight and verification of accuracy by experienced radiologists remains critical. Incorrectly labeled data learned in the training set would corrupt the test set and degrade the accuracy of the deep learning model [2] . These limitations to CNN development in children, particularly the relatively low numbers of cases in pediatric radiology, are widely recognized [7] . There might also be, unfortunately, insufficient economic incentives to justify prioritization of AI development for children. Therefore, strategies must be developed, including potentially adapting algorithms from adults, to overcome these obstacles. Conversely, novel ideas and strategies from pediatric imagers could augment overall AI radiology development. Finally, an additional approach that might be particularly well suited to children is Society for Pediatric Radiology (SPR)-facilitated multi-institutional collaboration. In this model, development and investment burden could be shared among quaternary institutions with robust research infrastructure focused on algorithm training with separate smaller children's hospitals focused on algorithm testing. This structure would also have the benefit of ensuring generalizability of algorithm performance. The advent of AI within several areas of pediatric radiology is progressing rapidly. As of September 2020, the areas with greatest development in peer-reviewed literature were musculoskeletal radiograph classification and segmentation. Although widespread deployment into clinical pediatric radiology practice remains nascent, these areas have the best potential to initially impact pediatric radiology workflow. More specifically, having algorithms perform repetitive, timeintensive activities unburdens radiologists to increase efficiency and allow additional time to perform cognitively demanding responsibilities and facilitate interpersonal collaboration with colleagues [7] . Five examples developed to date include assessment of pediatric bone age, Risser stage, elbow fractures, wrist fractures and leg-length discrepancy. Bone age is the most developed area, having been featured in the imaging literature and in the Radiological Society of North America (RSNA) Bone Age Machine Learning Challenge. Based on a dataset of 14,236 radiographs, 105 submissions were uploaded, with top results giving mean absolute difference (MAD) of just over 4 months as compared to the reference standard [8] . For Risser stage classification, in 1,830 radiographs performed for adolescent idiopathic scoliosis, the CNN was comparable to slightly better than six expert graders, with 78.0% versus 74.5% accuracy, respectively. Additionally, the kappa coefficient for this CNN was 0.72, which exceeded the 0.65 for the human graders (kappa coefficient measures inter-rater agreement; 0.72 indicates substantial agreement) [9] . A model to determine wrist fractures was built using a training dataset of 7,356 studies, annotated by radiologists Fig. 1 Non-interpretive data from an academic medical center data science team demonstrate radiology exam volumes (per day, per accession number) that occurred during the coronavirus disease 2019 (COVID- 19) pandemic. The data source was radiologists' dictated reports, including both children and adults. CT computed tomography, DEXA dual energy X-ray absorptiometry, MR magnetic resonance, NM nuclear medicine, PET positron emission tomography, US ultrasound, XR radiography and subsequently tested on 524 emergency department studies. A substantial minority of included studies was from pediatric patients (training set was 1,362 children and 5,153 adults). A subgroup analysis of pediatric patients showed a sensitivity of 92.7% and 93.5%, and a specificity of 76.2% and 86.4% on anteroposterior and lateral radiographs, respectively [10] . In a pediatric-specific elbow study for classification, the accuracy was 88%, with an area under the curve (AUC) of 0.95 for the model. The model performed best for supracondylar and lateral condylar fractures but was less successful in cases of elbow effusion without fracture, and with proximal radius and ulna fractures [11] . Leg-length discrepancy (LLD) studies are a common, cognitively simple yet tedious and labor-intensive portion of radiographic workload for pediatric radiologists. A deep learning algorithm for automated segmentation of bilateral lower extremities was developed using 179 LLD studies randomly divided into training, validation and testing sets; the algorithm showed high correlation with the radiology reports (full leg-length discrepancy r=0.92; mean absolute error 0.51 cm). One notable feature of this study was reporting of the dramatic difference between mean calculation time for the deep learning method versus mean time for a radiologist to manually perform the calculation (1 s vs. 96 s) [12] . The results of these pediatric musculoskeletal-focused studies are promising, particularly for triaging fracture detection in settings where pediatric radiology expertise is limited (e.g., general radiology overnight) or in developing countries where radiologist availability is restricted. Additionally, although body MRI studies are fundamentally more complex because of the variety of sequences, acquisition planes and organ systems involved, these radiograph classification studies do set the stage for AI image interpretation development in children. Areas of greatest potential impact to body MRI include image acquisition and sequence optimization, which are necessary tasks preceding image interpretation. Because image acquisition is not isolated to children or adults, advances in either population should theoretically be generalizable to the other cohort. While developments often flow from adults to children because of the generally greater numbers of adult patients and financial and resource allocation, opportunities might also exist for pediatric-specific needs to drive innovation. Multiple examples of improving MRI image quality are emerging within the literature, including at least one study with significant pediatric population representation. Chen et al. [13] , in an investigation of 157 patients with a mean age of 11 years, utilized a deep learning variation network to improve the MRI image reconstruction speed and quality on single-shot fast spin-echo sequences. Their results indicated better sharpness and signal-to-noise ratio (SNR) and a decrease in average reconstruction time per section from 5.60 s to 0.19 s [13] . Another study, by Lv et al. [14] , also focused on decreasing respiratory motion on free-breathing abdominal MRI study in adult volunteers. The authors compared a respiratory motion correction technique using a CNN-based approach for image registration to both non-motion correction and local affine registration methods [14] . The results indicated that the CNN achieved the highest SNR and vessel sharpness while also significantly reducing registration time compared to the other two approaches [14] . Not only can CNNs be utilized to increase signal-to-noise ratio and image sharpness, but they might also be employed to screen for non-diagnostic images, which could help by alerting the MRI technologists to optimize or repeat sequence acquisitions without unnecessarily interrupting pediatric radiologist workflow. For evaluation of T2-weighted liver MRI, Esses et al. [15] showed that a CNN algorithm to screen for nondiagnostic images had negative predictive values (NPVs) of 94% and 86% with respect to two human radiologists. This high NPV could facilitate potential detection of nondiagnostic sequences for MRI technologist quality review [15] . Examples of improved image quality applicable to body MRI might also be derived from cardiac and knee MRI techniques. Masutani et al. [16] demonstrated that deep learning CNNs (specifically single-and multiple-frame SRNet and UNet architecture) can infer high-frequency spatial details from low-resolution inputs (super-resolution) for cardiac MRI. AI techniques in cardiac imaging can be applied to dynamic as well as static cardiac MR acquisitions, something that would be of great value given the higher heart rates encountered in the pediatric age group [17] . Recently, an interchangeability study was published of knee MRI demonstrating that a deep learning variation network enabled a 3.5-fold acceleration in image acquisition compared to fully sampled data acquisition, while all six radiologist readers judged that MR sequence quality to be better [18] . The study also showed that interchangeability of sequences resulted in discordant clinical opinions of no greater than 4% for any feature [18] . Beyond the direct scope of body MRI, CNNs might also improve filtered back-projection in pediatric CT. MacDougall et al. [19] showed image noise reduction by 31% when CNNs were utilized for abdominal CT examinations in 11 children. Improvements to pediatric abdominal CT might indirectly benefit body MRI quality in children by improving magnet access in situations where low-dose CT might be a suitable alternative. Taken together, this body of literature illustrates that CNN development for MRI sequence optimization, including reduced acquisition time, increased SNR and image sharpness and identified non-diagnostic MR images and is both feasible and potentially deployable on the scanner at time of acquisition. Figure 2 demonstrates an artistic rendering combining these concepts, which we hope to see developed by the collaborative efforts of electrical and computer science engineers, MRI physicists and radiologists for clinical use. Segmentation identifies the voxels composing an anatomical structure of interest, most commonly an organ, portion of an organ, or pathology such as a tumor. While several segmentation tasks in body MRI are only geared toward adults (e.g., prostate), several other undertakings have potential benefits to children. Wang et al. [20] published a study assessing feasibility of a CNN with a U-Net architecture for segmenting the liver on both MRI and CT. For the MRI portion, the Dice scores were 0.95 for T1-weighted MRI and 0.92 for T2*-weighted MRI, and the 95% limits of agreement between T1-weighted MRI volume compared to manual segmentation were −358 mL to 180 mL [20] . The Dice score is a commonly utilized validation tool for AI image segmentation algorithms that these authors defined as "the volume of overlap between segmentations from the CNN and from the manual labeling divided by the averaged segmentation volume between the two methods" [20] . The authors concluded that liver segmentation for MRI utilizing a CNN is feasible and generalizable across multiple modalities and imaging techniques [20] . Another potential opportunity for automated segmentation for pediatric radiologists is calculating renal volumes. Two manuscripts, both in adults, focused on deep learning for segmenting kidney volumes in the setting of autosomal-dominant polycystic kidney disease on CT and MRI, respectively. Sharma et al. [21] utilized a training dataset of 165 CT examinations and a testing set of 79 CT examinations and found a Dice coefficient of 0.86 comparing automated deep learning versus manual segmentation. A subsequent MRIfocused manuscript that compared two semantic CNNs demonstrated accuracy greater than 85% for both models [22] . Based on these body MRI studies in adults, it is reasonable to conclude that segmentation tasks on pediatric abdominopelvic MRI examinations will eventually be assisted by CNNs, potentially relieving pediatric radiologists and imaging support staff of another time-intensive repetitive task. Literature on the role of CNNs in segmentation is further supported by work in the realm of cardiac MR [23] . Figure 3 demonstrates an artistic rendering for MRI liver segmentation utilizing a CNN, which we are optimistic will eventually be commercially available for future practice. For children, segmentation of liver volumes is very helpful, particularly in surgical planning for hepatoblastoma resection with partial hepatectomy. Segmentation of other common pediatric tumor volumes including Wilms tumor is valuable for assessing volumetric treatment response to chemotherapy. Once segmented, another potential area of development for AI is a comprehensive MRI-based liver analysis for the Fig. 2 Magnetic resonance image optimization. Artistic rendering demonstrates the opportunity for convolutional neural networks (CNNs) to decrease motion and increase image sharpness on abdominopelvic MRI in children. The first (left) portion shows motion artifact resulting in blurriness on a coronal T2weighted image. The CNN schematic (middle) shows white spheres representing nodes and blue lines representing connections between the nodal layers. The final (right) portion shows an image with decreased motion artifact. Underlying images are conventionally acquired coronal half-Fourier acquisition single-shot turbo spinecho images in a 17-year-old girl experiencing claustrophobia at the time of acquisition. Image created by Devon Stuart, MA, CMI, in conjunction with Michael Moore, MD occurrence of liver diseases, especially those related to lifestyle, because there is overlap between the adult and pediatric populations. He et al. [24] has done work showing the ability of an SVM model applied to clinical and T2 radiomic data to predict liver stiffness, discussed later, potentially obviating the need for formal MR elastography to assess liver stiffness in cases where the prediction of underlying liver disease is low. Liver fibrosis staging has also been investigated using CNNs and gadoxetic-enhanced liver MRI with significant correlation with pathological data [25] . Both of these could allow for the noninvasive assessment and subsequent monitoring of diffuse liver disease processes, such as viral hepatitis infections and steatohepatitis. Presently, a paucity of detection-focused body MRI literature is directly applicable to children. Literature from adults, though, shows that detection and characterization of liver lesions on MRI are developing. Zhen et al. [26] used data from 1,210 people with liver tumors to train a CNN that was then tested on 210 independent patients. These liver lesions were placed into seven total categories, with AUC performance listed in parentheses, including hemangioma (0.94), focal nodular hyperplasia (0.98), hepatocellular carcinoma (0.92) and metastatic malignancy (0.89) [26] . An additional CNN model within this study distinguished malignant from benign liver pathology with an AUC of 0.95 (95% confidence interval [CI] 0.91 to 0.98) [26] . Another CNN liver lesion study, by Hamm et al. [27] , was trained with 434 lesions and tested with 60 lesions in adults from six separate liver lesions. Overall, the test set performance on previously unseen cases showed sensitivity of 90% and specificity of 98% (compared to human radiologists of 82% and 96%, respectively) [27] . Although AI characterization of abdominal histopathology in children remains to be developed, pediatric radiomic MRI AI is emerging. For prediction of liver stiffness in children, He et al. [24] built an AI model using support vector machines with both clinical and T2-weighted MRI radiomic data, with 27 and 105 features, respectively. The algorithm was trained on data from 225 internal patients, then independently tested on 84 patients. For the external validation portion, the SVM achieved an AUC of 0.80 (compared to 0.84 on internal data). These studies help confirm AI's potential to assist pediatric radiologists beyond detection and into additional tissue characterization. Pediatric radiologists might eventually leverage CNN models for detection of abnormalities beyond the direct focus of thoracoabdominal MRI, specifically osseous and wholebody MRI findings. While much of the deep learning literature in musculoskeletal imaging [28] is focused on detection of fractures, cartilaginous abnormalities and meniscal tears, there is a potential AI growth opportunity for detecting pediatric skeletal abnormalities such as metastases, diffuse marrow replacement, hematopoietic marrow conversion as well as multifocal osteomyelitis. A study of metastatic disease demonstrated detection of all spinal metastases in 26 adults utilizing a deep Siamese neural network with a rate of 0.21 falsepositives per case with use of an aggregation strategy [29] . Although less directly applicable to children, a deep learning model differentiated lung cancer spinal metastatic disease from other types of metastatic disease with 0.81 accuracy on dynamic contrast-enhanced MRI, further confirming the potential utility of deep learning for osseous disease [30] . While CNN detection of additional sites of osseous metastatic disease remains to be developed, this could be very helpful in directing pediatric radiologists to areas beyond the primary examination focus. Figure 4 demonstrates an artistic rendering of a model for MRI osseous metastasis to the right femur that requires significant long-term development. Emerging AI might also assist in predicting treatment response, although therapy response literature based on body MRI remains sparse presently. One MRI-based example is machine learning models to predict treatment response to transarterial hepatocellular carcinoma chemotherapy. Specifically, logistic regression and random forest-type models utilizing both clinical patient data and baseline preprocedural MRI were developed, although accuracy remains only fair at 78% [31] . Development by pediatric radiologists of AI for predicting outcomes using a combination of clinical and MRI data is an additional area for research to help advance care for the children we treat. Based on studies and ideas from the radiology literature in both adults and children, advances in the utilization of artificial intelligence in pediatric body MRI are anticipated. Development is most likely to be focused in areas including: MRI image optimization, lesion detection and solid organs segmentation in the abdomen including liver and kidneys. As obstacles in imaging children are conquered, AI is expected to facilitate improved efficiency by augmenting pediatric radiologists. Fig. 4 Magnetic resonance imaging osseous lesion detection. Artistic rendering demonstrates the opportunity for convolutional neural networks (CNNs) to detect osseous lesions on abdominopelvic MRI beyond the primary focus of abdominal organs and bowel. Proximal right femoral metastasis is highlighted with the marrow component shaded green and periosteal component shaded orange. Bottom, the neurons represent a biological neural network, which is often likened to a CNN. Underlying axial T2-W turbo spin-echo image is from a 14year-old boy undergoing abdominopelvic MRI for metastatic disease evaluation. Image created by Devon Stuart, MA, CMI, in conjunction with Michael Moore, MD Convolutional neural networks for radiologic images: a radiologist's guide Machine learning concepts, concerns and opportunities for a pediatric radiologist Machine learning for medical imaging Deep learning: a primer for radiologists Medical image data and datasets in the era of machine learning -whitepaper from the 2016 C-MIMI meeting dataset session Machine learning in whole-body MRI: experiences and challenges from an applied study using multicentre data Three reasons why artificial intelligence might be the radiologist's best friend The RSNA Pediatric Bone Age Machine Learning Challenge Risser stage: convolutional neural networks for automatic Risser stage assessment Convolutional neural networks for automated fracture detection and localization on wrist radiographs Elbow fractures: binomial classification of pediatric elbow fractures using a deep learning multiview approach emulating radiologist decision making Deep learning measurement of leg length discrepancy in children based on radiographs Variable-density singleshot fast spin-echo MRI with deep learning reconstruction by using variational networks Respiratory motion correction for free-breathing 3D abdominal MRI using CNN-based image registration: a feasibility study Automated image quality evaluation of T2-weighted liver MRI utilizing deep learning architecture Deep learning singleframe and multiframe super-resolution for cardiac MRI From compressedsensing to artificial intelligence-based cardiac MRI reconstruction Using deep learning to accelerate knee MRI at 3T: results of an interchangeability study Improving lowdose pediatric abdominal CT by using convolutional neural networks Automated CT and MRI liver segmentation and biometry using a generalized convolutional neural network Automatic segmentation of kidneys using deep learning for total kidney volume quantification in autosomal dominant polycystic kidney disease A comparison between two semantic deep learning frameworks for the autosomal dominant polycystic kidney disease segmentation based on magnetic resonance images Deep learning for cardiac image segmentation: a review Machine learning prediction of liver stiffness using clinical and T2-weighted MRI radiomic data Liver fibrosis: deep convolutional neural network for staging by using gadoxetic acidenhanced hepatobiliary phase MR images Deep learning for accurate diagnosis of liver tumor based on magnetic resonance imaging and clinical data Deep learning for liver tumor diagnosis part I: development of a convolutional neural network classifier for multi-phasic MRI Current applications and future directions of deep learning in musculoskeletal radiology A multi-resolution approach for spinal metastasis detection using deep Siamese neural networks Differentiation of spinal metastases originated from lung and other cancers using radiomics and deep learning based on DCE-MRI Predicting treatment response to intra-arterial therapies for hepatocellular carcinoma with the use of supervised machine learning Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations