key: cord-0463945-hyyd9ezw authors: Kang, John; Thompson, Reid; Aneja, Sanjay; Lehman, Constance; Trister, Andrew; Zou, James; Obcemea, Ceferino; Naqa, Issam El title: NCI Workshop on Artificial Intelligence in Radiation Oncology: Training the Next Generation date: 2019-10-18 journal: nan DOI: nan sha: ef253005cec99cdf41fa1cd576a39aac200b05f2 doc_id: 463945 cord_uid: hyyd9ezw Artificial intelligence (AI) is about to touch every aspect of radiotherapy from consultation, treatment planning, quality assurance, therapy delivery, to outcomes reporting. There is an urgent need to train radiation oncologists and medical physicists in data science to help shepherd artificial intelligence solutions into clinical practice. Poorly-trained personnel may do more harm than good when attempting to apply rapidly-developing and complex technologies. As the amount of AI research expands in our field, the radiation oncology community needs to discuss how to educate future generations in this area. In this perspective article by the Training and Education Working Group from the National Cancer Institute (NCI) Workshop on AI in Radiation Oncology (Shady Grove, MD, April 4-5, 2019), we provide and discuss Action Points relevant for future trainees interested in radiation oncology artificial intelligence: (1) creating AI awareness and proper conduct; (2) developing a dedicated didactic curriculum; (3) creating a publicly available database of training resources; and (4) accelerating the translation process of AI into clinical practice. Artificial intelligence (AI) is a longstanding field of study that has attempted to emulate and augment human intelligence. In the last several years, AI has been reinvigorated by advances in computer technology and machine learning (ML) algorithms, which aim to teach computers to learn patterns and rules by using previous examples. ML builds on experiences from computer science, statistics, neuroscience, and control theory, among many other disciplines. ML has benefited from recent availability of large datasets and developments in computers' hardware and software 1 for solving large-scale optimization problems. Most notably, deep learning (DL) techniques have demonstrated significant successes in computer vision and language processing. The umbrella term "informatics" includes practical applications of any of the above areas of study; for example, bioinformatics for biology and clinical informatics for clinical practice. The term "data science" refers to the general study of data analysis, which has recently focused on ML methods. A schematic of the relationships between common terminologies is shown in Figure 1 . Figure 1 : Schematic of how artificial intelligence (AI), machine learning (ML), and deep learning (DL) relate to each other. Closely associated application areas such as "data analytics" and "big data" exist both within and outside of these realms. Many fields such as finance, manufacturing, and advertising have already incorporated AI into their workflows to improve efficiency and perform supra-human tasks. While AI has been adopted more slowly in the clinic due to multiple competing factors-including a lack of training-the perception and engagement of AI in medicine has been improving. The American Medical Association (AMA) adopted a policy in June 2019 to integrate training in AI augmentation 2 . The National Institutes of Health (NIH) Big Data to Knowledge (BD2K) Initiative has established several Centers of Excellence in Data Science, and is focused on enhancing nationwide training infrastructure in biomedical data science as well as data sharing 3 . Radiation oncology holds significant promise for AI-powered tasks-described in several perspectives and reviews [4] [5] [6] [7] [8] [9] -not just for optimizing workflows or diagnosis, but also more rewarding tasks such as prognostic prediction and personalized treatment recommendations. AI applications in radiation oncology span the domains of both medical physicists and radiation oncologists. Some applications, such as autosegmentation and automated treatment planning, will be human-verifiable as correct. Others, such as adaptive treatment planning, risk prognostication, and decision support may require careful consideration of model development and validation. As applied research in these applications grows in radiation oncology, a commensurate growth in education is necessary to be able to build and validate trustworthy AI models that can be applied to the clinic. Formal surveys of trainees in both radiation oncology and radiology reveal that the majority are interested in additional training in the AI and informatics 10, 11 . The American Society of Radiation Oncology (ASTRO) sent surveys in 2017 to radiation oncology chairs and trainees to assess their perception of training and research opportunities in genomics, bioinformatics, and immunology. 11 Among the three areas, bioinformatics received the most enthusiasm: 76% believed that bioinformatics training would "definitely or probably" advance their career. 67% expressed interest in a formal bioinformatics training course and 88% of chairs reported they would "probably or definitely" send faculty or trainees to such a course, reflecting an unmet need in training opportunities. We believe this survey is an accurate reflection of the current interests and needs of our field. 12 As a community that will be using AI in the daily workflow, we need to have a basic understanding of AI/ML methodology and its applications into relevant fields like imaging and genomics. In recognition of the need for radiation oncologists with specialized training, Oregon Health & Science University 13 and MD Anderson Cancer Center 14 have created fellow and/or resident tracks specifically for radiation oncology trainees. In radiology, the sentiment towards AI appears to be similar to that in radiation oncology 15 . A single institution survey of a radiology department revealed concerns about job security but also enthusiasm to learn about AI/ML 10 . This survey showed that 97% of trainees were planning to learn AI/ML as relevant to their job (vs. 77% of attending radiologists). This enthusiasm was tempered by the fact that, given their current knowledge of AI, 50% of trainees were "unsure of or definitely would have changed" their decision to pursue radiology. Despite these misgivings, 74% were willing to help create or train an ML algorithm to do some of the tasks as a radiologist. National radiology societies have been responsive to these sentiments. The American College of Radiology (ACR) Data Science Institute (https://www.acrdsi.org/) recently launched the ACR AI-LABâ„¢ to allow radiologists to create, validate and use models for their specific local clinical needs 16 . The Radiological Society of North America (RSNA) and the Society for Imaging Informatics in Medicine (SIIM) co-sponsor the National Imaging Informatics Course, held twice a year (in its 3rd year); the majority of residency programs have participated 17 . The RSNA annual meeting hosts several AI refresher courses and coding challenges in the new "AI Pavilion" with residents encouraged to participate 18 . The SIIM Resident, Fellow, Doctoral Candidate, Student (RFDS) society hosts monthly journal clubs and promotes mentoring opportunities 19 . In this critical review, we propose an overall action plan for radiotherapy-specific AI training that is comprised of the action points outlined in Table 1 . We cover each AP in detail in this perspective article. There are undeniably swirling issues regarding the application of AI in radiation oncology and medicine in general. In the last decade or so, we have seen several examples of ethical concerns and biases magnified by AI. When there are biases in the training data (e.g., certain populations or scenarios are over represented), then an algorithm that models correlations could propagate or even amplify these biases, leading to undesirable outcomes in deployment 20 . This is particularly problematic as AI is sometimes viewed as being "objective" without consideration for the causal data generation process, which is often unknown. The European Union (EU) has recently released seven points action plans towards so called "trustworthy" AI. This plan focuses on the ethical aspects of AI and includes: human agency and oversight; robustness and safety; privacy and data governance; transparency; diversity, non-discrimination and fairness; societal and environmental well-being; and accountability 21 . Similarly, the Food and Drug Administration (FDA) has taken similar steps towards regulation of AI applications in medicine 22 . A key component of improving awareness is to be transparent and to clearly document where and when an AI algorithm is used in any part of the clinical workflow. And in cases where AI is applied, researchers and physicians should also clarify whether the AI is a machine learning system-which are the more recent type of AI trained on large data and tend to be more black box-or an older rule-based system. Machine learning and rule-based AI have different behaviors and regulations. There are currently no educational guidelines for AI training in radiation oncology or medical physics residents. Serendipitously, there is an active discussion within the field about revising the radiation oncology resident training curriculum. While in depth discussion of all the factors at play is beyond the scope of discussion here, we refer readers to pair of editorials by Amdur and Lee 23 and Wallner et. al 24 . In Spring 2019, the Accreditation Council for Graduate Medical Education put forth a draft of proposed changes 25 to the residency curriculum. The draft revisions are notable for mandating intradepartmental conferences in several new areas, including clinical informatics. We agree that any new updated training curriculum should discuss training in informatics. Given the rapid pace of advancements and increased interdisciplinary education requirements, it may be necessary to have residents elect "sub-specializations" so that they do not dilute a given area of interest too much. In Action Point 2, we propose a high-level overview of a curriculum draft for trainees in medical physics and radiation oncology to adequately grasp the basic principles of AI. These principles-listed below-are generalizable to medicine as a whole, but have particular significance for interventional and informatics-heavy specialties such as radiation oncology. There is increasing concern that AI models influenced by bias will further perpetuate healthcare disparities for patients. The underlying reason behind why bias is retained in AI models is often related to training data which fails to represent the entire population equally. Because AI algorithms do not have a concept of "fairness", surveillance of inherent bias with AI is typically left to those who designed the system. As noted by the EU/FDA in Action Point 1, proper application of AI should aim to enhance positive social change and enhance sustainability and ecological responsibility. Particularly in medicine, rules and regulations should be put in place to ensure responsibility and accountability for AI systems, their users and their appropriate utilization. In the computer science and machine learning communities, there has been increasing efforts to improve the teaching of ethics and human-centered AI in coursework (https://stanfordcs181.github.io/) 20, 21, 26 . A complementary area of work is to develop methods to audit AI systems in order to identify potential systematic or cultural biases. Trainees must develop an appreciation for these critical complexities and potential limitations of AI. Data structures and algorithms form the foundation of AI applications. Unfortunately, quantitative analysis and critical data appraisal are not emphasized in medical or post-graduate training, but there are examples of deeper statistical training in our field to ensure appropriate clinical trial design. As a large number of ML techniques become published in medical journals, it is incumbent upon editors and readers alike to have some basic facility with the techniques. Building a working knowledge of basic statistical concepts such as hypothesis testing, confidence intervals, and basic performance metrics will need to be introduced before more data structures and model-agnostic techniques such as data cleaning 27 , cross validation, model fitting, bias-variance trade off, and advanced performance metrics, such as the widely-used but poorly-understood receiver operating curve 28 . To de-mystify many of these topics, there are existing high-quality online courses made broadly available, which will be further discussed in Action Point 2 and Action Point 3. For proper clinical application of AI tools, physician should be able to assess the validity of the data and the model-generation process. So-called "black box" models have such internal complexity that they are conceptualized as inputs mapped to outputs without any intent to understand how the mapping occurs. Several ML methods, including deep learning (DL) and most ensemble methods, fall into this categorization. While black box AI models can have excellent performance during training and internal validation, they often encounter problems generalizing when widely deployed. Understanding why a problem occurred can be difficult with "black box" models and is currently a very active area of AI research [29] [30] [31] . One way to demonstrate data and model interpretability through "use cases." In medical research, there are well-known examples of the potential dangers of black box models related to confounding 3233 . Fortunately, researchers were able to catch these issues before deploying their models, may not always be the case for complex datasets with nonobvious confounders. There is an ongoing discussion on the necessity of AI interpretability by the FDA 22,34 and the informatics community 35 . All parties would agree that elevating the knowledge base of clinicians and physicists will certainly enable more innovation regardless of final regulatory plans. For trainees interested in applying data science to clinical practice, these opportunities should be encouraged and promoted. While medical physics and radiation oncology AI curricula have significant overlap, there will necessarily be focuses on separate domains. In medical physics, instruction may cover methods for autosegmentation, automated/adaptive treatment planning, and quality assurance. Radiation oncology trainees may be more interested in prognostic predictions and clinical decision support. In the future, as AI takes more of an augmented intelligence role, there should be instruction for physicians for how to decide whether to accept, interrogate or reject recommendations. For example, physicians may need to determine whether there is sufficient rationale to accept an automatically-generated plan or treatment recommendation using clinical and dosimetric information. Several radiation oncology departments have AI researchers who could contribute to a training curriculum. These courses should be jointly taught to both physicists and physicians. We anticipate that common courses and collaboration between trainees in medical physics and radiation oncology will improve translation of AI methods into the clinic. For departments without access to sufficient resources, online education using so-called MOOCs ("massive open online courses;" a misnomer as they are not necessarily massive or open) and workshop models (see Action Point 4) may be more educational to trainees than co-opting faculty without training in AI/ML. For advanced practitioners, we will discuss data science hackathons and crowdsourcing in Action Point 3. A significant impediment to validated AI in medicine is the tribalism surrounding clinical data ownership 36 that is partially attributable to regulations involving protected health information (PHI). Unlike in academic medicine, academic AI researchers have a strong open-access culture where pre-print archiving of publications is the norm 37 and datasets are simultaneously published with papers to invite validation. Notably, patients are generally supportive of the sharing of their data and would likely embrace scientific reuse of their data to improve the lives of future patients 38 , though we recognize that there are many regulatory limitations to widespread data sharing of this sort. Finding a path for controlled data sharing amongst trusted parties, or more broadly with deidentification schemes could be an important first step in improving the accuracy of AI algorithms. In this curriculum, we hope to emphasize the efforts in medicine and oncology to promote data sharing and discuss obstacles to this. The NCI is keen on improving data sharing protocols and resources. In 2018, the NCI Office of Data Sharing (ODS) was created and has a current focus on pediatric tumors 39 . The longstanding The Cancer Imaging Archive 40 allows the sharing of anonymized imaging datasets as well as corresponding clinical and genomic data 41 . A radiology initiative includes the ACR Imaging Network (ACRIN) for clinical trial protocol and dataset sharing 42 . The American Society of Clinical Oncology (ASCO) provides another strong example of centralized data sharing in the CancerLinQ project (https://cancerlinq.org/). Radiation oncology currently lacks a specialty-specific centralized platform for data request and sharing. One approach to overcome data transfer medicolegal/PHI issues is through distributed or federated learning. In this approach, analysis is performed locally and models are transferred (e.g., feature weights) instead of data 43 ; this decentralized approach has shown equivalent performance to that using central pooling of data 44, 45 . Such innovative approaches for anonymization could be part of a training curriculum to help overcome barriers to data sharing. Recent years have seen signs of a cultural shift in radiation oncology towards open collaboration and data sharing, along with formalization of key principles in data sharing, namely that data should be FAIR: findable, accessible, interoperable, and reusable. These FAIR guiding principles for scientific data management and stewardship 46 are of utmost importance, and should be discussed with and endorsed for all trainees. In line with these principles, several radiation oncology academic centers and cooperative groups have contributed datasets to the TCIA [47] [48] [49] [50] . Open access journals with a focus on radiation oncology include BMC Radiation Oncology, the Frontiers in Oncology section on radiation oncology, and Advances in Radiation Oncology, which was launched by ASTRO in 2015 51 . Several coordinating efforts present opportunities to pool ideas and data to promote collaborating, increase power for discovery, and avoid redundancy. Within imaging, these efforts include the aforementioned ACR Data Science Institute for AI in medical imaging, which aims to identify clinically-impactful use cases in radiation oncology, such as autosegmentation and MRI-derived synthetic CT scans 52 . Within genomics, the Radiogenomics Consortium is a transatlantic cooperative effort pooling American and European cohorts to find genomic markers for toxicity to radiotherapy 53, 54 . Several groups within the consortium are interested in creating ML models to predict toxicity response in radiotherapy [55] [56] [57] [58] . Through the proposed curriculum draft of Action Point 2, we hope to build a core of trainees for the next generation who are able to understand and apply data science fundamentals while also understanding ethical considerations and data sharing principles. Directly building off the curriculum discussed in Action Point 2, the third Action Point relates to the creation of a centralized, publicly accessible repository of resources to guide trainees. These resources could include seminal white papers, video lectures, code samples, and contacts for potential collaborations. To reach the widest potential audience, we favor storage at open access websites such as GitHub and Youtube, for instance. As discussed in Action Point 2, a formal curriculum can be facilitated and standardized using MOOCs, which would consist of video lectures and interactive coding exercises 59 . MOOCs have become very influential in open education as they can be tailored for various experience levels and are self-paced. Due to economies of scale, they can be widely disseminated for reasonable costs. For example, the MOOCs on Coursera run on the higher end of cost and charges around $40/month for classes that last around a month with around 10-12 hours of coursework a week 60 . MOOCs could be adopted from existing courses or centrally created in collaboration with organizations such as the ARRO Education Committee and Radiation Oncology Education Collaborative Study Group (ROECSG). As there is more interest, trainees will likely want to be involved in practical research projects. Given that AI expertise is not evenly distributed, both intra-and inter-institutional collaborations can be fostered. In this respect, trainees can provide a valuable service by annotating data for research. At the same time, they would also benefit from the service of others by receiving annotated data for model and skills development. A model for this can be seen in eContour (https://www.econtour.org), a free web-based contouring atlas. In a randomized trial, eContour improved nasopharynx contours and anatomy knowledge compared to traditional resources 61 . A next phase of eContour may involve enabling user-generated contours and segmentations for technology development and research, with a prototype initially pilot-tested at the American College of Radiation Oncology Annual Meeting in 2017 (http://www.econtour.org/acro). Researchers are invited to use the platform to collect contours from large numbers of users from diverse practice locations, though must provide funding to support website programming and content administration. For residents, funding opportunities are available through professional organizations, as discussed in Action Point 4. Another promising venue for trainees interested in skills development is through public competitions. Past challenges in radiation oncology have leveraged collaborations between academic centers, international societies such as MICCAI (The Medical Image Computing and Computer Assisted Intervention Society) and commercial sites such as TopCoder.com (Wipro, Bengalaru, India) and Kaggle.com (Google, San Francisco, CA). These public crowd-sourcing challenges have included two radiomics challenges (to predict human papillomavirus status or local control) in oropharyngeal cancer after radiotherapy 62 and to predict lung tumor segmentation 63 . In the data science competition space as a whole, there has been enthusiasm for healthcare challenges, with the last three Kaggle Data Science Bowls (https://datasciencebowl.com/) focused on heart ejection fraction determination (2016), lung cancer screening (2017), and cellular nuclei detection (2018). Developing and maintaining resources described in Action Point 3 will require accelerated learning of particularly motivated trainees who will need institutional infrastructure and funding mechanisms to be successful. While MOOCs provide consistency and quality of education, for accelerated training, the radiation oncology community could adopt the intensive weeklong workshop model that is widely used by oncology organizations 12 67 and ad hoc workshops by the NIH/NCI 68, 69 have provided forums for data science practitioners to discuss results and ideas, but have not yet focused on education. Another avenue to gain expertise could be through the American Board of Radiology's B. Leonard Holman Research Pathway. This pathway is an established research fellowship during residency that provides between 18-21 protected months of research without lengthening clinical training time. This protected time could be used to gain expertise in data science, which could include AI fellowships in collaboration with data science departments or industry. Several companies, including Google, Microsoft, NVidia, and Facebook all offer 1 year AI "residencies" for specific areas such as deep learning. There are several seed grant opportunities for residents and fellows. These include 1 year grants of $25-50k by ASTRO (physicians and physicists), RSNA, and ASCO. There are also additional funding opportunities not specific to trainees by the Radiation Oncology Institute 70 and radiotherapy companies 71, 72 . For new faculty, NIH K08/K23 awards can provide mentored research time and salary support 73 . NIH R25 grants for developing informatics tools for cancer are another promising avenue for multi-year funding opportunities 74 . AI is becoming a transformative force in medicine but there are dangers to blindly trust trained models and raw data without understanding their governance. Just as radiation oncology trainees should have an understanding of radiobiology and physics to treat patients, we believe that some level of competency in AI is necessary to safely and effectively utilize it in the clinical setting. In this perspective article from the 2019 NCI Workshop on AI in Radiation Oncology: Training and Education Working Group, we have discussed AI awareness and proper conduct (Action Point 1), what an AI curriculum might include (Action Point 2), how to create and contribute to educational resources (Action Point 3) , and what support from institutions and funding agencies is required (Action Point 4). We hope that this paper will spark further discussion on the future of trainee education in radiation oncology. Cramming more components onto integrated circuits AMA adopt policy, integrate augmented intelligence in physician training The National Institutes of Health's Big Data to Knowledge (BD2K) initiative: capitalizing on biomedical big data Predicting outcomes in radiation oncologymultifactorial decision support systems Machine Learning Approaches for Predicting Radiation Therapy Outcomes: A Clinician's Perspective Big Data and machine learning in radiation oncology: State of the art and future prospects El Naqa I. Big Data Analytics for Prostate Radiotherapy Artificial intelligence in radiation oncology: A specialty-wide disruptive transformation? The Role of Artificial Intelligence in Diagnostic Radiology: A Survey at a Single Radiology Residency Training Program Assessing the Training and Research Environment for Genomics, Bioinformatics, and Immunology in Radiation Oncology Career Enrichment Opportunities at the Scientific Frontier in Radiation Oncology Fellowship in Clinical Informatics: Radiation Oncology Track | OHSU Fellow and Resident Radiation Oncology iNtensive Training in Imaging and Informatics to Empower Research Careers (FRONTI2ER) Artificial intelligence in radiology: friend or foe? Where are we now and where are we heading? American College of Radiology Launches ACR AILAB to Engage Radiologists in AI Model Development National Imaging Informatics Curriculum and Course RSNA Launches Artificial Intelligence Initiatives Society for Imaging Informatics in Medicine. SIIM.org AI can be sexist and racist -it's time to make it fair Building trust in human-centric AI. FUTURIUM -European Commission Food & Drug Administration. Clinical and Patient Decision Support Software Draft Guidance A Call for Change in the ABR Initial Certification Examination in Radiation Oncology The American Board of Radiology Initial Certification in Radiation Oncology: Moving Forward Through Collaboration Program Requirements for Graduate Medical Education in Radiation Oncology: Summary and Impact of Focused Requirement Revisions Tidy data The meaning and use of the area under a receiver operating characteristic (ROC) curve Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission Interpretation of Neural Networks is Fragile Making intelligence intelligible with Dr. Rich Caruana. Microsoft Research Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study The Challenge of Regulating Clinical Decision Support Software After 21(st) Century Cures AMIA Urges More Work on FDA's Decision Support Guidance | AMIA NEJM Editor Flip Flops On Data Sharing After Social Media Firestorm. CardioBrief Patient-Led Data Sharing -A New Paradigm for Electronic Health Data NCI's Office of Data Sharing: Setting a "Gold" Standard for Childhood Cancer The Cancer Imaging Archive (TCIA) -A growing archive of medical images of cancer. The Cancer Imaging Archive (TCIA) Initiative -The Cancer Imaging Archive (TCIA) Public Access -Cancer Imaging Archive Wiki Distributed learning: Developing a predictive model based on data from multiple hospitals without data leaving the hospital -A real life proof of concept Developing and Validating a Survival Prediction Model for NSCLC Patients Through Distributed Learning Across 3 Countries Distributed deep learning networks among institutions for medical imaging The FAIR Guiding Principles for scientific data management and stewardship Data From NSCLC-Radiomics Data from NSCLC-Cetuximab Data from Head and Neck Cancer CT Atlas Imaging and clinical data archive for head and neck squamous cell carcinoma patients treated with radiotherapy ASTRO's Advances in Radiation Oncology: Success to date and future plans Artificial Intelligence in Radiation Oncology Imaging Establishment of a radiogenomics consortium Establishment of a Radiogenomics Consortium Computational methods using genome-wide association studies to predict radiotherapy complications and to identify correlative molecular processes Machine Learning on a Genome-wide Association Study to Predict Late Genitourinary Toxicity After Prostate Radiation Therapy Genomics models in radiotherapy: from mechanistic to machine learning Why Jupyter is data scientists' computational notebook of choice Applied Data Science with Python Specialization Multi-institutional Randomized Trial Testing the Utility of an Interactive Three-dimensional Contouring Atlas Among Radiation Oncology Residents Machine Learning Applications in Head and Neck Radiation Oncology: Lessons From Open-Source Radiomics Challenges Use of Crowd Innovation to Develop an Artificial Intelligence-Based Solution for Radiation Therapy Targeting Treatment data and technical process challenges for practical big data efforts in radiation oncology State of Data Science in Radiation Oncology Workshop. In: NIH National Cancer Institute National Cancer Institute NCI Budget Fact Book -Research Career "K" Awards -National Cancer Institute ITCR: innovative algorithms (R21 Clinical Trial Optional) The authors thank Dr. Erin Gillespie for information on eContour.org. This work has been supported by grants and contracts as follows: JK by the Radiological Society of North America Research & Education Foundation Resident Research Grant #RR1843. JZ is supported by grants from the Chan-Zuckerberg Initiative and the National Institute of Aging P30AG059307. IEN by NIH grants R37-CA222215 and R01-CA233487. SA is supported by the American Cancer Society, National Science Foundation, and the Agency for Health Research and Quality. The contents do not represent the views of the U.S. Department of Veterans Affairs or the United States Government.