key: cord-0741928-sf7yd361
authors: Unberath, Mathias; Ghobadi, Kimia; Levin, Scott; Hinson, Jeremiah; Hager, Gregory D.
title: Artificial Intelligence‐Based Clinical Decision Support for COVID‐19–Where Art Thou?
date: 2020-06-23
journal: Adv Intell Syst
DOI: 10.1002/aisy.202000104
sha: cb6ec7c6077d928706836e91626dc2a4cee375da
doc_id: 741928
cord_uid: sf7yd361

The COVID‐19 crisis has brought about new clinical questions, new workflows, and accelerated distributed healthcare needs. Although artificial intelligence (AI)‐based clinical decision support seemed to have matured, the application of AI‐based tools for COVID‐19 has been limited to date. In this perspective piece, the opportunities and requirements for AI‐based clinical decision support systems are identified and challenges that impact “AI readiness” for rapidly emergent healthcare challenges are highlighted.

To all front line health workers 1 

Prior to January 2020, the artificial intelligence and machine learning (AI/ML) for healthcare community had many reasons to be pleased with the recent progress of their field. Learningbased algorithms had been shown to accurately forecast the onset of septic shock, [1] ML-based pattern recognition methods classified skin lesions with dermatologist-level accuracy, [2] diagnostic AI systems successfully identified diabetic retinopathy during routine primary care visits, [3] AI-based breast cancer screening outperformed radiologists by a fairly large margin, [4] ML-driven triaging tools improved outcome differentiation beyond the emergency severity index, [5] AI-enabled assistance systems simplified interventional workflows, [6] and algorithm-driven organizational studies enabled redesign of infusion centers. [7] Many would have argued that, after nearly 60 years on the test bench, [8] AI in healthcare had finally reached a level of maturity, performance, and reliability that was compatible with the unforgiving requirements imposed by clinical practice.

Today, only a few months later, this rather sunny outlook has become overcast. The world's healthcare systems are facing the outbreak of a novel respiratory disease, COVID-19. As of May 29, 2020 more than 5 800 000 cases of COVID-19-caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) infection-have been reported across 210 countries, leading to more than 360 000 deaths. [9] Although a rapidly emerging infectious disease escalating to a global pandemic represents a rare worst-case scenario, it provides an opportunity to observe the resilience of each and every link in the healthcare chain. Though heavily stressed, most links of this chain withstood the stress test of this first wave of the pandemic, whereas the highly acclaimed AI/ML link appeared to have given in. This breakdown, however, cannot be attributed to a lack of opportunity-on the contrary: Several articles in popular and scientific press describe immediate opportunities for AI-assisted tools to have a positive impact on patient outcomes. These applications range from improved diagnosis, triaging and prognostication [10, 11] to personalized treatment decision support [12] to automated monitoring tools. [13] These AI-based clinical decision support (CDS) are the central focus of this perspective piece.

Why, then, are AI-assisted CDS tools seemingly limited in contributing to the fight against COVID-19?

Although the COVID-19 crisis has brought about new clinical questions, new workflows, and accelerated distributed healthcare needs, the core principles of successful CDS are unwavering [14] indeed they are arguably more important than ever. Foremost, an AI-based CDS must address a clinical decision where variability and uncertainty exists, as achieving reproducibility constitutes a core challenge in the course of the diagnosis or treatment continuum. [15] Many opportunities and unknowns exist in both identifying patients with SARS-CoV-2 and performing optimal treatment while containing transmission.

Clinical needs and opportunities for AI-driven CDS are evolving with the pandemic. Since early on, especially in areas with limited diagnostic testing capacity (including most of the USA), rapid identification of patients with COVID-19 has been a major challenge. New signs and symptoms of the illness are still being described, and guidance on which patients should be considered persons under investigation (PUIs) for COVID-19 or tested for SARS-CoV-2 infection is continuously evolving. [16] Failure to rapidly identify and isolate patients with COVID-19 in the hospital setting creates opportunity for nosocomial infection spread, placing patients and healthcare workers at excess risk. Screening and isolation procedures that lack specificity drive overuse of limited personal protective equipment (PPE) and delays to care for non-COVID-related illness. To date, our approach to screening for COVID-19 has been largely reactive. AI-driven algorithms with capacity to identify aberrant clinical presentations or to quickly identify nonobvious patterns in patients with the disease could shift this approach to a more proactive one. Beyond screening, identification of individual patient factors (e.g., age and comorbidities), symptoms and clinical findings or measures (e.g., vital signs, laboratory and imaging results) that are predictive of clinical deterioration is needed. AI-driven CDS could be used to target earlier intervention or guide disposition (e.g., discharge vs hospitalization) decisionmaking based on these predictions. As the pandemic continues to unfold, clinical needs will also change. For example, AI-driven CDS to support optimal resource allocation, load balancing, or treatment selection may become the predominant need.

Although the application of AI-based CDS for COVID-19 has been limited to date, this is somewhat understandable. AI is highly dependent on sufficiently diverse representative data; AI/ML algorithms cannot be trained and adequately validated without creating large data sets that reflect the clinical use case, and such data will always be limited in the early phases of an emerging infectious disease. Nonetheless, the need and opportunity for improvement in clinical care through AI will continue to increase over the coming months-and those who are well-positioned to accrue and capitalize on data will be able to seize this opportunity.

In addition to capturing data, it is equally important to understand the clinical use case-effective development and deployment of any CDS, including that driven by AI, requires a deep understanding of both the problems of focus and the environment in which it is encountered. For instance, an AI-driven prediction model is effective only when its prediction can be linked to a clear action (e.g., altered clinical decision-making that prevents or better prepares for the predicted outcome) and is embedded seamlessly in the relevant clinical workflow. Rapid identification of the most pressing clinical needs, and understanding which of the many targets for AI that will yield improvements in clinical care and effective development of CDS requires multidisciplinary (clinician-engineer) collaboration. Teams with existing collaborations of this nature, such as in our own Malone Center for Engineering in Healthcare and Center for Data Science in Emergency Medicine, will be best positioned to rapidly respond to and address needs as they emerge. [15] 

As already noted, AI algorithms cannot be trained and validated without sufficiently large quantities of data, but at the same time not all data that may be available is necessarily useful. AI-based CDS tools must be designed with a target population in mind that is determined by the clinical question to be answered. To obtain unbiased estimates of the CDS system's performance in deployment, careful study design is an essential prerequisite. When curating datasets for model development and validation, the in-and exclusion criteria for any sample must be well-defined and precisely aligned with the patient cohort who would later be subject to algorithmic analysis. Collecting data in this manner is time-consuming and requires care and effort, which is especially challenging during a rapidly evolving pandemic situation. It is not surprising that many of the recent articles that describe AI-based CDS tools rely on convenience sampling of data, i.e., dataset curation strategies that prioritize easily available data, e.g., public repositories such as challenges or The Cancer Imaging Archive, [17] without considering the ultimate target population. From an engineering perspective, such studies can confirm the validity of the technical approach, but they cannot provide evidence to support immediate clinical deployment.

Consequently, while there are early reports on the use of AI-based prediction models for detection and prognosis of COVID-19, the reported results have a high risk of bias and are probably overly optimistic. [18] It is paramount that these promising technical feasibility studies are followed up with carefully designed validation studies to demonstrate the model's usefulness in the target population. The challenges that this creates are not new. They have already been recognized more than two decades ago by Kassirer [19] who concludes: "Typically, new models were tested, found to exhibit interesting characteristics, and then abandoned when their developers found that the next steps were far harder than designing the first model". As highlighted in Section 2, CDS tools are urgently needed and careful reporting of the study and dataset design, e.g., following transparent reporting of a multivariable prediction model for individual prognosis or diagnosis TRIPOD, [20] is critical to interpret the clinical implications of the proposed CDS tool.

Irrespective of the sampling strategy, data collection for AI-based CDS development involves the aggregation and preconditioning of large amounts of medical records. Information in patient medical records is considered highly sensitive and private information that must be protected-these standards are governed by the well-known Health Insurance Portability and Accountability Act (HIPAA) privacy act. Because research using identifiable protected health information qualifies as human subjects research, any such study must be approved by the respective Institutional Review Board (IRB), which are established to protect the welfare, rights, and privacy of human subjects. Although research on fully anonymized patient data does not qualify as human subjects research and is thus exempt, this determination cannot usually be made by the study team but is reserved for the IRB.

A major challenge for all data-driven investigations is the growing evidence that the technical feasibility of fully automated anonymization [21, 22] is questionable. This in turn complicates a straightforward "exempt" declaration, suggesting that new study protocols will have to undergo full IRB consideration. If the IRB is operating at capacity, it can take multiple weeks before a final decision is rendered, and if positive, data collection can begin. Furthermore, in light of the COVID-19 state of emergency, many new studies are being submitted for review and IRBs must prioritize protocols that promise direct and immediate benefits to participants, such as controlled trials of emerging treatments. [23, 24] www.advancedsciencenews.com www.advintellsyst.com

This circumstance may further delay the approval of studies that have low immediate benefit to the subjects, which is the case for most AI algorithm development. Consequently, even if predictor and outcome variables are known immediately after onset, thus enabling multidisciplinary teams to define value-creating AI use cases, the hands-on work toward this goal will likely be delayed substantially due to IRB and data trust related processes. This circumstance seems to be supported by observing the delay between the onset of the epidemic in late December 2019 and the number of publications describing ML algorithms for the automated interpretation of imaging data that only surged in late April 2020. Althoughe the authors unanimously advocate for the importance of IRB and data trust review of studies to protect human subjects, it is becoming increasingly clear that current practices and processes do not easily support the rapid development and deployment of AI technology. To promote AI readiness, we should consider the implementation of IRB subcommittees dedicated to AI algorithm development with specialized study protocol templates that allow for rapid turnaround review of data science-related projects. If indeed the implementation of AI and evidence-based personalized medicine in routine clinical care is an institutional priority, then clear guidelines and preapproved mechanisms should be created that empower teams to move quickly and have impact while, by design, complying with all ethical standards.

Combating COVID-19 has been challenging both due to the large number of asymptomatic patients who can spread the virus and the highly variable impact of SARS-CoV-2 on different patients. The questions that need answering range from identifying and testing patients, therapeutic treatments, to protecting healthcare workers, and, perhaps more challenging, predicting and responding to the healthcare need which triggered the "flatten the curve" response and the widespread shutdowns to keep the healthcare systems afloat. Although reducing physical distance is an effective way to curb the sudden impact on healthcare systems, it comes at a large economical and societal cost. Alternative approaches, however, require a thorough understanding of the impacts of COVID-19 on healthcare systems operations, diagnoses, and treatments, as well as devising smart testing and tracing responses. This approach requires proactive actions that are informed through a robust data infrastructure. It is overly simplistic to assume that a global pandemic could be controlled without proper use of the growing amount of data and knowledge that is generated by the minute. At the same time, large and fast-paced data and knowledge aggregations cannot be processed without the help of AI, which in return, is empowered by a reliable data infrastructure that is accessible to researchers in a timely manner. This access should be significantly expedited by establishing robust pathways for data sharing and usage, including the standardization of data formats.

To enable such an infrastructure, multiple data components are needed. Consider for instance, a dataset on hospital operations that summarizes basic information such as employees shift work or the number of utilized resources, including PPEs-As trivial as this data sounds, it has proven a challenge to gather this data in a reliable manner, and lack of such data is ever more felt when a crisis as the current pandemic hits and the need for fast load-balancing and resource allocation rise seemingly overnight. Another important component of any data infrastructure is the availability of clinical data in a standard form that can be shared in an anonymized manner across healthcare systems. This data is particularly important in informing diagnosis and response to therapy-one of the early challenges that clinicians faced during the COVID-19 pandemic. Although some systems exist that capture similar data, they are often incomplete and siloed to the particular organization that generated the data, with accompanying challenges including IRB approvals. Rarely can such datasets be shared on a large scale across multiple systems, although there are noteworthy exceptions. [25] An important differential characteristic of an ideal data infrastructure and a key point that enables healthcare systems to respond quickly and proactively, is that such data infrastructure should be devised with data sciences and AI in mind, as opposed to considering it as an afterthought to what organically happens in healthcare environments. An infrastructure that is multidisciplinary and is informed by not only the clinicians and medical researchers but also data scientists, engineers, managerial sciences, and public health experts is the one that has a better chance to stand the test of time.

As the pandemic progresses and as more data is acquired, becomes available for research, and is shared, it is likely that AI's contribution to combating the COVID-19 crisis will grow. In nonhuman subjects research tasks, such as language processing or computational biology, AI has already started to make an impact by extracting key findings from the ever increasing body of literature on COVID-19 [26] or by trying to understand the protein structure of SARS-CoV-2 to drive drug discovery. [27] When considering AI/ML for healthcare, we note a number of recent developments that seek to address some of the roadblocks identified herein. Commendable examples include Food and Drug Administration's request for feedback on their proposed regulatory framework for AI/ML-based software as medical device, [28] and the Office for Human Research Protections' Exploratory Workshop Series on privacy and health research, which discusses potential operational solutions to the challenges IRBs face when reviewing "big data research". [29] Specifically for CDS, AI's notable absence in the early phases of this initial surge will likely not be the whole story. There are pockets of examples [30, 31] where AI is being used to support COVID-19-related patient care right now, and many more examples will follow.

Once the dust of this first wave of the pandemic has settled, we, as a community, should spend some time to identify and analyze the organizational, institutional, or regulatory hurdles that this healthcare crisis has highlighted, as well as the solution paths that emerged to bring AI-based CDS systems for COVID-19 to the bedside. These insights should contribute to an open discussion of how we can improve on the AI readiness of current practices and protocols. After all, who knows when the next severe test comes forth and what it will look like. Let's be prepared!

Proc. IEEE 2019

IEEE Spectrum-AI Can Help Hospitals Triage COVID-19 Patients

Center for Disease Control and Prevention: Coronavirus Disease

COVID-19 Research Database

Kaggle -COVID-19 Kaggle Community Contributions

US Food and Drug Administration: Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD)-Discussion Paper and Request for Feedback

Office for Human Research Protection Exploratory Workshop-Privacy and Health Research in a

Thirona-Artificial intelligence to Screen for COVID-19 on CT-and X-ray Images

Business Insider-OSF HealthCare Partners with StoCastic to Use New AI System to Move Coronavirus Triage Out of the Hospital

All authors have contributed equally to this manuscript. This article was supported by institutional funds provided by the Malone Center for Engineering in Healthcare.