key: cord-0019151-6mhptwpt authors: Petzold, Axel; Albrecht, Philipp; Balcer, Laura; Bekkers, Erik; Brandt, Alexander U.; Calabresi, Peter A.; Deborah, Orla Galvin; Graves, Jennifer S.; Green, Ari; Keane, Pearse A; Nij Bijvank, Jenny A.; Sander, Josemir W.; Paul, Friedemann; Saidha, Shiv; Villoslada, Pablo; Wagner, Siegfried K; Yeh, E. Ann title: Artificial intelligence extension of the OSCAR‐IB criteria date: 2021-05-19 journal: Ann Clin Transl Neurol DOI: 10.1002/acn3.51320 sha: aafd046dff7a44b905fd16c3528ed2b8d97f2fcf doc_id: 19151 cord_uid: 6mhptwpt Artificial intelligence (AI)‐based diagnostic algorithms have achieved ambitious aims through automated image pattern recognition. For neurological disorders, this includes neurodegeneration and inflammation. Scalable imaging technology for big data in neurology is optical coherence tomography (OCT). We highlight that OCT changes observed in the retina, as a window to the brain, are small, requiring rigorous quality control pipelines. There are existing tools for this purpose. Firstly, there are human‐led validated consensus quality control criteria (OSCAR‐IB) for OCT. Secondly, these criteria are embedded into OCT reporting guidelines (APOSTEL). The use of the described annotation of failed OCT scans advances machine learning. This is illustrated through the present review of the advantages and disadvantages of AI‐based applications to OCT data. The neurological conditions reviewed here for the use of big data include Alzheimer disease, stroke, multiple sclerosis (MS), Parkinson disease, and epilepsy. It is noted that while big data is relevant for AI, ownership is complex. For this reason, we also reached out to involve representatives from patient organizations and the public domain in addition to clinical and research centers. The evidence reviewed can be grouped in a five‐point expansion of the OSCAR‐IB criteria to embrace AI (OSCAR‐AI). The review concludes by specific recommendations on how this can be achieved practically and in compliance with existing guidelines. Sophisticated artificial intelligence (AI)-based algorithms not only enable discrete layer segmentation of retinal optical coherence tomography (OCT) scans, but can identify novel retinal features. A prominent recent example is automatic clinic referral via deep learning based on retinal layer analysis. 1 The current OSCAR-IB quality control (QC) criteria take algorithmic quality components into account (the A-criterion) but the criteria are not adapted to AI-based algorithms. 2 Given the recent success and rapid developments in this field, it is timely to build on the OSCAR-IB QC criteria to address the challenges of AI and big data specifically. To this purpose, it is critical to acknowledge that accuracy is paramount to the interpretation of retinal OCT in neurological disease. Judgments are highly dependent on quantitative data of individual retinal layers. Key components are thickness, degree of change, and alteration of the topography. The retinal layer thickness changes seen in neurological disorders are much more subtle 3-7 than the pathologies seen in the ophthalmologic diseases, now successfully detected by AI-based methods. 1, 8, 9 For neurodegenerative diseases, relevant annual retinal layer atrophy rates are just above the axial image resolution of contemporary spectral-domain and swept-source OCT techniques. 5 For this reason, image QC is paramount. Over the past decade, big OCT data have accumulated in neurodegenerative and neuroinflammatory diseases. These data are attractive for the development of AI strategies. The expectation is to improve the accuracy of OCT-based quantification, diagnostic sensitivity and specificity, discover novel surrogates for monitoring disease progression as well as outcome metrics for clinical trials. Fully automated AI-based strategies are transferable from highly specialized services to primary care. The test throughput is also scalable to include, for example, high street opticians. Both will aid with the logistics of patient care through local centers. In 2012, we proposed the first consensus OCT QC criteria, OSCAR-IB. 2 The name served as a mnemonic for seven distinct QC criteria to be remembered; (i) Obvious errors, (ii) Signal strength, (iii) Centration of scan, (iv) Algorithm failure, (v) Retinal pathology, (vi) Illumination, and (vii) Beam placement. This was followed by international validation 10 and endorsement to reporting guidelines. 11 The OSCAR-IB QC criteria were developed in a multiple sclerosis (MS) network, and have since been broadly accepted. This success is at least in part due to the demand for clarity and transparency on being practical to "get simple things right". A similar approach is warranted for the role of AI in relation to OCT data in neurodegenerative diseases. AI-based strategies are at risk for propagation of systematic errors, imbalance, and bias due to subtle differences in data acquisition or postprocessing. There is justified concern about the lack of QC standards for big data. 12, 13 The rise of big data is in part driven by the hope of improving P4 medicine: predictive, preventive, personalized, and participatory care. 14 For example, integrative prediction models encompassing as many as 63 variables have been proposed to enable personalized predictions of individuals' outcomes and guide treatment decisions in myeloproliferative neoplasms. 15 The implications that AIdriven approaches will have for individuals can easily be influenced by bespoke sources of bias fed into the model as discussed below. Importantly, minimal variation in image acquisition can cause substantial errors in the quantitative data. 16, 17 Strategies based on AI are excellent in recognizing changes between images, but do not necessarily know how the human OCT operators have acquired an image. This can mislead the AI-based strategy with a downstream effect of possible misdiagnosis, mismanagement, and harm. The risk for such a situation to occur increases with rapidly rising numbers of OCT scans to be evaluated. It may introduce systematic errors if imbalances exist between populations and centers, for example, due to service capacity issues or automation of our health-care systems. The possible medico-legal ramifications are also evident. To date, OCT data in MS and related disorders are most consistent as most reports adhered to the OSCAR-IB QC criteria and followed the APOSTEL reporting guidelines. 2, 11 Results are more heterogeneous for other neurological diseases because of lack of standardization. There is evidence for an early publication bias in Alzheimer disease (AD) reports. 6 Subsequent data were not supportive of the earlier enthusiasm. Few of the reports on AD followed a rigorous QC approach. This similarly applies to reports of OCT in Parkinson disease (PD), 18, 19 amyotrophic lateral sclerosis (ALS), 20-23 stroke, 24 epilepsy, 4, 25 and schizophrenia. 26 There are critical successes for the use of AI in neurological disease. For example, urgent triaging of individuals from brain imaging to neurosurgery 27 ; earlier diagnosis of AD 28 ; identifying suitable candidates for epilepsy surgery 29 ; and regulation of adaptive deep brain stimulation in movement disorders. 30 Imaging-based trial outcome measures in neurology include almost all neurodegenerative, neurovascular, and neuroinflammatory conditions alongside tumors. 31 Imaging data have become multimodal. This adds to complexity and time needed by human readers and reporting. Likewise, histological data can now be used for machine and deep learning. 32 The review committee The ophthalmological community has driven advances in AI-based analysis of retinal OCT. 1 The committee for this review has been expanded (Table 1) . We included representatives from neurological sub-specialties who have used retinal OCT data for diagnostic and prognostic purposes, as well as treatment trial outcome measures. We have also engaged with experts in the fields of AI and bio-engineering and bio-statistics and two not-for-profit organizations (www.ern-eye.eu and www.imsvisual.org). The importance of patient involvement as a key stakeholder has been recognized 33 and contributed to the development of conceptual models. 34 On a day-to-day practical level, the experience has demonstrated that individuals tolerate retinal OCT well. It is noninvasive, noncontact, quick, and provides instant feedback. The possibility to display images directly to the individuals and discuss changes has given them more confidence and insight in their care. 35 This good partnership has helped in working together to build trust, supporting treatment decisions and making OCT scans available for research. There is a need to maintain this mutual trust at a time where the immense amount of data accumulated now permit AI-inspired projects on big data. None of this is possible without patient participation, their consent, and feedback. A key concern of patients and their advocates is that their data will be misused. Individuals have a higher level of confidence in not-for-profit stakeholders than in government or private companies. 36 Due to the requirement of very large training datasets for optimal performance, most current clinical AI systems have been developed using routinely collected data which have been anonymized. Anonymization of medical images presents specific challenges, however, particularly images of individually unique structures such as the neurosensory retina. 37 Even when carefully anonymized, there is at least a theoretical risk of re-identification for such images, either now or with some future technology. 38 Therefore, we recommend a multistep approach to addressing data protection and privacy. Firstly, retinal OCT scans should be anonymized according to current national and international standards. 39 This includes removal of any imaging meta-data such as patient names, dates of birth, or medical record numbers, obscuration of hospital visit dates, plus careful consideration of any associated clinical meta-data (e.g., merging of categories/classes if they contain only a limited number of examples). 40 Secondly, a range of additional safeguards should be put in place. Technical safeguards include the requirement to store data in trusted research environments with access controls and audit logs; contractual safeguards include prohibitions against linkage or attempted re-identification of data. Importantly, every attempt should be made to minimize the data shared to that required for the clinical or research purposethis is a fundamental principle of much data protection regulations, including the European General Data Protection Regulation (GDPR). Finallyand perhaps most importantlyit is vital to engage in patient and public engagement and involvement at the earliest possible stage. This includes making patients aware that their data are being used for research, publishing study protocols, and giving patients the opportunity to opt-out. By adopting a cautious and engaged approach such as this, we believe it is possible to reduce any data protection risks while maximizing the potential for future patient benefit. In the future, a range of technical solutions, including federated learning and homomorphic encryption, should help further mitigate these risks. 41 We reviewed three databases, PubMed, Web of Science, and Google Scholar, between 01 January 1963 and 23 April 2020 without language restriction. We chose the English version of a manuscripts if the same group had published similar data in Dutch, French, German, Italian, or Spanish. The search terms used were "optical coherence tomography" or "OCT" combined with "artificial intelligence", "machine learning", "deep learning", "multiple sclerosis", "optic neuritis", "dementia", "Alzheimer", "Parkinson", "motor neuron disease", "amyotrophic lateral sclerosis", "stroke", "cerebrovascular accident", "schizophrenia", "patient voice". We also reviewed articles included in three systematic reviews previously conducted. 3, 5, 8 Methods Firstly, we reviewed the original OSCAR-IB criteria to clarify which of the QC failures require an individual to be re-assessed or to be excluded if, for example, post hoc homogenization approaches fail. Having to recall a patient for a failed test is not desirable, is problematic, and is expensive. Secondly, we reviewed approaches to rectifying QC failures by image postprocessing. Thirdly, we examined the outcome of our AI-based methods for irregularities, identical to the approach taken in the original OSCAR-IB report. 2 The terminology of terms explicitly related to AI is summarized in Table 2 . Firstly, there is a clear and justified fear of the misuse of big data. 37 Secondly, the patient-physician relationship must be supported to provide an optimal experience. Thirdly, demonstration of the capability of the AI strategy enhances the ability to produce high-quality and relevant effectiveness research. Fourthly, it promotes accountability. Fifthly, it provides grounding for the production of reproducible studies. Together, the definition of QC for AI can be summarized by five pillars which were named individually or in combination in the literature reviewed ( Figure 1 ). The mnemonic, RASCO, stands for Reproducibility (R), Accountability for decisions made (A), to be Supportive of the patient-physician relationship (S), Capability ranging from machine learning (ML)-supported OCT quality control assessment to time and resource-efficient decision-making (C), and Openness with and trust in public opinion (O) is pertinent, given the personal data protection issues discussed above. 42 The utility of AI in medical applications is more dependent on data quality than quantity. The new research field of big data has contributed considerably to the advancement of medical science by analysis of large datasets. Until recently, it had not been easy to accumulate enough data to create a large data repository and analyses were too complicated or lacked statistical and computer power. A critical area of weakness of big data can be the granularity and quality of the source data entered. In essence, the quality of outputs or results of AI-based assessments should not be expected to exceed the underlying quality of the data being analyzed (input data). This underpins the importance of maintaining the highest standards of quality, even in the AI space. As we are at the dawn of AI for OCT research, one of our aims is to facilitate the generation of the high-quality data needed for future research in the field. Each OCT scan should prospectively be labeled as QC fail or not. There are several reasons why a scan may fail QC. Each failed scan should be annotated with a complete list of reasons. An efficient way is to use the capital letters of the OSCAR-IB criteria. 10 To avoid a potential bias by eliminating scans from the sickest individual who may have difficulties with the test, this needs to be explicitly noted. Retinal and systemic co-morbidities require careful clinical evaluation with more in-depth ophthalmic phenotyping than hitherto done in most neurological studies. 43 (2) where the error requires recalling and repeating the test. Human-led OCT image QC is a time-consuming task, so it is desirable for this to be performed using AI strategies. We suggest making use of the above-described annotation of failed scans for ML of OSCAR-IB criteria ( Figure 2 ). This will enable the training of future AI algorithms to separate good from insufficient quality OCT scans. The next application step within the pool of scans designated as being of inadequate quality will be to identify those scans which may be subjected to post-acquisition correction approaches, thereby making them high quality, and enabling their safe and accurate utilization. This is a crucial step as it allows for AI training in auto-correction. Scans which failed OSCAR-IB and are not correctable must be excluded from any further AI steps. Taken together this leaves a staged approach to QC in AI: (1) Automated AI OCT QC rating using validated OCT QC criteria 2,10 ; (2) where possible AI QC correction during image postprocessing, or if not possible patient recall for repeat acquisition; and (3) final step of AI-based image analysis. This will typically make use of pattern recognition and be the key step forward for the primary research questions. A limitation to keep in mind is that Unsupervised ML tries to discover previously undetected patterns in a dataset Over-fitting Over-fitting can be a problem with ML, a source of over-enthusiastic reporting and reason for lack of reproducibility We have not identified reports on the vulnerability of algorithms to misclassification due to the use of different OCT devices or software versions. Even seemingly small updates have the potential to cause significant differences which if left unnoticed can bias results. 44 The definition of ground truth is disease-specific. It should be stated explicitly how the ground truth was defined. At the minimum for AD and other neurodegenerative dementias, epilepsy, MS, optic neuritis (ON), neuromyelitis optica spectrum disorder (NMOSD), PD, adherence to consensus investigation protocols, and diagnostic criteria will be required. As diagnostic criteria in most neurological diseases are regularly updated, this needs to be taken into account. The descriptive statistics reviewed were mostly based on binary classifiers such as a disease is present "yes/no". These models should include a comment on proportional bias. 45 This is needed to interrogate how much the AIbased prediction agrees with the ground truth. The definition for an acceptable ground truth needs to include the level of evidence on which it was based. For binary and multiclassifier models, the degree of inter-rater agreement should be stated to permit judging on how stable the ground is. Graphs can be presented in a way that allows judgment of the degree of over-fitting and underestimation relevant in comparing differences between AI and ground truth. Many studies used Bland-Altman plots 46 or analyzed the performance of AI and ground truth based on a receiver operator curve (ROC)-based area under the curve (AUC). This gives comparative estimates of sensitivity, specificity, and the positive predictive value (PPV) as a measure of overall accuracy. This is particularly relevant for relatively rare diseases. It was recommended to complementing area Figure 2 . The capability of AI to contribute interpreting OCT images depends on the optimization of each step contributing to the decision tree. The first step relates to the quality of the raw data. Validated QC criteria for OCT image have been summarized as OSCAR-IB. 2 The ground truth of whether or not an OCT passes QC is based on human assessment. The seven OSCAR-IB criteria for QC rejection by a human assessor can directly be used to train AI. Annotation of corrupted OCT scans permits for two outcomes: (1) image postprocessing and repair of artifacts or (2) complete rejection and (if feasible) recall of patient and OCT rescan. Only a dataset that passed OCT image QC should be used for further AI interpretation. the curve ROC (AUROC) values by precision-recall (precision is the PPV and recall is the sensitivity in the AI literature) curves (AUPRC). 8 This was found to be of relevance for unbalanced datasets (substantially more subjects in one of the groups compared). Reporting of calculation of cut-off values included the use of independent cohorts, a graphical ROC-based approach, the Youden Index, k-fold cross-validation, or hold-out validation approaches to obtain accurate estimations of AI-based cut-off performance. We did not yet find consistent reports of the inclusion of power calculations to studies, which are relevant for randomized controlled trials using AI-based outcome measures. 47 It is recommended that sample size estimates be performed before developing an algorithm and repeated after study completion. The gain in power, meaning a more robust statistical result, is just as informative for future research as the potential cost savings by optimizing numbers. Lastly, the standardized effect size, likely to come from AI, was recommended to be aligned with distribution, and anchor health economics to inform clinical trials on what will be a realistic difference. 47, 48 Cohort description On review, cohort descriptions were mostly conform to contemporary standards on demographic characteristics. Cohort descriptions are relevant for AI, and will also greatly limit/determine the usability of the system. This reinforces the need to build on successful initiatives such as the established Consolidated Standards of Reporting Trials of Electronic and Mobile Health Applications and online TeleHealth (CONSORT-EHEALTH) 49 and the CONSORT-AI guidelines. 50 Documentation of developmental changes is also relevant throughout pediatric care and at the transition to adult care. A novel source of potential biases related to the disease diagnostic criteria used. For many of the conditions of interest to retinal OCT, subsequent diagnostic criteria were published over the past decades. While generally aimed at improving practicability, sensitivity, and specificity, this bears the risk that cohorts which are supposed to have the same disease can be quite different in their composition. For example, subsequent diagnostic criteria for MS, AD, and PD have profoundly reshaped the patient base for clinical research over time. Contemporary cohorts tend to be milder than historical cohorts. 51 Clinical trial populations are different from observational studies. The co-morbidity burden is relevant. Relevant items for the pooling of big data are: reporting of the exact diagnostic criteria, a detailed listing of all inclusion and exclusion criteria, recruitment, referrals, and capability of individuals to comply with the examinations. Minimization of the risk of systematic bias will ensure that validation of AI in other cohorts will be comparable. For all AI algorithm development efforts, data used for this purpose should be clearly described for discovery and replication analyses. To avoid obtaining a distorted/biased view on performance, data that are used for validation (e.g., to assess performance) should not have been used during algorithm development. There is a real risk for over-fitting the AI models. It was recommended to perform a validation of the algorithm with the aid of a comparable out-of-sample population. Each AI classification scheme should be rated to whether or not an external validation was performed. This can be supported by publishing details on the building blocks of the AI. Relevant are precise and meaningful definitions on a functional and performance level. This entails a detailed description of the AI architecture, hyper-parameters, as well as details on how the available data were used to train such systems, preferably via open access code repositories. One of the challenges with AI at regulatory level but also at the clinical level found was the fact that neural networks can learn with data and improve their performance. For this reason, it was suggested to define in advance which type of learning is allowed without requiring validation, approval, or lack clinical risks. It was reported that AI might improve over human performance in terms of accuracy and speed. 52 For this "machine versus human" approach reporting included data on sensitivity, specificity, positive, and negative predictive values including the 95% confidence intervals (CI) and the numbers on which the calculations were based. These data permit to answer the question if AI can outperform humans not only as seen with Chess and "Go" games, 53, 54 but also for classification of retinal appearances. 55 There is a second, equally important question to be answered. How can AI be used to enhance human performance? 56 Therefore, it was recommended to test if there is a synergistic effect if the AI and human approach are combined. This is typically referred to as human-AI symbiont/symbiotic. 57 The relevance of potential clinical downstream effects has been recognized. [58] [59] [60] The big chances are to reduce the burdens on physicians and help with service capacity issues. It was recommended to indicate if an algorithm is useful for clinical practice. This requires to test the algorithm in clinical routine. There were different levels at which algorithms added information: on an individual level, on a cohort level, or for screening purposes. There can be important consequences for daily clinical care and health systems. Concerns reported related to misdiagnosis and practicability. This has implications for disease classification. 61 Guidelines At the time of the review, the following guidelines were relevant: APOSTEL, TRIPOD-AI, CONSORT-AI, 50 SPIRIT-AI, and STROBE 62 guidelines. They are regularly updated and latest information can be found on the equator network website at www.equator-network.org/. Open access and data sharing were found to be essential for accountability and reproducibility. The classified sample dataset is just as valuable as the developed algorithm. Datasets can potentially be used by other groups, to facilitate even greater improvements. Accordingly, data availability may accelerate development in the field. Algorithms can also be transferable and codes can be shared. On review, there is a need to understand on what basis an algorithm came to a particular conclusion. There are a few careful predictions one can make regarding the "black box" for neurodegeneration based on anatomy and progression pattern. Firstly, anatomically each area in the retina is connected by axons with a corresponding area in the brain because of the hard-wired retino-cortical projections. 3, 5 The location of damage to the brain will determine the location of expected OCT changes in a determined area of the retina, a "Region of Interest" (ROI). It has been shown that an ROI-based approach to quantification of inner retinal layer atrophy is superior to occasionally performed sector analysis [5] [6] [7] or the generally adopted global averaged approach because it can mask small areas of atrophy. 63, 64 Secondly, the progression pattern is determined by location and size of a lesion damaging the retino-cortical projections. [64] [65] [66] The speed of progression is highest and the area of inner retinal atrophy most extensive with direct retrograde axonal degeneration as seen with optic nerve damage. More distal brain damage will still cause localized atrophy in the retina by a mechanism called retrograde trans-synaptic axonal degeneration. 65, 67 On sequential OCT imaging, the time course of atrophy is shorter with small brain lesions compared to larger brain lesions. 64 It can be anticipated that a smoldering, slowly enlarging brain lesion will continue to drive the expansion of OCT detectable retinal atrophy. 68 Thirdly, inflammatory activity in demyelinating disease has been related to transient increase of the inner nuclear layer (INL) volume. [69] [70] [71] [72] [73] Part of this INL thickening is related to the development of microcystic macular edema (MME). 69, 70, 74 Vitreous traction had been implicated, but is not required for the development of MMO. 75 In most (>80%) cases, MMO is a transient phenomenon. 74 In the remainder, it remains static over the years 74, 76 and is considered by some to represent a retrograde maculopathy 77 due to axonotmesis in the anterior visual pathways as known from experimental models. 78 Fourthly, there are qualitative observations on the OCT images, which have not yet been translated into automated forms of quantification. One example is the presence of hyper-reflective spots. 74 There are two types of these hyper-reflective spots on OCT, and one is static and particularly visible at the upper and lower border of the INL. With the advent of OCT-Angiography (OCTA) and adaptive optics, it has become clear that they represent reflectivity changes from the inner retinal vasculature. 79 There is at least another type of hyper-reflective spot noticed on serial OCT images, which migrates vertically through the retina. Fifthly, the vitreous has specific OCT signal characteristics which can be reliably quantified from the raw image data. 80, 81 The technique is useful in neurological disease affecting younger adults where the vitreous body still adheres to the retina such as the majority of people with MS. 82 The evaluation of the raw OCT data, rather than analysis of an already postprocessed screen image, is required due to signal changes. Sixth, advanced image shape analyses now permit for quantitative data on qualitative characteristics of the optic disc. The technique has proved valuable in idiopathic intracranial hypertension 83, 84 and possibly also idiopathic moyamoya angiopathy. 85 Similarly, the presence of peripapillary hyper-reflective ovoid mass-like structures (PHOMS) is a novel OCT finding, 86 undetectable on conventional funduscopic examination. Likewise, shape analysis of the fovea has become possible. 87, 88 Seventh, functional assessment of individual retinal layers by OCT is possible using, for example, a dark adaptation. 79 One can anticipate that with the availability of OCTA, the retinal equivalent of a blood-oxygen-level-dependent (BOLD) signal for the brain will emerge. 89 Increased, localized retinal metabolic activity will demand increased oxygen supply and cause elevated perfusion of the microvasculature. 79 Pioneering data on OCTA in MS imply that there is a need for AI-supported QC to exclude artifacts. [90] [91] [92] This will be relevant for reliable quantitative OCTA data on the retinal microvasculature which may help to differentiate between disease entities such as MS and NMOSD. 93 Eighth, inter-eye differences of individual retinal layers are an attractive and highly sensitive method to screen for optic neuritis and MS. 43,94-101 Expanding on these findings, there is a field for AI-based analyses of patterns of retinal asymmetry in MS. 43 Lastly, reflectivity changes of individual layers can be interrogated to estimate tissue properties indirectly. 102, 103 Based on the above combination of numerous quantitative and qualitative changes in retinal (neural and nonneural tissue) architecture in neurological disease, there are promising avenues for a supervised ML approach to the analysis and interpretation of OCT data. Equally, for researchers who prefer to follow a nonsupervised ML approach, the committee recommends checking if findings may be explainable, at least in part, by the above summary of anatomically, biologically, and pathologically plausible observations. In summary, we reviewed several levels of AI-based OCT research in neurology. The main points arising from this review are summarized in Table 3 and based on five pillars (RASCO). The practical conclusions from the multiple levels of evidence reviewed and the summary table may be found helpful on a practical level for future research in the field. We (AP and PAK) acknowledge a proportion of our financial support from the National Institute for Health Research ( We are grateful to all consortium members, who have contributed on many levels to the conception and development of this manuscript over the course over the past years. Members of the IMSVISUAL and ERN-EYE Consortium include: Orhan Aktas, Jack Antel, Nasrin Asgari, Clinically applicable deep learning for diagnosis and referral in retinal disease The OSCAR-IB consensus criteria for retinal OCT quality assessment Optical coherence tomography in multiple sclerosis: a systematic review and meta-analysis Reduction of retinal nerve fiber layer thickness in vigabatrin-exposed patients: a meta-analysis Retinal layer segmentation in multiple sclerosis: a systematic review and meta-analysis Spectral-domain OCT measurements in Alzheimer's disease Retinal layers in Parkinson's disease: a meta-analysis of spectral-domain optical coherence tomography studies A comparison of deep learning performance against health-care professionals in detecting diseases from medical imaging: a systematic review and meta-analysis Artificial intelligence to detect papilledema from ocular fundus photographs Quality control for retinal OCT in multiple sclerosis: validation of the OSCAR-IB criteria The APOSTEL recommendations for reporting quantitative optical coherence tomography studies Expression of concern: Hydroxychloroquine or chloroquine with or without a macrolide for treatment of COVID-19: a multinational registry analysis Cardiovascular disease, drug therapy, and mortality in Covid-19 A personal view on systems medicine and the emergence of proactive P4 medicine: predictive, preventive, personalized and participatory Classification and personalized prognosis in myeloproliferative neoplasms A simple sign for recognizing Off-Axis OCT measurement beam placement in the context of multicentre studies Reliability of intra-retinal layer thickness estimates Optical coherence tomography in Parkinsonian syndromes The neuroophthalmological assessment in Parkinson's disease Optical coherence tomography does not support optic nerve involvement in amyotrophic lateral sclerosis Subtle retinal pathology in amyotrophic lateral sclerosis Retinal involvement in amyotrophic lateral sclerosis: a study with optical coherence tomography and diffusion tensor imaging Macular sublayer thinning and association with pulmonary function tests in Eyes and stroke: the visual aspects of cerebrovascular disease Retinal nerve fiber layer thickness in vigabatrin-exposed patients Schizophrenia and the eye Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study TADPOLE Challenge: Accurate Alzheimer's disease prediction through crowdsourced forecasting of future data, Predictive intelligence in medicine : second International Workshop Machine learning for predicting epileptic seizures using EEG signals: a review Toward electrophysiology-based intelligent adaptive deep brain stimulation for movement disorders MRI signatures of brain age and disease over the lifespan based on a deep brain network and 14 468 individuals worldwide Defining and predicting transdiagnostic categories of neurodegenerative disease The patient is speaking": discovering the patient voice in ophthalmology Stakeholder participation in comparative effectiveness research: defining a framework for effective engagement vision loss from atypical optic neuritis: patient and physician perspectives Share and protect our health data: an evidence based approach to rare disease patients' perspectives on data sharing and data protection-quantitative survey and recommendations Protecting data privacy in the age of AI-enabled ophthalmology Ethics of using and sharing clinical imaging data for artificial intelligence: a proposed framework Information Commissioner's Office. Anonymisation: managing data protection risk code of practice Preparing raw clinical data for publication: guidance for journal editors, authors, and peer reviewers Secure, privacy-preserving and federated machine learning in medical imaging UK government using confidential patient data in coronavirus Retinal asymmetry in multiple sclerosis Software updates of OCT segmentation algorithms influence longitudinal assessment of retinal atrophy Reporting on deep learning algorithms in health care Applying the right statistics: analyses of measurement studies DELTA, javax.xml.bind.JAXBElement@5af99c39, guidance on choosing the target difference and undertaking and reporting the sample size calculation for a randomised controlled trial Assessing methods to specify the target difference for a randomised controlled trial: DELTA (Difference ELicitation in TriAls) review Improving and Standardizing Evaluation Reports of Web-based and Mobile Health Interventions Reporting guidelines for clinical trials evaluating artificial intelligence interventions are needed Changes in the risk of reaching multiple sclerosis disability milestones in recent decades: a nationwide population-based cohort study in Sweden Human versus machine in medicine: can scientific literature answer the question? Mastering the game of Go with deep neural networks and tree search What Google's winning Go algorithm will do next Optic disc classification by deep learning versus expert neuroophthalmologists Artificial intelligence and human trust in healthcare: focus on clinicians Symbiosis between Humans and Artificial Intelligence The frequency of diagnostic errors in outpatient care: estimations from three large observational studies involving US adult populations Evaluation and accurate diagnoses of pediatric diseases using artificial intelligence Is the future of medical diagnosis in computer algorithms? Machine Learning for Clinical Decision-Making: Challenges and Opportunities STrengthening the REporting of Genetic Association Studies (STREGA): an extension of the STROBE statement Evaluation of a region-of-interest approach for detecting progressive glaucomatous macular damage on optical coherence tomography Patterns of retrograde axonal degeneration in the visual system The time course of retrograde trans-synaptic degeneration following occipital lobe damage in humans Progression of anterograde trans-synaptic degeneration in the human retina is modulated by axonal convergence and divergence Retrograde trans-synaptic retinal ganglion cell loss identified by optical coherence tomography Optic radiation damage in multiple sclerosis is associated with visual dysfunction and retinal thinning -an ultrahighfield MR pilot study Microcystic macular oedema in multiple sclerosis is associated with disease severity Microcystic macular oedema, thickness of the inner nuclear layer of the retina, and disease characteristics in multiple sclerosis: a retrospective study Microcystic inner nuclear layer abnormalities and neuromyelitis optica Retinal inner nuclear layer volume reflects response to immunotherapy in multiple sclerosis Retinal inner nuclear layer volume reflects inflammatory disease activity in multiple sclerosis; a longitudinal OCT study The clinical spectrum of microcystic macular edema Dynamic formation of macular microcysts independent of vitreous traction changes Microcystic macular oedema confirmed, but not specific for multiple sclerosis Microcystic macular edema: retrograde maculopathy caused by optic neuropathy Trans-synaptic retrograde degeneration in the visual system of primates Anterior visual system imaging to investigate energy failure in multiple sclerosis Automated analysis of vitreous inflammation using spectral-domain optical coherence tomography Optimizing OCT acquisition parameters for assessments of vitreous haze for application in uveitis Objective quantification of vitreous haze on optical coherence tomography scans: no evidence for relationship between uveitis and inflammation in multiple sclerosis Optic nerve head quantification in idiopathic intracranial hypertension by spectral domain OCT Optical coherence tomography for the diagnosis and monitoring of idiopathic intracranial hypertension Retinal pathology in idiopathic moyamoya angiopathy detected by optical coherence tomography The optic disc drusen studies consortium recommendations for diagnosis of optic disc drusen using optical coherence tomography CuBe: parametric modeling of 3D foveal shape using cubic B ezier Altered fovea in AQP4-IgG-seropositive neuromyelitis optica spectrum disorders Quantitative relations between transient BOLD responses, cortical energetics, and impulse firing in different cortical regions Optical coherence tomography angiography indicates associations of the retinal vascular network and disease activity in multiple sclerosis Alterations in the retinal vasculature occur in multiple sclerosis and exhibit novel correlations with disability and visual function measures Image artifacts in optical coherence tomography angiography among patients with multiple sclerosis Optical coherence tomography angiography (OCTA) in multiple sclerosis and neuromyelitis optica spectrum disorder The investigation of acute optic neuritis: a review and proposed protocol Diagnostic accuracy of optical coherence tomography Inter-Eye Percentage Difference (IEPD) for optic neuritis in multiple sclerosis Optimal intereye difference thresholds by optical coherence tomography in multiple sclerosis: an international study Optical coherence tomography is highly sensitive in detecting prior optic neuritis Optimization of spectral domain optical coherence tomography and visual evoked potentials to identify unilateral optic neuritis Asymptomatic optic nerve lesions Optical coherence tomography for detection of asymptomatic optic nerve lesions in clinically isolated syndrome Optical coherence tomography: A useful tool for identifying subclinical optic neuropathy in diagnosing multiple sclerosis Adaptive optics imaging of the human retina Altered ellipsoid zone reflectivity and deep capillary plexus rarefaction correlate with progression in Best disease The Authors. Annals of Clinical and Translational Neurology published by Wiley Periodicals LLC on behalf of A. Petzold is part of the steering committee of the ANGI network which is sponsored by ZEISS, steering committee of the OCTiMS study which is sponsored by Novartis, and reports speaker fees from Heidelberg Engineering. P. Albrecht reports consulting fees, research grants, and nonfinancial support from Allergan, Biogen, Celgene, Ipsen, Merck Serono, Merz Pharmaceuticals, Novartis, and Roche, consulting fees, and nonfinancial support from Bayer Healthcare, and Sanofi-Aventis/Genzyme, outside the submitted work. L. Balcer reports personal fees from Biogen; she is editor in chief of the Journal of Neuro-Ophthalmology. E. Bekkers has nothing to disclose. A. Brandt is cofounder and shareholder of startups Motognosis and Nocturne. He is named as inventor on several patent applications description MS serum biomarkers, perceptive visual computing, and retinal image analysis. R. Bremel has served as a consultant for Biogen, EMD Serono, Genzyme/Sanofi, Genentech/Roche, Novartis, and Viela Bio. He receives ongoing research support directed to his institution from Biogen, Genentech, and Novartis. P.A. Calabresi has received consulting fees for serving on scientific advisory boards for Biogen and Disarm Therapeutics, and is PI on grants to Johns Hopkins from Biogen, Gentech, and Annexon. O. Galvin has nothing to disclose. J.S. Graves has grant/contract research support from the National MS Society, Biogen, and Octave Biosciences. She serves on a steering committee for a trial supported by Novartis. She has received honoraria for a nonpromotional, educational activity for Sanofi-Genzyme. She has received speaker fees from Alexion and BMS and served on an advisory board for Genentech. A. Green reports grants and other support from Inception Biosciences; grants from the National Multiple Sclerosis Society and from the US. National Institutes of Health; additional support from MedImmune, Mylan, Sandoz, Dr Reddy, Amneal, Momenta, Synthon, and JAMA Neurology, outside the submitted work; and that the Multiple Sclerosis Center, Department of Neurology, University of California San Francisco has received grant support from Novartis for participating in the OCTIMS study. P.A. Keane is supported by a Clinician Scientist award (CS-2014-14-023) from the National Institute for Health Research. J. Nij Bijvank has nothing to disclose. J.W. Sander has been consulted by and received research grants and fees for lectures from Eisai, UCB, Zogenix, and GW Pharmaceuticals, outside the submitted work. F. Paul receives funding from Deutsche Forschungsgemeinschaft, Bundesministerium f€ ur Bildung und Forschung, and Guthy Jackson Charitable Foundation. FC has received consulting fees from Clene, EMD Serono, and PRIME, and is participating as a site investigator in the Novartis-funded OCTIMS study. S. Saidha has received consulting fees from Medical Logix for the development of CME programs in neurology and has served on scientific advisory boards for Biogen-Idec, Genzyme, Genentech Corporation, EMD Serono, and Celgene. He was the site investigator of a trial sponsored by Med-Day Pharmaceuticals, and is the PI of investigator-initiated studies funded by Genentech Corporation and Biogen Idec, and received support from the Race to Erase MS foundation. He has received equity compensation for consulting from JuneBrain LLC, a retinal imaging device developer. P. Villoslada has received an honorarium from Heidelberg Engineering in 2014, has received unrestricted research grants from Novartis (including for the OCTIMS study), Biogen, Genzyme, and Roche, and has participated None. Additional supporting information may be found online in the Supporting Information section at the end of the article.Text S1. List of committee members: IMSVISUAL and ERN-EYE.