key: cord-0263829-fbz5ejt9
authors: Kundu, P. K. G.; Salehi, S. S. M.; Cahn, B. A.; Mazurek, M. H.; Yuen, M. Y.; Welch, E. B.; Gordon-Kundu, B. S.; Schlemper, J.; Sze, G.; Kimberly, W. T.; Rothberg, J.; Sheth, K. N.
title: Point-of-Care MRI with Artificial Intelligence to Measure Midline Shift in Acute Stroke Follow-Up
date: 2022-01-24
journal: nan
DOI: 10.1101/2022.01.22.22269697
sha: 7f01535a38a3e57345ed7bdaaf4745e8d9f99b1d
doc_id: 263829
cord_uid: fbz5ejt9

Background and Purpose: In stroke, timely treatment is vital for preserving neurologic function. However, decision-making in neurocritical care is hindered by limited accessibility of neuroimaging and radiological interpretation. We evaluated an artificial intelligence (AI) system for use in conjunction with bedside portable point-of-care (POC)-MRI to automatically measure midline shift (MLS), a quantitative biomarker of stroke severity. Materials and Methods: POC-MRI (0.064 T) was acquired in a patient cohort (n=94) in the Neurosciences Intensive Care Unit (NICU) of an academic medical center in the follow-up window during treatment for ischemic stroke (IS) and hemorrhagic stroke (HS). A deep-learning architecture was applied to produce AI estimates of midline shift (MLS-AI). Neuroradiologist annotations for MLS were compared to MLS-AI using non-inferiority testing. Regression analysis was used to evaluate associations between MLS-AI and stroke severity (NIHSS) and functional disability (mRS) at imaging time and discharge, and the predictive value of MLS-AI versus clinical outcome was evaluated. Results: MLS-AI was non-inferior to neuroradiologist estimates of MLS (p<1e-5). MLS-AI measurements were associated with stroke severity (NIHSS) near the time of imaging in all patients (p<0.005) and within the IS subgroup (p=0.005). In multivariate analysis, larger MLS-AI at the time of imaging was associated with significantly worse outcome at the time of discharge in all patients and in the IS subgroup (p<0.05). POC-MRI with MLS-AI >1.5 mm was positively predictive of poor discharge outcome in all patients (PPV=70%) and specifically in patients with IS (PPV=77%).

Stroke continues to be a leading cause of death and disability in people of all ages worldwide (GBD 2015 Neurological Disorders Collaborator Group, 2017 Dewan et al., 2018) . The monitoring of imaging biomarkers, such as midline shift (MLS), that signify neurological damage after stroke is one of the most prominent challenges in neurointensive care.

Conventional imaging techniques used to monitor these biomarkers, such as computed tomography (CT) and magnetic resonance imaging (MRI), typically require transportation of patients from the intensive care unit to a different location within the hospital, which greatly increases the risk of complications (Jia et al., 2016; Parmentier-Decrucq et al., 2013; Smith et al., 1990 ).

Point-of-care (POC) versions of conventional imaging modalities, such as transcranial doppler (TCD) ultrasound (Blanco and Abdo-Cuza, 2018; Lau and Arntfield, 2017) , computed tomography (POC-CT) (LaRovere et al., 2012; Peace et al., 2010) , and low-field POC magnetic resonance imaging (POC-MRI) (Cooley et al., 2021; Sheth et al., 2020; Turpin et al., 2020) , are emerging as potential solutions to increase the availability of imaging at the patient bedside, potentially revolutionizing neurocritical care workflows 1/22/22 4:18:00 PM. However, each of these modalities has its own limitations. TCD is operator-dependent and limited by the size and location of the acoustic windows -regions of where the skull is thin enough for ultrasound to penetrate (Naqvi et al., 2013) . POC-CT has inherently low soft-tissue contrast and exposes patients to ionizing radiation (Rumboldt et al., 2009) . While the image resolution and number of sequences available are currently limited in POC-MRI relative to conventional, high-field MRI, POC-MRI overcomes the limitations of TCD and POC-CT by offering whole-brain images with excellent soft-tissue contrast that are acquired without ionizing radiation and are not operatordependent. A preliminary study of neurointensive care patients with neurologic symptoms owing to severe COVID-19 and stroke pathology demonstrated the sensitivity of POC-MRI to neuropathophysiology (Sheth et al., 2020) .

One of the benefits of bringing imaging to the patient's bedside in a neurocritical care unit is the speed with which images can be acquired; however, this benefit may be negated if the clinicians must wait for an official read or analysis of the imaging biomarkers from another department.

For example, MLS measurements typically require manual definition of anatomical landmarks or All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 evaluation of images in separate software packages (Liao et al., 2018) . Artificial intelligence (AI) provides a mean for capturing expert knowledge in an automated image assessment algorithm that has been trained using input from experts and can be incorporated directly into the image workflow.

AI serves to augment decision making by automating the interpretation of data, thereby complementing or supplementing human evaluation (Hainc et al., 2017) . Deep learning (DL)-a sub-type of AI-has been used in medical imaging for pathology detection and classification, as well as image filtering (Serag et al., 2019; Taghanaki et al., 2021; Vieira et al., 2017) . In supervised DL, artificial neural network models are trained through minimizing the error of predicting 'ground truth' features of interest in training data (LeCun et al., 2015) . The combination of POC-MRI and automated biomarker assessment holds great potential for improving the workflow in the neurocritical care setting.

The primary aim of our study was to compare the performance of a supervised DL algorithm trained to automatically measure MLS to that of manual assessment by expert neuroradiologists from POC-MRI data acquired in a cohort of stroke patients in neurocritical care. Additionally, the relationship between the automated MLS measures and clinical outcomes for the patients was assessed.

Our study was conducted under an institutional review board (IRB) protocol approved by the Yale Human Research Protection Program, and written informed consent was obtained from all participants or their legally authorized representatives prior to any research activities.

Between July 2018 to March 2020, all patients who were admitted for stroke to the Neurosciences Intensive Care Unit (NICU) at the Yale New Haven Hospital and had visible brain pathology on conventional neuroimaging (CT or high-field MRI) were screened for the study. Inclusion criteria included age ≥ 18 years, admission to the NICU for stroke, and visible brain pathology on standard of care imaging. Exclusion criteria included code status (i.e., All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 patients that were not clinically stable), isolation requirements (e.g., due to MRSA, c. diff, or e. coli), or the presence of at least one of the following MRI contraindications: cardiac pacemakers or defibrillators, intravenous medication pumps, insulin pumps, deep brain stimulators, vagus nerve stimulators, cochlear implants, pregnancy, and cardiorespiratory instability.

Patient age, primary diagnosis, stroke severity, and discharge outcomes were recorded, if available. Primary diagnoses were either ischemic stroke (IS) or hemorrhagic stroke (HS), and the HS group was subdivided into intraparenchymal hemorrhage (IPH) and subarachnoid hemorrhage (SAH). Stroke severity was measured by the NIH stroke scale (NIHSS, and recorded at the time of POC-MRI imaging. Discharge outcomes were recorded at the time of discharge and at 90-day follow-up, if available, using the modified Rankin Scale (mRS; 0-6), which captures the degree of disability or dependence in daily activities of stroke patients or other causes of neurological disability, where an mRS score of 6 indicates expiry (Ostwaldt et al., 2018; Quinn et al., 2009; Ropper, 1986; Sulter et al., 1999) .

Patient imaging was performed during the follow-up period after initial treatment for acute stroke. Images were acquired at bedside in the Neuro ICU using an FDA-cleared, ultra-low magnetic field (64 mT) portable POC-MRI system (Swoop™, Mk 1.2 RC6.3-7.2 software; Hyperfine, Inc., Guilford, CT, USA) with an 8-channel head coil and a biplanar 3-axis gradient system with peak amplitudes of 26 mT/m (Z-axis) and 25 mT/m (X-and Y-axis). Patients were positioned in the head coil inside the imaging area of the portable POC-MRI while in standard hospital beds (Figure 1a ). Ongoing standard of care treatment (i.e., ventilation, intravenous infusions, and telemetry) continued during the imaging exam, and radiofrequency interference cancellation was enabled on the POC-MRI system (Rearick et al., 2017) . T2-weighted POC (T2WPOC) images were acquired: repetition time (TR) = 4000 ms; echo time (TE) = 228 ms; inversion time (TI) = 1400 ms; 1.5 × 1.5 × 5 mm 3 resolution; 36 slices; and an approximately 5 min scan duration.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10. 1101 /2022 All image data were deidentified as part of the IRB-approved protocol, and no personally identifiable information (PII) was accessible in this study. Imaging data were uploaded to a cloud picture archiving and communication system (PACS) for further analysis 2.3. Additional data sets 2.3.1. Training data sets Two additional data sets were used to train the MLS-AI models. First, high-field T2W (T2WHF) images publicly available from the Human Connectome Project (n=528) were adapted to match the T2WPOC image resolution and noise content (Van Essen et al., 2013) . The T2WHF images were acquired at 3.0 T with the following parameters: TR = 3200 ms; TE = 565 ms; 0.7 × 0.7 × 0.7 mm 3 resolution, and a scan duration of 8 min 24 s. Second, low-field T2W (T2WLF) images from the Hyperfine image archival system (n=86) were used. These de-identified images were acquired using the POC-MRI system (Swoop, Mk 1.2 RC6.3-7.2 software; Hyperfine, Inc., Guilford, CT, USA) at a variety of sites and represent a variety of unknown pathologies. All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101/2022.01.22.22269697 doi: medRxiv preprint

For use in model evaluation only, low-field T2W images from healthy controls (n=10; T2WLFHC) were extracted from the Hyperfine image archival system. These images were acquired under a protocol approved by the New England IRB, and written informed consent was obtained from each participant prior to imaging. Participants were adults, aged 18 years old or older, with a body habitus compatible for scanning inside the POC-MRI. Exclusion criteria included contraindications for MRI and pregnancy. Imaging was performed at Hyperfine, Inc. (Guilford, CT) with a POC-MRI system (Swoop, Mk 1.2 RC6.3-7.2 software; Hyperfine, Inc.; Guilford, CT, USA).

Three independent neuroradiologists (3-5 years of experience each) annotated each image volume included in this study (T2WPOC, T2WLF, T2WHF, and T2WLFHC) using ITK SNAP (Yushkevich et al., 2016) 

MLS-AI estimates were derived from each patient data set (i.e., T2WPOC) with a commercially available AI system (BrainInsight, Hyperfine Research Inc, Guilford CT) using an end-to-end All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. (Ronneberger et al., 2015) .

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10. 1101 /2022 To produce models representative of the image quality of the acquired stroke data set, a two-fold cross-validation experiment, involving model training and evaluation, was performed using the T2WPOC, T2WLF, and T2WHF image volumes and their corresponding annotations. Each independent fold was composed of half of the T2WPOC data (randomly split) and all the T2WLF and T2WHF data sets. The images and annotations in each fold were augmented to further increase the variation present in the data set through random geometric distortions (see Supplemental Information) and then used to train an MLS-AI model. The data sets within each fold were further subdivided, with 80% used for model training and 20% for model validation.

Training was conducted in steps using batches of training data, with each training step followed by a validation step. The validation step was used to determine if a model updated with a batch of training data was more predictive of ground truth in the independent validation data, in which To establish a background distribution of MLS-AI on POC-MRI from healthy controls, MLS-AI was also calculated using the T2WLFHC data as input and the average of the models generated above.

To estimate the accuracy of the trained MLS-AI model against MLS estimates from human annotators for each image volume, the absolute difference of the MLS-AI estimate from the average human MLS estimate of all three annotators was computed. To accommodate variations in brain size, the difference was then normalized to the length of the brain midline. This measure of accuracy was calculated during the validation phase of training and for the final evaluation of the POC data.

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101/2022.01.22.22269697 doi: medRxiv preprint 2.7. Intra-rater and inter-method annotation discrepancy Intra-rater landmark location discrepancy and the discrepancy between landmark locations defined by humans and MLS-AI were calculated as the mean absolute error (MAE) measured in millimeters between the landmark annotations and ground truth locations: MAEHuman and MAEMLS-AI, respectively. (See the Supplemental Information for additional details.) Since comparison to "true" values of radiologic measures is intractable, ground truth locations for each of the three landmarks were established on a relative basis. For the human annotations, ground truth was defined as the mean of the landmark locations from the other two annotators. For MLS-AI annotations (i.e., the locations of the peak probability), ground truth was defined as the mean of the landmark locations from three human annotators.

To determine a threshold for dichotomizing MLS-AI to produce a qualitative marker for the presence or absence of MLS, values of MLS-AI at the minimum dimension of image resolution were evaluated for predictive value versus clinical outcome. A logistic regression was performed to evaluate associations between qualitatively worse outcome (mRS>3) and quantitative measurements of MLS-AI (Supplementary Figure 2) . Values of 1.0 mm, 1.5 mm (the minimum image resolution), and 2.0 mm were evaluated for predictive value.

A paired-t-test was performed to compare MAEMLS-AI and MAEHuman. A two-tailed Student's ttest was used to compare between the means of distributions of MLS-AI across study groups.

The Mann-Whitney U-test was performed to compare rank-sum differences in MLS-AI across study groups. A hypothesis test for non-inferiority of MLS-AI to human annotators was conducted (Walker and Nowacki, 2011) . The test compared the MLS-AI model discrepancy (MAEMLS-AI) to the average annotator discrepancy (MAEHuman), as a fraction of average clinical annotator discrepancy, upper bounded by a clinically acceptable relative error, δ=0.2. Noninferiority was established by showing that MAEMLS-AI was significantly less than (1 + δ) MAEHuman at the α=0.05 significance level (see Supplemental Information for further details). All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10. 1101 /2022 The relationship between MLS-AI estimates and stroke severity (NIHSS score) and disability at discharge (mRS score) was evaluated using linear regressions for all patients and in IS and HS, separately. A multivariate regression was performed controlling for patient age, as well. mRS was modeled as a dependent variable of MLS-AI. Additionally, we evaluated the qualitative effect of MLS-AI on patient outcome at discharge for IS and HS (binary logistic regression; mRS>3). In a sub-sample of patients with available follow-up clinical data, the relationship between MLS-AI mRS scores at 90 days post-discharge was examined with linear regression.

Ordinary and logistic regression analyses were conducted using statsmodels in Python (Seabold and Perktold, 2010) . Leave-one-out cross-validation was conducted to determine the 95% percentile CIs of the estimated effect size. Regression analysis produced the regression coefficient, posterior probability (p), and 95% confidence intervals (two-tailed) using leave-oneout cross-validation. P-values less than 0.05 were considered statistically significant 3. Results

A total of 94 patients were scanned with POC-MRI. The average patient was 62 years old and exhibited moderate stroke severity at the time of imaging (mean NIHSS=5) and moderate disability at the time of discharge (mean mRS=3; able to walk independently) Patients with data on age, diagnosis, severity and disability data (n=71) were grouped by primary diagnosis and stroke category: IS (n=38) and HS (n=33 total) with IPH (n=18) and SAH (n=15). Patient age, diagnosis, and stroke severity and disability scores are summarized in Table 1 . No adverse events related to POC-MRI were reported. All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101/2022.01.22.22269697 doi: medRxiv preprint 

For each fold, the included image data sets were split 80%/20% for training/validation, respectively: T2WPOC (38/9), T2WLF (66/20), and T2WHF adapted to POC-MRI resolution and noise content (470/58). The accuracy of the MLS-AI estimates was 20.7 ± 9.5% in validation during training and 19.3 ± 9.2% in evaluation.

There was no significant difference between MAEMLS-AI and MAEHuman (0.80±0.76 mm and 0.82±0.88 mm respectively; p=0.79). The disagreement of MLS-AI with the average human expert annotation of individual landmarks was 1.15 mm, while the average discrepancy of the individual human annotators amongst each other was 1.39 mm (annotator discrepancies were 1.32, 1.44, and 1.41 mm for annotators 1, 2, and 3, respectively).

The ratio of discrepancy of MLS-AI with human annotators was 0.83 (bootstrapped confidence interval at α=10 -5 was 0.75, 0.92). Based on noninferiority hypothesis testing, the discrepancy of All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 MLS-AI estimates was not significantly different (i.e., noninferior) from that of human annotators (p<10 -5 ; Figure 3 ).

The mean MLS-AI estimate was 1.37±0.14 mm in all stroke patients. The mean MLS-AI estimates for IS and HS, including IPH and SAH, were not significantly different (1.33±0.18 mm and 1.42±0.18 mm, respectively; p=0.73). See Figure 4 . The mean MLS-AI estimate for the ten healthy controls was 1.01±0.41 mm, which was not significantly different from patients (p=0.07). The largest MLS-AI estimates observed in patients and in healthy subjects were 5.50 and 1.96 mm, respectively. (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101/2022.01.22.22269697 doi: medRxiv preprint

MLS-AI was significantly associated with NIHSS in all patients (p<0.005) and in the IS subgroup (p=0.005) but not in the HS subgroup (p=0.13) (Figure 5b ). Patient age was found to be significantly associated with NIHSS (p<0.005). In a multivariate analysis controlling for patient age, NIHSS was associated with MLS-AI in all patients and in IS (both p<0.05) but not in HS (p=0.23) (statistical summary in Supplementary Table 1 ). All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10. 1101 /2022 In the univariate analysis, mRS was significantly correlated with MLS-AI in all patients and in the IS subgroup (both p<0.05; Figure 6 ) but not in HS (p=0.84). Patient age was not significantly associated with mRS, and in a multiple regression model factoring age, MLS-AI remained a significant predictor of mRS in all patients (p=0.04). For the subset of patients with follow-up data (n=26; IS:9, HS:15), a significant association of MLS-AI with 90-day mRS was observed (p<0.05: β=0.687, CI: [0.083, 1.292]).

Larger MLS-AI measurements were significantly associated with worse outcome at discharge in all patients (p<0.05; OR=1.66 [CI: 1.01, 2.72]). Patient sub-groups did not show significant associations between MLS-AI and discharge outcome, with or without correcting for patient age (see Supplemental Table 1 for statistical values). 

All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 3.6. Qualitative threshold for MLS Qualitatively worse disability was significantly associated with MLS-AI >1.5 mm (Mann-Whitney U-test, p<0.05). The positive predictive value (PPV) of significant disability at discharge (mRS>3; MLS-AI>1.5 mm) was 70% for the entire sample and 77% in IS. The negative predictive value (MLS-AI<1.5 mm) was marginal in both samples (51 and 46%, respectively).

The results of our study demonstrated that MLS-AI estimates are not inferior to manual MLS measurements made by expert neuroradiologists. In addition, MLS-AI was associated with neurologic status (NIHSS) at the time of imaging and disability at discharge (mRS), before and All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101 https://doi.org/10. /2022 after controlling for patient age. Furthermore, in a sub-sample of patients with follow-up data, MLS-AI was predictive of disability at 90-days post-discharge.

In practice, the Hyperfine POC-MRI transmits images to the cloud for processing, where MLS-AI models automatically evaluate new images and produce MLS-AI estimates. This approach makes MLS-AI measurement available wherever an internet connection is available, including in the developing world. The noninferiority of the MLS-AI estimates compared to expert neuroradiologist measurements suggests that clinical workflows using MLS-AI estimates could expedite evaluation of POC-MRI. Thus, the integration of AI and POC-MRI could not only save minutes to hours from image acquisition to initial interpretation, but it could also decrease healthcare costs associated with stroke, as well as increase the accessibility and utility of brain imaging.

The association of MLS-AI with outcome was observed in all patients, and specifically in IS, where swelling or edema resulting from neuronal death causes lateral shifts of midline brain structures, leading to functional disability (Adams et al., 1999; Yoo et al., 2013) . Both IS and HS groups exhibited comparable distributions of MLS-AI, indicating that IS-specific associations of MLS-AI were not biased by an interaction of diagnosis and MLS-AI effect size. Importantly, no clinical outcome data was used in DL model training, suggesting that the association with outcomes were unbiased. IS accounts for approximately 87% of stroke occurrences (Ballarin and Tymianski, 2018; Beal, 2010) . The stronger association of MLS-AI with IS outcomes suggested the sensitivity of this approach to brain edema. Our findings confirm that the training strategy used here rendered a model that was robust to the conditions of bedside imaging in a neurointensive care setting, suggesting a role for POC-MRI and AI in detecting stroke pathophysiology in a general stroke patient group, and in IS specifically.

Larger MLS is known to be associated with neurological deterioration and early mortality in ischemic stroke (Pullicino et al., 1997; Qureshi et al., 2009; Sandoval and Witt, 2008; Sheth et al., 2020; Wijdicks et al., 2014; Yoo et al., 2013) . While the qualitative determination of significant MLS from CT and MRI has been cited as the displacement of midline structures by as much as 12 mm, shifts as small as 2 mm are associated with functional deficits (Ropper, 1986) .

Although the inclusion of additional predictors (e.g., gender, ethnicity, NIHSS) in models for functional outcome has been shown in stroke with larger volumes of infarction (i.e., MLS=8-22 All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10.1101/2022.01.22.22269697 doi: medRxiv preprint mm), previous studies have mainly used univariate models when associating smaller MLS with functional outcomes (Battey et al., 2014; Ropper, 1986) . We found that POC-MRI measures of MLS-AI greater than 1.5 mm-the voxel (i.e., volumetric pixel) size of the T2W imaging sequence used in this study-were predictive of functional outcomes. As patients in the present study were clinically stable and scanned with POC-MRI as follow up to initial treatment with appropriate intervention (i.e., thrombolytics in IS), MLS was smaller than would be observed in a typical acute stroke population. We hypothesize that MLS-AI may also be sensitive in larger strokes and when used immediately after injury.

Our study did have limitations. Motion and other artifacts corrupted 24% of the images acquired according to the clinical protocol, associated with the inclusion of lower quality images from early iterations of the POC-MRI software. Improvements in image quality and model performance are expected to improve sensitivity to MLS, and thus the prediction of outcomes.

Furthermore, the predictive value for outcomes based on MLS-AI of 1.5 mm may be biased by factors such as partial volume effects, which require further study. Lastly, studying patients with more severe pathologies (i.e., larger MLS) is needed for the further validation of this approach (Battey et al., 2014) . Future studies could include more patient follow-up data to demonstrate the relationship between MLS-AI and long-term outcomes and evaluate MLS as a dynamic process through serial imaging using POC-MRI.

Deep learning models used in brain imaging tasks such as segmentation and noise reduction include deep CNNs, generative adversarial networks, and autoencoders such as U-Net (Çiçek et al., 2016; Ronneberger et al., 2015) . ResUNet -an evolution from U-Net and Res-Net architectures (He et al., 2016; Ronneberger et al., 2015) -leverages U-net design and has been shown to be effective in a variety of 3D image evaluation tasks (Fu et al., 2020; Wolny et al., 2020) . Like all MRI, POC-MRI is susceptible to artifacts from motion and interference, where image quality can affect detection tasks. However, our results showed that ResUNet was effective in detecting anatomical landmarks in POC-MRI images while accommodating variance due to errors related to imaging in the open environment. Additionally, the MLS-AI localization accuracy metrics indicated that the MLS-AI model was not overfit and would be able to accurately identify MLS landmarks in novel data. All rights reserved. No reuse allowed without permission.

(which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted January 24, 2022. ; https://doi.org/10. 1101 /2022 Limitations of two-fold cross-validation include susceptibility to selection bias and the need for data selection from the same population, in terms of subjects and data quality. These effects were controlled in this study by producing a set of different models trained with random reshuffling of data folds, and final MLS-AI estimates produced as an average of the estimates of the individual models. Using this approach, independent MLS-AI models provided unbiased MLS-AI estimates for each POC-MRI image acquired in this study.

Our study used an integrated approach to diagnostic brain imaging by combining portable pointof-care MRI with AI. We demonstrated the feasibility of using a POC-MRI exam to derive an automated imaging measure reflective of cerebral edema, MLS-AI, with validation against standard scores for clinical outcome, including neurologic status (NIHSS) and discharge functional outcome (mRS). The detection of MLS using POC-MRI acquired at the bedside represents a new opportunity to safely inform treatment planning throughout the progression of stroke using automated imaging methods. Further study may lead to the integration of AI and other POC-MRI-based automated imaging measures into new neurocritical care workflows to improve patient outcomes by decreasing time to treatment and increasing the accessibility of lifesaving brain imaging techniques, not only in stroke, but also in other critical brain injuries.

Baseline NIH Stroke Scale score strongly predicts outcome after stroke: A report of the Trial of Org 10172 in Acute Stroke Treatment (TOAST)

Discovery and development of NA-1 for the treatment of acute ischemic stroke

Brain edema predicts outcome after non-lacunar ischemic stroke

Gender and stroke symptoms: a review of the current literature

Transcranial Doppler ultrasound in neurocritical care

3D U-Net: Learning Dense Volumetric Segmentation from Sparse Annotation

A portable scanner for magnetic resonance imaging of the brain

Estimating the global incidence of traumatic brain injury

Global, regional, and national burden of neurological disorders during 1990-2015: a systematic analysis for the Global Burden of Disease Study

The Bright, Artificial Intelligence-Augmented Future of Neuroimaging Reading

Deep Residual Learning for Image Recognition

High incidence of adverse events during intrahospital transport of critically ill patients and new related risk factors: a prospective, multicenter study in China

Point-of-care transcranial Doppler by intensivists

Deep learning

Brain Midline Shift Measurement and Its Automation: A Review of Techniques and Algorithms

Transcranial Doppler Ultrasound: A Review of the Physical Principles and Major Applications in Critical Care

Comparative Analysis of Markers of Mass Effect after Ischemic Stroke

Adverse events during intrahospital transport of critically ill patients: incidence and risk factors

The use of a portable head CT scanner in the intensive care unit

Mass effect and death from severe acute stroke

Reliability of the modified Rankin Scale: a systematic review

Intracerebral haemorrhage

Noise suppression methods and apparatus

U-Net: Convolutional Networks for Biomedical Image Segmentation

Lateral displacement of the brain and level of consciousness in patients with an acute hemispheral mass

Review of Portable CT with Assessment of a Dedicated Head CT Scanner

Blood-brain barrier tight junction permeability and ischemic stroke

Statsmodels: Econometric and Statistical Modeling with Python

Translational AI and Deep Learning in Diagnostic Pathology

Assessment of Brain Injury Using Portable, Low-Field Magnetic Resonance Imaging at the Bedside of Critically Ill Patients

Mishaps during transport from the intensive care unit

Use of the Barthel index and modified Rankin scale in acute stroke trials

Deep semantic segmentation of natural and medical images: a review

Portable Magnetic Resonance Imaging for ICU Patients

The WU-Minn Human Connectome Project: an overview

Using deep learning to investigate the neuroimaging correlates of psychiatric and neurological disorders: Methods and applications

Understanding Equivalence and Noninferiority Testing

Recommendations for the management of cerebral and cerebellar infarction with swelling: a statement for healthcare professionals from the American Heart Association/American Stroke Association

Accurate and versatile 3D segmentation of plant tissues at cellular resolution

Validating imaging biomarkers of cerebral edema in patients with severe ischemic stroke

ITK-SNAP: An interactive tool for semiautomatic segmentation of multi-modality biomedical images

This study was funded by an American Heart Association Collaborative Science Award 17CSA3355004. Additional research funding and the POC-MRI prototype was provided by Hyperfine, Inc. KNS is supported by the NIH (U24NS107136, U24NS107215, R01NR018335, R01NS110721, R03NS112859, U01NS106513, 1U01NS106513-01A1) and the American Heart Association (18TPA34170180). The authors would like to thank the volunteers for their participation and Dr. Lori Arlinghaus for assistance with manuscript preparation.