key: cord-0725688-jmcnzcll authors: Goswami, Neha; He, Yuchen R.; Deng, Yu-Heng; Oh, Chamteut; Sobh, Nahil; Valera, Enrique; Bashir, Rashid; Ismail, Nahed; Kong, Hyun J.; Nguyen, Thanh H.; Best-Popescu, Catherine; Popescu, Gabriel title: Rapid SARS-CoV-2 Detection and Classification Using Phase Imaging with Computational Specificity date: 2020-12-15 journal: bioRxiv DOI: 10.1101/2020.12.14.422601 sha: 049323eba64ed9578c908e6d4c47515fe71db051 doc_id: 725688 cord_uid: jmcnzcll Efforts to mitigate the COVID-19 crisis revealed that fast, accurate, and scalable testing is crucial for curbing the current impact and that of future pandemics. We propose an optical method for directly imaging unlabeled viral particles and using deep learning for detection and classification. An ultrasensitive interferometric method was used to image four virus types with nanoscale optical pathlength sensitivity. Pairing these data with fluorescence images for ground truth, we trained semantic segmentation models based on U-Net, a particular type of convolutional neural network. The trained network was applied to classify the viruses from the interferometric images only, containing simultaneously SARS-CoV-2, H1N1 (influenza-A), HAdV (adenovirus), and ZIKV (Zika). Remarkably, due to the nanoscale sensitivity in the input data, the neural network was able to identify SARS-CoV-2 vs. the other viruses with 96% accuracy. The inference time for each image is 60 ms, on a common graphic processing unit. This approach of directly imaging unlabeled viral particles may provide an extremely fast test, of less than a minute per patient. As the imaging instrument operates on regular glass slides, we envision this method as potentially testing on patient breath condensates. The necessary high throughput can be achieved by translating concepts from digital pathology, where a microscope can scan hundreds of slides automatically. One Sentence Summary This work proposes a rapid (<1 min.), label-free testing method for SARS-CoV-2 detection, using quantitative phase imaging and deep learning. COVID-19 is an infectious disease caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which reached pandemic proportions in 2020. (1) The global impact of the disease on the healthcare systems and its socio-economic ramifications are severe and, likely, long-lasting. ( 2) The prompt response and public health measures have proven effective in limiting the spread of the virus, decreasing the number of active cases, and, ultimately the mortality rate. (3) Fast, accurate, and scalable testing has been recognized unanimously as crucial for mitigating the impact of COVID-19 and future pandemics. (4) Diagnostic test accuracy is characterized by the sensitivity, defined as the probability of a positive result in a diseased patient, and specificity, given by the probability of a negative result in a healthy patient. Furthermore, the negative predictive value represents the chance of an individual with a negative test to be disease-free and, conversely, the positive predictive value is the chance that a person with a positive test is infected. In addition to these accuracy metrics, throughput and cost are important for deploying testing at scale. Recently, Weissleder et al. have reviewed the current status of the COVID-19 diagnostic tests (4) . Briefly, nucleic acid tests (NATs) rely on the viral RNA being amplified via polymerize chain reaction (PCR) and are the most broadly used in the clinic today. NATs have been implemented on automated instruments and provide a result in several hours. Their accuracy may vary, with false negative rates reported in the order of 30% (4, 5) . Serological tests assess the patient's response to the viral infection through proteins such as immunoglobulin G. The efficacy of these tests relies on prior knowledge about the patient's immune status as well as potential previous exposures to other virus types. The accuracy of serological tests is very high when performed ~20 days after the infection or first symptoms, but may lead to high false negative rates for early patients and false positives for patients previously exposed to other viruses (4) . Common antigen tests can be performed using nasopharyngeal swabs and yield results in less than one hour. These tests operate on detecting proteins associated with the SARS-CoV-2 virus (nucleocapsid or spike proteins) using lateral flow or enzyme-linked immunosorbent assay (ELISA) tests. Recently, accelerated efforts have been devoted to developing alternative testing procedures. These alternative detection schemes involve the use of plasmonic biosensors (6) (7) (8) , fluorescence imaging of labelled virus particles and detection through machine learning (9) , microfluidic immunoassays coupled with fluorescence detections (10) etc. While these approaches represent advances in SARS-CoV-2 detection methodologies, they still require either labelling or addition of foreign particles/solutions for the detection of SARS-CoV-2. Here, we present a new approach for SARS-CoV-2 detection, which relies on direct, label-free imaging of viral particles. We employed spatial light interference microscopy (SLIM), a highly sensitive interferometric method, to image viruses deposited on a glass slide. Although, individual viruses are below the diffraction limit of the microscope, the optical path length information retrieved by SLIM unravels the nanoscale distribution of the refractive index associated with the individual and aggregated viral particles. We paired these data with deep learning algorithms, specifically optimized for viral particle detection and classification. Using fluorescence markers for specific virus tagging, we retrieved "ground truth" data by imaging the same field of view with both SLIM and epi-fluorescence. To emulate a more realistic application environment, we synthesized datasets where different virus types were "digitally mixed" onto the same SLIM image for deep learning development and evaluation. Thus, in addition to SARS-CoV-2, we imaged H1N1, HAdV and ZIKV. While a situation where a patient is exposed simultaneously to these four viruses is highly unlikely, we wanted to test it as a challenging task for our method and evaluate the specificity of our deep learning model. Following the training process, we tested the convolutional neural network (CNN) on unseen samples, classifying one virus type vs. the rest. Our results indicated a 96% area under the receiver operating characteristic curve for SARS-CoV-2, 99% for H1N1, 92% for HAdV and 91% for ZIKV. This pre-clinical study demonstrates that sensitive imaging of unlabeled particles, paired with artificial intelligence (AI) can provide the foundation for a rapid, high-throughput, scalable test. The fact that the assay can be performed on the specimen placed on a glass slide allows for simple and fast sample collection, via, e.g., breath condensates. The image acquisition and inference take 100 ms total, which means that the entire test, including specimen collection, can be performed within a minute. Throughput can be scaled-up by borrowing engineering concepts from whole slide scanners in digital pathology, where hundreds of slides can be automatically fed into the imaging instrument. As the specimen requires minimum preparation and the instrument can be made portable, in principle, the technology can be deployed as a point-of-care solution. The paper is structured as follows. First, we present the workflow for multimodal imaging and ground truth data acquisition. Next, we describe the SLIM imaging system and its sensitivity to the nanoscale ultrastructure of viral particles. We show 3D tomograms of the four virus types, to illustrate the subtle texture difference that the instrument captures, which the AI tools exploit for classification. We describe the convolutional neural network, which is a version of U-Net optimized for this problem. Finally, we present the accuracy of classifying the four virus types. We end with a discussion of the next steps necessary to implement this technology as a reliable clinical testing solution. Figure 1 depicts the workflow of our approach (see Fig. S1 and Supplementary Section S1 for details on sample preparation). We tagged the deactivated virus samples with Rhodamine B isothiocynate as detailed in Materials and Methods. The staining was followed by dialysis to remove unbound fluorophores. The sample was deposited on a glass slide, fixed with EtOH, and air dried (Fig. 1A) . The slide was imaged using multimodal SLIM and epi-fluorescence, overlaid for the same field of view (Fig. 1B) . The resulting images were processed to extract pairs of images associated with individual particles (Fig. 1C) . A U-Net convolutional neural network was trained using these data, with the fluorescence images acting as ground truth. The U-Net output provides a semantic segmentation map, i.e., an image that classifies and labels the various virus types (Fig. 1D ). protocol, viruses were deactivated, stained with Rhodamine B isothiocyanate and dialyzed for 2 days to reduce fluorescence background and then placed on slide, fixed with 90% EtOH and air-dried B. We added a SLIM module to a traditional phase contrast microscope for quantitative phase information. C. SLIM and fluorescence were registered, single 48 x 48 regions were cropped from the image and segmented to provide label for multiclass classification. D. We synthesized a new dataset by randomly placing the cropped virus particles onto a background image acquired during the same experiment. A deep neural network was trained with this dataset to perform virus particle classification. Given a SLIM image, the model will output a class label for each pixel in the image. A key element in our approach is the spatial light interference microscope described in Fig. 2A . SLIM belongs to the family of quantitative phase imaging (QPI) instruments (11) which have found broad applications in biomedicine (12) (13) (14) (15) (16) (17) (18) (19) (20) (21) (22) (23) due to their ability to image unlabeled, highly transparent structures. SLIM is implemented as an add-on module to an existing phase contrast microscope and, in essence, controls rigorously the phase shift between the incident and scattered field emerging from the specimen (24, 25) . We used a Nikon Eclipse Ti inverted microscope outfitted with a SLIM module (CellVista SLIM Pro, Phi Optics, Inc.), which allows for fully automated data acquisition. The microscope objective pupil is relayed onto the surface of a phase-only spatial light modulator (SLM), such that the phase shift between the incident and scattered light is controlled precisely ( Fig. 2A) . We record four intensity frames associated with individual phase shifts, applied in increments of 2 ⁄ , as shown in Fig. 2B . The four intensity images are combined as described in (25, 26) to decouple the amplitudes of the incident and the scattered fields from the phase information and obtain a quantitative phase map associated with specimen ( Fig. 2B ). Because the interfering fields in SLIM propagate along a common path, the phase measurement is highly stable, to within a fraction of a nanometer pathlength (25) . Due to the white light illumination associated with the phase contrast microscope, the SLIM images are free of speckles, which converts into sub-nanometer spatial pathlength sensitivity (25) . These attributes make SLIM ideal for the challenging task of imaging viral particles on a glass slide. Figure 2C illustrates the significant boost in contrast present in SLIM compared to traditional phase contrast microscopy. SARS-CoV-2, H1N1, HAdV and ZIKV were separately stained as illustrated in Fig. S1 (see Methods Section and Supplementary Section S1 for more details) with Rhodamine B isothiocyanate which has an emission at 595nm. We performed dual channel phase-fluorescence imaging on the samples. Figure 3 illustrates the imaging results for SARS-CoV-2, with SLIM Resolution of our imaging system is approximately 335nm (illumination at 550nm, objective 100x/1.45 with condenser NA 0.55). Following Rayleigh's resolution criterion, two objects with separation less than the width of point spread function (PSF), cannot be fully resolved. The individual virus particles used in this study have an average diameter of less than 150nm, which makes them sub-diffraction objects for optical imaging. In order to push the resolution beyond the diffraction limit, we performed a deconvolution with the microscope's PSF (Supplementary Section S3). To estimate the PSF, we identified the smallest spot in the images via a Matlab script. Using this PSF, the images were deblurred by employing the iterative Richardson-Lucy algorithm with total variation regularization (see Supplementary Section S3 for more details) (27, 28) . Figure S5 illustrates the deconvolution results for the four virus classes. Thus, the deconvolution is able to produce deblurred images with clumps separated into smaller groups. However, it should be noted that the size of the deconvolved particles does not necessarily match the actual size of the virus particles as the decoupling of PSF and virus is still not perfect. However, we can successfully separate clumps into subsequent individual viruses, which the neural network is likely to pick-up for classification. One advantage of SLIM over fluorescence is the inherent ability to measure not only shape descriptors like, diameter, orientation, circularity etc., but also quantify the phase information associated with the sample, which can then be used to extract biophysical information, such as, cell dry mass density. From the SLIM images, we extracted the total dry mass and surface dry mass density for each measured particle (see Supplementary Section S3 for details). We observed shifts in the dry mass density for different virus classes as shown in Fig. S6A . Figure S6 (B-D) with p-values 1.35e-12, 8.84e-6 and 1.23e-5, respectively, demonstrate the statistical significance of the dry mass density differences between SARS-CoV-2 and H1N1, HAdV and ZIKV respectively, obtained by applying Kruskal-Wallis test (in MATLAB). These results indicate that dry mass density, which is incorporated in the SLIM data, is a marker that helps the machine learning algorithm to detect SARS-CoV-2. To get a better understanding of the viral particles, we performed a tomographic reconstruction of diffraction limited SLIM, using the Amira (Thermo Scientific) software (see Supplementary Section S4 for details). The results are shown in Fig. 4 , where volumetric reconstructions of the particle cores (Fig. 4 (A-D) ), and surface reconstructions (Fig. 4 (E-H) ) for each particle are illustrated. These reconstructions provide an insight into structural dissimilarities that exist even in the diffraction limited SLIM images. Surface irregularities can be seen for SARS-CoV-2 in Fig. 4 (A, E) . Figure 4 (B, F) show the H1N1 particle, which again has irregular surface but of different texture. Figure 4 (C, G) show a clump of at least two HAdV particles with hexagonal boundary visible in lower portion of Fig. 4G . ZIKV (Fig. 4 (D, H) ) is significantly smoother compared to SARS-CoV-2. The structural signatures present in these reconstructions agree with the TEM images showing irregular surface morphology for SARS-CoV-2 (29, 30) and H1N1 (31) , hexagonal cross-section for HAdV (32) and comparatively smoother surface of ZIKV (33, 34) . These reconstructions suggest that signatures of structural information still exist in the diffraction limited SLIM images, due to the nanoscale pathlength sensitivity of SLIM. These subtle features help the machine learning algorithm to successfully classify these particles. We formulated the virus detection task as a semantic segmentation problem: given an input SLIM image containing several virus particles, our model predicts a probability distribution for each pixel, denoting the chance of this pixel belonging to one of the 5 classes: background, SARS-CoV-2, H1N1, HAdV, and ZIKV. An argmax operation turns the model output into a class label for each pixel. As all our raw SLIM images were of pure-culture virus particles, we synthesized a new dataset via "digital mixing" for machine learning development and evaluation (see Supplementary Section S5 for details). The deep neural network we used was adapted from the U-Net ( Fig. 5A and Fig. S7A ) (35) . Our model was trained using the digitally mixed SLIM images as input and the corresponding segmentation maps as ground truth (Fig. 5 (B-C) and Fig. S7 (B-C)). We divided machine learning task into two steps. Two types of datasets were prepared based on two data curation strategies. First dataset was semiautomatic, with manual cropping followed by automatic segmentation, fixed concentration of viruses per digitally mixed image and placement of virus particles on a grid with artificial phase background. Second dataset was fully automatic, with automatic segmentation followed by automatic cropping, varying (but balanced) concentration of viruses per digitally mixed image and random placement of virus particles on a blank image for digital mixing. Every digitally mixed image has five particles per class. We kept 500 particles out as the test dataset, and trained the neural network on the remaining particles (see Supplementary Section 5). During evaluation, we noticed that our model sometimes predicted more than one label per particle. To solve this issue, we used a post-processing strategy to enforce particle-level consistency in our model prediction (see Fig. S8 and Supplementary Section S5 for details on post-processing method). After the post-processing, we achieved the following area under the Sections S2 and S5 for more details of the procedure). We randomly selected around 1000 images for training and kept the remaining 564 SLIM images as the test dataset to evaluate our model. Similar as the first dataset, we enforced instance-level consistency on our model prediction via the same post-processing step (see Supplementary Section S5 and Fig. S8 ). Figure 5D shows the predictions after post processing. Quantitative results for this dataset are shown in Figure 6 , where Fig. 6A shows the one-versus-all receiver operating characteristic (ROC) curve and Fig. 6B shows the complete confusion matrix to better illustrate our model's sensitivity. AUC for all four virus classes is above 91%. We anticipate that, in clinical situations, the most challenging issue will be to detect the SARS-CoV-2 class alone, or, occasionally, distinguish it We also plotted the precision and recall for SARS-CoV-2 on every image in the second test dataset into a histogram (Fig. S10) . The majority of the detections have precision/recall values nearing unity. The learning curve plots for both our models (for first and second datasets) are shown in Fig. S11 . The loss on the validation dataset and on the training dataset converged properly, indicating that our models did not overfit or underfit. We presented a method for detection and classification of SARS-CoV-2 in the presence of other viruses, by using interferometric imaging and AI. Our results indicate that highly sensitive phase imaging is capable of providing subtle structural specificity of the viral particles, which in turn, allows for their accurate classification. There are two main components that help our model detect and classify viruses with high accuracy. First, the specific texture of the dry mass density can report on the differences in the refractive index caused by the specific protein compositions of the virus. Second, the nanostructure signature of individual viruses, e.g., irregularities on the surface of SARS-CoV-2 and H1N1, hexagonal shapes in HAdV, and the smoother surface of ZIKV, are subtle features in the SLIM images, exploited by the neural network. The most likely combination of multiple viruses is SARS-CoV-2 and H1N1, a situation which can pose a challenge for accurate testing. However, our model proved to be successful in detecting and differentiating SARS-CoV-2 and H1N1 with a one versus all AUC of 96% and 99%, respectively. Pending successful clinical testing of this approach, we anticipate that the instrument can be implemented into a portable device controlled by a laptop. As the inference per field of view takes 60 ms, it is likely that the test per specimen, sampling several fields of view, will complete in a few seconds. Due to the lack of labels or other reagents, the test itself is bound to be inexpensive. Finally, to scale up throughput, we envision translating automatic slide scanning engineering concept from digital pathology devices. Stained virus sample was dropped on glass slide, fixed with 90% ethyl alcohol and air dried (more information in Supplementary Section S1). We performed deconvolution using Richardson-Lucy iterative algorithm with Total Variation (TV) regularization (27, 28) . We first converted phase map obtained from SLIM to complex field. This complex field was then used as an input to the algorithm. We derived an initial estimate for PSF from the images themselves, by choosing the smallest spot in the images. Utilizing the properties obtained from segmentation (area, integrated phase values, centroid, etc.,) we carried out quantitative analysis on single virus particles using MATLAB (see Supplementary Section S3). We produced tomographic reconstructions using Amira software (Thermo Scientific). We cropped out single particles from whole image and upsampled them by a factor of 10 with bilinear interpolation to remove pixelations. We then used Volren and Isosurface rendering to reconstruct volume and surface tomograms (see Supplementary Section S4) for each virus type. For both the first (manual selection, with background) and second (automatic selection, without background) datasets, we prepared digitally mixed images to train and test our network. We found that in some cases, our model inferred more than 1 label for different parts of the same particle. To enforce instance-level consistency onto our model prediction, we performed a postprocessing step via connected component analysis to ensure that all pixels in each individual particle are predicted as one class. After this post-processing step (see Supplementary Section S5), our model's performance was summarized into a confusion matrix on over 10,000 virus particles from the test dataset for the second dataset. Competing interests: G.P. and C.B-P. have financial interests in Phi Optics Inc., a company that manufactures quantitative phase imaging instruments for biomedical applications. Data and materials availability: All the data needed to reproduce the results can be obtained from the corresponding author upon reasonable request. Section S1. Sample preparation Section S2. Image acquisition and processing: registration, cropping and segmentation ground truth label. D. model inference. Figure S8 . Post-processing to enforce particle-level consistency. A. To ensure all pixels in one virus particle has the same predicted label, we performed connected component analysis and averaged the probability distribution within each connected component. Left column: raw probability prediction; right column: probability distribution after post-processing. B. After postprocessing, the predicted segmentation map no longer had different labels within one particleregion. This enabled us to compute, on an instance-level, the performance of our model. showed a good convergence between the validation loss and training loss of our models, indicating that our models did not underfit or overfit. E represents categorical cross-entropy Loss Mental health and the Covid-19 pandemic Mitigating the wider health effects of covid-19 pandemic response The emergence of SARS-CoV-2 in Europe and North America COVID-19 diagnostics in context Correlation of Chest CT and RT-PCR Testing for Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases Selective Naked-Eye Detection of SARS-CoV-2 Mediated by N Gene Targeted Antisense Oligonucleotide Capped Plasmonic Nanoparticles P-FAB: a fiber-optic biosensor device for rapid detection of COVID-19 Promising near-infrared plasmonic biosensor employed for specific detection of SARS-CoV-2 and its spike glycoprotein Virus detection and identification in minutes using single-particle imaging and deep learning. medRxiv Microfluidic immunoassays for sensitive and simultaneous detection of IgG/IgM/antigen of SARS-CoV-2 within 15 min Quantitative phase imaging of cells and tissues Quantitative phase imaging in biomedicine Optical properties of acute kidney injury measured by quantitative phase imaging Zinc's Effect on the Differentiation of Porcine Adipose-derived Stem Cells into Osteoblasts Quantitative phase imaging of stromal prognostic markers in pancreatic ductal adenocarcinoma Imaging collagen properties in the uterosacral ligaments of women with pelvic organ prolapse using spatial light interference microscopy (SLIM) Quantitative phase imaging reveals matrix stiffnessdependent growth and migration of cancer cells Topography and refractometry of sperm cells using spatial light interference microscopy SLIM microscopy allows for visualization of DNA-containing liposomes designed for sperm-mediated gene transfer in cattle Tomographic flow cytometry by digital holography Label-free optical quantification of structural alterations in Alzheimer's disease Optical Phase Measurements of Disorder Strength Link Microstructure to Cell Stiffness Holographic virtual staining of individual biological cells White-light diffraction tomography of unlabelled live cells Spatial light interference microscopy (SLIM) Wolf phase tomography (WPT) of transparent structures using partially coherent illumination Richardson-Lucy algorithm with total variation regularization for 3D confocal microscope deconvolution DeconvolutionLab2: An open-source software for deconvolution microscopy Electron microscopic image of a negatively stained particle of SARS-CoV-2, causative agent of COVID-19. Note the prominent spikes from which the coronavirus gets its name for "corona Transmission electron microscopy imaging of SARS-CoV-2. The Indian journal of medical research CDC H1N1 Flu | Images of the H1N1 Influenza Virus. Cdc.gov (2020) The adenovirus major core protein VII is dispensable for virion assembly but is essential for lytic infection Zika virus-like particle (VLP) based vaccine Zika virus replication and cytopathic effects in liver cells International Conference on Medical image computing and computer-assisted intervention Adam: A method for stochastic optimization Structures and distributions of SARS-CoV-2 spike proteins on intact virions Adenovirus replication cycle disruption from exposure to polychromatic ultraviolet irradiation Transmission electron microscopy and the molecular structure of icosahedral viruses Virus Pathogen Database and Analysis Resource (ViPR) -Flaviviridae -. Viprbrc.org (2020) Zika virus structure, maturation, and receptors Microscopy analysis of Zika virus morphogenesis in mammalian cells The 3.8 Å resolution cryo-EM structure of Zika virus Inactivation methods for whole influenza vaccine production Irradiation by a Combination of Different Peak-Wavelength Ultraviolet-Light Emitting Diodes Enhances the Inactivation of Influenza A Viruses Aerosol susceptibility of influenza virus to UV-C light Structure and accessibility of HA trimers on intact 2009 H1N1 pandemic influenza virus to stem region-specific neutralizing antibodies Reproductive outcomes predicted by phase imaging with computational specificity of spermatozoon ultrastructure Multiscale assay of unlabeled neurite dynamics using phase imaging with computational specificity (PICS) PICS: Phase Imaging with Computational Specificity Batch normalization: Accelerating deep network training by reducing internal covariate shift Proceedings of the IEEE conference on computer vision and pattern recognition Tensorflow: Large-scale machine learning on heterogeneous distributed systems Fossil charcoal particle identification and classification by two convolutional neural networks Evaluation of deep learning strategies for nucleus segmentation in fluorescence images Label-free cell viability assay using phase imaging with computational specificity. bioRxiv Accurate, large minibatch sgd: Training imagenet in 1 hour Sgdr: Stochastic gradient descent with warm restarts Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition scikit-image: image processing in Python Scikit-learn: Machine learning in Python. the Infected Cell Lysate, Gamma-Irradiated (NR-50547)) was obtained through BEI Resources, NIAID, NIH: Zika Virus, PRVABC59, Infected Cell Lysate, Gamma-Irradiated, NR-50547. Authors would also like to thank Dr ) for SLIM control and dual channel imaging. Funding: This work is supported by National Institutes of Health (R01GM129709, R01CA238191) National Science Foundation NSF-DMR 2004719 (awarded to H.J.K.). R.B. and E.V. acknowledge the support of NSF Rapid Response Research (RAPID) grant (Award 2028431), and the support of Jump Applied Research through Community Health through Engineering and Simulation (ARCHES) endowment through the Health Care carried out virus deactivation. H.J.K. and Y-H.D. selected chemical reagent appropriate for staining and carried out staining process on all four virus samples for fluorescence detections. N.G. prepared samples on slides, conducted imaging experiments, data analysis, tomographic reconstructions, deconvolution and data curation for machine learning Heat-inactivated SARS-CoV-2 (ATCC® VR-1986HK ™) was deposited by the Centers for Disease Control and Prevention and obtained through BEI Resources, NIAID,