key: cord-0786085-ocs07658 authors: Wood, Bayden Robert; Kochan, Kamila; Bedolla, Diana E.; Salazar-Quiroz, Natalia; Grimley, Samantha; Perez-Guaita, David; Baker, Matthew J.; Vongsvivut, Jitraporn; Tobin, Mark; Bambery, Keith; Christensen, Dale; Pasricha, Shivani; Eden, Anthony K.; Mclean, Aaron; Roy, Supti; Roberts, Jason; Druce, Julian; Williamson, Deborah A.; McAuley, Julie; Catton, Mike; Purcell, Damian; Godfrey, Dale; Heruad, Philip title: Infrared based saliva screening test for COVID‐19 date: 2021-05-27 journal: Angew Chem Int Ed Engl DOI: 10.1002/anie.202104453 sha: 9d8eda98fc49edf8fc4b59dced4703592e164370 doc_id: 786085 cord_uid: ocs07658 Severe acute respiratory syndrome coronavirus 2 (SARS‐CoV‐2) has resulted in an unprecedented need for diagnostic testing that is critical in controlling the spread of COVID‐19. We propose a portable infrared spectrometer with purpose‐built transflection accessory for rapid point‐of‐care detection of COVID‐19 markers in saliva. Initially, purified virion particles were characterized with Raman spectroscopy, synchrotron infrared (IR) and AFM‐IR. A data set comprising 171 transflection infrared spectra from 29 patients testing positive for SARS‐CoV‐2 by RT‐qPCR and 28 testing negative, was modeled using Monte Carlo Double Cross Validation with 50 randomized test and model sets. The sensitivity was 93 % (27/29) and specificity of 82 % (23/28) that included positive samples on the limit of detection for RT‐qPCR. Here, we demonstrate a proof‐of‐concept high throughput infrared COVID‐19 test that is rapid, inexpensive, portable and utilizes sample self‐collection thus minimizing the risk to healthcare workers and ideally suited to mass screening. Coronavirus disease 2019 (COVID-19) is a highly transmissible respiratory disease caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). This virus has infected over 160 million people resulting in greater than 3 million deaths surpassing malaria infections as the world's most devastating infectious disease. [1] The high mortality rate and tremendous socio-economic cost of the virus has necessitated mass testing to detect infected individuals who can then be isolated to reduce transmission. The gold standard for detection of SARS-CoV-2 is reverse transcription Polymerize Chain Reaction (RT-PCR) assay using material collected from nasopharyngeal swabs. [2] Whilst the assay has been effective in reducing transmission rates, the turnaround time for the result can be several days. The assay itself takes ~4 hours, requiring relatively expensive instrumentation and consumables including PCR primers, enzymes and other reagents, which can be difficult to implement in under-resourced health care systems. Collection of swab samples for RT-PCR requires skilled health workers and close face to face contact with patients; hence necessitating personal protective equipment (PPE), which is expensive and may be in short supply. [3] Nasal swab collection is uncomfortable, sometimes even causing bleeding and is likely to be unacceptable by some workers, where regular surveillance sampling is required. [4] While saliva testing with RT-qPCR is being explored [5] , this does not overcome the limitations associated with expense and time. Considering these many drawbacks, the implementation of new diagnostic approaches that can be performed rapidly, affordably, and with minimal risk to patient and health care workers, is highly desirable. Infrared spectroscopy is a routine analytical technique mainly used to identify molecular functional groups in various materials. Infrared light interacting with the intrinsic vibrational modes of molecules generates a spectrum that represents a unique chemical fingerprint of the sample. The development of portable Attenuated Total Reflection Fourier Transform Infrared (ATR-FTIR) spectrometers has seen a plethora of applications in the field of biomedical diagnostics [6] . More recently, the technology has been applied to diagnose Plasmodium sp. in blood [7] and viruses including hepatitis B and hepatitis C in serum [8] . The technique relies on detecting both the molecular phenotype of the pathogen and the host immune response. Advanced machine learning techniques can then be applied to predict diagnostic outcomes based on the chemical differences between infected and uninfected samples. Recently, Barauna et al. [9] applied ATR-FTIR to analyze saliva collected from COVID-19 diagnosed patients using pharyngeal swabs. In their measurements the swab was placed onto ATR spectrometer and the spectrum recorded of both the saliva and swab. Interestingly their spectra of purified virus did not show the characteristic amide modes of proteins that are expected given the dominant S-proteins that characterize SARS-CoV-2 virions, nor did they show the characteristic RNA bands that would also be expected from a large RNA virus. [10] ATR spectroscopy has a number of disadvantages when it comes to high throughput screening. First, the sample has to be dried onto the internal reflection element (IRE) prior to the measurement to maximise the absorbance. Secondly, the residue must be cleaned off the IRE, which increases the potential of transmission through aerosols and via surfaces, posing a danger to the operator. More recently, ATR disposable silicon substrates have been developed and deployed to detect brain cancer in serum samples. [11] These substrates negate the drying and cleaning steps but are prohibitively expensive for routine COVID-19 testing at present. In pursuit of a cheaper approach to high throughput screening, we propose utilizing infrared reflective slides and a modified reflection accessory to obtain high quality spectra of saliva. Hitherto, most infrared diagnostics have focused on blood components and only few studies have investigated less invasively collected samples such as saliva [12] . Saliva is emerging as an attractive medium for point-of-care diagnosis of COVID-19 in the current pandemic. The SARS-CoV-2 virus has a preferential tropism to human airway epithelial cells that express the cellular receptor angiotensin-converting enzyme 2 (ACE2) [13] . ACE2 was found to be higher in salivary glands compared to the lungs, indicating salivary glands could be a potential target for SARS-CoV-2 [14] . Wylie et al. [15] reported that SARS-CoV-2 RNA copies were higher in saliva (5.58 mean log copies per millilitre) compared to nasopharyngeal swabs (4.93 mean log copies per millilitre) from 70 COVID-19 positive patients. Besides SARS-CoV-2 virions, saliva is known to have many other COVID-19 biomarkers, including ACE2, adenosine deaminase, immunoglobulin G, immunoglobulin M, RNA and secretory immunoglobulin A [16] . The number of potential biomarkers and complexity of the spectra requires a machine modelling approach, which has already been employed to data mine blood clinical parameters for COVID-19 detection [17] and detecting SARS-CoV-2 in nasal swabs using matrix-assisted laser desorption/ionization-mass spectrometry (MALDI-MS). [18] From a practical perspective, saliva can be self-collected by dribbling into a sterile container reducing time and cost associated with the specimen collection and minimising nosocomial transmission of the disease [13b, 19] . The infrared technique is eminently suited to the analysis of saliva samples because it requires no additional reagents or consumables, is very rapid (less than 5 minutes to record spectra from three replicate samples and do the computation), the sample dries to a homogenous deposit and the data can be directly transferred to a machine learning model for diagnosis. A comprehensive schematic diagram of the COVID-19 diagnostic approach is presented in Figure 1 (A-G). In developing a new infrared based test, the first aim was to record a high-quality spectrum of the purified virus SARS-CoV-2 and assign the important marker bands. The SARS-CoV-2 virions were purified from the supernatant of infected Vero cells deactivated by fixation with 4% formalin. [20] Aliquots from the purified stock solution were deposited and dried onto a BaF2 window. The presence of virions was confirmed by means of Transmission Electron Microscopy (TEM). TEM images clearly show spherical particles approximately 120 nm in diameter with multiple spikes forming the solar crown structure, characteristic of coronaviruses (Figure 2 A, B) [21] . This is consistent with previously reported TEM images of SARS-CoV. [22] Atomic Force Microscopy (AFM) confirmed the presence of spherical particles of approximately 120 nm in diameter, notably aggregated together in large clusters (Figure 2 C, D) . Synchrotron-FTIR spectra were collected from the individual clusters and the mean and second derivative spectrum calculated ( Figure 2E ). Numerous bands are observed, reflecting the main chemical constituents of the virion particles (Table S1 ). These include protein bands at 1657, 1547, 1517 cm -1 (from spike, envelope, membrane and nucleocapsid proteins) along with lipid bands at 1740, 1464, 1382 and 1341 cm -1 (from lipid bilayer surrounding the nucleocapsid). In particular, multiple bands associated with RNA are present, including 1690, 1235, 1124, 1089, 996, 967 and 934 cm -1 . [23] A similar spectral profile was observed using nanoscale IR spectroscopy ( Figure S1 ), which enables singlevirion spectra to be recorded at a vertical resolution of approximately 100 nm. It is important to note that the spectra will have minor contributions from fixative and residual media 10 .1002/anie.202104453 This article is protected by copyright. All rights reserved. attached to the virus particles. In particular, bands at 1235 cm -1 and 1092 cm -1 were assigned to residual paraformaldehyde, which polymerizes when 4 % formalin dries ( Figure S2 ). Raman spectra of SARS-CoV-2 virions compared to purified RNA show nucleic acid markers at 1242, 1110, 782, 723 and 670 cm -1 and an RNA-specific marker band at 813 cm -1 ( Figure 2F ). Raman spectroscopy confirms the purity of the virion extraction and shows the very strong RNA bands that characterize this RNA rich virus. The bands between 800-700 cm -1 are unique identifiers for RNA as evidenced by the corresponding RNA spectrum, which matches perfectly to many bands in the virion particle spectrum. The lack of bands from media or fixative confirms the purity of the virions. The relative contribution of RNA bands to the Raman spectrum is very high, likely reflecting the very high content of RNA in SARS-CoV-2, which contains the largest genome RNAcontaining viruses. In pursuit of a high throughput tool for pointof-care diagnosis, we optimized a PerkinElmer Spectrum Two infrared spectrometer to perform transflection measurements using a modified reflection accessory for infrared reflective slides. A top plate was 3D printed to enable an aluminum slide holder to be slid along the mount so that the background spectrum and each sample deposit could be measured successively. To improve the throughput the slides were mounted as low as possible in the top plate and the mirrors of the reflection accessory adjusted so the 8 mm beam matched the size of the deposit to maximize absorbance. The larger infrared beam spot enables more sample and consequently more virions to be detected in the saliva compared to ATR-FTIR and Raman spectroscopy. Raman has the advantage of being able to measure aqueous samples. However, the diameter of laser beam (~1 µm) results in much less virions being detected compared to the infrared based approaches. Close up photographs of the accessory are found in the Supporting Information ( Figure S3 ). The transflection approach resulted in greater absorbance compared to ATR and less noise (Figure 3 ) because more sample is being interrogated by the infrared beam and thus the ability to detect virions improved. Other advantages include the ability to batch dry samples; viral transmission is minimized by self-collection of saliva with no swab contribution, the instrument does not need to be cleaned between measurements, and the slides can be easily stored for future analysis making the transflection approach more conducive to point-of-site testing and mass screening compared to ATR-FTIR. To test directly the application of the infrared approach as a diagnostic for COVID-19 infection, we recorded 171 infrared spectra of triplicate dried saliva deposits from 57 human donors presenting to the Royal Melbourne Hospital with COVID-19 like symptoms (Table S2) . Of these 29 patients tested positive with SARS-CoV-2 by RT-qPCR and 28 tested negative. Figure 4A shows the averaged spectra (4000-800 cm -1 ) for the positive and negative SARS-CoV-2 infected saliva samples along with the corresponding second derivative spectra. Some of the spectra show a small contribution from Viral Transport Medium (VTM), which has bands at 1578 cm -1 , 1408 cm -1 and 1078 cm -1 ( Figure S4 ). In addition to signatures from infectious agents, infrared spectra acquired from saliva will contain information from components including mucins, proline-rich proteins, cystains, histatins, statherins, amylases, carbonic anhydrases, salivary peroxidases, lipids, carbohydrates and inorganic compounds including nitrate and thiocyanate. [24] Saliva spectra have a distinctive band from thiocyanate at 2059 cm -1 and strong bands from proteins at 3288 cm -1 , 1658 cm -1 and 1549 cm -1 assigned to the amide A, amide I and II modes, respectively. The protein bands on average appear more intense in the positive samples compared to the negative samples. In general, the protein level of saliva is relatively small compared to other biofluids like serum. [24a] Hence, the significant difference in protein levels between infected and uninfected samples would likely be from viral proteins associated with the virus or antibodies produced by the host. On average, the thiocyanate band appears slightly more intense in the negative samples. Interestingly, thiocyanate is converted by salivary peroxidases to hypothiocyanite (OSCN -1 ) and is a potent antibacterial agent in the mouth, however, these levels will also change in response to SCN -1 intake in food and smoking [24b, 24c] . The bands at 1240 cm -1 and 1078 cm -1 are assigned to the phosphodiester groups associated with nucleic acids [23] and are more intense in the positive saliva samples indicating a contribution from viral RNA to this band. However, the 1078 cm -1 band could also have minor contributions from VTM. We first applied Principal Component Analysis (PCA) to investigate general trends in the spectral data. The modelling was performed in the phosphodiester region (1300-800 cm -1 ) on vector normalized second derivative spectra where each spectrum is the average of the three replicates (Source Data 1 and External Modelling S1). The 1300-800 cm -1 region was chosen because this is where many important RNA and glycoprotein marker bands are located and there is less interference from VTM. The PC1 vs. PC8 scores plot shows a general separation of infected and uninfected saliva sample spectra along PC1 with most negatives clustered in the lower left quadrant ( Figure 4B ). The corresponding PC1 vector indicated a number of RNA bands highlighted in orange (1237, 1121, 1078 and 940 cm -1 ) distinguishing the positive from the negative samples ( Figure 4C ). It should be noted that other bands including glycoproteins would also contribute to these bands in this region of the spectrum but the correlation with the averaged second derivative spectrum shown of virion particles in Figure 2E indicates that the major contribution to the infected saliva spectra is from RNA. Partial Least Squares-Discriminant Analysis (PLS-DA) was employed to create discrimination models to predict the positivity of the sample using first derivative spectra from the patient cohort (Source Data 2 and External Modelling S2). The methodology was evaluated using Monte Carlo Double Cross Validation (MCDCV) with 50 randomized test (30 %) and model sets (70 %) ( Figure 5 ). The test spectra were averaged, and no test replicates were included in the model avoiding over optimistic modelling, the so-called technical replicate trap. By randomizing both the test and model set ensures the most robust and unbiased way of testing the model's performance. Furthermore, by keeping the numbers of variables small and using only the 1300-900 cm -1 region further increases the robustness of the modelling because the more variables in the model, such as the variable amount of viral transport media in the sample, the more chance of finding spurious correlations. Figure 4D shows the Receiver Operating area under the ROC of 0.90, which is excellent for a small sample cohort. The MCDCV modelling approach achieved a sensitivity of 93 % (27/29) and a specificity of 82 % (23/28) based on the selection of a threshold of 0.6, which was optimized to reduce the number of false-negatives ( Figure 4E ). Here we report on a new transflection infrared based saliva test for COVID-19. The results showed a 93 % sensitivity and 82 % specificity using the MCDCV modelling approach. Furthermore, we identified specific SARS-CoV-2 infrared and Raman marker bands from highly purified virions. To improve the sensitivity and specificity and determine if the technique has the specificity to distinguish SARS-CoV-2 from other respiratory viruses including influenza and other coronaviruses a larger patient cohort is required. RNA RT-qPCR remains the "gold standard" diagnostic technique for SARS-CoV-2 infection. However, there is an urgent need for a point-of-care screening technique that could potentially triage patients for specific RT-qPCR testing. Such a tool would be extremely useful in the current pandemic enabling onsite screening at airports, sporting venues, universities and schools. An infrared based saliva test is logistically easier to perform, rapid, and minimizes the risk of transmission to health workers. Furthermore, self-collection of saliva would reduce patient discomfort and improve community participation rates in testing. World Health Organisation Coronavirus Disease (COVID-19) 692-694; b) P. Harikrishnan Coronaviruses: Methods and Protocols Mr. Finlay Shanks for instrumental support. Dr Emanuele Pedersoli for his assistance in graphic scripting. Funding: Part of this research was undertaken on the IRM beamline at Australian Synchrotron (Victoria, Australia), part of the Australian Nuclear Science and Technology Organisation (ANSTO). We acknowledge the support of the beamtime (Proposal ID. The datasets and model codes are available in the Zenodo repository (https://zenodo.org/record/4156646).