key: cord-0891687-5924kgqr
authors: Chivte, Prajkta; LaCasse, Zane; Seethi, Venkata Devesh R.; Bharti, Pratool; Bland, Joshua; Kadkol, Shrihari S.; Gaillard, Elizabeth R.
title: MALDI-ToF Protein Profiling as a Potential Rapid Diagnostic Platform for COVID-19
date: 2021-09-09
journal: Journal of mass spectrometry and advances in the clinical lab
DOI: 10.1016/j.jmsacl.2021.09.001
sha: c4e64ee2ff39de4212fcaffd99914d47ddb2365a
doc_id: 891687
cord_uid: 5924kgqr

More than a year after the COVID-19 pandemic was declared, the need still exists for accurate, rapid, inexpensive and non-invasive diagnostic methods that yield high specificity and sensitivity towards the current and newly emerging SARS-CoV-2 strains. Compared to the nasopharyngeal (NP) swabs, several studies have established saliva as a more amenable specimen type for early detection of SARS-CoV-2. Considering the limitations and high demand for COVID-19 testing, we employed MALDI-ToF mass spectrometry in the analysis of 60 gargle samples from human donors and compared the resultant spectra against COVID-19 status. Several standards, including isolated human serum immunoglobulins, and controls, such as pre-COVID-19 saliva and heat inactivated SARS-CoV-2 virus, were simultaneously analyzed to provide a relative view of the saliva and viral proteome as they would appear in this workflow. Five potential biomarker peaks were established that demonstrated high concordance with COVID-19 positive individuals. Overall, the agreement of these results with RT-qPCR testing on NP swabs was ≥90% for the studied cohort, which consisted of young and largely asymptomatic student athletes. From a clinical standpoint, the results from this pilot study suggest that MALDI-ToF could be used to develop a relatively rapid and inexpensive COVID-19 assay.

Coronavirus disease 2019 (COVID-19), a highly transmissible disease caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) virus, was discovered in the Hubei Province of China in December 2019 and was soon after declared a pandemic by the World Health Organization (WHO) in March 2020. Globally, there have been more than 189 million confirmed cases of COVID-19, including over 4 million deaths, as of July 2021 [1] . Clinical manifestations of COVID-19 predominantly include fever, dry cough, muscle pain, anosmia and fatigue [2] . Infections may also result in a number of medical complications, which worsen in patients of advanced age or with co-morbidities. In contrast to many other viral infections, 48% of persons who test positive for COVID-19 are asymptomatic [3] . Additionally, a report by Johannson et al. [4] concluded that at least 50% of new infections originate from contact with asymptomatic carriers. These reports are difficult to interpret, however, because not all persons who are asymptomatic continue to be asymptomatic throughout the course of infection; percentages of persons who remain asymptomatic are reported to range from 20% -50% [5] , [6] .

Facing the ongoing pandemic caused by SARS-CoV-2, early diagnosis of COVID-19 is of great importance to control community outbreaks. Thus, it is crucial to design rapid, sensitive, and accurate diagnostic tests to address this public health crisis. Broadly, there are two major categories of diagnostic tests to detect SARS-CoV-2: i) Detection of the viral RNA genome, and ii) detection of viral proteins and antibody response against the virus [7] .

The first test type reports whether the virus is present at an abundance above the analytical sensitivity of the assay. The second test type detects the presence of antibodies (mostly IgM and IgG) against SARS-CoV-2 within an individual who has been exposed to the virus. Studies have demonstrated that the strong antibody responses against the spike (S) and the nucleocapsid (N) proteins are of high diagnostic utility [8] .

SARS-CoV-2 is an enveloped virus that encodes at least 29 proteins in its RNA genome, four of which are structural proteins: spike (S), membrane (M), envelope (E) and nucleocapsid (N) proteins. The S protein is responsible for binding to the cellular surface receptor angiotensinconverting enzyme 2 (ACE2) through the receptor binding domain (RBD), an essential step for membrane fusion. Activation of the S protein requires cleavage between the S1 and S2 subunits by a furin-like protease and a subsequent conformational change [9] .

The current gold standard test for COVID-19 is of the first type of diagnostic test described above, molecular testing for the presence of SARS-CoV-2 RNA, which is accomplished by reverse transcription polymerase chain reaction (RT-qPCR). The sensitivity and specificity of RT-qPCR can be high, depending on the primer/probe sets used. False positive results, although uncommon, can occur because of contamination due to the template amplification nature of RT-qPCR, high viral loads in samples, and widespread prevalence of infection. False negative results are generally related to sampling issues, nucleic acid degradation, testing in the very early phase soon after exposure or late convalescent phase of the infection, or infections with variants that contain mutations in the primer/probe binding sites. RT-qPCR tests are usually carried out using specimens collected via a nasopharyngeal (NP) or oropharyngeal swab or saliva [7] , [10] , [11] .

However, SARS-CoV-2 and the biomarkers of COVID-19 can be found in multiple other specimen types. These include tracheal aspirate, sputum, whole blood, plasma, and serum [8] . Of particular interest in finding alternatives to NP swabs, it has been reported that standardized saliva collection can be adopted to detect SARS-CoV-2 infection, as saliva droplets are a main vehicle of viral transmission and the salivary glands are reported to be a target of infection as well as a reservoir for the virus [12] , [13] . In fact, approval for using saliva as a primary test material for SARS-CoV-2 was first given to Rutgers' RUCDR Infinite Biologics and collaborators under emergency use authorization (EUA) by the Food and Drug Administration (FDA) [12] . Similar saliva-based tests were authorized for the Yale School of Public Health and implemented by the University of Illinois Urbana Champaign [14] , [15] .

The reliability of viral detection in saliva has been demonstrated with SARS-CoV, Zika and Ebola [16] . Saliva as a specimen for diagnosis provides benefits like rapid, non-invasive collection, costeffectiveness and patient acceptance. Perhaps of highest diagnostic importance, however, the SARS-CoV-2 load in saliva is reported to be maximum within the first week of symptom onset, providing an avenue for early diagnosis of COVID-19 [17] .

In addition to the type of specimen collected for a diagnostic test, it is also crucial to decide upon the type of biomolecule to be detected. Molecular diagnosis of COVID-19 primarily depends on detection of viral RNA. However, RNA is extremely sensitive to degradation by ribonucleases and its extraction process is time-consuming, expensive and demands trained personnel as improper storage and extraction can be a major factor in contributing to false negative results of COVID-19 [11] . Additionally, SARS-CoV-2 has mutated over time [18] . This might prove to be a major challenge for tests that operate by detection of viral RNA by RT-qPCR. Mutations in the primer or probe binding sites may decrease or severely limit the ability of the assay to detect SARS-CoV-2. As such, multiplex RT-qPCR with two or three primer/probe sets may be needed to avoid false negative results. Moreover, a constant and long-term surveillance for mutations in the primer/probe binding sites is needed to ensure adequate test performance.

On the other hand, compared to nucleic acids as an analyte, proteins are more stable and are present in higher amounts in the virus. Although proteins cannot be directly amplified like nucleic acids this considerably reduces the chances of producing false positive results due to contamination. Furthermore, several different proteins can be detected in the same analysis, which increases the number of diagnostic markers and improves the conclusions drawn from a test. For instance, various classes of proteins, like antibodies, cytokines and viral proteins can be collectively analyzed using mass spectrometry and other proteomic techniques for their characterization and quantitation [11] , [19] . Beyond merely detecting a single target of interest, such as a specific SARS-CoV-2 protein, proteomics combined with informatics can derive actionable information from large data sets and, thus, help in establishing biomarkers for this disease [20] . Such holistic information is necessary to understand the pathogenesis of SARS-CoV-2 infection and may lead to novel therapies not only against the virus, but also to help modify the host response to the virus.

Developing a diagnostic test that overcomes the limitations of nucleic acid testing would assist in fulfilling the worldwide demand of reliable, rapid, accurate and cost-effective testing. Iles et al. have published a preliminary report on utilizing water gargle samples to monitor COVID-19 associated proteins using MALDI-ToF mass spectrometry (MALDI-ToF MS) [21] . The current study proceeds in a similar direction and utilizes saliva to test for the presence of SARS-CoV-2 through a simple gargle procedure followed by MALDI-ToF MS analysis. However, this study also compares the mass spectral analysis to the RT-qPCR status of the individuals from NP swabs. The method described in this article potentially detects both viral proteins and the antibodies produced against them. For the 60 saliva samples that were analyzed, the area under the curve (AUC) of potential viral protein and host protein peaks was used to detect the presence of SARS-CoV-2 and the host immune response against the virus. The values for AUC of the biomarkers and viral proteins were compared to RT-qPCR results from NP swabs sampled in parallel.

This work was approved by the Institutional Review Board and Institutional Biosafety Committee of Northern Illinois University (NIU) (August 12, 2020) as well as the Institutional Review Board of the University of Illinois at Chicago (February 11, 2021). Informed consent was obtained with the signature of the volunteers. Personal identification was not associated with any sample and collected information was limited to demographics, symptoms, and RT-qPCR results. Sample handling and processing followed all biosafety level guidelines.

Samples were collected at a drive-thru gateway testing program for student athletes organized by NIU Athletics at the Yordon Center in DeKalb, Illinois between the months of August and September of 2020. Student athletes received a NP swab sample collection that was subjected to RT-qPCR testing through the DeKalb County Health Department. NP swabs were collected from individuals and analyzed for SARS-CoV-2 by RT-qPCR with the Abbott RealTime SARS-CoV-2 Assay (EUA). The analysis was done by the University of Illinois Hospital laboratory in Chicago. Results were reported as Detected, Not detected or Inconclusive. Students who consented to participate in the research study were asked to provide a water gargle sample at the same time as the NP swab sample collection. Subjects were asked to gargle 10 mL of bottled spring water for 30 seconds, which was then deposited in a 50 mL conical centrifuge tube. Samples were stored at -20 °C until processing and analysis. A total of 550 gargle samples were collected to ensure a substantial pool of COVID-19 positive samples were included. For the current analysis, 60 (30 COVID-19 positives + 30 COVID-19 negatives) of the above total samples were processed and analyzed.

The gargle samples were prepared and analyzed following the method reported by Iles et al. [21] with minor modifications. Gargle samples were thawed and transferred to a 30 mL disposable polypropylene beaker (Fisher Scientific, Waltham, MA). Approximately 5 mL of each sample were filtered through a 0.45 µm polyethersulfone membrane filter (Fischer Scientific, Waltham, MA) with the filtrate being collected in the original 50 mL tube. Next, acetone precipitation was conducted on the filtrate by adding 5 mL of chilled acetone (Sigma-Aldrich, St. Louis, MO) to each tube. Samples were centrifuged in a Beckman Coulter Avanti J-E series centrifuge with a JA-20 rotor at 16,000 x g for 30 minutes at 4°C. The supernatant was disposed, the rim of the tube patted dry, and the pellet resuspended using 100 µL of 1 M dithiothreitol (DTT) (Sigma-Aldrich, St. Louis, MO) in LBSD-X buffer (MAPSciences, Bedford, UK), a proprietary membrane dissolution and viral envelope protein solubilization buffer. This buffer (reconstitution buffer) was prepared fresh daily by adding appropriate amounts of DTT solution before each analysis. Finally, to recover as much pelleted material as possible, the 100 µL reconstitution buffer was washed down the sides of the tube multiple times thoroughly. Upon complete reconstitution, the samples were gently vortexed and incubated at room temperature for 15 minutes.

SARS-CoV-2 viral load was quantitated by RT-qPCR in saliva samples by using S gene primers and probe, as reported previously [22] . The remaining aliquot was frozen at -20℃ (or kept on dry ice during transportation) until MALDI-ToF analysis. Then, 500 μL of the specimen were added to 1.5 mL LC-MS grade H 2 O (OmniSolv, Sigma-Aldrich, St. Louis, MO) and this was subjected to 3X v/v chilled acetone (6 mL) precipitation and incubated overnight at -20℃. The sample was then processed and analyzed following the gargle sample protocol.

Human serum antibodies IgA (I4036), IgG (I4506) and IgM (I8260) (Sigma-Aldrich, St. Louis, MO) were reduced with 1 M DTT for 10 minutes to a final concentration of 3 pmol of each antibody on the plate. Human salivary α-amylase (A1031, Sigma-Aldrich, St. Louis, MO) was prepared at 100 pmol/μl in LC-MS grade H 2 O, 1 μl of which was spotted. Preparation of positive and negative controls for MALDI-ToF analysis followed closely to that of gargle samples. The negative control, consisting of pooled human saliva (pre-COVID-19) collected before November 2019 (Lee Biosolutions, Maryland Heights, MO), was prepared by spiking 500 µL of the thawed stock into 10 mL of water in a 50 mL tube. From here, the control was filtered and processed following the gargle sample procedure. The positive control (heat inactivated SARS-CoV-2) was obtained from BEI Resources (NR-52286). Briefly, this standard was a heat inactivated, clarified, and diluted cell lysate and supernatant from Vero E6 cells infected with SARS-CoV-2. This sample (225 µL) was treated with 4X v/v of chilled acetone (900 µL) and incubated overnight at -20°C followed by centrifugation at 10,000 x g for 15 minutes at 4 °C. The pellet was completely dissolved in 25 µL of the reconstitution buffer.

For the assay, a sandwich method of matrix-sample-matrix spotting was employed. Sinapinic acid (Sigma-Aldrich, St. Louis, MO) was used as the matrix and consisted of 20 mg/mL in a 50:50 LC-MS grade H 2 O to acetonitrile (Oakwood Chemical, Estill, SC) solution containing 0.1% trifluoroacetic acid (Sigma-Aldrich, St. Louis, MO). First, 1 µL of sinapinic acid matrix was spotted in three wells of a 384 well stainless steel MALDI-MS sample plate (Shimadzu, Kyoto, Japan). After this had air-dried, 1 µL of sample, control, or standard was spotted in each well, immediately followed by 1 µL of sinapinic acid matrix. The matrix was prepared fresh every 7 days and stored at 4°C between analyses.

Spectral acquisition was performed using a Shimadzu AXIMA Performance MALDI-ToF MS (Shimadzu Kratos Analytical, Manchester, UK) equipped with a nitrogen laser set at 337.1 nm with a pulse width of 3 ns and maximum repetition rate of 60 Hz.

The AXIMA Performance mass spectrometer was operated with the Shimadzu Biotech Launchpad Software (version 2.9.4) and was run in positive-ion linear detection mode. The laser power and repetition rate were set at 100 µJ/pulse and 50 Hz, respectively. Spectra were acquired by summing 5,000 spectra (250 profiles by 20 shots) in a range of 2,000-200,000 m/z per sample by shooting in a raster pattern over the target well. The ion gate was set to blank values below 1,500 m/z. Pulsed extraction was set to 50,000 m/z.

The instrument was calibrated daily using the (M+H) + and (M+2H) 2+ peaks of ProteoMass Apomyoglobin MALDI-MS Standard (Sigma-Aldrich, St. Louis, MS), prepared at 100 pmol/µL in LC-MS grade H 2 O and spotted as described above. Signal intensities of the calibrant were recorded throughout the entire analysis to track inter-day instrument performance. Calibration was accepted if the mass deviation was less than 500 mDa.

The Shimadzu Biotech Launchpad was used to export a text file for each gargle sample that included the spectrum of mass-to-charge values ranging from 2,000 m/z to 200,000 m/z alongside respective ion count intensities. In preprocessing, seven subranges (shown in Table 1 ) were identified where m/z values were presumably indicative of host immune proteins or viral proteins. The intermittent values in each of the seven subranges indicated the presence of similar protein masses, hence ion counts in each subrange were coalesced together through integration of points to calculate the AUC which produced seven features for each data sample. AUC was computed by leveraging the composite Simpson's rule for each spectral range [23] . Simpson's rule is a numerical integration method that divides the integral range into shorter subintervals (indicated by three consecutive data points) and uses a quadratic polynomial on each subinterval to approximate the curve. If we define the width between each point as ( ), the start and end of each range are ℎ 0 and , respectively, where is the total number of points, the width of each integral ( ) can be ℎ calculated as shown in equation (1) . One limitation of this approach is the requirement that h be uniform between each consecutive point. In the mass spectrometric data, protein peaks had two floating points precision, which were unevenly spaced with different widths between two consecutive points. To make the widths uniform, each protein mass was rounded to the nearest integer and the mean of ion counts was taken at each integer value. This made the widths uniform with . If the points on the curve have corresponding values ℎ = 1

x 0 , x 1 , x 2 ,.. x n (x 0 ), f(x 1 ), ( 2 ) respectively, then the area for the first subinterval A 1 can be calculated using equation

, … ( ) (2) . Summing the areas for all subintervals (i.e., gives the composite function 1 , 2 , 3 ,… 2 -1 ) to calculate AUC as shown in the equation (3). Since each curve was not smooth, lesser error could be achieved with Simpson's rule for integration as compared to trapezoidal rule, which approximates the integral by forming trapezoids with straight lines in each subinterval [24] .

Range (m/z) 

Recently, studies utilizing MALDI-MS as a potential method for the detection of SARS-CoV-2 have been reported [25] , [26] . However, these studies involved NP swab sampling, were limited to the mass-to-charge range in collected mass spectra (less than 20,000 m/z) and lacked association of peaks to any molecular identity. In another study, published recently by Iles et al. [21] MALDI-MS was employed to detect both viral and human proteins from samples of gargled water. Though this study had a non-invasive sample collection and a wider m/z range, no direct comparison was made with RT-PCR results in clinical samples.

Herein, we report the analysis of 60 water gargle samples using a MALDI-ToF methodology compared to the RT-qPCR status of individuals done on NP swabs. We, however, did not directly analyze the gargle samples by RT-qPCR, and instead relied on the results from NP swabs to classify an individual as positive or negative for SARS-CoV-2. By comparing the mass spectra in known COVID-19 positive and COVID-19 negative individuals, clear distinctions were observed in the range of 20,000 to 200,000 m/z, a range not analyzed in previous COVID-19 MS studies. Furthermore, the AUC of putative host and viral protein peaks was used to correlate the MALDI-ToF profiles with the RT-qPCR status.

A baseline spectrum of peaks was determined in saliva collected prior to the emergence of COVID-19 (Section 2.3). This spectrum served as a control against which any changes in COVID-19 positive individuals could be compared. Figure 1 presents the profile for a representative COVID-19 negative sample that is nearly identical in terms of m/z peaks to the spectrum for the negative control. Of particular note are the peaks consistent between the two profiles, for salivary proteins these would be present in both diseased and healthy individuals. The identity of these proteins is suggested later (Section 3.3).

Distinct differences in MALDI-ToF spectra were observed in individuals who were COVID-19 negative or positive in NP swabs ( Figure 2 ). Overall, gargle samples from COVID-19 positive individuals showed higher intensities along with additional peaks in the spectrum. Peaks around 23,000 m/z, 28,000 m/z and 56,000 m/z were present in both positive and negative individuals. Interestingly, these peaks showed a higher intensity in the COVID-19 positive cases (Figure 2A ). Additional peaks that were unique to COVID-19 positive individuals were present between the ranges of 33,000 m/z to 51,000 m/z and 65,000 m/z to 120,000 m/z ( Figure 2B and 2C).

It is important to note that although the peaks found within these ranges could be used to reasonably separate COVID-19 positive from negative individuals, variation in peak intensities was apparent between COVID-19 positive spectra. We suspect that this may be due to sampling at different time points in the course of the infection and varying host immune response.

Included among the viral and immune proteins discussed is a peak located at 11,150 m/z used as a quality control feature that is most likely cystatin A, a resident protein of saliva [19] . The presence of this peak was used as an internal control to deem a sample as successfully gargled and appropriate to be included for analysis.

The mass spectra of gargle specimens contain signals from host salivary proteins and viral proteins in COVID-19 positive individuals. Hence, it is crucial to identify peaks that confirm the presence of SARS-CoV-2 in COVID-19 positive individuals along with markers of an immune response against the virus.

To begin, we analyzed the mass spectra of pre-COVID-19 saliva and a gargle sample from an individual who tested negative in an NP swab by RT-qPCR (Figure 1 ) to establish a baseline spectrum against which spectra from COVID-19 positive individuals could be compared. The most prominent peak in these spectra was observed near 56,000 m/z. Two smaller peaks are also present near 23,300 m/z and 28,000 m/z. Based on a previous study, we suspected that the peaks near 56,000 m/z and 23,000 m/z most likely represent IgA heavy chain and Ig light chains, respectively [21] .

To support this notion, we analyzed three human serum immunoglobulins (IgA, IgG and IgM) under reducing conditions using the same protocol as in gargle samples (Figure 3 ). The heavy chains for IgA, IgG and IgM were detected at 57,372 m/z, 51,142 m/z and 71,678 m/z, respectively, while the light chains for these antibodies were found between 23,000 m/z and 24,000 m/z. These results suggest that the peaks around 56,000 m/z and 23,000 m/z in pre-COVID-19 saliva and the gargle samples from a COVID-19 negative individual are indeed IgA heavy chain and Ig light chains. This is not surprising, given basal levels of secretory IgA and light chains are nearly always detected in saliva [17] . Various other peaks were observed throughout the spectra of the immunoglobulin standards, which may represent combinations of heavy and light chains, dimers of heavy chains, and multiply-charged ions.

Interestingly, the IgA heavy chain and Ig light chain peaks were almost twice as intense in COVID-19 positive individuals when compared to COVID-19 negative individuals. We suspect that this may be due to a robust immune response against SARS-CoV-2 upon exposure to the virus. Along the same lines, another feature to note is the existence of a peak at 70,603 m/z in COVID-19

positive individuals that could correspond to the heavy chain peak of IgM at 71,678 m/z. This suggests an early response to infection in COVID-19 positive individuals, as it has been reported that viral-specific IgM antibodies are produced first, followed by IgA and IgG [27] .

Additionally, human α-amylase is found in two forms in saliva with molecular weights of around 56 kDa (unglycosylated) and 62 kDa (glycosylated) [28] , [29] . MALDI-MS data have also been reported for human salivary α-amylase and a Y151M mutant, both having a parent ion peak near 56 kDa [30] , [31] . Because these masses fall within the region discussed above regarding heavy chains of IgA, samples of α-amylase were also analyzed by MALDI-MS. The observed spectra exhibit a peak at 56,600 m/z with a shoulder at 58,000 m/z ( Figure S1 ), therefore, α-amylase overlaps with the IgA heavy chain peak in the gargle sample spectra.

Apart from the IgM heavy chain peak at 71,000 m/z, additional peaks were observed between 65,000 m/z and 120,000 m/z in gargle samples from COVID-19 positive individuals. Peaks were prominent in the ranges of 66,400 to 68,100 m/z and 78,600 to 80,500 m/z ( Figure 2C ). Comparing the UniProt database (UniProtKB P0DTC2) and the work of Iles et al., the peak at 79,837 m/z most likely represents a signal for the S1 fragment of the SARS-CoV-2 spike protein; a peak with a similar m/z was also observed in the positive control spectrum from SARS-CoV-2 ( Figure 4) . Furthermore, the peaks described by Iles et al. [21] as viral envelope proteins (VEPs) were also visible in COVID-19 positive profiles ( Figure 2B ).

Compared to COVID-19 negative individuals, a signal was consistently observed in the range of 66,000 m/z to 68,000 m/z in gargle samples from COVID-19 positive individuals. By comparing numerous positive spectra, it appears that there are at least two distinct species with closely overlapping m/z envelopes. This peak could represent the S2 fragment of the SARS-CoV-2 S protein, which is predicted by UniProt (UniProtKB P0DTC2) to have a mass of 64.5 kDa (unglycosylated) with additional mass arising from extensive glycosylation [32] . Alternatively, this peak may arise from a fragment of IgA. As such, this signal cannot be unequivocally identified at this time. It should be noted that, in the mass spectrum of viral isolates from cell culture, an intense peak from bovine serum albumin occurs at 66,600 m/z; this unfortunately would suppress the signal from S2 (if present) into the baseline for the control samples of SARS-CoV-2 ( Figure  4 ).

A shows the entire mass range collected and Panel B shows the major peak signals beyond 70,000 m/z.

To estimate the sensitivity of the MALDI-ToF protocol, we tested a saliva sample that contained a very low SARS-CoV-2 viral load by quantitative RT-qPCR, as described previously [22] . The Ct value of this sample was 36.09 and, as such, the viral load was less than the quantifiable limit of the assay (600 copies/ml). An aliquot of this saliva sample was processed using the protocol for saliva samples in this study (Section 2.3) and the MALDI-ToF mass spectrum was acquired using the same parameters as gargle samples. The observed signal in the mass spectrum for the potential biomarker peak found between 78,600 -80,500 m/z was approximately 3 times the baseline noise level (S/N = 3), a commonly accepted value for finding the limit of detection in MALDI-ToF methods ( Figure S2 ). Based on the S/N ratio, we interpreted this sample as being positive for SARS-CoV-2. This result suggests that the MALDI-ToF protocol may indeed be as sensitive as RT-qPCR to detect SARS-CoV-2 in specimens that contain very low viral loads. More studies are necessary to determine the exact limit of detection of the MALDI-ToF protocol.

In order to compare the results of the NP RT-qPCR samples and the protein profiles collected with gargle samples via MALDI-ToF, the AUC was calculated under the seven peak ranges of interest listed in Table 1 for each sample. As stated above, the peak located at 11,150 m/z is most likely to be associated with cystatin A, a resident protein of saliva [19] . The presence of this peak was used as an internal control. Table S1 includes the AUC of this peak along with the six other features for all specimens, each titled as the reported RT-qPCR result on NP swabs. This table (Table S1 ) was used to sort the samples by AUC values for a given feature from largest to smallest. Sorting in this way showed a reasonable separation between COVID-19 positive and COVID-19 negative specimens for five out of the seven features, including the peaks of S1, S2/immune protein, immunoglobulin heavy/amylase, immunoglobulin heavy doubly-charged, and the biomarker near 112,000 m/z as seen in Figure 5 .

To establish a cutoff threshold for each potential biomarker as it compares to the COVID-19 status, the AUC values for each range were sorted from high to low, as described above, and compared to the disease status. Cutoffs were made that yield the best separation of negative/positive samples. The samples with AUC values above the threshold were assigned MALDI-ToF positive and values below were assigned negative. The cutoff values for a given feature chosen under this criterion are reported in Table 2 , along with the percent agreement between RT-qPCR results and our analysis. For all five potential biomarkers mentioned in Table 2 , we achieved 90% and higher agreement with the RT-qPCR results. Although the identities of the proteins are yet to be confirmed, these m/z ranges certainly exhibit a relationship with the COVID-19 status. We refined these observations further with ROC (Receiver Operating Characteristic) curve analysis, a commonly used technique in clinical studies to determine sensitivities and specificities at different threshold values of an analyte [33] . The true positive rate (sensitivity) on the Y-axis was plotted against the false positive rate (100-specificity) on the X-axis at each of the observed AUCs for every peak, shown in Figure 6 . The analysis clearly demonstrates the ability of the MALDI-ToF assay to discriminate between positive/negative COVID-19 status while providing a quantitative optimization of sensitivity/specificity for each potential biomarker, as summarized in Table 3 . Table 1 . Panel 6A plots data for the internal quality control biomarker tentatively assigned as cystatin A. The data points lie closely along the diagonal line with the AUC indicating no discrimination between COVID-19 negative and positive samples, as predicted. Panels 6C-G utilize data for the potential biomarkers S1, S2/immune protein, immunoglobulin heavy/amylase, immunoglobulin heavy doubly-charged, and the potential biomarker near 112,000 m/z. These five biomarkers show highly sensitive and specific discrimination between the COVID-19 positive and negative individuals with AUCs ≥ 0.933 and sensitivities and specificities of 93.33-100% and 90-93.33%, respectively ( Table 3 ). The closer the AUC is to 1 for ROC curve analysis, the better the model for predicting the disease state. More detailed results of the ROC curve analysis for each potential biomarker can be found in the Table S2 . Panel 6B displays the graph for the immunoglobulin light chains m/z region that shows a much weaker discrimination ability; this phenomenon was also observed anecdotally from visual examination of the mass spectra. In summary, the ROC analysis indicates that five peaks in the MALDI-ToF spectrum are capable of highly sensitive and specific discrimination of COVID-19 status in individuals. It is interesting to note that 89% of the positive samples were from donors who were asymptomatic. This is not surprising given that the student athletes in this study are generally young, healthy individuals. This cohort has been an elusive group to track, particularly early in the pandemic when only symptomatic persons were eligible for RT-qPCR testing. The COVID-19 positive, asymptomatic group were reported to be as contagious as symptomatic persons [34] towards the beginning of the pandemic and had been suggested to be disproportionately responsible for spreading the virus due to a lack of symptoms. This could be of concern in a setting such as a university where communal housing, dining and recreational facilities are predominant. Recent reports, however, cast doubt on a linear relationship between viral loads, symptoms and transmissibility, i.e., how much of an RT-qPCR detected viral load is replicating virus that can be transmitted to other persons is unclear. Transmissibility will need to be confirmed by viral cultures [5] , [35] .

A highly sensitive and specific saliva/gargle test was developed for diagnosing COVID-19 infection using MALDI-ToF MS. The method described in this study is relatively rapid and inexpensive and is sensitive enough to detect SARS-CoV-2 infection in samples with very low viral loads. It has advantages over other tests such as non-invasive sampling and the ability to observe both viral proteins and host response. Comparison of COVID-19 status (assessed by RT-qPCR of NP swabs) with MALDI-ToF analysis over a wide range of 2,000-200,000 m/z identified five potential biomarkers of COVID-19 infection. These results need to be validated on a larger cohort of samples. It is also crucial to further identify the potential biomarker peaks to understand the relationship of the proteins with the course of the disease. Preliminary studies to identify the peaks were performed by examining human immunoglobulins and heat inactivated SARS-CoV-2 virus by this reported assay and comparing peak masses of these controls to those observed for gargle samples. Further confirmatory analysis such as sequencing of proteins that occur in a healthy and SARS-CoV-2-infected saliva proteome is required. The verification of the identity of the human immune biomarkers may serve as a useful tool for monitoring the immune response under various conditions and stressors. While discrimination of COVID-19 status was based on the AUC of the separate potential biomarkers, we aim to develop machine learning models for achieving unbiased, higher accuracies using multiple peaks in tandem for clinical diagnosis.

P. Chivte, Z. LaCasse, V. Seethi, J. Bland, S. S. Kadkol, P. Bharti, and E.R. Gaillard have no competing interests to declare.

The protocol "Use of MALDI TOF mass spectrometry to analyze SARS-CoV-2 viral proteins" was approved by the Northern Illinois University Institutional Review Board on August 12, 2020. The protocol "Methods to detect SARS-CoV-2 in saliva without nucleic acid extraction" was approved by Institutional Review Board of the University of Illinois at Chicago on February 11, 2021.

World Health Organization

Symptoms of Coronavirus

Asymptomatic SARS-CoV-2 Carriers : A Systematic Review and Meta-Analysis

SARS-CoV-2 Transmission From People Without COVID-19 Symptoms

Asymptomatic transmission of COVID-19: What we know, and what we don't

Occurrence and transmission potential of asymptomatic and presymptomatic SARS-CoV-2 infections : A living systematic review and meta-analysis

The interpretation of SARS-CoV-2 diagnostic tests

Diagnostics for SARS-CoV-2 detection: A comprehensive review of the FDA-EUA COVID-19 testing landscape

Molecular Architecture of the SARS-CoV-2 Virus

Laboratory diagnosis of SARS-CoV-2 -A review of current methods

Molecular Diagnosis of COVID-19: Challenges and Research Needs

SARS-CoV-2 identification and IgA antibodies in saliva : One sample two tests approach for diagnosis

SARS-CoV-2 infection of the oral cavity and saliva

SalivaDirect : A simplified and flexible platform to enhance SARS-CoV-2 testing capacity

Saliva-based molecular testing for SARS-CoV-2 that bypasses RNA extraction

Saliva is a reliable tool to detect SARS-CoV-2

COVID-19 salivary signature : diagnostic and research opportunities

The Impact of mutations in SARS-CoV-2 spike on viral infectivity and antigenicity

Mass spectrometric identification of SARS-CoV-2 proteins from gargle solution samples of COVID-19 patients

Proteomics and informatics for understanding phases and identifying biomarkers in COVID-19 disease

Development of a clinical MALDI-ToF mass spectrometry sssay for SARS-CoV-2 : Rational design and multidisciplinary team work

Development and validation of viral load assays to quantitate SARS-CoV-2

Effect of the integration method on the accuracy and computational efficiency of free energy calculations using thermodynamic integration

Area under a curve: trapezoidal and Simpson's rules

Detection of SARS-CoV-2 in nasal swabs using MALDI-MS

A combined approach of MALDI-TOF mass spectrometry and multivariate analysis as a potential tool for the detection of SARS-CoV-2 virus in nasopharyngeal swabs

Immune response to SARS-CoV-2 and mechanisms of immunopathological changes in COVID-19

Structure of human salivary α-amylase crystallized in a C-centered monoclinic space group

Electrophoretic characterization of posttranslational modifications of human parotid salivary alpha-amylase

Transglycosylations catalysed by Y151M mutant of human salivary α -amylase (HSA)

MS characterization of multiple forms of alpha-amylase in human saliva

Characterization of the SARS-CoV-2 S protein: Biophysical, biochemical, structural, and antigenic analysis

The use of receiver operating characteristic curves in biomedical informatics

A systematic review of asymptomatic infections with COVID-19

SARS-CoV-2 detection, viral load and infectivity over the course of an infection

The authors gratefully acknowledge instrumentation support from Shimadzu Scientific Instruments, Inc. We thank Dr. Ray Iles, MAPSciences, for the kind gift of LBSD-X buffer and Northern Illinois University Athletics (Yordon Center) for assisting us with the sample collection. The positive control reagent (Figure 4) was deposited by the Centers for Disease Control and Prevention and obtained through BEI Resources, NIAID, NIH: SARS-Related Coronavirus 2, Isolate USA-WA1/2020, Heat Inactivated, NR-52286.