key: cord-0871809-ln414qsl
authors: Tan, Shaun S.; Saw, Sharon; Chew, Ka Lip; Huak, Chan Yiong; Khoo, Candy; Pajarillaga, Anastacia; Wang, Weixuan; Tambyah, Paul; Ong, Lizhen; Jureen, Roland; Sethi, Sunil K.
title: Head-to-head evaluation on diagnostic accuracies of six SARS-CoV-2 serological assays
date: 2020-09-30
journal: Pathology
DOI: 10.1016/j.pathol.2020.09.007
sha: 19a5447104000342ac3e58491341ea27b96a77cf
doc_id: 871809
cord_uid: ln414qsl

In this study, we evaluated and compared six SARS-CoV-2 serology kits including the Abbott SARS-CoV-2 IgG assay, Beckman Access SARS-CoV-2 IgG assay, OCD Vitros OCD Anti-SARS-CoV-2 Total antibody assay, Roche Elecsys Anti SARS-CoV-2 assay, Siemens SARS-CoV-2 Total assay, and cPass surrogate viral neutralising antibody assay. A total of 336 non-duplicated residual serum samples that were obtained from COVID-19 confirmed patients (n=173) on PCR and negative controls (n=163) obtained pre-December 2019 before the COVID-19 pandemic were used for the study. These were concurrently analysed on the different immunoassay platforms and correlated with clinical characteristics. Our results showed all assays had specificity ranging from 99.3% to 100.0%. Overall sensitivity across all days of symptoms, in descending order were OCD (49.1%, 95% CI 41.8–56.5%), cPass (44.8%, 95% CI 37.5–52.3%), Roche (41.6%, 95% CI 34.5–49.0%), Siemens (39.9%, 95% CI 32.9–47.3%), Abbott (39.8%, 95% CI 32.9–47.3%) and Beckman (39.6%, 95% CI 32.5–47.3%). Testing after at least 14 days from symptom onset is required to achieve AUCs greater than 0.80. OCD and cPass performed the best in terms of sensitivity for >21 days symptoms with 93.3% (95% CI, 73.5–99.2%) and 96.7% (95% CI, 82.8–99.9%), respectively. Both also shared the greatest concordance, kappa 0.963 (95% CI 0.885–1.0), p<0.001, and had the lowest false negative rates. Serology results should be interpreted with caution in certain cases. False negatives were observed in a small number of individuals with COVID-19 on immunosuppressive therapy, pauci-symptomatic or who received antiretroviral therapy. In conclusion, all assays exhibited excellent specificity and total antibody assays with spike protein configurations generally outperformed nucleocapsid configurations and IgG assays in terms of diagnostic sensitivity.

A cluster of unexplained viral pneumonia cases first identified in Wuhan in December 2019 has now progressed to the global pandemic known as the coronavirus disease 2019 . Clinical features include a spectrum ranging from asymptomatic infection, mild acute respiratory illness, fever, and diarrhoea, to that of multi-organ failure requiring intensive care support. 1 Presence of co-morbidities such as diabetes mellitus, hypertension, and cardiovascular disease confer poorer clinical outcomes and increased mortality. 2 Laboratory markers such as elevated white blood cell count, neutrophil count, lactate dehydrogenase, alanine aminotransferase, aspartate aminotransferase, total bilirubin, creatinine, troponins, Ddimer and procalcitonin are associated with poorer prognosis for COVID-19 patients. 3 The demand for laboratory testing, especially in a timely fashion, is crucial for clinical management and public health efforts to limit transmission. Currently, the gold-standard for diagnosing COVID-19 is detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) RNA via real time reverse transcription polymerase chain reaction (rRT-PCR), targets of which may include a combination of nucleocapsid (N), envelope (E), RNAdependent-RNA polymerase (RdRp) , and open reading frame (orf1a and orf1b) genes. 4 The use of SARS-CoV-2 serological assays has both clinical and epidemiological uses. At the individual level, serological tests may be used to support clinical diagnosis by determining recent or previous infection, thereby contributing to quarantine measures, or by determining the immunological status in vaccinated individuals with a view for booster requirements. In addition, there have been suggestions regarding the utility of 'immune passports' with the use of seropositive status conferring potential immunity and allowing the individual to return to work or certified fit-to-travel. 5 At a public health level, popuation screening with serology offers insights into epidemiology and immunity of the population. Also, seroprevalence data may allow government officials to more effectively implement interventions strategies such as lockdowns or targeted physical distancing measures. The advent of commercially available SARS-CoV-2 serological assays allows for more ubiquitous testing amongst laboratories. Nevertheless, there are currently limited head-tohead comparisons of test performance between the available testing platforms. The aim of our study is to compare six commercial serological tests with various assay design configurations directed against separate SARS-CoV-2 antigens, including a viral neutralisation antibody assay, and to inform users as to their clinical and interpretive differences.

Ethics approval for our study was given by the National Healthcare Group Domain Specific Review Board (NHG ROAM Reference Number: 2020/00337 and 2020/00407).

In our institution, we had access to a number of instruments used in the latest SARS-CoV-2 serological assays. For purposes of this study, we used the Abbott Architect i4000SR for the Abbott SARS-CoV-2 IgG (Abbott Diagnostics, USA), Roche Cobas E411 for testing the Elecsys Total Anti-SARS-CoV-2 antibody (Roche Diagnostics, Switzerland), UniCel DxI 800 for the Access SARS-CoV-2 IgG antibody assay (Beckman Coulter, USA), ADVIA Centaur XPT for Siemens SARS-CoV-2 Total (COV2T) assay (Siemens Healthineers, Germany), VITROS 3600 for the Vitros OCD Anti-SARS-CoV-2 Total antibody (Ortho Clinical Diagnostics, USA), and the Evolis Complete System (Biorad, USA) for the cPass SARS-CoV-2 surrogate viral neutralising antibody detection ELISA kit (Genscript, USA). A summary of the analysers used, the assay configuration and interpretation of the quantitative values are provided in Table 1 . These assays will subsequently be referred to as Abbott, Beckman, cPass, OCD, Roche, and Siemens assays.

We collected a total of 336 non-duplicated residual serum samples for the study that were obtained from COVID-19 confirmed patients (n=173) and negative controls (n=163) obtained pre-December 2019 before the COVID-19 pandemic. We prospectively selected samples between 30 March 2020 and 15 June 2020 from COVID-19 patients in our institution. When samples were collected from patients, we recorded the number of days following onset of symptoms. All had a minimum of one real time reverse transcription polymerase chain reaction (rRT-PCR) respiratory sample positive on our Cobas 6800 SARS-CoV-2 assay (Roche Diagnostics, Switzerland), or the A*Star Fortitude kit (Accelerate Technologies, Singapore) with the cycle threshold (CT) value being lower than cut-off. This was the gold standard to which our results were compared. Samples were collected in serum separator tubes (Beckton Dickinson, USA), centrifuged at 3000 rpm for 8 min and after clinical testing, residual sera were collected in accordance with previously described laboratory protocols for COVID-19 sample handling. 6 Days of symptoms were recorded based on first day of onset of COVID-19 symptoms, as documented by managing clinicians. Patients who were asymptomatic at the time of PCR testing were excluded. Archived serum samples taken from patients prior to December 2019 representing COVID-19 naivety were used as negative J o u r n a l P r e -p r o o f controls. These included healthy blood donors as well as patients with and without other positive serological tests: anti-extractable nuclear antigen antibodies (9); anti-glomerular basement membrane antibodies (4); anti-smooth muscle antibody (3); hepatitis A IgM (3); Epstein-Barr virus IgM (3); anti-intrinsic factor IgG (5); cytomegalovirus IgM (4); cytomegalovirus IgG (3); syphillis treponemal antibody (5) ; Epstein-Barr virus IgA (7); Leptospira IgM (3); hepatitis C antibody (9); hepatitis B surface antigen (7); hepatitis B e antigen (2); anti-double stranded DNA IgG (3); rubella IgM (4); antinuclear antibodies (3); hepatitis A IgG (3); dengue virus IgG (1); varicella zoster IgM (1); human immunodeficiency virus (8) ; and varicella zoster virus IgG (6) . Prior to testing patients' sera, calibration was performed and quality controls were passed as per manufacturers' instructions.

Statistical analyses were performed using the Statistical Package for Social Sciences (SPSS) 25.0 (IBM, USA) with statistical significance determined at p<0.05. Sensitivity and specificity were expressed as percentages with their respective 95% confidence interval (CI). The relationship between the six immunoassays and their qualitative agreement were assessed using the inter-rater Gwet's AC1 kappa concordance to account for potential paradox effect with Cohen's kappa. 7 Receiver operator characteristic (ROC) and area under the curve (AUC) values were analysed to determine the discriminative ability of the tests. Quantitative values of the signal range in both reactive and non-reactive samples were presented.

Specificities of all six assays were excellent. Five of the assays (Abbott, Roche, OCD, Siemens, cPass) had 100% specificity with no cross-reactivity in the negative control panel. Beckman had one serum sample which was positive (signal value 1.69) from a healthy volunteer. This was repeated and remained positive, thus giving a specificity of 99.3% (95% CI 95.7-99.9%). In general, the quantitative signals in this negative cohort except for Beckman, were distinctly lower than the threshold cut-off for each assay (Supplementary Table 1 , Appendix A).

We compared the qualitative results generated by each of the six assays. The overall sensitivity across all days of symptoms in descending ranking were: OCD (49.1%, 95% CI 41.8-56.5%), cPass (44.8%, 95% CI 37.5-52.3%), Roche (41.6%, 95% CI 34.5-49.0%), Siemens (39.9%, 95% CI 32.9-47.3%), Abbott (39.8%, 95% CI 32.9-47.3%) and Beckman (39.6%, 95% CI 32.5-47.3%). OCD and cPass performed the best in terms of sensitivity for >21 days group with 93.3% (95% CI, 73.5-99.2%) and 96.7% (95% CI, 82.8-99.9%), respectively. In the 14-20 days of symptoms group, OCD and Roche had equal highest sensitivity (88.5%, 95% CI 69.8-97.6%). For the 7-13 days of symptoms group, we found that OCD had the highest sensitivity (56.8%, 95% CI 39.5-72.9%), outperforming Beckman (31.4%, 95% CI 16.9-49.3%), p=0.019, and Siemens (32.4%, 95% CI 18-49.8%), p=0.035.

The discriminative ability of the binary/qualitative serology tests indicated that testing after at least 14 days from symptom onset is required to achieve AUCs of more than 0.80. Earlier testing was suboptimal with AUCs between 0.662 and 0.784 when testing was performed between days 7 and 13, and less then 0.60 when testing between days 1 and 6. Less than 10% of samples were positive when testing earlier than 7 days after symptom onset by any assay, suggesting that these assays have poor diagnostic capability in the early phase of symptoms for COVID-19. Comprehensive data are presented in Table 2 , and the ROC curves for binary discrimination shown in Fig. 1 .

Quantitative signal values may be useful for estimating SARS-CoV-2 antibody titres in patients. For our cohort of patients we documented the minimum to maximum ranges of the reactive and non-reactive cases' signal values for each assay. We then demonstrated that the mean quantitative signals generally increased for COVID-19 patients across all assays, in tandem with increasing number of days from symptom onset (Fig. 2) . One-way ANOVA comparing the mean signals between the days of symptoms was different (p<0.001) across five assays. The Siemens assay was not analysed due to the number of cases recorded as >10 and <0.05 (i.e., beyond analytical measuring range) and hence impractical for mean signal calculations. All five assays showed a significant difference (p<0.05) in mean quantitative values between >14 days and <14 days using Tukey's posthoc honest significant difference (HSD) test, in line with our ROC data described above (Supplementary Tables 2 and 3 , Appendix A).

We were interested in corcordance between the assays, particularly when compared against a viral neutralisation antibody assay. Kappa concordance, categorised by days of symptoms and plotted against the qualitative result, is summarised in Table 3 . Data showed that for >21 days of symptoms, cPass and OCD had the greatest concordance between the assays with kappa 0.963 (95% CI 0.885-1.0), p<0.001. In the 14-20 days group, cPass agreed most with Abbott, with kappa 1.0 (95% CI 0.868-1.0), p<0.001, followed by both OCD and Roche which had similar kappa of 0.950 (95% CI 0.843-1.0), p<0.001. cPass also correlated well with Siemens for all days, kappa 0.942 (95% CI 0.909-0.974), p<0.001.

We determined the false negative rate (FNR) of each assay, using the remaining five assays as a composite reference standard ( Table 4 ). The lower the percentage, the better the performance of the assay in question. For samples taken >21 days after symptom onset, cPass had the lowest FNR (0%, 95% CI 0.0-11.6%) followed by OCD (3.3%, 95% CI 0.08-17.2%). There was a significant difference of 13.3% (95% CI 1.1-25.4%), p=0.039, between Abbott and cPass. For 14-20 days, OCD and Roche had equal lowest FNR (3.8%, 95% CI 0.10-19.6%). OCD again demonstrated the lowest FNR in the 7-13 days group (8.1%, 95% CI 1.7-21.9%), and this was statistically significant when compared against Siemens (32.4%, 95% CI 18-49.8%), p=0.009 as well as Beckman (31.4%, 95% CI 15.9-47.0%), p=0.010.

The clinical characteristics and signal values were also reviewed for COVID-19 patients with non-reactive results even for samples taken 14 days after symptom onset. In the 14-20 days group, the poorest performer was Siemens with five non-reactives as compared to the best performer, OCD, with three non-reactives. Patient 1 had end-stage renal disease as a result of IgA nephropathy, previously undergoing renal transplant and was on mycophenolate (Myfortic) immunosuppressant at the time of serology testing. Patients 2 and 4 had significant lymphopenia, required intubation with a course of intensive care unit stay, and were both administered lopinavir-ritonavir (Kaletra). Patient 2 was given hydrocortisone as well. The other two patients in this group were pauci-symptomatic. In the >21 days group, poorest performer Abbott had five non-reactives compared to only one non-reactive in the best performer cPass. All of these five patients were pauci-symptomatic and presented only with a mild acute respiratory illness, dry cough or sore throat. These data are summarised in Table 5 .

The SARS-CoV-2 virus is an enveloped RNA consisting of 4 structural proteins known as the spike (S), envelope (E), membrane (M) and nucleocapsid (N) proteins. S and N proteins in particular, have been the focus of several targets for serological assays on the basis that previous studies have indicated that they are the most immunogenic antigens. 8, 9 Walls and colleagues have shown coronavirus S protein mediates viral entry into host cells. The S protein consists of two functional subunits namely the S1 subunit which binds to the host cell receptor and S2 subunit which leads to fusion of viral and cellular membranes. 10 The S1 subunit comprises the receptor binding domain (RBD) which attaches to the human angiotensin-converting-enzyme 2 (ACE-2), initiating entry into the host cell. 11 The N protein, on the other hand, has been described to be highly immunogenic and expressed in large quantities during acute infection of SARS-CoV-2. 12 Western blot studies also indicate that the N protein is a potent viral antigen for SARS-CoV-2 on the basis of IgG, IgA and IgM antibodies directed against N protein in COVID-19 positive sera. 13 To our knowledge, our study is the first which investigated and compared immunoassays with different design configurations against the S1 subunit, RBD protein, and N antigen, including that of a surrogate viral neutralising antibody (NAb) assay. NAbs typically target the S1-RBD, inhibiting the binding of RBDs to their receptors and hence preventing S2mediated membrane fusion or entry into the host cell. As a result, viral infection of the host is prevented and presence of such antibodies may confer a status of true immunity. 14 Our results show that specificity for all six assays are excellent, and the one false reactive case in Beckman could be potential cross-reactivity of a common coronavirus in the healthy volunteer. In light of the limited sensitivity in the early stages of illness, antibody detection is not sufficiently reliable to act as a routine diagnostic test for COVID-19. This is supported by our previous work and current AUC results, which suggest that serological testing should generally be considered only 14 days after onset of symptoms. 15, 16 If testing is performed prior to this, users should consider the high possibility of false negative results. In this study, the best performer in terms of sensitivity across all days and also at >14 days of symptoms was OCD followed closely by cPass. This is unsurprising because both assay designs are towards the S1 spike protein and they also correlated well based on kappa's concordance. OCD in particular is a total antibody assay directed at IgA, IgM and IgG, hence its diagnostic accuracy outperformed assays raised against just IgG or IgG/IgM. As a a mucosal targeting virus, SARS-CoV-2 generates secretory IgA inducing strong mucosal immunity in the host. Yu and colleagues also showed that IgA antibody produced the earliest seroconversion amongst all antibodies, and was significantly greater than that of IgM in patients with both severe and non-severe illness. 17 It would be valuable for vendors to consider providing quantitative signals for individual components to a total antibody assay in the diagnostic evaluation COVID-19. Perhaps disappointingly, Beckman's RBD assay design did not significantly correlate well with cPass but did have a better correlation with Abbott's assay J o u r n a l P r e -p r o o f which could be due to both of these assays only targeting IgG. Our cPass NAB results exhibited the best sensitivity in the cohort of patients >21 days, where only one case who was pauci-symptomatic demonstrated negative neutralising antibodies. These results corroborate with findings from Brouwer's group who showed potent neutralising antibody production in a convalescent cohort of patients and served as a marker of immunity. 18 Interestingly, quantitative values for cPass ranged from a negative inhibition percentage of -81.1% to that of +91.1%. The negative values may be explained in part by reflecting statistical variation around 'zero' inhibition, or true biological enhancements where non-specific antibodies cross-react and bind to the receptors, decreasing binding avidity and hence inhibition. 19 Our data provide important considerations for serology testing, and caveats for clinical use. Some patients who had been immunocompromised, given steroids, or some antivirals (two of our patients received a protease inhibitor antiretroviral) may exhibit a down-regulated immune response which manifests as a negative result in some of the immunoassays. Furthermore, pauci-symptomatic patients who were COVID-19 positive on rRT-PCR more frequently failed to have a positive serology on multiple immunoassays, even during the convalescent phase. This suggests that patients with mild symptoms may not display a strong immune response or antibody production, and clinicians need to interpret such cases with caution. Our finding are supported by the study by Long et al. which showed that the virus specific IgG levels in the asymptomatic group were significantly lower compared to symptomatic COVID-19 patients. 20 The strength of our study includes using the same cohort of unique, non-duplicate COVID-19 patients' sera to compare performance head-to-head. Certain validation studies use the same patient but collect their sera at different time points for the calculation of sensitivity. This potentially leads to a positive skew in reporting data as an early seroconversion patient likely continues to exhibit reactive serology at later timepoints. A recent study by Public Health England showed the comparison between Abbott, Diasorin, Roche and Siemens, although this only accounted for convalescent patients (≥20 days of symptoms). 21 Our study has similar specificity characteristics as described, but further categorises our patient cohort into four different time points, from <7 days to ≥21 days, which provides more detailed information about practical testing and their respective AUCs. Certain limitations in our study were that we did not have archived sera from patients with previous coronaviruses such as SARS-CoV-1 and MERS-CoV to test for cross-reactivity. As this study was approved only for residual sera, low volume samples were insufficient for certain assays to be performed, and these patients were excluded from the final data analysis. Lastly, our COVID-19 positive patients diagnosed via rRT-PCR were all assumed to be true positives only.

Future directions should revolve around testing of rRT-PCR together with serology which will improve overall diagnostic accuracy for COVID-19. The use of CT-values from PCR and quantitative signals generated by serological assays may provide utility regarding infective status and immunity of the patient. Quantitative signal values and their interpretations should be provided by vendors, as reports have shown that greater titres of SARS-CoV-2 antibodies in the early phase of COVID-19 translate to more deleterious outcomes for the patient. 22 In such scenarios, serology may provide insight into prognosis and appropriate clinical management. Importantly, it has been suggested that neutralising antibodies directed against SARS-CoV-2 S and RBD proteins confer immunity upon the host and these have been the focus of several vaccine trials. 14 Neutralising antibodies potentially trigger immunopathogenic and pro-inflammatory events in the host, hence serological testing may serve as a useful indicator whether immunisation has been acquired. 23 

Taken together, our report shows excellent specificity for all assays although sensitivities were poorer when compared to the manufacturer's claims. Spike protein designed immunoassays generally outperformed nucleocapsid target immunoasays. Clinicians should interpret negative serology results with caution if the patient is immunocompromised, given antiretrovirals, pauci-symptomatic or has had symptoms less than 14 days. 

A novel coronavirus from patients with pneumonia in China

Comorbidity and its impact on 1590 patients with COVID-19 in China: a nationwide analysis

Laboratory abnormalities in patients with COVID-2019 infection

Guidelines for laboratory diagnosis of coronavirus disease 2019 (COVID-19) in Korea

Serology for SARS-CoV-2: apprehensions, opportunities, and the path forward

Practical laboratory considerations amidst the COVID-19 outbreak: early experience from Singapore

High agreement and high prevalence: the paradox of Cohen's kappa

Serological analysis of New York City COVID19 convalescent plasma donors

Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine

Structure, function, and antigenicity of the SARS-CoV-2 spike glycoprotein

Structural basis of receptor recognition by SARS-CoV-2

The nucleocapsid protein of SARS-CoV-2: a target for vaccine development

Biochemical characterization of SARS-CoV-2 nucleocapsid protein

Neutralizing Antibodies against SARS-CoV-2 and Other Human Coronaviruses

Clinical evaluation of serological IgG antibody response on the Abbott Architect for established SARS-CoV-2 infection

Comparative clinical evaluation of the Roche Elecsys and Abbott SARS-CoV-2 serology assays for COVID-19

Distinct features of SARS-CoV-2-specific IgA response in COVID-19 patients

Potent neutralizing antibodies from COVID-19 patients define multiple targets of vulnerability

Statistical approaches to analyzing HIV-1 neutralizing antibody assay data

Antibody responses to SARS-CoV-2 in patients with COVID-19

Evaluation of sensitivity and specificity of 4 commercially available SARS-CoV-2 antibody immunoassays

The role of SARS-CoV-2 antibodies in COVID-19: Healing in most, harm at times

Antibodies to SARS-CoV-2 and their potential for therapeutic passive immunization

Values are kappa (95% CI)

We would like to thank Temasek Holdings Pte Ltd for sponsoring the