key: cord-0811531-4iq24vu0
authors: Prince, Harry E.; Givens, Tara S.; Lapé-Nixon, Mary; Clarke, Nigel J.; Schwab, Dale A.; Batterman, Hollis J.; Jones, Robert S.; Meyer, William A.; Kapoor, Hema; Rowland, Charles M.; Haji-Sheikhi, Farnoosh; Marlowe, Elizabeth M.
title: Detection of SARS-CoV-2 IgG Targeting Nucleocapsid or Spike Protein by Four High-Throughput Immunoassays Authorized for Emergency Use
date: 2020-10-21
journal: J Clin Microbiol
DOI: 10.1128/jcm.01742-20
sha: b4ef7f0f77a8996b4f2463bb8bb6aef736cc0e97
doc_id: 811531
cord_uid: 4iq24vu0

A total of 1,200 serum samples that were tested for SARS-CoV-2 IgG antibody using the Abbott Architect immunoassay targeting the nucleocapsid protein were run in 3 SARS-CoV-2 IgG immunoassays targeting spike proteins (DiaSorin Liaison, Ortho Vitros, and Euroimmun). Consensus-positive and consensus-negative interpretations were defined as qualitative agreement in at least 3 of the 4 assays. Agreement of the 4 individual assays with a consensus-negative interpretation (n = 610) ranged from 96.7% to 100%, and agreement with a consensus-positive interpretation (n = 584) ranged from 94.3% to 100%. Laboratory-developed inhibition assays were utilized to evaluate 49 consensus-negative samples that were positive in only one assay; true-positive reactivity was confirmed in only 2 of these 49 (4%) samples. These findings demonstrate very high levels of agreement among 4 SARS-CoV-2 IgG assays authorized for emergency use, regardless of antigen target or assay format. Although false-positive reactivity was identified, its occurrence was rare (no more than 1.7% of samples for a given assay).

zation (EUA) from the U.S. Food and Drug Administration (https://www.fda.gov/medical -devices/emergency-situations-medical-devices/eua-authorized-serology-test -performance). However, the extent of concordance among the various EUA assays is still being defined; in particular, it is unclear how assays measuring antibodies to the SARS-CoV-2 spike protein compare to assays measuring antibodies to the SARS-CoV-2 nucleocapsid protein (3) . Such information is crucial to assisting clinicians and laboratory scientists in evaluating and interpreting results generated by different laboratories that utilize multiple EUA platforms. Here, we compare results obtained using 4 highthroughput SARS-CoV-2 IgG EUA immunoassays; samples positive or negative in an assay measuring IgG to viral nucleocapsid were tested in 3 assays measuring IgG recognizing viral spike protein subunits S1 or both S1 and S2 (S1ϩS2).

Samples. Serum samples with positive (n ϭ 600) or negative (n ϭ 600) SARS-CoV-2 IgG results from the Abbott chemiluminescent microparticle immunoassay (CMIA) utilizing nucleocapsid protein as the antigen target (referred to here as the nucleocapsid-based assay) were selected for further analysis in 3 SARS-CoV-2 IgG assays utilizing spike protein(s) as the antigen target (referred to here as spike-based assays). The 600 deidentified Abbott-positive samples were tested in the Abbott assay at Quest Diagnostics Nichols Institute, Chantilly, VA, and consisted of consecutive positive samples with enough remaining volume for further testing. After testing in the Abbott assay, these samples were frozen and shipped on dry ice to Quest Diagnostics Infections Disease (QDID), San Juan Capistrano, CA; there they were thawed and tested in the other 3 SARS-CoV-2 IgG assays within 24 h of thawing. The 600 deidentified Abbott-negative samples were tested in the Abbott assay at QDID and consisted of consecutive negative samples with enough remaining volume for further testing; the samples were refrigerated a maximum of 48 h before being tested in the other 3 assays. The median age of patients contributing the Abbott-positive samples was 47 years, and 48% were male; the median age of the patients contributing the Abbott-negative samples was 52 years, and 42% were male. Clinical findings and SARS-CoV-2 RNA results were not available for any of the 1,200 patients.

Immunoassays. In addition to the Abbott CMIA, three SARS-CoV-2 IgG immunoassays (2 chemiluminescent immunoassays [CIAs] and 1 enzyme-linked immunosorbent assay [ELISA]) were evaluated ( Table 1 ). All 4 assays were performed following the instructions for use supplied by the respective manufacturers. For the 3 chemiluminescent assays, undiluted serum samples are placed inside the respective instruments (Table 1) , and all assay steps are performed inside the instrument. In contrast, for the Euroimmun ELISA, samples are diluted 1:101 in kit-supplied sample diluent, and the diluted sample is added to reaction wells. Results were classified as positive or negative following the interpretive criteria supplied in the manufacturer's instructions, with one exception; samples with a Euroimmun equivocal index (0.8 to 1.0) were interpreted as positive for purposes of this study.

Inhibition assays. Three assays (Abbott, DiaSorin, and Euroimmun) were modified to distinguish true-positive (TP) from false-positive (FP) reactivity based on inhibition of reactivity by soluble SARS-CoV-2 proteins. Preliminary experiments using a sample positive in all 4 assays (presumed to be TP) established the optimal concentrations of inhibitory proteins required for each assay. For the Abbott inhibition assay, serum samples were diluted 1:2 with phosphate-buffered saline (PBS), PBS containing soluble recombinant S1 protein (6 g/ml; GenScript, Piscataway, NJ), or PBS containing soluble nucleocapsid protein (6 g/ml; GenScript); thus, the final concentration of S1 protein or nucleocapsid protein in analyzed samples was 3 g/ml. Similarly, for the DiaSorin inhibition assay, sera were diluted 1:2 with PBS, PBS containing S1 protein (6 g/ml) and S2 protein (6 g/ml; RayBiotech, Peachtree Corners, GA), or PBS containing nucleocapsid (12 g/ml, to match the total S1ϩS2 concentration of 12 g/ml). Thus, the final concentrations of analyzed samples for the DiaSorin inhibition assay were 3 g/ml each for S1 and S2 and 6 g/ml for nucleocapsid. For the Euroimmun inhibition assay, samples were diluted 1:101 in sample diluent, sample diluent containing S1 protein (3 g/ml), or sample diluent containing nucleocapsid protein (3 g/ml). For each inhibition assay, the samples tested included the samples positive only by the comparable routine assay, plus a minimum of 9 samples that were positive in all 4 assays.

Inhibition was calculated as (PBS or diluent index Ϫ nucleocapsid or spike index)/PBS or diluent index and expressed as a percentage. The cutoff for distinguishing TP from FP reactivity was defined as the mean percent inhibition value minus 2 standard deviations for the samples positive in all 4 assays that were tested in the inhibition assay; values below this cutoff were considered FP. Cutoff values were 29% for the Abbott inhibition assay, 42% for the DiaSorin inhibition assay, and 61% for the Euroimmun inhibition assay. Analysis and statistics. The various reactivity patterns across all 4 SARS-CoV-2 IgG assays were combined into 3 interpretation groups. Consensus-negative was defined as a negative result in at least 3 of 4 assays, and consensus-positive was defined as a positive result in at least 3 of 4 assays; the nonconsensus group included samples positive in 2 of 4 assays. All samples were classified into consensus-negative, consensus-positive, or nonconsensus categories based on the cutoffs listed in Table  1 . For each of the 4 assays, positive agreement and negative agreement (and 95% confidence intervals) were calculated to compare the assay's performance to positive consensus or negative consensus. Samples with a nonconsensus interpretation were not included in positive-agreement/negativeagreement calculations, since there was no majority positive or negative reactivity (i.e., positive in 2 assays and negative in 2 assays). Statistical analyses were conducted using R version 3.6.3, R Foundation for Statistical Computing (https://www.R-project.org). Table 2 Agreement of individual assay results with consensus-negative or -positive interpretations. Table 3 shows the agreement of results for the 4 individual assays with a consensus-negative or consensus-positive interpretation. High levels of agreement were observed; agreement with a consensus-negative interpretation ranged from 96.7% to 100%, and agreement with a consensus-positive interpretation ranged from 94.3% to 100%. Note that the Ortho assay exhibited the highest agreement (100%) with both the consensus-negative and consensus-positive interpretations.

Results of inhibition assays. Within the consensus-negative group were 49 samples (4% of all samples tested) that were positive in only one assay (14 for Abbott, 15 for DiaSorin, and 20 for Euroimmun; no samples were positive by Ortho only). These samples, plus a minimum of 9 samples that were positive in all 4 assays (referred to here as TP samples), were tested in platform-specific inhibition assays designed to discriminate FP from TP reactive. The results are shown in Fig. 1 . Of 14 samples positive only , and all 16 evaluable samples positive only by Euroimmun exhibited FP reactivity (inhibition of Ͻ61%); 6 DiaSorin-positive only samples and 4 Euroimmun-positive only samples could not be evaluated using the inhibition assay because they were negative when repeated in the routine assay as part of the inhibition protocol. Inhibition by the SARS-CoV-2 protein that was not the target protein in the comparable routine assay (i.e., S1 in the Abbott inhibition assay, nucleocapsid in the DiaSorin inhibition assay, and nucleocapsid in the Euroimmun inhibition assay) was Յ4%, Յ10%, and Յ16%, respectively, for all samples (data not shown). Taken together, these findings demonstrate that, based on inhibition assay results, only 2 of 49 (4%) samples positive in just one assay exhibited TP reactivity (both samples were positive only in the Abbott assay); 96% of samples initially positive in only one of the 4 assays exhibited FP reactivity.

The appropriate interpretation of results from SARS-CoV-2 IgG assays depends on a clear understanding of their performance characteristics and limitations. Robust IgG responses to both the spike protein found on the surface of virus particles and the nucleocapsid protein found inside the virus particle occur following SARS-CoV-2 infection (4-7). However, there are conflicting reports as to the relative sensitivities of spike-based versus nucleocapsid-based IgG assays, particularly during the first 14 days after disease onset. Liu et al. (using ELISAs) and Tang et al. (using the Euroimmun ELISA and the Abbott CIA) found that spike-based IgG assays were slightly more sensitive during this time frame (8, 9) , whereas Burbelo et al. (using luciferase immunoprecipitation systems) found that nucleocapsid-based assays were more sensitive (10). By 2 weeks after symptom onset, however, comparable sensitivities are observed (6, 7, (11) (12) (13) (14) .

Based on these published contradictory results for sensitivity (8) (9) (10) , plus additional studies presenting contradictory data on the correlation of virus-neutralizing activity with spike antibody levels versus nucleocapsid antibody levels (15, 16) , we sought to better understand how spike-based SARS-CoV-2 IgG assays perform compared to nucleocapsid-based assays. We thus selected samples previously tested using the Abbott assay targeting nucleocapsid protein to conduct our comparison study.

Our results show that, when consensus interpretations were defined based on agreement in at least 3 of the 4 assays evaluated, individual assay agreements with the consensus interpretation were very high, ranging from 94% to 100%, regardless of Values for the Abbott assay represent percent inhibition by nucleocapsid, values for the DiaSorin assay represent percent inhibition by S1ϩS2 proteins, and values for the Euroimmun assay represent percent inhibition by S1 protein. The horizonal lines indicate the assay-specific inhibition value below which reactivity is considered FP. antigen target and assay platform. These findings thus demonstrate highly comparable performance among these 4 SARS-CoV-2 IgG assays. Our findings are consistent with those of other investigators (17) (18) (19) (20) , who demonstrated good correlation when comparing multiple assays using panels of up to approximately 500 samples. To our knowledge, ours is the first study to assess IgG reactivity in these 4 high-throughput assays using well over 1,000 samples.

A small number of samples within the consensus-negative group (4% of all samples) were positive in only one of the 4 assays. This pattern may represent increased assay sensitivity, decreased specificity, or a combination of the two. To discriminate among these possibilities, inhibition assays were developed and performed. The results showed that all samples positive only in the DiaSorin or Euroimmun assay exhibited FP reactivity; similarly, 12 of 14 samples positive only in the Abbott assay showed FP reactivity. Thus, the vast majority (96%) of samples that tested positive in only 1 of the 4 assays represented decreased assay specificity, rather than increased assay sensitivity. Of note, 13 of the 20 samples with FP reactivity in the Euroimmun assay exhibited index values within the manufacturer's equivocal range (interpreted as positive for the purpose of this evaluation); however, 12 other Euroimmun-equivocal samples fell within the consensus positive group (also positive by Abbott and Ortho), indicating that some Euroimmun-equivocal samples show TP reactivity. These findings suggest that it may be appropriate to test Euroimmun-equivocal samples using a different SARS-CoV-2 IgG assay, in order to distinguish TP from FP reactivity.

Other investigators (17) evaluating the Euroimmun SARS-CoV-2 IgG assay interpreted Euroimmun-equivocal results as negative rather than as positive, prompting us to question how using that interpretation option might alter our findings. Had samples with an index in the Euroimmun equivocal range been interpreted as negative, the consensus-negative group would increase by 1 sample (from 610 to 611), and the consensus-positive group would decrease by 11 samples (from 584 to 573); most notably, the size of the nonconsensus group would nearly triple, increasing from 6 to 16 samples. The impact of these shifts on consensus-negative percent agreement and consensus-positive percent agreement would be minimal; the Euroimmun consensusnegative agreement would increase from 97% to 99%, and the DiaSorin consensuspositive agreement would increase from 94% to 96%. Thus, when a binary (positive/ negative) interpretation was used for Euroimmun assay results, either interpretation of Euroimmun-equivocal results led to the same conclusion: all 4 SARS-COV-2IgG assays evaluated were highly comparable, exhibiting Ͼ90% agreement with consensus results.

A recognized limitation of our study is the lack of clinical information for the patients whose serum samples were evaluated. Information on viral RNA results, disease onset, and clinical course would likely help determine if the 2 patients positive only by Abbott but with TP reactivity simply mounted an antibody response to nucleocapsid earlier than a spike-based response, or if these patients for some reason failed to mount an antibody response to spike protein. Likewise, the same questions apply to the 2 patients positive in all 3 spike-based assays but negative in the nucleocapsid-based assay (see Table 2 , consensus-positive results). Of note, Kohmer et al. (20) also identified a small number of samples with IgG recognizing only nucleocapsid or spike protein and speculated that there may be individual differences in the antibody response to SARS-CoV-2 proteins. Last, we assume that samples negative in all 4 assays were from uninfected persons and that samples positive in all 4 assays were from infected patients, but without information on viral RNA results, time since infection, and clinical course, we cannot know with certainty.

An additional limitation to the study is possible sample selection bias. As indicated, samples were selected based on the qualitative result in the Abbott SARS-CoV-2 IgG assay. Had we selected samples based on results in a spike-based assay, the levels of agreement among assays might have been different. However, the large number of samples included in the study provided the needed statistical power to demonstrate that the 4 assays yield comparable results.

Last, although data for the inhibition assays showed a clear distinction between TP and FP reactivity, it is unclear why the inhibition values in the Abbott inhibition assay were noticeably lower than the inhibition values in the DiaSorin and Euroimmun assays. The most obvious difference is the target antigen of the assay; the Abbott assay targets nucleocapsid, whereas the DiaSorin and Euroimmun assays target spike protein(s). For all 3 inhibition assays, the soluble inhibitory protein employed was not sourced from the assay manufacturer but rather purchased from a different vendor. Thus, subtle differences in glycosylation and tertiary structure may account for differences in the ability of the soluble protein to inhibit binding of antibodies to the immobilized antigen. Further studies are needed to characterize the relationship between soluble antigen structure and inhibitory activity in SARS-CoV-2 IgG inhibition assays.

In summary, our findings show these 4 SARS-CoV-2 IgG EUA assays exhibit excellent agreement, regardless of the target antigen used or assay format (CMIA/CIA versus ELISA). While variability can occur with any assay, health professionals receiving SARS-CoV-2 IgG results obtained using any of these 4 assays, whether from different laboratories or different platforms within a given laboratory, can be assured that the results are comparable and likely equivalent as a diagnostic adjunct to NAAT.

Evaluation of transport media and specimen transport conditions for the detection of SARS-CoV-2 using real time reverse transcription PCR

Current status of epidemiology, diagnosis, therapeutics, and vaccines for novel coronavirus disease 2019 (COVID-19)

The COVID-19 Serology Studies workshop: recommendations and challenges

Severe acute respiratory syndrome coronavirus 2-specific antibody responses in coronavirus disease patients

Detection of IgM and IgG antibodies in patients with coronavirus disease 2019

Detection of serum IgM and IgG for COVID-19 diagnosis

Antibody detection and dynamic characteristics in patients with COVID-19

Evaluation of nucleocapsid and spike proteinbased enzyme-linked immunosorbent assays for detecting antibodies against SARS-CoV-2

Clinical performance of two SARS-CoV-2 serologic assays

Sensitivity in detection of antibodies to nucleocapsid and spike proteins of severe acute respiratory syndrome coronavirus 2 in patients with coronavirus disease 2019

Kinetics of SARS-CoV-2 specific IgM and IgG responses in COVID-19 patients

Profile of immunoglobulin G and IgM antibodies against severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2)

Evaluation of the EUROIMMUN anti-SARS-CoV-2 ELISA assay for detection of IgA and IgG antibodies

Performance characteristics of the Abbott Architect SARS-CoV-2 IgG assay and seroprevalence in

Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS

CoV-2: an observational cohort study

Detection of SARS-CoV-2-specific humoral and cellular immunity in COVID-19 convalescent individuals

Performance characteristics of four high-throughput immunoassays for detection of IgG antibodies against SARS-CoV-2

Validation of a chemiluminescent assay for specific SARS-CoV-2 antibody

Performance of six SARS-CoV-2 immunoassays in comparison with microneutralisation

Cinical evaluation of six high-throughput SARS-CoV-2 IgG antibody assays

This research received no specific grant from any funding agency in the public, commercial, or not-for-profit sector.We thank the Infectious Disease Serology Department staff members at Quest Diagnostics Chantilly and Quest Diagnostics Infectious Disease for expert technical assistance. We thank Ron Kagan for his assistance in retrieving demographic data for the study.