key: cord-1054750-lagddst9 authors: Tollånes, Mette C.; Jenum, Pål A.; Kierkegaard, Helene; Abildsnes, Eirik; Magne Bævre-Jensen, Roar; Breivik, Anne C.; Sandberg, Sverre title: Evaluation of 32 rapid tests for detection of antibodies against SARS-CoV-2 date: 2021-04-27 journal: Clin Chim Acta DOI: 10.1016/j.cca.2021.04.016 sha: 63c1c07d4be1ba9d22f9121269a28c3d8b4d7070 doc_id: 1054750 cord_uid: lagddst9 Aims To evaluate the analytical performance of 32 rapid tests for detection of antibodies against coronavirus SARS-CoV-2. Materials and methods We used at total of 262 serum samples (197 pre-pandemic and 65 convalescent COVID-19), and three criteria to evaluate the rapid tests under standardized and optimal conditions: (i) Immunoglobulin G (IgG) specificity “good” if lower limit of the 95% confidence interval was ≥97.0%, “acceptable” if point estimate was ≥97.0%, otherwise “not acceptable”. (ii) IgG sensitivity “good” if point estimate was ≥90.0%, “acceptable” if ≥85.0%, otherwise “not acceptable”. (iii) User-friendliness “not acceptable” if complicated to perform or difficult to read result, otherwise “good”. We also included partial evaluations of three automated immunoassay systems. Results Sensitivity and specificity varied considerably; IgG specificity between 90.9% (85.9-94.2) and 100% (97.7-100.0), and IgG sensitivity between 53.8% (41.9-65.4) and 98.5% (91.0-100.0). Combining our evaluation criteria, none of the 28 rapid tests that detected IgG had an overall performance considered “good”, seven tests were considered “acceptable”, while 21 tests were considered “not acceptable”. Four tests detected only total antibodies and were not given an overall evaluation. IgG sensitivity and/or specificity of the automated immunoassays did not exceed that of many rapid tests. Conclusion When prevalence is low, the most important analytical property is a test’s IgG specificity, which must be high to minimize false positive results. Out of 32 rapid tests, none had a performance classified as “good”, but seven were classified as “acceptable”. The current worldwide Coronavirus disease 2019 pandemic, caused by the severe acute respiratory disease coronavirus 2 (SARS-CoV-2) [1] , has led to a surge in research regarding testing, prevention, and treatment. Until safe and efficient vaccines are available for everyone, the World Health Organization maintains that accurate and efficient testing is among the key elements in the strategy to limit virus spread [2] . The current gold standard to detect present infection is real-time revers transcription PCR (RT-PCR), detecting viral RNA directly in a sample from the upper or lower airways, a relatively personnel-, time-and resource demanding procedure. In late March 2021, the Norwegian Institute of Public Health estimated that approximately 2.5% of the Norwegian population had been infected with SARS-CoV-2 [3] . To what extent SARS-CoV-2 infection leads to transient or long-lasting immunity is debated [4, 5] . Using simple and cheap antibody detecting tests could potentially of value in some situations, for instance to confirm past infection or for epidemiological surveillance [6] [7] [8] [9] . For these purposes, a test's ability to reliably detect immunoglobulin G (IgG) is considered most important [6, 7, 10]. Many inexpensive antibody detecting tests designed for point-of-care use are currently available for professional use. Although the number of published manufacturer-independent evaluations are growing [9, [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] , data on test performance is not always available. The aim of the present study was to evaluate the analytical performance and user friendliness of several rapid tests to aid health care professionals in their choice of antibody-detecting test, particularly in a low prevalence, pointof-care setting. We also included three automated immunoassay systems from hospital laboratories for partial evaluations. The study was a collaboration between the Kristiansand Municipality, Norway, Vestre Viken Hospital Trust, Norway, Lillebaelt Hospital, Denmark, and the Norwegian Organization for Quality Improvement of Laboratory Examinations (Noklus). To evaluate the analytical specificities of the rapid tests, we used 99 pre-pandemic serum samples from Vejle biobank, primarily set up to study etiologies of diabetes and its comorbidites [25] . We also used 98 pre-pandemic clinical samples left over from routine analyses at Vestre Viken Hospital trust, with no information on indication or suspected diagnosis. To evaluate the analytical sensitivities of the rapid tests and automated immunoassays, we used 65 serum samples from 65 convalescent participants who had previously been confirmed infected with SARS-CoV-2 by RT-PCR. All 65 had been community treated, not requiring hospitalization. The selection of rapid tests (Table 1 and Supplementary table) consisted of the tests that manufacturers or suppliers could send to Noklus by the set deadlines. The first 17 rapid tests were evaluated during June 2020 (65 samples from convalescents and 99 pre-pandemic samples) and September 2020 (98 pre-pandemic samples), and the final 15 rapid tests during October 2020 (all samples). To avoid unnecessary thawing and freezing cycles, the samples were kept in small aliquots at -80°C and thawed only prior to analyses. All rapid tests were lateral flow immunochromatographic assays, except number 32, which was based on a microfluidic system. Three tests required a reader (tests 8, 31 and 32), while the rest were read visually. Most tests had separate fields for detection of immunoglobulin M (IgM) and IgG on the same cartridge, although four tests detected only total antibodies (number 14, 21, 30, and 32) . All rapid tests were performed under standardized and optimal conditions in accordance with the manufacturers' instructions at Noklus' headquarters. Faint banding was considered a positive result. Results were read independently by two biomedical laboratory scientists, and in cases of discordant results, a third was used as an arbitrator. Test interpretation was not blinded to reference standard status. Due to the limited volume of some serum samples, they were not all analyzed on all three automated immunoassay systems. Serum samples from the 65 previously RT-PCR positive participants were analyzed on two different platforms for qualitative detection of SARS-CoV-2 IgG; DS2 ® Automated ELISA Processing System (DYNEX Technologies, Inc. 14340 Sullyfield Circle, Chantilly, VA 20151 USA) was used with "EDITM Novel Coronavirus COVID-19 IgG Elisa kit" (Epitope Diagnostics, Inc.7110 Carroll Road, San Diego, CA 92121, US), and Alinity i (Abbott, Abbott Park, Illinois, U.S.A.) was used with the kit "SARS-CoV-2 IgG" ref 06R90 (Abbott Ireland, Diagnostics Division, Finisklin Business Park, Sligo, Ireland). 46 of the samples were further analyzed on an iFlash Immunoassay Analyzer 1800 (Shenzen YHLO Biotech Co. ltd. China) with the kit "SARS-CoV-2 IgG" ref C86095G. The 98 pre-SARS-CoV-2 serum samples from Vestre Viken Hospital trust were analyzed on the DS2 platform, and the 99 pre-SARS-CoV-2 serum samples from Vejle biobank were analyzed on the iFlash platform. Only IgG results are reported. Stata IC/16.1 (StataCorp LLC) was used for statistical analyses. IgG and IgM specificities were calculated separately (where possible) from analyses of pre-pandemic sera and defined as the proportion of SARS-CoV-2 antibody negative samples. IgG and IgM sensitivities were defined as the proportion of recovered COVID-19 participants who had detectable IgG and IgM antibodies, respectively. We computed 95% confidence intervals (CIs) for the sensitivities and specificities using the Agresti-Coull Method [26] . Automated immunoassays all report an ambiguous area around the cut-off value, but for the purpose of calculating sensitivity and specificity, we used the laboratory reported cut-offs to classify borderline results as positive or negative ( iFlash 1800: 12 AU, DYNEX DS2: 1.0 S/CO, and Alinity i: 1.4 S/C). For any test, there is usually a trade-off between sensitivity and specificity, and the most important properties of a test will vary with the clinical situation [27] . When the prevalence of past and present COVID-19 is low, the most important property of a test is a very high specificity in order to minimize the risk of false positive results, and both ≥99% [28] and ≥ 97% [10] have been suggested cut-offs. To facilitate the choice of antibody-detecting rapid test in a low prevalence, point-of-care setting, we suggest the following criteria to classify rapid test performance as "good", "acceptable" or "not acceptable": 1. IgG specificity:  "good" if the lower limit of the 95% CI of the point estimate is ≥97.0%  "acceptable" if the point estimate is ≥97.0% (while lower limit of the 95% CI <97.0%)  otherwise "not acceptable" 2. IgG sensitivity:  "good" if the point estimate is ≥90.0%  "acceptable" if the point estimate is in the interval [85.0% -90.0%>  otherwise "not acceptable" 3. User-friendliness (for a point-of-care setting):  "not acceptable" if complicated to perform or difficult to read result  otherwise "good" To receive an overall evaluation of "good", all three performance characteristics should be classified as "good". If one is classified as "not acceptable", the overall evaluation is "not acceptable". Otherwise, the performance should be considered "acceptable". Tests that detected total antibodies and not IgG specifically, were not given an overall evaluation. The project was considered a method evaluation study and therefore exempt from ethical board approval in Norway. Recovered COVID-19 participants gave written informed consent to participate. In Denmark, use of restmaterial as separated plasma/serum from anonymous healthy persons for technical quality control is not restricted. The project was approved by the Data protection officers in Kristiansand Municipality, at Vestre Viken Hospital Trust, and at Noklus. Suppliers provided their tests free of charge to Noklus and did not pay for the evaluation. In sending the tests, they consented to having the results published. When blood samples were drawn from the 65 recovered COVID-19 participants, the number of days since their onset of symptoms was between 37 and 89 (median 67 days). The participants were between 15 and 75 years old (median age 53), and 38 (58%) were women. Participants reported having had varying degrees of symptoms during COVID-19, though none had required hospitalization. The analytical performance of the rapid tests varied considerably (Table 2 and Supplementary figure) . Twenty-one rapid tests had IgG specificity above 97%. IgM specificity was generally equal to or lower than IgG-specificity (Table 2) . We were only able to evaluate the analytical specificity of two out of three automated immunoassays due to small available volumes of pre-pandemic samples. The results, however, suggest analytical specificities of the iFlash and DYNEX DS2 systems were not necessarily superior to several of the rapid tests (Table 3) . Analytical sensitivity also varied considerably. Eleven rapid tests had point estimates of IgG sensitivity above 90%, while 14 tests had point estimates of IgG sensitivity below 85%. IgG sensitivities of the included automated immunoassays did not exceed that of several of the rapid tests (Table 3) . We calculated predictive values at various prevalences for three rapid tests at different ends of the performance spectrum (Table 4 ). While high specificity increases positive predictive value (PPV) at lower prevalences, even a test with a specificity of 99% (test 2) would have less than 70% positive predictive value at 2% prevalence. There was great variability in the number of rapid tests that were positive in samples from each of the 65 recovered COVID-19 participants (Supplementary figure 1, panel B ). For instance, 18 of the 65 participants tested positive for IgG antibodies on all 28 IgG detecting rapid tests. While none of the tests had 100% sensitivity, no participant tested negative on all rapid tests either. The majority of the tests were considered easy to perform and interpret. For three rapid tests, more than 10% of test results had to be interpreted by more than two BLS to reach consensus (Table 1) . Nine tests were judged not acceptable by the user-friendliness evaluation criterion for a point-ofcare setting. Combining our evaluation criteria of IgG specificity, IgG sensitivity and user-friendliness, no test's overall performance was considered "good", but tests 2, 3, 4, 7, 12, 15, and 16, were considered "acceptable" for use in a low prevalence, point-of-care setting. We evaluated 32 antibody-detecting rapid tests for SARS-CoV-2 using criteria of IgG specificity, IgG sensitivity, and user-friendliness. We found great variability in analytical performance. Emphasizing test properties considered most important in a low prevalence, point-of-care setting, no test was considered "good", but seven tests were given an overall evaluation of "acceptable". Strengths of our study include the large number of rapid tests evaluated under identical and optimal conditions, allowing direct comparisons of test properties. The use of serum samples predating the emergence of SARS-CoV-2 allowed us to evaluate analytical specificities. Also, since previous studies have shown that the amount of antibodies produced is associated with the severity of COVID-19 [14, 29] , analytical properties of the tests will depend on the population it is used in [28] . To evaluate the rapid tests in a community setting, we used sera from recovered, community-treated COVID-19 patients (all confirmed RT-PCR positive), who had not been hospitalized for COVID-19. In addition, the community treated recovered participants all had more than a month between onset of symptoms and blood sampling, allowing everyone ample time to develop antibodies [30, 31] . Our study also has a number of limitations. We did use a reasonably large number of serum samples for the evaluation, both from recovered COVID-19 participants (n=65) and from before the pandemic (n=197). However, a larger number of serum samples would have made the evaluation even more robust. We did not have access to sera with known antibodies, or sera from patients with a known previous non-SARS coronavirus or other infection, to further challenge the tests for cross-reactivity. Also, while we did not have sufficient volumes of the pre-pandemic sera to allow full evaluations of the automated immunoassays, we included the results mainly to give an indication of their performance compared to that of the rapid tests. Further, by performing the evaluation under optimal conditions, not by intended users, and not blinded to reference standard status, both preanalytical and analytical errors were minimized, thus performance could be poorer in real life. In addition, even if all manufacturers state that full blood (capillary or venous), serum or plasma are equally suitable test materials, evaluating rapid tests using whole blood rather than serum would have closer mimicked real-life use. It is reassuring that studies have reported that the performance of most rapid antibody tests using whole blood was comparable to serum or plasma [22, 32] . Finally, we were only able to investigate analytical sensitivity and specificity, and not real-life diagnostic accuracy of the rapid tests. Test performance could therefore be poorer in a diagnostic setting. An ideal, prospectively designed study would involve repeated testing with RT-PCR and antibody tests for a large group of people over a prolonged period of time. Still, we believe this study provides valuable and relevant information for health care professionals faced with a choice between many antibody-detecting rapid tests. Several previous evaluations of rapid tests have now been published. Some are of limited quality because they use sera from a small number of pre-pandemic or COVID-19 participants [11, 12, 17, 18, 20, 21, 23, 24] . Many use sera from hospitalized COVID-19 patients [11, 13, 15-18, 20, 21] , or do not state whether participants had been hospitalized [12, 23] , which limits knowledge about the tests' usefulness in a community setting. Some studies report results for IgM and IgG combined [11, 16, 17, 20] , which does not allow conclusions about past infection. There seems to be a very large number of antibody-detecting rapid tests available, and we found only a few studies including some of the tests we evaluated. In these cases, the study designs were not considered similar enough to allow meaningful comparisons of test performances across studies. The Cochrane review published in June 2020, also noted the lack of high-quality diagnostic accuracy studies evaluating SARS-CoV-2 antibody tests in general, and point-of-care rapid tests in particular [9] . Tests 14, 21, 30, and 32 detected total antibodies and not IgG antibodies specifically. In our study, IgM-specificity was generally equal to or lower than IgG-specificity, implying a higher risk of false positive results if using IgM results. Thus, in a patient with a positive total-antibody test, the possibility of an isolated IgM response, which could be due to unspecific cross-reactivity, cannot be ruled out without supplementary testing. Furthermore, it may be considered a disadvantage that IgM and IgG tests often come in the same cartridge. Past infection is confirmed with IgG alone [10] , but the IgM result may be misinterpreted and cause confusion [15] . At worst, unspecific cross-reactivity may give the wrongful impression that the patient has some protection against future infection with SARS-CoV-2, which may affect behavior and increase the risk of future COVID-19 and spread of SARS-CoV-2. It is noteworthy, however, that test 32, despite the risk of unspecific interference, demonstrated high specificity in combination with a very high sensitivity in our study population Our user-friendliness criterion was primarily designed for a point-of-care setting (i.e., primary health care, health center, nursing home, etc.). Test 31 is intended for use in laboratories of moderate complexity and has for that reason been judged as not acceptable under the user-friendliness criterion. However, that does not imply that this test is not suitable for use in a laboratory facility of moderate complexity. It has been reported that not everyone infected with SARS-CoV-2 will develop detectable antibodies [30, 33] . The various antibody-detecting tests target antibodies against different SARS-CoV-2 antigens, and for the rapid tests, which antigen it detects is rarely declared. It is an interesting observation that while none of our rapid tests detected antibodies in all the recovered COVID-19 participants, none of the participants tested negative on all the rapid tests either. The differences probably reflect that people infected with SARS-CoV-2 produce both different amounts and different types of antibodies. Our results further suggest that by combining several antigens, or tests targeting several antigens, sensitivity may be increased. However, this could possibly increase the risk of false positive results, thereby lowering specificity. Clinical use of antibody detecting rapid tests is currently debated. In a hospital setting where RT-PCR is negative, antibody status may be helpful for the clinician [9] . Confirming past infection in a community setting has also been suggested, as well as seroprevalence studies for epidemiological surveillance [9] . For tests using the Spike-protein of SARS-CoV-2 as the antigen, vaccination status confirmation is an upcoming possible indication that we have not investigated. Since we evaluated rapid tests in a population that had had varying degrees of symptoms during COVID-19, we do not know how the tests would perform in those who have had very little or no symptoms, which would be relevant in a seroprevalence study. Currently in Norway, where only an estimated 2.5% of the population have ever been infected with SARS-CoV-2, we suggest confirmation of past COVID-19 infection in a community setting may be the most appropriate area of use for an antibody-detecting rapid tests. What is considered the most important property of a rapid test will vary with the clinical setting, but in this situation, avoiding false positive results is important. For this reason, we have emphasized IgG specificity as the most important test property. However, as a larger proportion of the population becomes vaccinated, this type of use will probably become gradually less relevant. The appropriate future use of antibody-detecting rapid tests in the clinical pathway of SARS-CoV-2 infection is currently uncertain. When an antibody-detecting rapid test is used in a low prevalence setting, the most important consideration should be the test's IgG specificity, which must be very high to minimize the risk of false positive results. Taking also IgG sensitivity and user friendliness into consideration, none of the 32 rapid tests evaluated had a performance classified as "good", but seven tests were classified as "acceptable" for use in a low prevalence, point-of-care setting. Abbreviations: IgG, Immunoglobulin type G; IgM, immunoglobulin type M; RT-PCR, Reverse transcription polymerase chain reaction Funding: This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Abbreviations: SARS-CoV-2, Severe acute respiratory syndrome coronavirus 2; COVID-19, Coronavirus disease 2019; IgG, Immunoglobulin G; IgM, Immunoglobulin M; Noklus, Norwegian Organization for Quality Improvement of Laboratory Examinations. Good a Rapid tests were classified according to three performance characteristics: (i) IgG specificity, (ii) IgG sensitivity, and (iii) user-friendliness, see text for details. No test was classified as good in all three areas; hence no test received an overall evaluation of "good". When at least one characteristic was classified as "not acceptable", so was the overall evaluation. Otherwise, performance was considered "acceptable". Tests are sorted according to overall evaluation, user-friendliness, and IgG specificity. b Specificities calculated from 99 serum samples from Vejle biobank A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-toperson transmission: a study of a family cluster Public Health Surveillance for COVID-19, Interim guidance, World Health Organization Norwegian Institute of Public Health COVID-19 modelling team, Situational awareness and forecasting for Norway COVID-19 and the Path to Immunity Change in Antibodies to SARS-CoV-2 Over 60 Days Among Health Care Personnel Medicine and Healthcare products Regulatory Agency, Target Product Profile: Antibody tests to help determine if people have immunity to SARS-CoV Laboratory diagnosis of severe acute respiratory syndrome coronavirus 2 Antibody tests for identification of current and past infection with SARS-CoV-2 The Role of Antibody Testing for SARS-CoV-2: Is There One? Evaluation of nine commercial SARS-CoV-2 immunoassays Evaluation of a COVID-19 IgM and IgG rapid test; an efficient tool for assessment of past exposure to SARS-CoV-2 Evaluation of SARS-CoV-2 serology assays reveals a range of test performance Evaluation of eleven rapid tests for detection of antibodies against SARS-CoV-2 Diagnostic performance of seven rapid IgG/IgM antibody tests and the Euroimmun IgA/IgG ELISA in COVID-19 patients, Clinical microbiology and infection : the official publication of the Multicenter evaluation of two chemiluminescence and three lateral flow immunoassays for the diagnosis of COVID-19 and assessment of antibody dynamic responses to SARS-CoV-2 in Taiwan Evaluation of two automated and three rapid lateral flow immunoassays for the detection of anti-SARS-CoV-2 antibodies Four point-of-care lateral flow immunoassays for diagnosis of COVID-19 and for assessing dynamics of antibody responses to SARS-CoV-2 Evaluation of the performance of SARS-CoV-2 serological tools and their positioning in COVID-19 diagnostic strategies Evaluation of Six Commercial Mid-to High-Volume Antibody and Six Point-of-Care Lateral Flow Assays for Detection of SARS-CoV-2 Antibodies Retrospective clinical evaluation of 4 lateral flow assays for the detection of SARS-CoV-2 IgG Clinical and laboratory evaluation of SARS-CoV-2 lateral flow assays for use in a national COVID-19 seroprevalence survey Lateral Flow Assays for Rapid Point of Care Testing Evaluation of diagnostic accuracy of 10 serological assays for detection of SARS-CoV-2 antibodies, European journal of clinical microbiology & infectious diseases : official publication of the Vejle Diabetes Biobank -a resource for studies of the etiologies of diabetes and its comorbidities Interval estimation for a binomial proportion Setting minimum clinical performance specifications for tests based on disease prevalence and minimum acceptable positive and negative predictive values: Practical considerations applied to COVID-19 testing Antibody Profiles in Mild and Severe Cases of COVID-19 Antibody responses to SARS-CoV-2 in patients with COVID-19 Antibody Responses to SARS-CoV-2 in Patients With Novel Coronavirus Disease Comparison of the diagnostic performance with whole blood and plasma of four rapid antibody tests for SARS-CoV-2 Profiling Early Humoral Response to Diagnose Novel Coronavirus Disease (COVID-19) Writing -Original Draft