key: cord-0862364-75qbmnnr authors: Kostoulas, Polychronis; Eusebi, Paolo; Hartnack, Sonja title: Diagnostic accuracy estimates for COVID-19 RT-PCR and Lateral flow immunoassay tests with Bayesian latent class models date: 2021-03-31 journal: Am J Epidemiol DOI: 10.1093/aje/kwab093 sha: 7609e0c5f17192c13fc098905b720fd14deff6a6 doc_id: 862364 cord_uid: 75qbmnnr The objective was to estimate the diagnostic accuracy of real time polymerase chain reaction (RT-PCR) and lateral flow immunoassay (LFIA) tests for COVID-19, depending on the time post symptom onset. Based on the cross-classified results of RT-PCR and LFIA, we used Bayesian latent class models (BLCMs), which do not require a gold standard for the evaluation of diagnostics. Data were extracted from studies that evaluated LFIA (IgG and/or IgM) assays using RT-PCR as the reference method. [Formula: see text] was 0.68 (95% probability intervals: 0.63; 0.73). [Formula: see text] was 0.32 (0.23; 0.41) for the first week and increased steadily. It was 0.75 (0.67; 0.83) and 0.93 (0.88; 0.97) for the second and third week post symptom onset, respectively. Both tests had a high to absolute Sp, with higher point median estimates for [Formula: see text] and narrower probability intervals: [Formula: see text] was 0.99 (0.98; 1.00) and [Formula: see text] was 0.97 (0.92; 1.00), 0.98 (0.95; 1.00) and 0.98 (0.94; 1.00) for the first, second and third week post symptom onset. The diagnostic accuracy of LFIA varies with time post symptom onset. BLCMs provide a valid and efficient alternative for evaluating the rapidly evolving diagnostics for COVID-19, under various clinical settings and different risk profiles. Over the past few months, there has been a need for rapid development of diagnostic tests that will efficiently detect SARS-Cov-2 infection. Real-time reverse transcriptase polymerase chain reaction (RT-PCR) tests, which detect the RNA of SARS-Cov-2, are considered as the reference(1) for a COVID-19 diagnosis. In addition, the development of serological assays detecting SARS-COV-2-specific IgM and/or IgG started immediately and is on-going(2) with a large portion of them being Lateral flow immunoassays (LFIA). These immunoassays are evaluated using RT-PCR as a gold standard (3) (4) (5) . However, it is known that RT-PCR is less than 100% sensitive (6) , while false positive results can also occur (7) . Thus, if a new diagnostic test is evaluated assuming RT-PCR as a perfect reference standardalthough it is notthe evaluation of the new test may be biased. In the absence of a gold standard, Bayesian latent class models (BLCMs), which do not require a priori knowledge of the infection status, are a valid alternative to classical test evaluation. In a BLCM setting, none of the tests is considered as a reference method and the sensitivity (Se) and specificity (Sp) for each test is estimated from the analysis of the cross-classified results of two or more tests in one or more populations. Latent models for diagnostic accuracy studies were introduced with the two-test, twopopulation model (8) , which is often referred to as the Hui and Walter paradigm. The first thorough discussion on the applicability of these methods in diagnostic accuracy studies was given by Walter and Irwig (9) and their implementation within a Bayesian framework has been evolving for over 20 years (10) (11) (12) . A meta-analytic alternative for the evaluation of diagnostics from multiple studies in the absence of a reference test has been proposed and can be used, if a sufficiently large number of studies is available (13) . Recently, guidelines for the application and sound reporting of BLCMs in diagnostic accuracy studies, the STARD-BLCM statement, have been proposed (14, 15 Hence, Se/Sp estimates for both RT-PCR and IgG/M were obtained and for each of the first three weeks after the onset of symptoms. In this study we followed the STARD-BLCM guidelines (Web Table 1 )(15). A flow chart for the selection process is in Web Figure 1 . We conducted the literature search using PubMed, medRxiv and bioRxiv without any language restrictions. The search strategy and results for each database are in Web Table 2 . The following search terms were used: ("SARS-CoV-2" OR "SARS-CoV-2" OR "Coronavirus disease 2019" OR "COVID-19") AND ("IgM" OR "IgG" OR "antibodies" OR "antibody" OR "serological" OR "serologic" OR "serology" OR "serum" OR "lateral flow"). Initially, 448 non-duplicated records were screened, and 28 full-text resources were scrutinized. Finally, four studies (18) (19) (20) (21) were identified that fulfilled criteria (i) to (iv) and cross-classified results could be extracted (patient characteristics, study design, and diagnostic tests of these studies are summarized in Web to vary between weeks post symptom onset. Briefly, we assume that for each of the i populationsin our case the four different studiesthe cross classified results of the two tests follow an independent multinomial sampling distribution: with the multinomial cell probabilities being expressed as: Within a fully Bayesian estimation framework, Beta distributions Be (a, b), are used as priors for the parameters of interest: , , and the prevalence p i in each population. Our model assumed that RT-PCR and LFIA are conditionally independent, an assumption which is expected to be valid because the two tests are based on a different biological principle (10) . Nevertheless, to account for the unlikely, yet existent, possibility of conditional dependence between RT-PCR and LFIA we also considered a model that captures conditional dependences. That is: where cdp and cdn is the conditional covariance between the Ses and the Sps, respectively. Uniform priors were specified for cdp and cdn with their limits being directly affected by the magnitude of the Se and Sp values (22): We have a two-test, four-subpopulation model, which is fully identifiable because the number of parameters to be estimated are eight (i.e. the Se and Sp of each test and the prevalence of SARS-Cov-2 infection in each population) for the independence model and ten (i.e. the two additional cdp and cdn parameters) and the degrees of freedom available from the data are twelve. In all alternative prior combinations, a non-informative, uniform beta prior distribution, Be (1, 1), over the range from 0 to 1, was adopted for the , and the prevalence of SARS-Cov-2 infection in each population . (24), autocorrelation checks and visual inspection of the trace plots and summary statistics were used as recommended (25) . Parameter estimates were based on analytical summaries of 60,000 iterations of three chains after a burn-in adaptation phase of 10,000 iterations. All checks suggested that convergence occurred and autocorrelations dropped-off fast (Web Figure 2 ). Models were run in the freeware program JAGS(26) through R(27) using the rjags package (26) . Priors were generated with the PriorGen package (28) . The code is available at https://github.com/paoloeusebi/BLCM-Covid19. A total of 448 studies were initially identified as studies on the evaluation of COVID-19 diagnostics and 28 of them provided access to full data that can be extracted. Table 4 ). We used BLCMs to estimate the diagnostic accuracy of RT-PCR and LFIA tests for SARS-CoV-2 infection depending on the time from the onset of symptoms. BLCMs do not require the presence of a reference test and thus allow for the simultaneous Se and Sp estimation of both tests. They provide a valid and efficient alternative to classical test evaluation (8, 15) . Importantly, the degrees of freedom provided by the data (i.e. 12) exceeded the number of parameters that had to be estimated (i.e. 8 and 10 for the conditional independence and dependence model, respectively), satisfying a necessary condition for identifiability. Further, sensitivity analysis revealed that under alternative prior specifications our results were similar (Web Table 4 that can be followed by a decline in viral load (35) . The latter observation of increasing positive detection rate for IgG and/or IgM with a steady and potentially slight decrease for SARS-CoV-2 viral load has also been observed elsewhere (36, 37) . Se IgG/M is higher than Se RT-PCR after the second week, which is also in line with recent evidence that the sensitivity of antibody assays overtook the RNA test on day 8 after the onset of symptoms (38) . Further, other authors also found a steep increase for antibodies, particularly in the second week, that is accompanied by a slight decrease in the probability of detection with nasopharyngeal swabs/ bronchoalveolar/sputum PCR over the first three weeks after symptom onset (39) . The Sp RT-PCR estimate was close to unity but false positive results can occur (7). There is scarcity of Sp estimates for RT-PCR methods because they are considered as the Finally, Sp IgG/M was also close to perfect, but with median estimates consistently lower than those for Sp RT-PCR but not statistically different. False positive results can be due to cross-reactions, which have been observed in diagnostic evaluation studies that were based on a reference standard from healthy individuals or individuals that have diseases unrelated to SARS-CoV-2 infection (43) . Cross reactivity between SARS-CoV-2 IgM assays and the rheumatoid factor IgM (RF-IgM) has also been observed (44) . for Se/Sp estimates that will be specific to different risk profiles and will allow for the interpretation of test outcomes according to the relevant epidemiological situation in each case. All data, models, codes and priors necessary to reproduce the results of this article are available in the main text or as supplementary files. This work was funded by COST Action CA18208: Diagnostic Testing for Severe Acute Respiratory Syndrome-Related Coronavirus-2: A Narrative Review Developing antibody tests for SARS-CoV-2. The Lancet Clinical significance of IgM and IgG test for diagnosis of highly suspected COVID-19 infection Clinical meanings of rapid serological assay in patients tested for SARS-Co2 Development and Clinical Application of A Rapid IgM-IgG Combined Antibody Test for SARS-CoV-2 Infection Diagnosis False-negative of RT-PCR and prolonged nucleic acid conversion in COVID-19: Rather than recurrence Positive RT-PCR Test Results in Patients Recovered from COVID-19 Estimating the Error Rates of Diagnostic Tests Estimation of test error rates, disease prevalence and relative risk from misclassified data: a review Estimation of diagnostic-test sensitivity and specificity through Bayesian modeling Estimation of diagnostic test accuracy without full verification: A review of latent class methods Sample size estimation to substantiate freedom from disease for clustered binary data with a specific risk profile Bayesian Meta-Analysis of the Accuracy of a Test for Tuberculous Pleuritis in the Absence of a Gold Standard Reference STARD-BLCM: Standards for the Reporting of Diagnostic accuracy studies that use Bayesian Latent Class Models Reporting guidelines for diagnostic accuracy studies that use Bayesian latent class models (STARD-BLCM) The STARD statement for reporting studies of diagnostic accuracy: explanation and elaboration Rapid diagnosis of SARS-CoV-2 infection by detecting IgG and IgM antibodies with an immunochromatographic device: a prospective single-center study Diagnostic Indexes of a Rapid IgG/IgM Combined Antibody Test for SARS-CoV-2. medRxiv Serological immunochromatographic approach in diagnosis with SARS-CoV-2 infected COVID-19 patients Test performance evaluation of SARS-CoV-2 serological assays Bayesian approaches to modeling the conditional dependence between multiple diagnostic tests Practical Markov Chain Monte Carlo]: comment: one long run with diagnostics: implementation strategies for Markov Chain Monte Carlo. Statistical science Inference from iterative simulation using multiple sequences CODA: Convergence diagnosis and output analysis software for Gibbs sampling output. Version 0.3. MRC Biostatistics Unit, Cambridge University rjags: Bayesian graphical models using MCMC. R package version R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing PriorGen: Generates Prior Distributions for Proportions. R package version 1 Diagnosis of SARS-CoV-2 Infection based on CT scan vs. RT-PCR: Reflecting on Experience from MERS-CoV Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study. The Lancet Infectious Diseases Antibody testing for COVID-19: A report from the National COVID Scientific Advisory Panel Antibody responses to SARS-CoV-2 in COVID-19 patients: the perspective application of serological tests in clinical practice Serological detection of 2019-nCoV respond to the epidemic: A useful complement to nucleic acid testing Clinical infectious diseases : an official publication of the Infectious Diseases Society of America Serology characteristics of SARS-CoV-2 infection since exposure and post symptom onset. The European respiratory journal Diagnostic value and dynamic variance of serum antibody in coronavirus disease 2019 Clinical infectious diseases : an official publication of the Infectious Diseases Society of America Evaluation the auxiliary diagnosis value of antibodies assays for detection of novel coronavirus (SARS-Cov-2) causing an outbreak of pneumonia (COVID-19) Interpreting Diagnostic Tests for SARS-CoV-2 Real-time RT-PCR in COVID-19 detection: issues affecting the results. 2020; Expert review of molecular diagnostics Correlation of Chest CT and RT-PCR Testing in Coronavirus Disease 2019 (COVID-19) in China: A Report of 1014 Cases Performance of radiologists in differentiating COVID-19 from viral pneumonia on chest CT Evaluation of nine commercial SARS-CoV-2 immunoassays A method to prevent SARS-CoV-2 IgM false positives in gold immunochromatography and enzyme-linked immunosorbent assays