key: cord-0814484-l2wxq1qm authors: Sherrill-Mix, S. title: Dynamics of RT-qPCR SARS-CoV-2 Detection Rates Prior to and After Symptom Onset date: 2020-07-11 journal: nan DOI: 10.1101/2020.07.09.20149245 sha: f3d7b788e64a6796a3e1fea10a214d7f1637fd22 doc_id: 814484 cord_uid: l2wxq1qm Effective RT-qPCR testing for SARS-CoV-2 is essential for treatment, surveillance and control of the COVID-19 pandemic. A recent meta-analysis suggested that testing prior to the onset of symptoms is likely to miss the majority of infected individuals. These findings cast severe doubts on the effectiveness of mass screening efforts intended to detect SARS-CoV-2 prior to the onset of symptoms and decrease community transmissions from pre-/asymptomatic individuals. However, alternative analyses and additional data described herein refine these estimates and suggest that many SARS-CoV-2 infections could potentially be detected prior to symptom onset. In a recently published study, Kucirka et al. 1 collected data from seven studies of RT-qPCR testing of patients infected with SARS-CoV-2 [5] [6] [7] [8] [9] [10] [11] and concluded that detection of SARS-COV-2 was very difficult prior to symptom onset and effectively impossible earlier than three days prior to symptom onset. However the polynomial logistic regression model used for this analysis has the potential to underestimate uncertainty in time periods with sparse data and unfortunately only three tests from a single patient were available prior to the onset of symptoms. In order to more adequately account for this uncertainty, we developed a Bayesian autoregressive moving average state space model (B-ARMA-SSM) and used Markov chain Monte Carlo sampling to estimate posterior probabilities for parameters of interest from the data collected by Kucirka et al. 1 For the day of and days following symptom onset, the detection rates estimated by the B-ARMA-SSM and the polynomial model of Kucirka et al. 1 closely resembled each other ( Figure 1 ). Both models estimated a peak in detection rates 3-4 days after symptom onset followed by a progressive decline in the probability of a positive detection. However in the days prior to symptom onset, the two models differed markedly in their estimates of detection rates and in their confidence in these estimates. In spite of the data only containing three tests in this time period, the polynomial model estimated very low detection rates, ostensibly precisely and with little potential for error. In contrast, the B-ARMA-SSM estimated higher probabilities of presymptomatic detection while displaying a much larger amount of uncertainty in its estimates ( Figure 1 ). For example, at 4 days prior to symptom onset the B-ARMA-SSM estimates a detection rate of 35% (95% credible interval: 6-76%) while the polynomial model estimates a detection rate of 0% (95% CrI: 0-0%). Intuitively, the larger confidence intervals seem more appropriate given the limited testing available but without further data it is unclear which of these estimates is more likely to reflect reality. Fortunately, additional data has become available allowing a test of the predictive powers of these models. We identified 7 additional studies 2,3,12-16 by following the PubMed search strategy described previously 1 . These new data contained 381 results from qPCR testing of 124 patients, including 21 patients measured prior to symptom onset. Combined with the previous seven studies, these data contained the results of 1619 RT-qPCR tests. Fitting the B-ARMA-SSM to this combined data produced an updated estimate with remarkable similarity to its previous predictions ( Figure 2 ). In addition, the estimates fell well within the credible intervals previously predicted by the model suggesting that the B-ARMA-SSM was able to adequately account for uncertainty in the data. In contrast, the new data did not seem to agree with the predictions made by the polynomial model. For example, the polynomial model predicted a 0% probability of detection earlier than 3 days prior to symptom onset yet 8 out of 13 tests administered between 4-7 days prior to symptom onset were positive ( Figure 2) . Overall, the B-ARMA-SSM estimated that in the combined data RT-qPCR detection rates averaged 80% (95% CrI: 65-89%) on the day of symptom onset, increased to 86% (95% CrI: 76-92%) at three days after symptoms and then fell steadily further into infection. Similarly, detection was estimated as increasingly unlikely the earlier a patient was sampled prior to the onset of symptoms. For example, at 4 days prior to symptom onset, the average detection rate was estimated at 56% (95% CrI: 30-79%) and at 8 days prior to symptom onset, rates were estimated at 21% (95% CrI: 5-55%). Thus, detection of SARS-CoV-2 in presymptomatic individuals must be approached cautiously but is not nearly as difficult as previously suggested. The relatively high false negative rates in SARS-CoV-2 RT-qPCR testing predicted here and previously 1 is worrisome. Of course, what is a "false negative" depends greatly on the purpose of testing. If tests are intended to provide documentation of previous SARS-CoV-2 infection then negative tests late in infection do indeed represent false negatives. However with the rapid development of serological testing, the more likely use for RT-qPCR testing is to detect positive cases prior to the onset of symptoms and to monitor viral clearance after symptom onset. In these cases, many "false negatives" are likely to be due to the reduction viral load to undetectable, and potentially less transmissible 2,4,15,17,18 , levels rather than a failure of PCR testing. Thus at later time points, the false negative rate is likely overestimated. A similar problem arises in the interpretation of rates prior to symptom onset where it is unclear if a false negative is due to a missed test or instead because the patient has not yet been infected or reached significant levels of viral load. In interpreting these data, it is also important to consider the limitations inherent to opportunistically collected data. The collected studies varied in techniques, assays and patient populations and some studies were estimated to have significantly higher or lower detection rates. For example, the van Kampen et al. 15 study was estimated to have the highest predicted detection rate with an estimated 97% (95% CrI: 94-95%) probability of detection in infected patients at three days after symptom onset. In addition, these analyses ignore censoring in the patient data. If a patient tests negative several times in a row then their doctor will often not continue to test them which would bias late stage data towards towards patients with prolonged disease. However, patient death or loss to followup due to more extreme disease progression could have an opposing effect. Further data from prospective sampling of at risk patients is essential to further characterize this critical period. Population-wide screening efforts are being implemented in an attempt to reduce spread from asymptomatic and presymptomatic individuals infected with SARS-CoV-2. An understanding of detection rates is critical to these efforts. The preliminary analyses presented here suggest that, although testing will certainly miss some infections, the detection of many presymptomatic individuals is indeed possible and offers the potential to greatly reduce the spread of SARS-CoV-2 19? . Data from seven studies 5-11 analyzed previously was obtained from Kucirka et al. 1 . Additional data was collected from seven additional studies reporting testing in patients who developed symptoms, were tested multiple times and tested positive at least once 2,3,12-16 . Where raw data were unavailable, data were digitized from published plots. Data from Arons et al. 2 was provided by personal communication. Only nasopharyngeal swabs were used from Xiao et al. 12 and tests detecting ≥ 1 PCR target were counted as positives. Only upper respiratory tract samples were used from van Kampen et al. 15 The data from a preprint by Kujawski et al. 8 in the initial data was replaced by data from the final publication by COVID-19 Investigation Team 20 In order to estimate the longitudinal progression of detection rates in SARS-CoV-2 patients, we developed a Bayesian autoregressive moving average state space model (B-ARMA-SSM). This model attempts to reduce potential misinterpretations in data sparse time periods by assuming simply that the detection rate on a given day should on average resemble that of the previous day and that increases or decreases in detection rate on a previous day will on average tend to continue. These probability are not directly observed but are inferred from the counts of positive and negative tests. To anchor the model, detection at 20 days prior 2 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 11, 2020. . to symptom onset is given a low prior probability. Differential detection rates in the individual studies are modeled as a constant multiplicative change in odds ratio drawn from a normal distribution of potential study offsets. The probabilities of RT-qPCR detection were thus: (1, 1) where x j,t is the number of positives out of n j,t total tests from study j at t days after symptom onset with estimated binomial probability p j,t and study effect B j . The posterior probabilities of the Bayesian model were estimated using Markov chain Monte Carlo sampling as implemented in Stan 21 and analysis and plotting were performed in R 22 . All data and code is available at https://github.com/sherrillmix/covidRTPCR and will be archived at Zenodo prior to final publication. . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. The copyright holder for this preprint this version posted July 11, 2020. 6 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted July 11, 2020. . https://doi.org/10.1101/2020.07.09.20149245 doi: medRxiv preprint Variation in false-negative rate of reverse transcriptase polymerase chain reaction-based SARS-CoV-2 tests by time since exposure Presymptomatic SARS-CoV-2 infections and transmission in a skilled nursing facility Viral kinetics of SARS-CoV-2 in asymptomatic carriers and presymptomatic patients Cluster of coronavirus disease 2019 (COVID-19) in the French Alps Virological assessment of hospitalized patients with COVID-2019 Myoung Don Oh, and Korea National Committee for Clinical Management of COVID-19. Clinical course and outcomes of patients with severe acute respiratory syndrome coronavirus 2 infection: a preliminary report of the first 28 patients from the Korean cohort study on COVID-19 Antibody responses to SARS-CoV-2 in patients of novel coronavirus disease A preliminary study on serological assay for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in 238 admitted hospital patients Profiling early humoral response to diagnose novel coronavirus disease (COVID-19) Profile of RT-PCR for SARS-CoV-2: a preliminary study from 56 COVID-19 patients Assessment of sensitivity and specificity of patient-collected lower nasal specimens for sudden acute respiratory syndrome coronavirus 2 testing CoV-2 shedding and seroconversion among passengers quarantined after disembarking a cruise ship: a case series Shedding of infectious virus in hospitalized patients with coronavirus disease-2019 (COVID-19): duration and key determinants. medRxiv Prolonged virus shedding even after seroconversion in a patient with COVID-19 Predicting infectious SARS-CoV-2 from diagnostic samples Pathogenesis and transmission of SARS-CoV-2 in golden hamsters The implications of silent transmission for the control of COVID-19 outbreaks Clinical and virologic characteristics of the first 12 patients with coronavirus disease 2019 (COVID-19) in the United States Stan: A probabilistic programming language R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing