key: cord-0883779-gcyvawdz
authors: Borremans, Benny; Gamble, Amandine; Prager, KC; Helman, Sarah K; McClain, Abby M; Cox, Caitlin; Savage, Van; Lloyd-Smith, James O
title: Quantifying antibody kinetics and RNA detection during early-phase SARS-CoV-2 infection by time since symptom onset
date: 2020-09-07
journal: eLife
DOI: 10.7554/elife.60122
sha: b284d9d483f81b0dfd87d45625b3d515c71e581d
doc_id: 883779
cord_uid: gcyvawdz

Understanding and mitigating SARS-CoV-2 transmission hinges on antibody and viral RNA data that inform exposure and shedding, but extensive variation in assays, study group demographics and laboratory protocols across published studies confounds inference of true biological patterns. Our meta-analysis leverages 3214 datapoints from 516 individuals in 21 studies to reveal that seroconversion of both IgG and IgM occurs around 12 days post-symptom onset (range 1–40), with extensive individual variation that is not significantly associated with disease severity. IgG and IgM detection probabilities increase from roughly 10% at symptom onset to 98–100% by day 22, after which IgM wanes while IgG remains reliably detectable. RNA detection probability decreases from roughly 90% to zero by day 30, and is highest in feces and lower respiratory tract samples. Our findings provide a coherent evidence base for interpreting clinical diagnostics, and for the mathematical models and serological surveys that underpin public health policies.

Since its emergence in December 2019, the SARS-CoV-2 pandemic has been the subject of intense research assessing all facets of the pathogen and its rapid global spread. Serology -the measurement of serum antibodies -provides crucial data for understanding key aspects of infection and epidemiology . At the level of populations, serologic data can provide insights into virus spread by enabling estimation of the overall attack rate, and seroprevalence estimates can elucidate the potential for herd immunity (Stringhini et al., 2020; Bryant et al., 2020) . In addition, these estimates are essential for developing accurate mathematical models of virus transmission dynamics, which provide the foundation for policies to reopen societies (Krsak et al., 2020; Angulo et al., 2020; Kissler et al., 2020) . At the level of individuals, the presence and concentration of antibodies against SARS-CoV-2 are indicators of past exposure, providing insights over a much wider temporal window than other metrics. When considered jointly with PCR testing to detect viral RNA, antibodies substantially improve the probability of detecting present and past infections (Prager et al., 2019) . This improvement is highly valuable because RNA detection is typically limited to a relatively brief period of infection, and because PCR sensitivity varies considerably with infection severity and biological sample type (Azkur et al., 2020; Yongchen et al., 2020) . Assessment of the levels of different antibody types (e.g. IgG, IgM) may even be used to infer approximately when individuals became infected (Azkur et al., 2020; Chang et al., 2005; Du et al., 2020; Borremans et al., 2016) , while detection of neutralizing antibodies may indicate protection from reinfection (Ni et al., 2020) .

These applications of serologic data depend critically on knowing when different antibodies against the pathogen become detectable (seroconversion time), how their concentrations change over time (antibody level kinetics) and how long they last (antibody decay) . When these key factors are known, serologic data become a powerful tool for inferring infection attack rate and transmission dynamics in the population (Bryant et al., 2020; Winter and Hegde, 2020) . Five months into the pandemic, a remarkable number of serologic studies on the initial immune response against SARS-CoV-2 had been published. These studies were conducted in different laboratories, used different assays and sampling methods, and were performed on different patient groups that showed different clinical manifestations of SARS-CoV-2 infection (Whitman et al., 2020; Lassaunière et al., 2020; Kontou et al., 2020) .

This extensive variation arising from many sources creates substantial challenges for integrating existing data into one coherent picture of antibody kinetics and viral RNA detection following SARS-CoV-2 infection. In 21 studies reporting the kinetics of anti-SARS-CoV-2 antibodies, we found the use of 8 different antibody assays, 10 different target antigens, and 9 different reported antibody level units (studies are listed in the Materials and methods section). Additionally, the temporal resolution at which studies collect data is highly variable: while some studies report antibody measurements for specific days, many bin results into periods of multiple days or even weeks. Integrated analysis of such diverse data is challenging, and requires statistical methods specifically developed for this purpose. Yet this type of integration is essential to capitalize on the limited and precious data available, to assess to what degree antibody and RNA detection patterns are affected by assay type and target antigen choice, and to establish consensus patterns. For example, a properly integrated analysis would better enable us to test whether antibody patterns depend on disease severity Tan et al., 2020) .

In this study, we quantified IgG and IgM antibody kinetics and RNA detection probability during SARS-CoV-2 infection (up to 60 days post-symptom onset) by aggregating data from published sources. We formally characterized IgG and IgM seroconversion times, detection probabilities over time and antibody level kinetics using methods tailored to accommodate the diverse ways in which data have been collected and reported. We investigated how these variables are affected by disease severity, assay type and targeted antigen, and how patterns differ between IgG and IgM. We also assessed how antibody level kinetics relate to the probability of detecting viral RNA in various biological samples. We estimated mean values as well as observed variation of all variables in order to provide the complete picture required to interpret serological and RNA testing data, inform mitigation strategies and parameterize mathematical models of pathogen transmission while accounting for variability. This formal integration approach enabled us to leverage 3214 data points from 516 individuals with symptoms ranging from asymptomatic to critical, published in 21 studies, resulting in a quantitative synthesis of diverse data on anti-SARS-CoV-2 antibody patterns and RNA detection during the early phase of infection.

We extracted data from 21 preprints and peer-reviewed articles reporting data on SARS-CoV-2 RNA or IgG, IgM or neutralizing antibodies against the virus in humans (see Materials and methods). When available, disease severity information was classified into three groups: asymptomatic/subclinical (n = 11 individuals), mild/moderate (n = 166), and severe/critical (n = 58). Unfortunately, the sample size for the asymptomatic group was too low for quantitative analyses. For 359 individuals, insufficient data were available for disease severity categorization, and these individuals were therefore excluded from analyses of the impact of disease severity. Published results were variously reported as exact days, intervals up to 22 days, or mean times for multiple individuals, while test results were reported as values for one individual or mean values for multiple individuals. Data after 30 days post-symptom onset were particularly underrepresented, but included because in aggregate they provide key insights. When reporting enzyme-linked immunosorbent assay (ELISA) results in the main text, IgG results are shown for assays targeting the nucleoprotein (NP) antigen (ELISA-NP), and IgM results are shown for assays targeting the Spike antigen (ELISA-Spike; whole or subunit), as these assays are most often used for the two antibody types Sethuraman et al., 2020) . Results for other assays and antigens are shown in Figure 1 -figure supplements 1 and 2.

Stepwise bootstrapping was used to estimate seroconversion times, using 270 data points from 99 individuals for IgG and 240 data points from 71 individuals for IgM. Mean IgG seroconversion time is 13.3 days post-symptom onset when using ELISA-NP and 12.6 for IgM using ELISA-Spike ( Figure 1a) . These results do not differ significantly (t = 0.22, df = 7.7, p=0.84) and are similar for magnetic chemiluminescence enzyme immunoassay (MCLIA; Figure 1b) . Variation in seroconversion times is substantial regardless of assay, for both IgG (sd = 5.7) and IgM (sd = 5.8).

Disease severity does not significantly affect seroconversion time, for IgM or for IgG (Figure 1c d). Mean IgM seroconversion time for mild/moderate cases is 12.3 days post-symptom onset vs 13.2 for severe/critical cases (t = À0.2, df = 23.5, p=0.83). Mean IgG seroconversion time for mild/moderate cases is 12.9 days post-symptom onset, vs 15.5 for severe/clinical cases (t = À0.96, df = 14.8, p=0.35). A detailed overview of seroconversion time results including means and standard deviations is provided in Figure 1 -figure supplements 3-5 and Figure 1 -source data 1.

While estimates of seroconversion time provide information about the first moment at which antibodies can be detected, changes in detection probability over time provide useful information about the proportion of individuals that has detectable antibodies, and hence the expected test sensitivity at the population scale. Sample sizes for these analyses (see Materials and methods) are 8053 data points for IgG and 7935 for IgM, with daily mean sample sizes of 224 and 220, respectively. The probability of detecting IgG (ELISA-NP) increases over time, reaching a maximum around 25-27 days post-symptom onset, at which point between 98% and 100% of individuals test positive ( Figure 2a) . Detection probability remains at this maximum level for the remainder of the days available in the studies existing at the time of writing (up to 60 days for ELISA-Spike, 

Samples sizes for observed and interpolated data are 7443 and 1793 for upper and lower respiratory samples and 1179 for fecal samples, with mean daily sample sizes of 226, 72 and 39, respectively. The probability of detecting viral RNA in respiratory and fecal samples is high (80-100%) at symptom onset and is consistently highest for lower respiratory tract samples ( Figure 2b ). Detection probability decreases rapidly at rates dependent on sample type, and most rapidly for upper respiratory tract samples, but the proportion of positive samples approaches zero around 30 days post-symptom onset for each sample type. Raw RNA detection probability data are provided in Figure 2 -source datas 3-5.

Antibody kinetics were analyzed by fitting a Gompertz growth rate function using Bayesian MCMC. While all subsets of the data were fit well by this model, we found some differences in antibody level kinetics depending on antibody, assay and antigen ( Figure 3A ). Full model fitting results for each assay can be found in Appendix 1.

Peak antibody level is reached around days 14-20 post-symptom onset, and the timing depends on antigen: both IgG and IgM peak levels are reached earlier when measured using ELISA NP than when using ELISA Spike (ELISA NP mean = 14.3 days, 95% CrI 12.0-16.1; ELISA Spike mean = 20.0 days, 95% CrI 17.6-22.4; 95% CrI for the difference = 2.7 to 9.2). The peak timing does not differ significantly between IgG and IgM when both are measured using ELISA Spike (IgG mean = 20.4 days, 95% CrI 16.8-24.1; IgM mean = 19.1 days, 95% CrI 15.6-22.4; 95% CrI for the difference = À6.4 to 3.5), nor when using ELISA NP (IgG mean = 15.2 days, 95% CrI 12.8-17.2, IgM mean = 12.2, 95% CrI 7.8-16.2; 95% CrI for the difference = À1.8 to 7.8). All estimates and pairwise statistics, including those for antibody levels measured using MCLIA, are shown in Figure 3 -source datas 1-2.

Antibody growth rates measured using ELISA NP tend to be higher than those measured using ELISA Spike (ELISA NP mean = 0.55/day, 95% CrI 0.48-0.64; ELISA Spike mean = 0.39/day, 95% CrI 0.34-0.44; 95% CrI for the difference = 0.07 to 0.26). The rate of increase for IgM does not differ significantly from that of IgG when both are measured using ELISA Spike (IgG mean = 0.39/day, 95% CrI 0.32-0.46; IgM mean = 0.41/day, 95% CrI 0.34-0.49; 95% CrI for the difference = À0.08 to 0.13), nor when measured ELISA NP (IgG mean = 0.53/day, 95%CrI 0.45-0.61; IgM mean = 0.68/day, 95% CrI 0.42-1.03; 95% CrI for the difference = À0.50 to 0.14). All estimates and pairwise statistics, including those for antibody levels measured using MCLIA, are shown in Figure 3 -source datas 3-4.

Disease severity does not significantly affect the time at which peak levels are reached for IgG ( Figure 3B ; mild mean = 14.0 days, 95% CrI 10.8-17.1; severe mean = 15.9 days, 95% CrI 10.7 to 20.6; 95% CrI for the difference = À8.0 to 4.2). However for IgM, peak antibody levels are reached approximately 7.0 days earlier for mild cases than severe cases ( Figure 3C ; mild mean = 15.6 days, 95% CrI 12.8-19.0; severe mean = 22.7 days, 95% CrI 18.5-26.6; 95% CrI for the difference = À12.2 to À1.8). Corresponding patterns are observed for antibody growth rate, which does not differ between mild and severe cases for IgG (mild mean = 0.58/day, 95% CrI 0.45-0.72; severe mean = 0.51/day, 95% CrI 0.36-0.69; 95% CrI for the difference = À0.16 to 0.28), but does for IgM, with levels increasing more rapidly for mild cases (mild mean = 0.51/day, 95% CrI 0.42-0.60; severe mean = 0.34/day, 95% CrI 0.28-0.42; 95% CrI for the difference = 0.05 to 0.28). Error bars indicate binomial exact 95% confidence intervals of the mean, based on daily sample size. Note that error bars after day 30 tend to be large, due to the limited available data. IgG and IgM values are those detected using any assay/antigen. After day 25, results are pooled into 3-day periods in order to improve estimates. The online version of this article includes the following source data and figure supplement(s) for figure 2:

Source data 1. IgG (ELISA-NP) detection probability. Source data 2. IgM (ELISA-Spike) detection probability. Source data 3. RNA -upper respiratory tract detection probability. Source data 4. RNA -lower respiratory tract detection probability. Source data 5. RNA -feces detection probability. 

By leveraging and integrating multiple data sources on key aspects of the antibody response against SARS-CoV-2, we were able to produce quantitative estimates of the mean and variation of seroconversion timing, antibody level kinetics, and the changes in antibody and RNA detection probabilities. These results provide critical reference information for serological surveys, assay sensitivity and risk of false-negative results, transmission models and herd immunity assessments. By combining data from 21 different studies using different assays, antigens, protocols and patient groups, we were able to quantify the means and, crucially, the extent of variation of important serologic and RNA detection parameters. Together, these antibody and RNA detection probability patterns provide an essential evidence base for informing sampling designs ( Table 1) . Figure 4 provides an overview of the key patterns.

Seroconversion time is highly variable between individuals, with a mean around 12-13 days postsymptom onset. We find that IgG and IgM can be detected as early as 0 days post-symptom onset in 10-20% of patients, which indicates that seroconversion can happen at, and likely before, the onset of detectable symptoms. To our knowledge, seroconversion prior to symptom onset has not been reported, which is likely due to the fact that such cases are typically not under investigation using serologic assays. By integrating a wide range of data sources, we detect greater variation in seroconversion timing than previously observed, and importantly, it was possible to quantify the distributions around the mean seroconversion times Zhao et al., 2020; Haveri et al., 2020) .

Patterns of IgM and IgG detection align with immunological expectations, as IgM antibodies are typically present during the early phase of the immune response, while IgG antibodies remain detectable for much longer periods (Xiao et al., 2020) . We detected IgG and IgM antibodies in nearly all (98-100%) individuals by days 22-23 post-symptom onset, consistent with recent findings (Kraay et al., 2020) . While IgG detection remains at this level for at least the range of times in the Source data 1. Peak antibody level time posterior means and 95% credible intervals (CrI). Source data 2. Peak antibody level time pairwise posterior differences. Source data 3. Growth rate posterior means and 95% credible intervals (CrI). Source data 4. Growth rate pairwise posterior differences. dataset (60 days for ELISA-Spike), the proportion of IgM-positive samples decreases after roughly 28 days post symptom onset, reaching around 65% by day 60. In other words, a growing proportion of individuals loses detectable IgM from day 30 onwards. We also detect a robust effect of viral antigen, where responses against NP rise faster than those against Spike, for both IgM and IgG. The quantification of changes in detection probability over time is relevant for clinical testing and assay choice and will determine test sensitivity (Sethuraman et al., 2020) .

It has been postulated that disease severity and humoral immunity against SARS-CoV-2 are correlated, but results so far have been inconclusive (Okba et al., 2020) . Here, we did not detect any significant effects of disease severity on antibody patterns, with the single exception that we estimated a lower rate of IgM increase in severe/critical cases relative to mild/moderate cases. Regarding seroconversion times, an earlier study analyzed 28 cases to find that IgM seroconversion times appeared to be the same for severe and non-severe cases, but their analysis of 45 cases showed that IgG seroconversion was earlier for severe cases . Similarly, earlier seroconversion in severe cases has been observed for SARS-CoV-1 (Lee et al., 2006) , but this result was not consistent across studies (Chan et al., 2005) . Our findings do not support the idea that severe cases seroconvert faster. Indeed, the only significant effect of severity in our analyses is that the inferred growth rate of IgM levels is slower for severe/critical cases. It is not clear whether this reflects a relevant biological difference, considering that all other parameters do not differ among disease severity categories. The consensus patterns from our meta-analysis suggest that any interaction between disease severity and antibody response must be subtle and sensitive to other sources of variation, explaining the inconsistencies seen across studies. Note that the IgG seroconversion histogram for severe/critical cases ( Figure 1d ) appears bimodal, with fewer datapoints between 13 and 18 days post-symptom onset. This could either be a consequence of an underrepresentation of these times in the different studies or a signal of a true underlying pattern, but unfortunately the data to distinguish between these two hypotheses are not currently available.

Given the finding that disease severity does not have major effects on early-phase antibody patterns, and assuming no cryptic relationship between severity and the factors governing protective immunity, then mild cases could be substantial contributors to the development of herd immunity development. This finding may also be important for vaccine efficacy; however, it is not yet known whether the presence of IgG or IgM correlates with protective immunity (Altmann et al., 2020) , although we do observe a similar pattern for neutralizing antibody detection (Figure 2a) .

The extensive individual variation in antibody patterns, which is a common phenomenon across many viral infections (Pacis et al., 2014) , may affect the accuracy of transmission models . For example, if seroconversion times reflect the actual end of infectiousness and onset of immunity (i.e. the transition from Infectious to Removed in SEIR-type models Li et al., 2020) , the observed range of 0 to 40 days post-symptom onset may need to be represented in the infectious period distribution. It is important to carefully consider how this variation may affect Has an individual been exposed in the past?

IgG 25-60(+) days post symptom onset

IgG persistence: possibly 1-2 years based on other human coronaviruses (Chang et al., 2005) . Assess transmission risk to others; contact tracing Giordano et al., 2020; Parameterization of transmission models Kucharski et al., 2020) .

How recently was an individual exposed?

IgM, IgG >25 days postsymptom onset IgG indicates exposure, which is more likely to be recent if IgM is also present, and longer ago if IgM is absent.

Recent exposure is more likely correlated with transmission risk, and is a useful measure for prioritizing contact tracing, notably for asymptomatic cases (Okba et al., 2020) . model conclusions, and whether it should be taken into account explicitly (Wearing et al., 2005) , especially given the heavy reliance of policy-makers on COVID-19 transmission models (Kissler et al., 2020) . We observed clear patterns of RNA detection that have several important implications, particularly for sampling designs. First, it is clear that the probability of detecting RNA is highly dependent on sample type, consistent with previous observations Memish et al., 2014) . Lower respiratory tract samples have the highest probability of testing positive for SARS-CoV-2 RNA, particularly after about 15 days post-symptom onset. During the first 8 days, 100% of lower respiratory tract samples tested positive for RNA. While detection probabilities for fecal and upper respiratory tract samples are nearly this high at symptom onset, they decrease much more rapidly, with the lowest average detection probabilities for upper respiratory samples. Nevertheless, it appears that by 30 days post-symptom onset detection probability approaches zero for all sample types, although it is important to note that the dataset did not include lower respiratory samples beyond day 29, which means that the true detection endpoint in lower respiratory samples could not be determined. These results match those from multiple studies Sethuraman et al., 2020; Guo et al., 2020) . When interpreting results on RNA detection, it is important to note that the presence of RNA does not necessarily imply the presence of live virus (Theel et al., 2020; Wö lfel et al., 2020) .

One potential caveat for any analysis of data reported as time since symptom onset is that variation in the incubation period (time between infection and symptom onset) can affect the estimated timing of antibody kinetics and RNA detection. The mean incubation period is estimated to be around 7-8 days, with a standard deviation of 4.4 . The clear antibody and RNA detection patterns we observe here suggest that the effect of this variation does not obscure broad patterns, but relative results may be affected if the incubation period differs between certain groups of individuals. This could indeed be the case for disease severity, as mild cases are estimated to have a longer incubation period (8.3 days) than severe cases (6.5 days) .

In summary, this study provides an up-to-date, comprehensive reference of key antibody and RNA detection parameters, including estimates of variation that can be used to inform serological surveys and transmission models ( Table 1) . As more data on SARS-CoV-2 become available, parameters can be updated through the use of the algorithms made available in the accompanying R code.

We considered preprints and peer-reviewed articles reporting the presence (positive or negative) or levels for IgG, IgM or neutralizing antibodies against SARS-CoV-2 or SARS-related CoV RaTG13 measured by enzyme-linked immunosorbent assay (ELISA), magnetic chemiluminescence enzyme immunoassay (MCLIA), lateral flow immunoassay (LFIA) or plaque reduction neutralization test (PRNT). In addition, we considered studies reporting PCR data from various biological samples, based on various PCR protocols. To be included in the study, we required that data were associated with information about time since symptom onset at the moment of sample collection. The search terms 'SARS-CoV-2' and 'COVID-19' were used in combination with the following search terms: serolog*, antibod*, IgG, IgM, RNA, shedding. This strategy was used in the databases Google Scholar, Pubmed and medRxiv. This resulted in about 850 candidate articles and preprints. Within these results, a first selection of candidate articles was performed by assessing the titles, in order to filter articles containing new data (i.e. excluding reviews, opinion articles, modeling studies, etc.). This narrowed down the list of candidate articles to 37, which were screened in detail. The final selection step required articles and preprints to show raw data in tables or figures and include data on time post-symptom onset. A selection process flowchart is shown in Figure 5 . We included articles available up to May 1 2020 that contained data that could be used for the analyses in this study. This resulted in a final subset containing 19 peer-reviewed articles and two preprints (Yongchen et al., 2020; Du et al., 2020; Wö lfel et al., 2020; Okba et al., 2020; Zhao et al., 2020; Haveri et al., 2020; Xiao et al., 2020; Jiang et al., 2020; Lee et al., 2020; Liu et al., 2020a; Long et al., 2020; Lou et al., 2020; Thevarajan et al., 2020; Xiang et al., 2020; Zhang et al., 2020a; Zhou et al., 2020; Young et al., 2020; Zhang et al., 2020b; Liu et al., 2020b; Zhang et al., 2020c; Adams et al., 2020; Zou et al., 2020) . Note that initial article selection sample sizes are approximate due to the way in which Google Scholar reports the number of results. It was crucial for these searches to use Google Scholar in order to find preprints that are not included in databases such as Web of Science. Figure 5 -source data 1 provides an overview of all articles that were included for analysis, with key features noted. Analyses were done in parallel for a dataset excluding data from preprints, which did not change any qualitative results (not shown). The supplementary R code (Source code 1) includes the option to generate all results with or without data from preprints. Data were extracted from published material, and were digitized from figures when necessary using WebPlotDigitizer (Rohatgi, 2019) . All data are available as Source data 1.

Disease severity information was classified into three groups: asymptomatic/subclinical, mild/moderate, and severe/critical. Individuals were assigned a classification of asymptomatic/subclinical (N = 11) if they were referred to as 'healthy', 'having no symptoms related to COVID-19', or 'asymptomatic'. Inclusion criteria for classification as mild/moderate or severe/critical are based on definitions from the Centers for Disease Control and Prevention (Centers for Disease Control and Prevention, 2019), the Chinese National Health Commission (Released by National Health Commission & National Administration of Traditional Chinese Medicine on March 3, 2020, 2020), and the World Health Organization (World Health Organization, 2020). When disease severity was not specified in the manuscript, patients who did not require supplemental oxygen therapy or transfer to the intensive care unit (ICU) were classified as mild/moderate, while those who did were classified as severe/critical.

A major goal of this study is to estimate the means and variation of IgG and IgM seroconversion times (time between symptom onset and first antibody detection) for different assays, antigens, and disease severity. We developed a stepwise weighted bootstrapping procedure to do this using data on seroconversion times that have been reported in a diverse number of ways (from exact days to periods up to 22 days, and as raw results for one individual or means for groups of individuals). Our approach ensures that the best data (i.e. high-resolution data in the form of one specific seroconversion time for one individual) have the most influence on estimates of the means and standard deviations (sd) of seroconversion times.

The stepwise weighted bootstrapping procedure integrated all types of data that contain useful information about the timing of seroconversion of different antibodies in day(s) post-symptom onset. At each step, a distribution of observed possible seroconversion times was bootstrapped 50,000 times from repeated random sampling of individual seroconversion times from the dataset. At the end of each bootstrapping step n, a normal distribution was fitted to the obtained distribution of possible seroconversion times. This distribution was then used as prior information to weight sampling probability during the weighted bootstrapping procedure at step n + 1 (Sms and Young, 2003) .

The first step used only the best available resolution of seroconversion data (i.e. reported for exact days, as opposed to a range of days) to bootstrap a distribution of observed possible seroconversion times. The following steps included all data for which the maximum reported seroconversion time range is the next one observed in the data (for up to maximum time range present in a dataset). For example, if a number of results was reported not as an exact time but as a period ranging 3 days (reported as such or as part of a time series), the data included in step 2 consist of results reported as exact days, and results reported as 3-day ranges. Bootstrapping in this case was again done through repeated random sampling of an individual. When that individual had a result reported as an exact time, that time was stored as a bootstrap sample. When that individual had a result reported as a time range, a time within that range was sampled, but importantly, the times within that range did not have the same probability of being sampled. This probability was determined by the normal distribution that was estimated after the preceding bootstrapping step. This ensured that the best available data have the largest contribution to the analysis, and data of lower resolution were used while taking into account the information contained in the higher resolution data. This stepwise procedure continued until data of all resolutions (i.e. including the largest reported seroconversion time periods) was bootstrapped.

Seroconversion times were sometimes reported as a mean time (± error) instead of an exact time or time period. In these cases, the standard deviation of time around the mean was calculated (using reported sample size and standard error), and a random time was drawn from this normal distribution. Some studies report seroconversion times for groups of individuals simultaneously. In this case, each individual group member was treated as a separate individual that can be sampled randomly. Data from cumulative seroconversion curves were incorporated by assigning the seroconversion time at which the curve increases to the number of individuals being reported to seroconvert at that time. In the bootstrapping procedure, each of these individuals could then be sampled in the same way as any other individual. Aside from increasing sample size (and hence the confidence in the estimates) and the density of the histogram/distribution, there were no significant differences between distributions estimated using different maximum time periods (Figure 1-figure supplement 5 ).

The probability of detecting SARS-CoV-2 specific IgG or IgM in plasma or serum samples was estimated by integrating data on whether an individual tested positive or negative on a given day postsymptom onset. Data containing information on detection probability on a given day are reported in diverse ways, using different resolutions of sample size (from one individual to results reported for groups) or time (results reported on specific days or as a range of days). Additionally, time series data from individuals sampled multiple times contain information about detection probability for times between measurements. These diverse data sources were integrated using different rules. When antibody levels were reported, the cut-off provided in the studies was used to determine the negative or positive status of samples. Individual results for a specific day were included as reported. When time was reported as a period, the midpoint time was used. When a proportion of positive samples was reported together with a sample size, the number of positive and negative samples were calculated and used as independent samples. When two samples that are part of a longitudinal time series showed the same result, the individual was assumed to have the same result for all times within the interval. When such samples had different results, the (interpolated) samples in the early half of the interval were assigned the same result as the first sample, and those in the later half were assigned the same result as the second sample. This procedure resulted in a dataset where each day post-symptom onset has a number of positive and negative observed samples that could be used to estimate a daily detection probability. Binomial exact confidence intervals of the means were calculated and shown.

The probability of detecting RNA in upper and lower respiratory samples, and in fecal samples, was estimated using the same procedure used for IgG and IgM, but excluding the assumption that days in the interval between two samples of a time series have the same result, that is not including any interpolated samples. This was based on the fact that RNA detection has been observed to be highly variable (Wö lfel et al., 2020; Kucirka et al., 2020) . Respiratory sample types were classified as upper (saliva, naso-or oropharyngeal) or lower (sputum, tracheal aspirate, bronchoalveolar lavage) respiratory tract samples. As RT-PCR protocols based on different target sequences resulted in similar sensitivities (Sethuraman et al., 2020) , all data were pooled for our analysis of detection probability.

To characterize the kinetics of antibody levels, we fit models to all individuals for whom longitudinal data were available (i.e. at least three samples are available, one of which has to be positive). Our goal was to estimate the rate of increase, and the timing and magnitude of the peak antibody level. Assays, antigens and reporting units differed extensively between studies, so antibody levels were normalized by dividing the level of each sample in a study by the maximum value observed in that study. This allowed us to compare antibody level kinetic patterns between different studies. Antibody level normalization using scaling to a mean of zero and standard deviation of one resulted in the same patterns (results not shown). All time-series are shown in Appendix 1.

As there were no (or very limited) data available for the later phase of kinetics, when antibody levels decay from their peak, we focused on the early phase of antibody increase up to peak level. These early-phase dynamics follow a standard growth rate pattern, for which well-described functions are available. Of these functions, a three-parameter Gompertz function, y t ð Þ ¼ ae Àbe Àct , was an excellent candidate, as its three parameters correspond to clinically significant measures of antibody level (y) dynamics over time (t). The asymptote (a) corresponds to the peak level, displacement (b) corresponds to the seroconversion time, and growth rate (c) corresponds with the antibody level increase rate. Antibody levels (y and a) were log-transformed.

We fit this function to the observed time series of normalized antibody levels using Bayesian Markov Chain Monte Carlo inference, using R-JAGS (Plummer, 2019) . All parameters were fit separately for each individual, with the assumption that they arise from the same population-level distribution, which was implemented as a hierarchical Bayesian model with hyperpriors for each parameter.

Prior distributions:

Peak titer mean~Uniform(min = 0, max = 5) Peak titer standard deviation~Gamma(shape = 1, rate = 1) Displacement mean~Normal(mean = 100, sd = 10) Displacement standard deviation~Uniform(min = 0, max = 200) Growth rate mean~Uniform(min = 0, max = 5) Growth rate standard deviation~Uniform(min = 0, max = 100)

Posterior means of the parameters were used for further analyses and for plotting. Data were combined into subsets depending on the measure of interest (assay, targeted antigen, disease severity). Six parallel chains with different starting values were run for 70,000 burn-in iterations, of which the first 20,000 were discarded (burn-in). Peak antibody level timing of an individual time series was approximated as the time at which the level reaches 95% of the maximum level (a). Results of parameters estimated using MCMC inference were reported as posterior means with 95% credible intervals (CrI). Statistical differences between estimated parameters were assessed by constructing the posterior distribution of the differences between the MCMC samples of the respective parameters (which were independent since they were estimated from different datasets), where the difference is considered significant when zero is not included in the 95% CrI.

All data preparation, cleaning, analysis and plotting was done in R version 3.6.1 (R Development Core Team, 2019) using packages ggplot2 (Wickham, 2016) , dplyr (Wickham et al., 2019) , readxl (Wickham and Bryan, 2019a) , patchwork (Pedersen, 2019) , binom (Dorai-Raj, 2014), tidyr (Wickham and Henry, 2019b) and ggridges (Wilke, 2020) . Welch two-sample t-tests were used to test for differences between estimated distributions. All codes used to fit models and produce results have been provided in Source code 1.

Antibody testing for COVID-19: a report from the national COVID scientific advisory panel

What policy makers need to know about COVID-19 protective immunity

Reopening society and the need for Real-Time assessment of COVID-19 at the community level

Immune response to sars-cov-2 and mechanisms of immunopathological changes in covid-19

Estimating time of infection using prior serological and individual information can greatly improve incidence estimation of human and wildlife infections

Serology for SARS-CoV-2: apprehensions, opportunities, and the path forward

Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19): National Center for Immunization and Respiratory Diseases (NCIRD)

Serological responses in patients with severe acute respiratory syndrome coronavirus infection and cross-reactivity with human coronaviruses 229E, OC43, and NL63

Longitudinal analysis of severe acute respiratory syndrome (SARS) coronavirus-specific antibody in SARS patients

binom: Binomial confidence intervals for several parameterizations

Detection of antibodies against SARS-CoV-2 in patients with COVID-19

Modelling the COVID-19 epidemic and implementation of population-wide interventions in Italy

Long-Term persistence of IgG antibodies in SARS-CoV infected healthcare workers

Serological and molecular findings during SARS-CoV-2 infection: the first case study in Finland

A systematic review of antibody mediated immunity to coronaviruses: antibody kinetics, correlates of protection, and association of antibody responses with severity of disease

Global profiling of SARS-CoV-2 specific IgG/ IgM responses of convalescents using a proteome microarray

Projecting the transmission dynamics of SARS-CoV-2 through the postpandemic period

Antibody tests in detecting SARS-CoV-2 infection: a meta-analysis. medRxiv

Modeling serological testing to inform relaxation of social distancing for COVID-19 control

COVID-19 serosurveillance may facilitate Return-to-Work decisions

Centre for Mathematical Modelling of Infectious Diseases COVID-19 working group. 2020. Early dynamics of transmission and control of COVID-19: a mathematical modelling study

Variation in false negative rate of RT-PCR based SARS-CoV-2 tests by time since exposure

Evaluation of nine commercial SARS-CoV-2 immunoassays

Anti-SARS-CoV IgG response in relation to disease severity of severe acute respiratory syndrome

A case of COVID-19 and pneumonia returning from macau in Taiwan: clinical course and anti-SARS-CoV-2 IgG dynamic

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2)

Antibody testing will enhance the power and accuracy of COVID-19-prevention trials

Evaluation of nucleocapsid and spike Protein-Based Enzyme-Linked immunosorbent assays for detecting antibodies against SARS-CoV-2

A preliminary study on serological assay for severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in 238 admitted hospital patients

Antibody responses to SARS-CoV-2 in COVID-19 patients: the perspective application of serological tests in clinical practice

Serology characteristics of SARS-CoV-2 infection since exposure and post symptom onset

Epidemiological parameters of coronavirus disease 2019: a pooled analysis of publicly reported individual data of 1155 cases from seven countries

Respiratory tract samples, viral load, and genome fraction yield in patients with middle east respiratory syndrome

Characterization of anti-viral immunity in recovered individuals infected by SARS-CoV-2. medRxiv

Severe acute respiratory syndrome coronavirus 2-Specific antibody responses in coronavirus disease patients

When genetics meets epigenetics: deciphering the mechanisms controlling inter-individual variation in immune responses to infection

patchwork: The composer of plots

Mapping the Host-Pathogen space to link longitudinal and Cross-sectional biomarker data: leptospira infection in California sea lions (Zalophus californianus) as a case study

R: A language and environment for statistical computing. 2.6.2. Vienna, Austria, R Foundation for Statistical Computing

Interpreting diagnostic tests for SARS-CoV-2

Prepivoting by weighted bootstrap iteration

Repeated seroprevalence of anti-SARS-CoV-2 IgG antibodies in a population-based sample

Viral kinetics and antibody responses in patients with COVID-19. medRxiv

The role of antibody testing for SARS-CoV-2: is there one

Breadth of concomitant immune responses prior to patient recovery: a case report of non-severe COVID-19

Temporal profiles of viral load in posterior oropharyngeal saliva samples and serum antibody responses during infection by SARS-CoV-2: an observational cohort study

Appropriate models for the management of infectious diseases

Modeling shield immunity to reduce COVID-19 epidemic spread

Test performance evaluation of SARS-CoV-2 serological assays

Ggplot2: Elegant Graphics for Data Analysis

dplyr: A grammar of data manipulation

readxl: Read excel files

The important role of serology for COVID-19 control

Virological assessment of hospitalized patients with COVID-2019

Clinical management of severe acute respiratory infection (SARI) when COVID-19 disease is suspected: interim guidance

Antibody detection and dynamic characteristics in patients with COVID-19

Profile of specific antibodies to SARS-CoV-2: the first report

Different longitudinal patterns of nucleic acid and serology testing results based on disease severity of COVID-19 patients

Epidemiologic features and clinical course of patients infected with SARS-CoV-2 in Singapore

Molecular and serological investigation of 2019-nCoV infected patients: implication of multiple shedding routes

Virus shedding patterns in nasopharyngeal and fecal specimens of COVID-19 patients

Anti-SARS-CoV-2 virus antibody levels in convalescent plasma of six donors who have recovered from COVID-19

Antibody responses to SARS-CoV-2 in patients of novel coronavirus disease 2019

A pneumonia outbreak associated with a new coronavirus of probable bat origin

SARS-CoV-2 viral load in upper respiratory specimens of infected patients

The figures below show the antibody data used for the different antibody/assay/antigen datasets. The IDs shown in the figure legend correspond with the individual identification provided in the accompanying spreadsheet. Antibody levels by days post symptom onset are shown for each individual (top panel) . Posterior growth rate, displacement and peak antibody level are shown (middle panels), with posterior mean (bold red line), and 95% credible intervals (dashed lines). Observed antibody levels (bottom-left panel) and fitted functions (bottom-middle panel) are shown for each individual, in addition to the overall mean (black dashed line). Finally, the posterior mean antibody level and 100 randomly selected posterior fits within the 95% credible interval are shown (bottomright panel). Chain convergence was assessed using the Gelman-Rubin diagnostic, and was one for all chains. Appendix 1-figure 9. IgM mild/moderate cases fitted antibody kinetics.Appendix 1-figure 10. IgM severe/critical cases fitted antibody kinetics.