key: cord-0765560-0wy38xi8 authors: Gao, Fei; Vaccine, Marlena S. Bannick; Division, Infectious Disease; Center, Fred Hutchinson Cancer Research; Seattle,; WA,; Division, Public Health Sciences; Biostatistics, Department of; Washington, University of title: Statistical Considerations for Cross-Sectional HIV Incidence Estimation Based on Recency Test date: 2021-06-03 journal: Statistics in medicine DOI: 10.1002/sim.9296 sha: 291e0aac51e87a110d05944e31cd5f28054a6622 doc_id: 765560 cord_uid: 0wy38xi8 Longitudinal cohorts to determine the incidence of HIV infection are logistically challenging, so researchers have sought alternative strategies. Recency test methods use biomarker profiles of HIV-infected subjects in a cross-sectional sample to infer whether they are"recently"infected and to estimate incidence in the population. Two main estimators have been used in practice: one that assumes a recency test is perfectly specific, and another that allows for false-recent results. To date, these commonly used estimators have not been rigorously studied with respect to their assumptions and statistical properties. In this paper, we present a theoretical framework with which to understand these estimators and interrogate their assumptions, and perform a simulation study to assess the performance of these estimators under realistic HIV epidemiological dynamics. We conclude with recommendations for the use of these estimators in practice and a discussion of future methodological developments to improve HIV incidence estimation via recency test. Determination of the incidence rate of HIV is critical for HIV surveillance and for evaluating the effectiveness of HIV prevention efforts. The current gold standard is through longitudinal follow-up and repeated testing of a cohort of participants drawn from the population of interest, such that the incidence can be estimated by the ratio of number of new cases and total follow-up time. This approach is theoretically simple, but it may present issues for HIV surveillance (Brookmeyer, 2010) . For example, high follow-up rates in large representative samples may be difficult to obtain, the cost of such studies is usually high, and there may be differences in HIV risk behaviors among persons who do and do not participate in cohort studies. In addition, there may be retention bias during follow-up or alteration of HIV risk by repeated HIV counseling and testing (Hawthorne effect) (Sherr et al., 2007) . An important alternative approach that avoids longitudinal follow-up and repeated testing is cross-sectional incidence estimation. This approach utilizes a biomarker-based algorithm to determine which infections in a cross-sectional sample drawn from the population of interest were acquired "recently". It was first proposed by Brookmeyer and Quinn (1995) , where subjects with negative HIV-antibody test and positive HIV-1 p24 antigen test were classified as recently infected. This recency test is infeasible in practice since the short p24 antigen-positive pre-seroconversion period requires testing a large number of individuals to estimate incidence with precision. Later, a number of serological assays that measure the antibody response to HIV infection were proposed, for example the detuned assay (Janssen et al., 1998) , the BED capture EIA (Parekh et al., 2002) , and the avidity assay (Suligoi et al., 2002; Duong et al., 2015) . Genetic diversity of HIV has also been used as a biomarker to indicate HIV recency (Kouyos et al., 2011; Yang et al., 2012; Cousins et al., 2011) . To improve the performance of the recency algorithm, others have proposed multiassay algorithms that make use of multiple assays and biomarkers to indicate recency Laeyendecker et al., 2013; Laeyendecker et al., 2018) . A number of statistical approaches have been proposed to determine HIV incidence based on a cross-sectional sample. They make use of recency test results in a cross-sectional sample as well as the classification characteristics of the recency test. Based on the understanding that incidence can be viewed as the expected number of new infections per uninfected person per unit time, Kaplan and Brookmeyer (1999) proposed the "snapshot estimator" where N rec and N neg are the numbers of test-recent HIV-positive subjects and HIV-negative subjects, respectively, from the cross-sectional sample, and µ is an estimate of the "mean window period" of the recency test (the average duration of infection among the subjects classified as recently infected; we will discuss a formal definition of this parameter in Section 2.2). The snapshot estimator was suggested to be unbiased when the incidence is constant over time, and it has been adopted in a number of applications in HIV incidence estimation (Eshleman et al., 2013; Rehle et al., 2015; Solomon et al., 2016) . The snapshot estimator implicitly assumes that the mean window period is finite, such that a long-infected subject would have a zero probability of being classified as recent. However, for many recency tests, a small proportion of long-infected persons may be falsely classified as "recent". A number of methods have been proposed to address such false-recency (McDougal et al., 2006; Hargrove et al., 2008; Kassanjee et al., 2012) . One widely adopted approach is the "adjusted estimator" from Kassanjee et al. (2012) , where an infection duration cutoff T * is defined to delineate between "recent" and "long" infected subjects. Based on this cutoff, the adjusted estimator uses two characteristics of a recency test that are closely related to the sensitivity and specificity of a classification procedure: mean duration of infection (MDRI) Ω T * and false-recent rate (FRR) β T * . MDRI is similar to the mean window period in that it captures the average duration of infection among those who are "truly recent" and classified as recently infected, and FRR is the probability of mis-classified for a randomly selected long-infected subjects (these parameters will be further discussed in Section 2.2). For a recency test, the adjusted estimator is given by where N pos is the number of HIV-positive subjects in the cross-sectional sample, and Ω T * and β T * are estimates of MDRI and FRR for the recency test, respectively. Since it accounts for recency tests that produce false-recent results, the adjusted estimator is thought to be more flexible and theoretically more robust than the snapshot estimator. It has also been widely adopted in applications of HIV incidence estimation (Maman et al., 2016; Moyo et al., 2018) . Even though the cross-sectional incidence estimators have been widely utilized in practice, the statistical properties, especially those of the adjusted estimator, have not been well studied or understood. Specifically, the key parameters (MDRI, FRR) may not be well characterized and the assumptions under which the estimators serve as unbiased estimators for the incidence in a target population have not been rigorously studied. In this paper, we formulate a theoretical framework for assessing HIV recency, formally establish the assumptions for cross-sectional incidence estimation based on the snapshot and adjusted estimators, and evaluate the bias of the estimators when the assumptions fail to hold. We evaluate the numerical performance of the estimators under various simulated settings with different HIV epidemic trajectories and recency tests with different properties and provide some recommendations in using the estimators in practice. 2 Theoretical Model Let T be the (calendar) HIV infection time of a subject and let A(t) be an indicator of eligibility at time t, i.e., whether this subject would be eligible to be included in a survey for HIV incidence (and prevalence) of the target population at time t. This eligibility indicator A(t) can be based on a collection of (possibly time-dependent) individual covariates and a population of interest. For example, in a cross-sectional population survey, a minimum requirement for eligibility is being alive at the time the survey is conducted. Another example is A(t) = I(MSM, Age 18-50 at time t), which uses two characteristics -an indicator of being a member of the men who have sex with men population (MSM), and an indicator of being aged 18-50 -to define the eligible population at time t. At any calendar time t, the prevalence in this target population is given by p(t) = Pr(T ≤ t|A(t) = 1), and the incidence in this population is given by i.e., it is the rate of instantaneous HIV infection for an eligible HIV-negative subject at time t. With slight abuse of notation, we write λ(t) = Pr(T = t|T ≥ t, A(t) = 1). Note that both p(t) and λ(t) concern the distribution of infection time T in a restricted population defined by A(t), such that they are conditional quantities given A(t) = 1. The main goal of cross-sectional incidence estimation is to estimate λ(t) based on a finite-sized cross-sectional sample collected at time t satisfying A(t) = 1. Suppose that we collect a random sample from the eligible population at time t (all subjects in the sample satisfy A(t) = 1). For each subject in that sample, we first assess their HIV status, and if it is positive, we apply a subsequent HIV recency test. We assume that HIV status can be determined without any mis-classification (e.g., using an RNA-based diagnostic), while HIV recency may not be, with details described below. Recall that we use N neg , N pos , and N rec to denote the numbers of subjects who are HIV-negative, HIV-positive, and HIV test-recent, respectively. The probabilities associated with those subjects in the cross-sectional sample are: • The probability of HIV-negative: Pr(T > t|A(t) = 1) = 1 − p(t). • The probability of HIV-positive: Pr(T ≤ t|A(t) = 1) = p(t). -The probability of HIV-positive and subsequently classified as recently infected based on the recency test: P rec (t) = Pr(M ∈ R, T ≤ t|A(t) = 1). The variable M denotes the biomarker values of the HIV recency test and R is a region for those values that classifies a subject as HIV recent. One example of a recently proposed HIV recency test is based on the combination of three biomarkers: LAg Avidity assay, BioRad Avidity assay, and viral load. For this test, the test-recent region R is defined as LAg Avidity OD n < 2.8, BioRad Avidity OD n < 95%, and viral load > 400 copies/ml (Laeyendecker et al., 2018) . At time t when the cross-sectional sample is taken, the probability of test-recent, i.e., M ∈ R, shall depend on the true infection duration t − T . In particular, we define the duration-specific test-recent probability φ(u, t) = Pr(M ∈ R|T = t − u, A(t) = 1), for infection duration u ≥ 0. Since the recency test is always applied to an HIV-positive subject that is eligible at time t, φ(u, t) is a probability conditional on A(t) = 1. We assume that φ(u, t) depends only on u, the infection duration, and does not depend on t. That is, the calendar time when the test is taken is irrelevant to test accuracy given a fixed infection duration, and we denote the quantity as φ(u). Summary measures of the duration-specific test-recent probability function φ(u) are suggested in literature to describe recency test properties and are used as parameters in cross-sectional incidence estimation. For example, the mean window period (µ) used in the snapshot estimator can be defined as an integration of φ(·) (as long as the integration is finite) i.e., The mean duration of infection (MDRI, Ω T * ) in the adjusted estimator is defined as a truncated integration of φ(·) from 0 to T * , i.e., The false-recent rate (FRR, β T * ) is defined as the probability that a randomly chosen person from the population of long-infected subjects (i.e., has an infection duration for more than time T * ) will be classified as "recently" infected by the recency test (Kassanjee et al., 2012) . Let G(u) be the distribution of infection times among these long-infected subjects. Then β T * can be written as . Remark 1 In Kassanjee et al. (2012) , the MDRI is defined as where P R (u) is the probability of a person infected u time units ago still being alive and "recent" (in this setting being alive is the only eligibility criterion). That is, This definition is different from ours in that φ(u) conditions on A(t) = 1 but P R (u) involves the probability of A(t) = 1 conditioning on A(t − u) = 1. Determining MDRI of a recency test typically involves sampling eligible individuals with known infection duration (to some reasonable approximation). In order to be sampled at time t, they need to be eligible at the current time t (A(t) = 1)), instead of eligible at the time of infection (A(t − u) = 1). Therefore, our definition of MDRI based on φ(u) is the one that is aligned with the sampling strategy of studies that are conducted in practice. Based on our notation, the test-recent probability is given by That is, the test-recent probability is a weighted version of the duration-specific test-recent probabilities, where the weight is related to the distribution of the infection time for those infected and eligible at time t. The probability Pr(T = t − u|T ≤ t, A(t) = 1) is not directly linked to λ(t), the quantity of interest. Some further assumptions are needed to construct this linkage such that the estimation of λ(t) based on the cross-sectional sample is valid. Suppose that the prevalence function of HIV, p(t), is continuous over time. We introduce the following set of assumptions for cross-sectional incidence estimation. Assumption A.1 φ(u) = 0 with u greater than some large value. Let τ be the upper bound of u such that φ(u) is positive, i.e., τ = max u {φ(u) > 0}. Assumption A.1 indicates that the tail of φ(u) goes to zero when u is large, indicating zero testrecent probability for a subject infected long enough. It would ensure that the mean window period µ is finite, which is a key requirement for the validity of the snapshot estimator. Assumption A.2 suggests that the infection time is uniformly distributed in [t − τ, t] for an infected eligible subject at time t. Note that this is not necessarily equivalent to a constant incidence in [t − τ, t], and we will discuss this in detail in Section 2.4. Given Assumptions A.1-A.2, the test-recent probability can be written as By replacing the parameters with their estimators, an estimator for λ(t) can be formulated as This estimator is indeed the snapshot estimator (Kaplan and Brookmeyer, 1999) . Some alternative assumptions may be considered for the adjusted estimator (Kassanjee et al., 2012) . Assumption B.1 allows a non-zero test-recent probability for a long-infected subject, however, it restricts this probability to be constant. Otherwise, the false-recent rate would depend on G(·), the distribution of infection time with respect to which the false-recent rate is evaluated. Assumption suggests that the infection time is uniformly distributed in [t − T * , t] for an infected eligible subject at time t. Since T * is usually smaller than τ , Assumption B.2 is less restrictive than Assumption A.2, since the uniform distribution requirement on infection times is on a shorter time span in the past. Given Assumptions B.1 and B.2, the probability of test-recent can be written as Then, an estimator for λ(t) can be formulated as which is the adjusted estimator (Kassanjee et al., 2012) . Assumption B.1 requires a constant φ(u) for u ≥ T * , such that the false-recent rate β * T no longer depends on the distribution of the long-infected population G(·). Then, an unbiased estimate of β T * can be obtained by taking the average test-recent rate among an arbitrary sample of long-infected subjects. In practice, φ(u) may be non-constant for u > T * . In that case, the summary FRR β T * depends on the distribution G(·) and is context-specific. For example, it may depend on the demographic and epidemiological history of the population (Kassanjee et al., 2016 ). An estimate β T * depends on the distribution of long-infected subjects based on which β T * is estimated. In practice, researchers usually prefer a recency test with a small FRR (< 2%), so that φ(u) can be viewed as approximately constant for u > T * . Remark 2 In the case when Assumption B.1 is violated, use of the adjusted estimator may still be appropriate if FRR is evaluated among a similar population as the long-infected subjects in the where the last equality follows from Assumption B.2,. Then, derivations for P rec (t) to obtain the adjusted estimator still hold. Therefore, we may still appropriately use the adjusted estimator, if the distributions of the long-infected subjects in the cross-sectional sample and in the evaluating external study where β T * is estimated are the same. In the derivations for both estimators, one key assumption is that where c = τ for the snapshot estimator and c = T * for the adjusted estimator. A similar assumption was also suggested in Mahiane et al. (2014) in describing the sensitivity and specificity of recency biomarker. Write λ * t (s) = Pr(T = s|T ≥ s, A(t) = 1) and p * t (s) = Pr(T ≤ s|A(t) = 1) as the incidence and prevalence at time s restricted to the eligible population at time t. Note that λ * t (s) differs from the λ(s) defined in (3), since they are restricted to the populations that are eligible at different times. Obviously we have λ * t (t) = λ(t) and p * t (t) = p(t). The key quantity Pr(T = s|T ≤ t, A(t) = 1) can be written as To connect λ * t (s) and p * t (s) with the observed incidence λ(t) and prevalence p(t), we make the following assumption. Assumption C For s ∈ [t − c, t], the restricted incidence is equal to the unrestricted (or observed) incidence, i.e., λ * t (s) = λ(s), and the restricted prevalence is equal to the unrestricted prevalence, i.e., p * t (s) = p(s), and for all t. This assumption would approximately hold when c is small. Specifically, if A(t) is defined by characteristics such that only a small proportion of the subjects move in and out of the eligible population in a time span of c, then the eligible population remains approximately the same, i.e., When A(t) is defined by covariate values such as membership of a particular population (e.g., MSM), this assumption requires that the most of the subjects who were part of this population at time s are also part of this population at time t. Based on Assumption C, Specifically, we consider the following assumption. , and for all t. To summarize these assumptions, the consistency of the snapshot estimator and adjusted estimator is given in the following theorems. Theorem 1 Suppose that Assumptions C and D hold for c = τ . Then, Assumption A.2 holds. If we further assume Assumption A.1, then the snapshot estimator λ is unbiased for estimating λ(t). Theorem 2 Suppose that Assumptions C and D hold for c = T * . Then, Assumption B.2 holds. If we further assume Assumption B.1, then the adjusted estimator λ is unbiased for estimating λ(t). We have given results on consistency of the snapshot and adjusted estimators with Theorems 1 and 2. The main epidemiological requirement is Assumption D, i.e., incidence and prevalence are constant over a period of time. In this section, we explore the expected bias when Assumption D fails to hold (but all other assumptions hold). Specifically, we assess the bias associated with non-constant incidence λ(t) but constant prevalence. By Assumption C, where p is the constant prevalence. We first consider the bias of the snapshot estimator. Given Assumption A.1, the expected value of the snapshot estimator is given by which is a weighted version of the incidence over [t − τ, t]. With an HIV epidemic If the incidence where ω = τ 0 uφ(u)du/µ is the mean shadow time defined by Kaplan and Brookmeyer (Kaplan and Brookmeyer, 1999) , indicating that the cross-sectional sample is "casting a shadow" back in time. That is, when the incidence is linearly changing in time and the prevalence is constant, the snapshot estimator estimates the incidence rate ω time units ago. The estimation bias is given For example, if incidence is decreasing, i.e., ρ > 0, the underlying incidence that produced an infection u > 0 time units ago was higher than the current incidence, so the estimateλ will have positive bias. Similarly, for the adjusted estimator, we evaluate the expected value under Assumption B.1, which is given by . It can be viewed as a "mean shadow time" for the adjusted estimator with a recency test that satisfies Assumption B.1. The estimation bias is Thus far, we have provided a framework with precisely defined assumptions through which to understand both the snapshot and adjusted estimators. To our knowledge, rigorous derivation of the estimators and their assumptions has not been done by others. In the following section, we evaluate how these estimators perform empirically under realistic epidemiological scenarios and with realistic recency test algorithms. To evaluate the numerical performance of the estimators under various settings, we conducted simulation studies. Throughout the simulations, we assume that Assumption C on approximation of the eligible population always hold. We also assume that prevalence is constant over time. We consider different settings of HIV epidemics and recency tests, where Assumptions A.1, B.1, and D hold or not. We calculate the snapshot and adjusted estimators using (1) and (2), with variance estimators calculated based on Appendix A of Gao et al. (2020) . Importantly, these variance estimators accounts for variability in estimating µ, Ω T * and β T * from an external study. We generate practical settings by mimicking the epidemiological dynamics of HIV in a population of men who have sex with men (MSM) attending Silom Community Clinic in Bangkok, Thailand (Pattanasin et al., 2020) . Particularly, we set the prevalence to be constant and as the mean prevalence in 2011-2018 in that population, and generate settings with different incidences by modeling the HIV incidence in 2011-2018 by either a linear model or log-linear model, to reflect a linearly decreasing or exponentially decreasing incidence. Based on the estimates from the Bangkok MSM data, we consider the following settings corresponding to different trends in HIV incidence. Assumption D is satisfied when the incidence is constant, and it is violated in the linear and exponential settings. We would like to estimate the incidence at time t such that the "true" incidence is 0.032 across all settings. In our simulations, we will assess how violating this assumption affects the performance of the snapshot and adjusted estimators. We sought to assess the performance of the two estimators with a variety of recency tests with different characteristics. The properties of those simulated recency tests mimic two tests in Brookmeyer et al. (2013) and Laeyendecker et al. (2018) , with modifications that allow us to assess the performance of the estimators under diverse conditions. For the snapshot estimator, we set τ = 12. For the adjusted estimator, we always considered T * = 2, i.e., any person with an infection acquired longer than 2 years ago is a "long-infected" case. We first consider a set of recency tests with a relatively short mean window period and a short shadow period, mimicking a recency test in Brookmeyer et al. (2013) that classifies a subject as recenct if their BED capture enzyme immunoassay (BED-CEIA) ≤ 1.5, their Bio-Rad Avidity (BRAI) (Bio-Rad Laboratories, Mississauga, ON) < 40, and their viral load > 400 copies/ml. This test has a mean window period of 101 days and a shadow period of 194 days. We generated four different recency tests that mimic this test: (1A) φ 1A (t) = 1 − F Gamma (t; α = 0.352, β = 1.273), where F Gamma (·; α, β) is the cumulative distribution function of a Gamma random variable with shape α and rate β. Assumption A.1 (approximately) holds for this test with mean window period 101 days and mean shadow 194 days. Assumption B.1 fails to hold since φ 1A (t) is non-constant for t ≥ 2. MDRI = 98 days and the test-recent rate probability at t = 2 is 1.4%. . This test modifies test 1A by carrying forward the 1.4% test-recent probability at t = 2, such that Assumption B.1 for the adjusted estimator holds with MDRI = 98 days and FRR = 1.4%. Assumption A.1 for the snapshot estimator no longer holds such that the mean window period is infinite. (1C) φ 1C (t) = φ 1B (t) + f N (t; 7, 1)/8, where f N (t; µ, σ) is the density function of a normal random variable with mean µ and standard deviation σ. This test further modifies test 1B by adding a normally distributed spike centered at 7 years, such that Assumption B.1 on constant falserecent rate no longer holds. This test, similar to that depicted in the figure of epidemiological and test recent dynamics in Kassanjee et al. (2012) , represents a setting in which individuals who have been on antiretroviral therapy for years may have biomarker profiles similar to those who have been recently infected, and thus the false-recent rate among those individuals is relatively higher. (1D) φ 1D (t) = φ 1B (t) + F N (t; 10, 2)/10 where F N (t; µ, σ) is the cumulative distribution function of a normal random variable with mean µ and standard deviation σ. This test modifies test 1B by steadily increasing the false-recent rate starting around 6 years, and reaches 9.8% at 12 years. The high false-recent rate is motivated by the BED assay (Parekh et al., 2002) , which has been shown to have an FRR in some populations up to 15% (Mastro et al., 2010) . (2B) φ 2B (t) = φ 2A (t)I(t ≤ 3.17) + 0.020I(t > 3.17). This recency test has a constant 2% testrecent probability when t ≥ 3.17. Unlike test 1B, Assumption B.1 for the adjusted estimator is violated since the test-recent rate is non-constant after year 2, as depicted in Figure 1 by the shaded grey region. (2C) φ 2C (t) = φ 2B (t) + f N (t; 7, 1)/8. A similar normally distributed spike centered at 7 years was added to test 2B. (2D) φ 2D (t) = φ 2B (t) + F N (t; 10, 2)/10. Similar to φ 1D , FRR increases up to 10.4% at time 12. The duration-specific test-recent probabilities of all six recency tests are depicted in Figure 1 . The shaded grey area highlighted that Assumption B.1 is violated for test 2B, even though it holds for test 1B: there is a non-constant fraction of the long-infected subjects (infection duration > 2) who test recent. The data simulation procedure consists of two parts. The first part requires simulating data to mimic an external study based on which we estimate the properties of a particular recency test, including mean window period, MDRI and FRR. The second part includes simulation of crosssectional samples from a population with given HIV epidemiological dynamics. These separate data simulation procedures are outlined in sections 3.3.1 and 3.3.2, respectively. Here we outline the process of simulating recency test results for samples (with known infection durations) in an external study and estimating recency test parameters based on such simulated data. We simulate infection durations in the external study similar to those in Duong et al. (2015) , with detailed procedure described in Appendix S1.1. Then, given a duration-specific test-positive probability φ(·), we generate a test-recent indicator by ∆ ij ∼ Bernoulli(φ(u ij )), where u ij is the simulated infection duration for sample j = 1, . . . , n i of subject i = 1, . . . , m in the external study. Based on the observed data {(u ij , ∆ ij ) : i = 1, . . . , n; j = 1, . . . , n i } in the external study, we estimate the function φ(t) using generalized estimating equations (Liang and Zeger, 1986) , with an exchangeable correlation structure accounting for within-subject correlation. The marginal model uses a logit link and a cubic polynomial for the linear predictor by assuming Then, an estimate for φ(u) can be constructed by φ(u) = γ 0 + γ 1 u + γ 2 u 2 + γ 3 u 3 , where γ = ( γ 0 , γ 1 , γ 2 , γ 3 ) is the parameter estimate. We use robust standard errors for variance estimation. We then calculate the mean window period and MDRI by numerically integrating the estimated the recency test-positive function φ. In particular, the mean window period involves a numerically integrating until infinity, however, in the simulation we set the upper bound of the integration to the maximum duration observed in the simulated sample (approximately 8 years). We estimate the variance of Ω T * and µ using the delta method and the robust variance-covariance matrix of γ. We estimate FRR by evaluating the average test-recent probability among a number of long-infected subjects. In particular, we consider 1500 long-infected subjects with duration of infection uniformly distributed between T * = 2 and τ = 12 years, similar to other studies (Kassanjee et al., 2016) . their infection duration based on the epidemic parameters (p, λ(t)), with details given in Appendix S1.2. Given T i , we generate a recency test indicator ∆ i ∼ Bernoulli(φ(t − T i )). Finally, we calculate N rec = Npos i=1 ∆ i . For each simulation replicate, we first generate observations of an external study to obtain the estimates µ, Ω T * and β T * . Then, we generate an independent cross-sectional sample {N rec , N pos , N neg }. We calculate λ by Equation (1) and calculate λ(t) by Equation (2). We estimate their variances based on the formulas in appendix A of Gao et al. (2020) . The code to reproduce these simulations, and instructions to use functions for estimating incidence based on cross-sectional data are available. 1 Table 1 shows the simulation results in settings with constant, linear, and exponential incidence trends and recency tests 1A-D and 2A-D, with fixed cross-sectional trial sample size N = 5000. Across all settings, the "true" value for incidence is λ = 0.032. Each entry is based on 5,000 simulations. Recency assays 1A and 2A satisfy Assumption A.1 for the snapshot estimator. In the constant incidence setting where Assumption D further holds, the empirical bias is small. In the settings where Assumption D fails to hold (linear or exponential incidence), there is empirical bias associated with the snapshot estimator, and the empirical bias is close to the expected bias calculated based on the formula in Section 2.5 (0.15×10 −2 and 0.12×10 −2 for assay 1A in the linear and exponential settings, respectively; 0.20×10 −2 and 0.23×10 −2 for assay 2A in the linear and exponential settings, respectively). For the recency tests that violate Assumption A.1 (assays 1B-D, 2B-D), the empirical bias is larger, and it increases if the constant incidence assumption is also violated. Table 1 : Summary statistics (×10 −2 ) for the simulation studies with different settings over 5000 simulations each. For each epidemiological setting and recency test, we show the empirical median bias (Bias), the empirical standard error (SE), the average standard error estimate (SEE), and the empirical coverage probability of the 95% confidence intervals (Cov). We note whether the assumptions are satisfied for the snapshot and adjusted estimator in the Asm. column. coverage probabilities are close to the nominal level with recency test 1A and 2A when incidence is constant. There is under-coverage when the incidence is non-constant or when Assumption A.1 fails to hold (recency tests 1B-D, 2B-D). When incidence is constant, the empirical bias for recency tests 1A-C and 2A-C is small, while the empirical bias for recency tests 1D and 2D are relatively large. When the constant incidence assumption is violated, the empirical bias gets larger and the empirical bias for assay 1B matches with the theoretical calculated expected bias when Assumption B.1 holds (0.10×10 −2 and 0.09×10 −2 for assay 1B for the linear and exponential settings, respectively). Across all settings, the standard error estimate is close to the empirical standard error, indicating reasonable estimation of variability. Notably, the empirical standard errors for the adjusted estimator are always larger than those for the snapshot estimator. The coverage probabilities are close to the nominal level with recency tests 1A and 1B, even in non-constant incidence settings. The coverage probabilities in the constant incidence setting are always close to the nominal level. Similar to the snapshot estimator, there is under-coverage when the incidence is non-constant for some recency tests. The coverage probabilities are closer to the nominal level compared to the snapshot estimators in most non-constant incidence settings, mainly due to a larger associated variability. Table 1 shows that recency tests 1C and 2C give minimal bias and nominal coverage in the constant incidence setting. It seems to suggest that the adjusted estimator may sometimes be robust to violation of the recency test requirement (Assumption B.1) when the incidence in the population is constant. However, it is the result of a specific choice of distribution of infection duration in the external study, based on which we estimate β T * . The infection times of the long-infected subjects were generated uniformly from 2 to 12 years, such that the estimated β T * reflects the average false-recent rate in [2, 12] , similar to the average false-recent rate in a constant incidence setting. To evaluate sensitivity to this distributional assumption, we consider two different distributions of infection durations for long-infected subjects in the external study in estimating β T * . The first distribution is similar to those who were infected more than two years ago in Duong et al. (2015) , where the range of infection duration is [2,8.25] years and the distribution is certainly not uniform (see Figure 2 ). The second distribution is to truncate Duong's infection durations at 5 years, such that the range of infection duration is [2,5] years. Simulation results are shown in Table 2 . Different sampling schemes for the long-infected subjects provide different estimates for the incidence with recency tests 1C and 2C. In particular, if the distribution of long-infected subjects fails to recover the major trend of the φ function (e.g., the spike at 7 years for assays 1C and 2C), the estimator is biased and the coverage is poor. This suggests caution in using the adjusted estimator when Assumption B.1 fails to hold, and that one should be sure that the distribution of long-infected subjects matches with the cross-sectional sample. Adjusted Estimator (2) In this manuscript, we considered a unified statistical framework to assess the assumptions for cross-sectional incidence estimation based on the snapshot and Kassanjee's adjusted estimators. We established two key assumptions: the incidence and prevalence in the population of interest are constant over the period of time preceding a cross-sectional sample; and the duration-specific testrecent probability function φ(t) goes to zero for the snapshot estimator or is constant in the tail for the adjusted estimator. We derived the theoretical biases of the estimators when constant incidence assumption fails to hold. To empirically assess the biases, we conducted simulation studies under various scenarios with different epidemiological settings and different recency test properties. Indeed, the estimators perform well when their corresponding assumptions hold. When the constant incidence assumption is violated, the numerical bias is commensurate with the theoretically calculated bias. The adjusted estimator is more robust when the assumptions about the recency test properties (Assumption A.1 or B.1) are violated; though to compensate for this, the variability of the adjusted estimator is always larger than that of the snapshot estimator. This robustness to mis-specification makes the adjusted estimator more flexible in the setting where the property of a specific recency test is not precisely known. There are important differences between the snapshot and adjusted estimator with respect to their requirements. The snapshot estimator requires a finite positive range for φ(t) (Assumption A.1). In other words, if someone was infected sufficiently long time ago, the recency test is perfectly specific. In contrast, the adjusted estimator requires a constant φ(t) in the tail (Assumption B.1). In other words, past a certain point, the false-recent test probability is unrelated to infection duration. As illustrated by assay 2A in the simulation studies, Assumption B.1 is not necessarily less restrictive than Assumption A.1 and indeed the snapshot estimator performs better for this specific assay. In practice, to obtain the best performance, the researcher should be cautious and understand the properties of the recency test before choosing whether to apply either of the estimators. An additional consideration when using the adjusted estimator is that the performance of the adjusted estimator may be affected by the distribution of the long-infected subjects that are used to estimate FRR. As suggested in Remark 2, the bias from the adjusted estimator may be minimal if the distributions of the long-infected subjects in the cross-sectional sample and in the evaluating external study are the same. However, if the range of the infection duration of the long-infected subjects fails to recover a non-constant region of the φ function of the recency test, or if the distribution of infection duration differs much from that of long-infected subjects in the crosssectional sample, biases from estimating FRR lead to bias and under-coverage in the adjusted estimator. In order to use the adjusted estimator, researchers need to specify a fixed T * beyond which point subjects are regarded as long-infected. In practice, T * is set at 1 or 2 years, and a proper test-recent region R is then chosen to yield a recency test with the desired properties (e.g., large MDRI, FRR¡2.0%) defined upon this choice of T * . Choice of T * would affect the performance of the adjusted estimator through its impact on MDRI/FRR and the fact that the adjusted estimator estimates a weighted average incidence in a range of T * prior to the cross-sectional time. In this manuscript, we did not assess the impact of T * , since it would involve extensive modeling for the biomarker values at different infection duration. We wish to explore the effect of different choices of T * in future research. Finally, the performance of the adjusted estimator is sensitive to Assumption B.1 that includes a constant tail of the duration-specific test-recent probability φ(t), which is affected by the test-recent region R of the recency test. In particular, the test-recent region R is usually chosen to guarantee such assumption, leading to a potentially small MDRI and suboptimal power for the adjusted estimator. An alternative strategy is to directly model the infection duration and construct an incidence estimator based on a predicted infection duration given recency assay readings. Since it uses the recency assay readings, such strategy is similar to making use of "recency" results from multiple test-regions so that more power may be gained. This alternative strategy is currently under investigation. S1 Simulation Procedure S1.1 Data Simulation to Mimic Duong et al. (2015) In section 3.3.1 we describe the process for estimating mean window period and MDRI (and possibly FRR) from an external study. To get the external study data, we construct a data generating process based off of the data source in Duong et al. (2015) . The purpose of Duong et al. (2015) was to estimate the MDRI for one particular recency test. We re-purpose the data source to estimate characteristics of our recency tests 1A-D and 2A-D for the simulation studies. In Duong et al. (2015) , the authors gathered data from individuals with known seroconversion times (or equivalently infection duration) and with measurements taken longitudinally, sometimes over the course of many years. In total, we have 2077 longitudinal measurements on 175 individuals. There were additional individuals in the dataset, but we focused on only the ones with optimal panel data as described in Duong et al. (2015) The individuals had different HIV-1 subtypes including A, B, C, D, and AE, and were from varied areas of the globe including the Netherlands, Thailand, Ethiopia, Kenya, China, and Trinidad. We grouped all geographic areas and HIV-1 subtypes together for simplicity. The grey histogram in Figure 2 shows the empirical distribution of infection durations in the Duong et al. (2015) dataset pooled over all geographic and subtype cohorts. Our aim is to create a data generation process that mimics this empirical distribution, but preserves the longitudinal aspect of the data (as opposed to re-sampling independently from this histogram). For each individual in the original dataset, we calculated the gap times between their longitudinal measurements. The average number of days between longitudinal measurements is a function of the sample number, i.e., the first couple of samples taken on an individual are typically close together in time and after that there is a longer time between each sample (see Figure 3 ). Based on this observation, we fit a piece-wise log-linear GEE model with a knot at sample number 5. The model fit is shown in Figure 3 . Pattanasin et al. (2020) : constant incidence, linearly decreasing incidence, and exponentially decreasing incidence. We consider all three of these for the simulation settings. Right: Distribution of 10,000 past infection times that each setting implies. Note that when we move from constant to linear or exponential incidence, the assumption of uniformly distributed infection times is violated. Our simulation study assesses how violating this assumption affects the performance of the snapshot and adjusted estimators. Measuring the hiv/aids epidemic: approaches and challenges Estimation of hiv incidence using multiple biomarkers Estimation of current human immunodeficiency virus incidence rates from a cross-sectional survey using early diagnostic tests Use of a high resolution melting (hrm) assay to compare gag, pol, and env diversity in adults with different stages of hiv infection Recalibration of the limiting antigen avidity eia to determine mean duration of recent infection in divergent hiv-1 subtypes Use of a multifaceted approach to analyze hiv incidence in a cohort study of women in the united states: Hiv prevention trials network 064 study Sample size calculation for active-arm trial with counterfactual incidence based on recency assay Improved hiv-1 incidence estimates using the bed capture enzyme immunoassay New testing strategy to detect early hiv-1 infection for use in incidence estimates and for clinical and prevention purposes Snapshot estimators of recent hiv incidence rates A new general biomarkerbased incidence estimator Viral load criteria and threshold optimization to improve hiv incidence assay characteristics-a cephia analysis Performance of a limiting-antigen avidity enzyme immunoassay for cross-sectional estimation of hiv incidence in the united states Ambiguous nucleotide calls from population-based sequencing of hiv-1 are a marker for viral diversity and the age of infection Hiv incidence determination in the united states: a multiassay approach Identification and validation of a multi-assay algorithm for cross-sectional hiv incidence estimation in populations with subtype c infection Longitudinal data analysis using generalized linear models Mixture models for calibrating the bed for hiv incidence testing Closer to 90-90-90. the cascade of care after 10 years of art scale-up in rural malawi: a population study Estimating HIV Incidence in Populations Using Tests for Recent Infection: Issues, Challenges and the Way Forward Comparison of hiv type 1 incidence observed during longitudinal follow-up with incidence estimated by cross-sectional analysis using the bed capture enzyme immunoassay Cross-sectional estimates revealed high hiv incidence in botswana rural communities in the era of successful art scale-up in 2013-2015 Quantitative detection of increasing hiv type 1 antibodies after seroconversion: a simple assay for detecting recent hiv infection and estimating incidence Recent declines in hiv infections at silom community clinic bangkok, thailand corresponding to hiv prevention scale up: An open cohort assessment A comparison of south african national hiv incidence estimates: a critical appraisal of different methods Voluntary counselling and testing: uptake, impact on sexual behaviour, and hiv incidence in a rural zimbabwean cohort Community viral load, antiretroviral therapy coverage, and hiv incidence in india: a cross-sectional, comparative study Precision and accuracy of a procedure for detecting recent human immunodeficiency virus infections by calculating the antibody avidity index by an automated immunoassay-based method A new pattern-based method for identifying recent hiv-1 infections from the viral env sequence 31 instead of 0.29) and the results did not substantively change distributions for each. To do this, we impose one additional assumption.In addition,Assumption E states that equivalent incidence among eligible populations at different times holds for a range of s closer to t, while the incidence λ * t (s) = Pr(T = s|T ≥ s, A(t)) is zero when s is far away from t. It leads to Assumption C if c ≤ c t where c = τ for the snapshot estimator and c = T * for the adjusted estimator. This assumption may not generally hold in practice, since we would expect a continuous λ * t (s), and in this case there may be a discontinuity in incidence at time t − c t , however, it is useful in generating infection time distribution with Assumption C holds. Particularly, the closed form solution for T in a number of cases in shown below. Incidence λ(s) Infection Time(1 − e) /ρ Figure 4 shows the incidence functions we consider for simulations based on the Bangkok MSM data, and the corresponding infection time distributions based on the above derivations. As a sensitivity check, we slightly varied the parameters derived from the Bangkok data (i.e., prevalence