key: cord-0121502-0tkg57em authors: Karmakar, Bikram; Small, Dylan title: Inference for a test-negative case-control study with added controls date: 2020-05-14 journal: nan DOI: nan sha: fab80aecf7f0aa3973475be4ab64addbcfed3d44 doc_id: 121502 cord_uid: 0tkg57em Test-negative designs with added controls have recently been proposed to study COVID-19. An individual is test-positive or test-negative accordingly if they took a test for a disease but tested positive or tested negative. Adding a control group to a comparison of test-positives vs test-negatives is useful since additional comparison of test-positives vs controls can have potential biases different from the first comparison. Bonferroni correction ensures necessary type-I error control for these two comparisons done simultaneously. We propose two new methods for inference which have better interpretability and higher statistical power for these designs. These methods add a third comparison that is essentially independent of the first comparison, but our proposed second method often pays much less for these three comparisons than what a Bonferroni correction would pay for the two comparisons. Test-negative designs with added controls have recently been proposed to study . An individual is test-positive or test-negative accordingly if they took a test for a disease but tested positive or tested negative. Adding a control group to a comparison of testpositives vs test-negatives is useful since additional comparison of test-positives vs controls can have potential biases different from the first comparison. Bonferroni correction ensures necessary type-I error control for these two comparisons done simultaneously. We propose two new methods for inference which have better interpretability and higher statistical power for these designs. These methods add a third comparison that is essentially independent of the first comparison, but our proposed second method often pays much less for these three comparisons than what a Bonferroni correction would pay for the two comparisons. keywords: Case-control studies; Closed testing; Confidence intervals; Potential biases; Second control group. Test-negative studies compare exposures in cases who take a test for a particular disease and test positive vs. controls who also take the test but test negative. 5 ;7 A test-negative study with added controls (TNSWAC) supplements with controls who did not take the test. TSNWACs have been used to study antibiotic resistance 4 and proposed to study COVID-19. 8 The standard inference approach has been to present two exposure rate comparisons, (i) test-positives to test-negatives (ii) test-positives to controls To control the familywise Type I error rate for multiple comparisons at level α (e.g., α = 0.05), the Bonferroni inequality can be used and each comparison done at level α/2. Here we propose different inference strategies that can provide greater interpretability and power. A valuable feature of TSNWACs is that comparisons (i) and (ii) may have different potential biases. 8 Evidence is strengthened when diverse approaches with diverse potential biases produce similar results. 6;3 However, comparisons (i) and (ii) are dependent -see Figure 1A -and might tend to agree just because of this dependence. It is important to distinguish new evidence from the same evidence repeated twice. 3 To this end, it is useful to supplement comparisons (i)-(ii) with comparison (iii) test-positives pooled with test-negatives to controls, which is essentially independent of (i) -see Figure 1B and supplement -and may suffer from different potential biases than (i). 1 For example, it has been hypothesized that smoking protects against Covid-19. 2 Comparison (i) may be biased because test-negatives may have some other infection (e.g., the flu) for which smoking increases risk and comparisons (ii) and (iii) may be biased because test-takers tend to be "health seeking." 7 Finding evidence of smoking being protective in all comparisons (i)-(iii) would strengthen evidence compared to just comparisons (i)-(ii) in part because the latter comparisons are dependent. The following are two procedures that consider comparisons (i)-(iii) and control the familywise error rate for multiple comparisons at α (proof/code in supplement). The first procedure is (2) Test the null of no exposure effect in either comparison (i) and/or comparison (iii), H 0(i) ∪ H 0(iii) , at level λ. This could be done by Fisher's combination method since comparisons (i) and (iii) are essentially independent under the null. 1 (2), then test H 0(i) and H 0(iii) each at level λ. (4) If λ = α/2 and both H 0(i) and H 0(iii) were rejected in (3), then test H 0(ii) at level α and reject if p-value ≤ α. For example, suppose the p-values for H 0(i) , H 0(ii) and H 0(iii) were .04, .03 and .04 respec-tively, then the standard procedure would not reject any null hypotheses whereas the second procedure would reject all nulls (note: p-value for H 0(i) ∪ H 0(iii) using Fisher's combination test is .012). Confidence intervals for magnitudes of effect can be formed using both procedures, see supplement. Figure 1C compares the power of the two proposed procedures and the standard procedure in a simulation. Both proposed procedures increase power over the standard procedure in the simulated setting with the second procedure providing more power. ology. In Causal thinking in the health sciences: concepts and strategies of epidemiology. 1973. [7] Jan P Vandenbroucke and Neil Pearce. Test-negative designs: Differences and commonalities with other case-control studies with other patient controls. Epidemiology, 30 (6) Method 1. Note first that H 0(iii) is false when and only when one of H 0(i) or H 0(ii) were false. Since Method 1 can reject H 0(iii) in step (2) only when both H 0(i) and H 0(ii) are rejected at To show familywise error rate control, consider now the different possibilities of the three hypotheses being true or false separately. (a) When all three hypotheses are true, the familywise error rate is pr(R (i),(ii),(iii) ) ≤ pr(R (i),(ii) ) ≤ pr(R (i) ) + pr(R (ii) ) = pr(P (i) ≤ α/2) + pr(P (ii) ≤ α/2) Hence, the familywise error rate is always controlled. Method 2. First we expand the notation R S to denote the event that at least one of the nulls is false is the same as at least one H 0(i) H 0(iii) is false, and only when both H 0(i) and H 0(iii) are true we will have H 0(i)∧(iii) true. Now we use the result that P (i) and P (iii) are essentially independent and P (i)∧(iii) , Fisher's combination of these two p-values, is a valid p-value under H 0(i)∧(iii) . 1 (see footnote) Consider again the different combinations of the three hypotheses being true or false. We can reduce some effort in this enumeration by noting that H 0(iii) is false when and only when one of H 0(i) or H 0(ii) were false. Two analyses are essentially independent if the joint distribution of the p-values from these analyses is stochastically larger than the uniform distribution on unit square. Here, (i) and (iii) are nearly independent since we can show pr(P (i) ≤ p, P (iii) ≤ q) ≤ pq for all 0 ≤ p, q ≤ 1. With larger sample size this inequality becomes sharper, and asymptotically they are independent. (a) When all three of H 0(i) , H 0(ii) and H 0(iii) are true, the familywise error rate is pr(R (i),(ii),(iii) ) ≤ pr(R (ii) at level α/2 in step (1) or R (i)∧(iii) at level α/2 in step (2)) ≤ pr(R (ii) at level α/2) + pr(R (i)∧(iii) at level α/2) = pr(P (ii) ≤ α/2) + pr(P (i)∧(iii) ≤ α/2) ≤ α/2 + α/2 = α. (b) When H 0(i) is true but H 0(ii) is false, hence H 0(iii) is false, the familywise error rate is pr(R (i) ) = pr(R (i) at level α/2 or at level α in step (3), by whether λ = α/2 or = α) (c) Finally, when H 0(ii) is true but H 0(i) is false, hence H 0(iii) is false, the familywise error rate is pr(R (ii) ) = pr(R (ii) at level α/2 or at level α in step (3), by whether λ = α/2 or = α) Hence, the familywise error rate is always controlled. 2 Confidence sets for the magnitude of effects Notation: We can create confidence sets for the effects of the exposure using the methods discussed in the letter. Some new notation are needed. In the following a subscript P is for testpositives, N for test-negatives, and C for the added controls. Also, n with appropriate subscript denotes the counts of a particular group of individuals. For example, n P 1 denotes the number of exposed test-positives and n C0 the number of unexposed test-negatives, and n P N 1 is the number of exposed test-positives or test-negatives. Data tables: The collected data can be tabulated in three tables corresponding to the three comparisons (i), (ii) and (iii). Comparison (iii) Test-positive or negative n P N 1 n P N 0 Control n C1 n C0 Total n P N C1 n P N C0 A p-value for a given one of the three comparisons can be calculated from the corresponding table, e.g., using Fisher's exact test. For example, P (ii) is the p-value calculated from the 2-by-2 table above with the numbers n P 1 , n C1 , n P 0 and n C0 . We have three effects of interest for the exposure, between test-positives and test-negatives, between test-positives and controls, and one between test-negatives and controls. We denote these effects as θ P,N , θ P,C and θ N,C , which are defined below. These are called attributable effects. The effect θ P,N is the ratio of the number of individuals who became test-positive because of the exposure, but in the absence of it would have been test-negative minus the number of individuals who became test-negative because of the exposure but in the absence of it would have been test-positive, divided by the number of exposed test-positives or test-negatives. Notice that θ P,N is a number between -1 and 1; θ P,N = 0 if exposure did not move anyone from being test-positive compared to test-negative without exposure or the reverse. If θ P,N is positive, there individuals for whom the exposure caused them to become test-positive. Similarly, if θ P,N is negative, there are individuals for whom the exposure caused them to become test-negative. In summary, θ P,N is the net effect of the exposure on becoming test-positive over test-negative for exposed tested individuals. The second effect θ P,C is defined similarly. By our definition, θ P,C is the net effect of the exposure for test-positives versus controls relative to all exposed individuals either test-positive or control. We have θ P,C = 0 if the exposure did not make any change in who became testpositive over control or the reverse. Finally, we define a third attributable effect θ N,C in the same way to denote the net effect of Evidence factors in a casecontrol study with application to the effect of flexible sigmoidoscopy screening on colorectal cancer Low incidence of daily active tobacco smoking in patients with symptomatic covid-19 infection Evidence factors in observational studies Risk factors for extended-spectrum β-lactamase-producing escherichia coli urinary tract infection in the community in denmark: a case-control study Theoretical basis of the test-negative study design for assessment of influenza vaccine effectiveness Causal thinking in the health sciences: concepts and strategies of epidemi-Adjusted comparison (ii)