key: cord-0859526-j2fxkjst
authors: Mohanan, M.; Malani, A.; Krishnan, K.; Acharya, A.
title: Prevalence Of COVID-19 In Rural Versus Urban Areas in a Low-Income Country: Findings from a State-Wide Study in Karnataka, India
date: 2020-11-04
journal: nan
DOI: 10.1101/2020.11.02.20224782
sha: 36479cc0e9de77231733dcb0b0f5dfafcce940fe
doc_id: 859526
cord_uid: j2fxkjst

Although the vast majority of confirmed cases of COVID-19 are in low- and middle-income countries, there are relatively few published studies on the epidemiology of SARS-CoV-2 in these countries. The few there are focus on disease prevalence in urban areas. We conducted state-wide surveillance for COVID-19, in both rural and urban areas of Karnataka between June 15-August 29, 2020. We tested for both viral RNA and antibodies targeting the receptor binding domain (RBD). Adjusted seroprevalence across Karnataka was 46.7% (95% CI: 43.3-50.0), including 44.1% (95% CI: 40.0-48.2) in rural and 53.8% (95% CI: 48.4-59.2) in urban areas. The proportion of those testing positive on RT-PCR, ranged from 1.5 to 7.7% in rural areas and 4.0 to 10.5% in urban areas, suggesting a rapidly growing epidemic. The relatively high prevalence in rural areas is consistent with the higher level of mobility measured in rural areas, perhaps because of agricultural activity. Overall seroprevalence in the state implies that by August at least 31.5 million residents had been infected by August, nearly an order of magnitude larger than confirmed cases.

There are few published studies on the epidemiology of SARS-CoV-2 in low-and middleincome countries, which contain the vast majority of confirmed cases. India has the secondhighest number of reported cases, but most seroprevalence estimates have come from urban centers.

Urban areas, because of higher population densities, are thought to be more vulnerable to . However, rural areas received millions of migrant workers fleeing cities and agriculture was an essential-activity exempt from lockdown. Selection of Primary Sampling Units: The broadest stratum for the survey is a "homogenous region" (HR), an areas comprised of neighboring districts in a state where the districts have similar agro-climactic conditions, similar urbanization rates, and similar female literacy. Table 1 provides the districts in the 5 CPHS homogenous regions in Karnataka.

Within each HR, CPHS samples from 2 strata: urban and rural areas. Within urban areas, which are the towns from the 2011 Indian Census, CPHS samples from 4 strata defined by town size. Within each town stratum, CPHS selects at least one town via simple random selection.

Within the rural strata, CPHS picks a random subset of villages via simple random selection. In Karnataka, CPHS has picked 31 -51 villages per HR and 3-4 towns per HR. There were some cases where we were not able to execute the survey in a CEB or village because of the lockdown restrictions and imposition of containment areas. Before going to that CEB in a town or that village, we replaced it in the sample. In the case of CEBs, we randomly selected another CEB from the CPHS sample in the same town. In the case of villages, we selected the nearest villages to the inaccessible village.

Although there are 30 districts in Karnataka, our random selection of one-quarter of towns or villages from the CPHS sample of CEBs resulted in administrative units not being selected in 10 districts. Hence our study sample includes 20 districts.

We did not conduct power calculations when selecting our sample. First, our main constraint was how many households from the CPHS frame we were allowed to include in our serological study. The owner of the database, CMIE, was concerned that requesting biosamples might cause their panel households to refuse to participate in CPHS going forward. Weighing the value of this study against that risk, they were only willing to sacrifice roughly one-quarter of the sample.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint Second, we did not have strong priors on the fraction of the population that had previously been exposed. The fraction of the population that previously had a confirmed case was small when we started the study on June 15, 2020. That fraction was an implausible estimate of positive proportion on an ELISA test. A more realistic higher number would require a larger sample size to estimate given the standard sample size formula for binary outcomes.

However, we were able to calculate the minimum detectable effect (MDE) given the sample size we were allocated. This is summarized for different priors estimates of positive proportions in Figure 1 . Our calculation of MDE comes from the usual sample size formula for a two-sided test and binary outcomes:

where one seeks 95% confidence ( /2 = 1.96) and 80% power ( = 0.84), is a prior estimate of positive proportion, and is the sample size. We ultimately chose to estimate positive proportion in 10 strata, which implies our MDE would have been 8.08 percentage points if (a) our belief was that there was no difference between positive proportion in urban and rural areas and (b) our prior estimate for positive proportion in each strata turned out to be the correct estimate for the state.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; Figure 1 shows what the MDE would be if the study examined 10 strata (5 homogenous regions x urban or rural areas) or 2 strata (urban versus rural areas).

Because prevalence changes over time, we estimate prevalence in each stratum (homogenous region x urban status) within a 2-3 week window. Karnataka is the 6th largest Indian state by area (191,791 km2). The data collection was conducted by the study team in various parts of the state from June 15 to August 29, 2020, to complete sampling across the entire state. The median date on which we visited each strata is depicted on the x-axis in the Figures 4 and 5 in the results section.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint

At each of the 2912 households in the KSS sample, we first sought consent from anyone at home to complete a health survey that asked about demographics, comorbidities, travel and contact history, and COVID symptoms. Multiple individuals at each household were allowed to complete this survey.

At each of the households wherein at least 1 person consented to complete the health survey, we asked 1 person to consent to a 5ml blood draw (via venipuncture using an EDTA vacutainer) and a nasopharyngeal nasal swab. The blood was refrigerated until it was delivered to Xcyton lab. The swab was placed in viral transport medium (VTM) and refrigerated until delivered to Aster Labs. The VTM we used was the Covisafe TM kit (manufactured by Mapmygenome) for collecting oropharyngeal swabs that make it possible to transport the swabs at ambient temperatures even if refrigeration failed.

Before visiting each household, we endeavored to call the household to increase the probability that we would arrive when they were at home. Despite this effort, we lost 580 households (20.0% of 2912 households in the KSS sample frame) because they were not home when we arrived ( Figure 2 ).

When visiting a household, we sought consent to conduct our health survey. 425 households (14.6% of all 2912 households in the KSS frame or 18.2% of 2332 available households) declined consent to participate. In total 1907 households had at least 1 person consent to complete a health survey (demographic and health questionnaire). This implies a response rate of 65.5% of the 2912 households in the KSS frame and 81.8% of households that were available. We note that these consent rates are comparable to other studies of COVID. 1 Across the households that consented to the health survey, there were 1363 persons who consented to provide both blood and swab. In 2 households one person provided blood and another provided a swab. In 11 (22) households, the oner person who consented to a biosample only provided person blood (swab). In total, 1374 households (72.1% of the 1907 households where someone consented to a health survey) had a person that provided blood and 1385 households (72.6% of 1907) had a member that provided a swab.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint . CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint

We were unable to obtain lab results for all samples. Among 1374 blood samples, we were unable to obtain results for 170 samples (12.3% of blood samples)). Of these, 12 were lost because labels were unreadable by the lab and 158 had inadequate blood drawn to extract serum. Among swab samples. We were unable to obtain law results for 46 samples (3.3% of swab samples). Of these 12 were due to bad labels and the result were due to failure in the VTM. (See Table 2 for details on sample composition by consent and lab specimen availability, by location and date)

Science and Technology Institute, India. This ELISA test is positive if IgG score, defined as the ratio of the titer in a sample and in a negative control, is greater than 1.5. This test has sensitivity of 84.7% (95% CI: 80.6-88.1%) and specificity of 100% (95% CI: 97.4-100) 2 .

Aster Labs conducted RTPCR tests targeting the N gene using the ARGENE® SARS-COV-2 R-GENE® assay from Biomerieux SA 1 . This test received Emergency Use Authorization from the FDA in May 2020 2 . We code borderline results on this test are as negative results. This test has sensitivity of 100% (95% CI: 87.7-100%) and specificity of 100% (98.1-100%) 3 .

Our primary outcome is the proportion of positive results on ELISA tests for each of 10 homogenous region x urban/rural strata. Our secondary outcomes are:

• Adjusted proportion of positive results on RT-PCR tests for each of 10 homogenous region x urban/rural strata, • Adjusted proportion of positive results on ELISA tests for the 5 homogenous regions, • Adjusted proportion of positive results on ELISA tests for urban or rural areas, • Adjusted seroprevalence based on the ELISA test accounting for the imperfect accuracy of those tests. 

We Finally, we estimate the adjusted seroprevalence using the Rogan-Gladen 4 correction for imperfect accuracy of tests after calculating adjusted proportions. We employ adjusted proportions to calculate the variance of adjusted seroprevalence and then employ normal approximations to estimate Wald confidence intervals for that prevalence. We employ Microsoft

Excel 365 and Stata 16 to perform statistical analyses. 3 Details on methods at https://consumerpyramidsdx.cmie.com/kommon/bin/sr.php?kall=wkb.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint

Serology tests: The adjusted proportions of positive IgG tests ranged from 22.8-53.1% across rural and 30.9-76.8% across urban areas (Figure 3 , details in Table e1 ). Overall rural, urban and statewide adjusted proportions were 37.4% (95% CI: 32.9-41.8%), 45.6% (95% CI: 38.1-53.1%), and 39.6%, (95% CI: 35.7-43.4%), respectively. Mysore region had a higher adjusted proportion (50.1%, 95% CI: 44.7-55.4%, difference p<0.001). Table e2 ).

The correlation between ELISA and RTPCR tests is 0.04 (p=0.14).

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint received over 2 million returning workers) in May 2020 showed that large share of workers arriving from parts of the country where the epidemic was raging tested positive on RT-PCR 5 . A second contributing factor was that, while urban areas experienced severe lockdowns, rural areas experienced fewer restrictions on mobility ( Figure 5 ) because agricultural activity was deemed an essential sector. There are several policy implications of our findings. With nearly half the population in the state being infected with COVID as of August 2020, stringent suppression policies across the general population will impose significant costs on those who are already infected. In the short run, most individuals who have already been exposed are likely to be resistant to repeated infection. Until there is further evidence on how fast antibodies (such as those to the RBD of the spike protein we test for) decline over time and whether t-cell immunity provides protection even after antibodies decline, it is difficult to make inferences about long term immunity. However, even in the short run, there is a strong case for adopting frequent testing in exchange for permitting productive economic activity in the state.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint

That being said, it should be acknowledged that the populations who were exposed in the first half of the epidemic might be significantly different from those who remain uninfected. For example, individuals who are elderly and have chronic conditions or are otherwise at higher risk may have taken precautions to avoid infections. A total relaxation could lead to a spike in infections among such at-risk populations leading to further spikes in severe cases or mortality that will create large burden for the healthcare system. Therefore, as the government considers relaxing restrictions on economic activity, it is critical to continue efforts to promote mask wearing, hand washing, and communicating the significance of COVID complications to individuals who are at risk.

Our findings underscore the need for larger scale studies across India that can provide estimates of seroprevalence at smaller levels of granularity and also study what happens to antibodies and t-cell immunity over time. anu@mapmygenome.in

This study was approved by the Government of India (the Prime Minister's Principal Scientific Advisor's Office) and the Government of Karnataka. The study protocols were also approved by IRB / IEC committees at three institutions:

• Karesa (ECR/308/Indt/KA/2018, Approval date June 3, 2020), for Anu Acharya (Mapmygenome) • Duke University (Protocol 2020-0553), • University of Chicago (IRB20-1484). Because Mohanan (Duke) and Malani (University of Chicago) only received de-identified data, the research was determined to be exempt from IRB review at their institutions.

. CC-BY-NC-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted November 4, 2020. ; https://doi.org/10.1101/2020.11.02.20224782 doi: medRxiv preprint

SARS-CoV-2 antibody prevalence in Brazil: results from two successive nationwide serological household surveys. The Lancet Global Health

Comparative evaluation of SARS-CoV-2 IgG assays in India

Rapid assessment of Biomerieux ARGENE® SARS-COV-2 R-GENE® real-time detection kit2020

Estimating prevalence from the results of a screening test

Prevalence of SARS-CoV-2 among workers returning to Bihar gives snapshot of COVID across India. medRxiv

Appendix: Proportion positive is fraction of tests where the score, defined as IgG response in sample divided by IgG response in known negative sample, is greater than cutoff value. So, e.g., cutoff = 1.5 means test is positive if IgG response in sample is 50% or more greater than in the control sample. Weights are designed to make samples representative of an area (urban or rural) within a region. The weights account for lack of lab results. Bias-corrected bootstrap confidence intervals, based on 1000 replications per estimated proportion, are presented.