key: cord-0811425-qgz4zftg authors: McCreesh, N.; Mohlamonyane, M.; Edwards, A.; Olivier, S.; Dikgale, K.; Dayi, N.; Gareta, D.; Wood, R.; Grant, A. D.; White, R. G.; Middelkoop, K. title: Improving estimates of social contact patterns for the airborne transmission of respiratory pathogens date: 2022-02-07 journal: nan DOI: 10.1101/2022.02.06.22270386 sha: 35f33045a87a914748960408b35c14ac4eb4b4ee doc_id: 811425 cord_uid: qgz4zftg Background. Data on social contact patterns are widely used to parameterise age-mixing matrices in mathematical models of infectious diseases designed to help understand transmission patterns or estimate intervention impacts. Despite this, little attention is given to how social contact data are collected and analysed, or how the types of contact most relevant for transmission may vary between different infections. In particular, the majority of studies focus on close contacts only - people spoken to face-to-face. This may be appropriate for infections spread primarily by droplet transmission, but it neglects the larger numbers of 'shared air' casual contacts who may be at risk from airborne transmission of pathogens such as Mycobacterium tuberculosis, measles, and SARS-CoV-2. Methods. We conducted social contact surveys in communities in two provinces of South Africa in 2019 (KwaZulu-Natal and Western Cape). In line with most studies, we collected data on people spoken to (close contacts). We also collected data on places visited and people present, allowing casual contact patterns to be estimated. Using these data, we estimated age mixing patterns relevant for i) droplet and ii) non-saturating airborne transmission. We also estimated a third category of pattern relevant for the transmission of iii) Mycobacterium tuberculosis (Mtb), an airborne infection where saturation of household contacts plays an important role in transmission dynamics. Results. Estimated contact patterns by age did not vary greatly between the three transmission routes/infections, in either setting. In both communities, relative to other adult age groups, overall contact intensities were lower in 50+ year olds when considering contact relevant for non-saturating airborne transmission or the transmission of Mycobacterium tuberculosis than when considering contact relevant for droplet transmission. Conclusions. Our findings provide some reassurance that the widespread use of close contact data to parameterise age-mixing matrices for transmission models of airborne infections may not be resulting in major inaccuracies. The contribution of older age groups to transmission may be over-estimated, however. There is a need for future social contact surveys to collect data on casual contacts, to investigate whether our findings can be generalised to a wider range of settings, and to improve model predictions for infections with substantial airborne transmission. Mathematical models of infectious disease transmission are widely used to help inform infectious disease policy, estimate the potential impact of interventions, and to provide insight into disease dynamics and natural history. Many models incorporate patterns of mixing between different sections of the population, most commonly between different age groups. Simulated mixing patterns can have a large effect on model dynamics (1) , and it is therefore important that realistic mixing patterns are simulated. Mixing patterns are frequently informed by social contact data: empirical data collected from respondents on the people that they had contact with over a set period of time (2) . The majority of social contact data collection has focused on 'close' contacts, with most studies using a definition of contacts that required a two way face-to-face conversation of at least three words, close proximity (e.g. within 2m), and/or physical contact (2) . It is plausible that these types of contact approximate reasonably well the types of contact that are relevant for the transmission of infections that are transmitted primarily through direct contact and droplets. For obligate, preferential, or opportunistic airborne infections such as measles, Mtb, and SAR-CoV-2, however, it is likely that this definition excludes many potentially effective contacts. This is because transmission of airborne infections can occur between anybody 'sharing air' in inadequately ventilated indoor spaces, regardless of whether conversation occurs, and over distances of more than 2m (3) . For airborne infections, estimates of 'casual contact' time may therefore be more appropriate, calculated as the time spent in indoor locations multiplied by the number of other people present. Tuberculosis also differs from the majority of respiratory infections in the long durations of time for which people are potentially infectiouswith an estimated 9-36 months between disease development and diagnosis/notification across 11 high burden countries (4) . This is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint means that transmission to repeated contacts can partially saturate (even allowing for reinfection), making the relationship between contact time and infection risk non-linear (5) . This effect is most pronounced for contact between household members (5) . Household membership and repeated contacts are rarely explicitly simulated in mathematical models, and therefore the effects of contact saturation need to be incorporated into the mixing matrices used to parameterise the models. In this paper, we describe methods for estimating age mixing patterns relevant for nonsaturating airborne transmission and Mtb, using a novel weighted approach to incorporate the effects of household contact saturation into our estimates for Mtb. We generate estimates of age mixing using data on close and casual contacts from two communities in South Africa, and compare the estimated mixing patterns with those typically used in mathematical modelling studiesgenerated using close contact numbers, and more suitable for the droplet transmission. Social contact data were collected in two study communities in South Africa: one in KwaZulu-Natal Province, and one in Western Cape Province. Both communities have high rates of unemployment, and high prevalences of HIV and incidences of tuberculosis compared to the provinces as a whole. The study community in KwaZulu-Natal consisted of a population of 46,000, living in the predominantly rural and peri-urban areas in the catchment areas of two primary care clinics, and within a demographic surveillance area (DSA). The study community in Cape Town was a peri-urban community of 27,000 people, and was an established research site with biennial censuses. . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The KwaZulu-Natal data were collected between March and December 2019. 3093 adults (aged 18+ years) were sampled at random, stratified by residential area (~350 households per area) and with probability proportional to the number of eligible people in each area, based on the most recent DSA census conducted prior to area entry. Up to three attempts were made to contact sampled individuals. The Western Cape data were collected in May to October 2019. In total, 1530 adults (aged 15+ years) were selected using age-and sex-stratified random sampling, based on a census conducted in the study population in February and March 2019. Up to five attempts were made to contact selected individuals. Each attempt was made on different days of the week (including at weekends), at different times of day to maximise the chance of making contact. For both surveys, interviews were conducted face-to-face at the respondents' homes, using interview administered questionnaires on tablet computers. Interviews were conducted in isiZulu in KwaZulu Natal, and in English or isiXhosa in Western Cape. Respondents were asked about their movements on a randomly assigned day in the preceding week in KwaZulu-Natal, and on the day before the interview in Western Cape. To allow casual contact time (defined as time spent 'sharing air' indoors or on transport) to be estimated, respondents were asked to list the places they had visited (including their own home) and transport they had used. For each location, questions asked included: • What type of location was it? • How long did you spend there? • How many people were there, halfway through the time you were there? Respondents were also asked about their close contacts, defined as people that the respondent had a face-to-face conversation with. Respondents were first asked to make a list of all their . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint 6 contacts, with help from the interviewer. Respondents were then asked questions about a random 10 contacts, or all of their contacts if they reported fewer than 10. Questions included: • Is this contact a member of your household? • How old do you think they are? • How much time did you spend with them in total? Respondents' basic demographic information were also collected. For the KwaZulu-Natal community, data on household size and residency (urban, peri-urban, rural) were obtained from the most recent DSA census. All other data were collected directly from the respondents. Ethical approval for the data collection in KwaZulu-Natal was granted by the Biomedical Close contact numbers and times were estimated using data on people with whom the respondents reported having a face-to-face conversation. 95% plausible intervals for the age mixing matrices were generated using bootstrapping. Household and non-household age- is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint 7 mixing matrices were generated from sampled contacts who were and were not reported to be members of the respondents' own households respectively. Casual contact time in a location was estimated as the duration of time the respondent reported spending there, multiplied by the reported number of people present. Central estimates for casual contact time age-mixing matrices were generated using the method outlined in McCreesh et al (6) . In brief, as data were collected on numbers of total people and children present in indoor locations only, and not the ages of adults, the age distribution of adult casual contacts needed to be estimated. We therefore assumed that the age distribution of adult contacts in each location type matched the age distribution of respondents who reported visiting location of that time, weighted by the duration of time they reported spending in that location time, and weighted to the sampled population age and sex distribution. Household and non-household contact matrices were generated using data on contact time in respondents' own homes and all other locations respectively. Plausible ranges were generated using bootstrapping. The age mixing matrices were adjusted to be symmetrical, using the study community age structures. Data on adult contact numbers and time with children were used to estimate child contact numbers and time with adults. To allow comparison between the two study communities, the lowest respondent age group was set at 15-19 years for both surveys. As 15-17 year olds were not interviewed in KwaZulu-Natal, we assumed that contact patterns in 18-19 years olds were representative of contact patterns in all 15-19 year olds. Further details of the analysis methods are given in the appendix. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint Generating age-mixing matrices for droplet and non-saturating airborne transmission, and Mycobacteria tuberculosis Figure 1 summarises the data used to generate the age mixing matrices for droplet and nonsaturating airborne transmission, and Mtb. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint 9 contact time matrices. Close contact time was used for household estimates as 1) it was considered plausible that the majority of contact between household members is likely to meet the definition of 'close contact', and 2) it allowed the age structures of households to be more accurately reflected in the age-mixing matrices. Age-mixing matrices relevant for Mtb were set equal to the sum of the household close contact number matrices and the non-household casual contact time matrices, weighted to reflect empirical estimates of the proportion of tuberculosis that results from household transmission. For each pair of bootstrapped household and non-household matrices, a proportion of 'contact' that should occur in households was sampled from a uniform distribution between 8-16% (with 12% used for the central estimate) (5) . A weighted average was then generated, given the desired proportion of overall 'contact' by adults occurring in households. To allow direct comparisons to be made between the different age mixing matrices, the matrices for non-saturating airborne transmission and Mtb were adjusted to give the same mean contact intensity between adults as the matrices for droplet transmission. Further details of the analysis methods are given in the appendix. Of the 3093 people sampled in KwaZulu-Natal, 1723 (56%) were successfully contacted, 299 (10%) were dead or reported to have out-migrated, and 1071 (35%) could not be contacted. Of those successfully contacted, 1704 (99%) completed an interview. Of the 1530 people sampled in Western Cape, 1214 (93%) were successfully contacted, 117 (8%) had moved or died, 193 (13%) had had incorrect information listed in the census, and 6 . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint were uncontactable. Of the 1214 people contacted, 77 (6%) refused to be interviewed and 14 were ineligible (due to disability or lack of English and isiXhosa). Of 1123 people interviewed, technical issues meant that data from 8 interviews were lost, leaving 1115 (92%) completed interviews. Table 1 shows the characteristics of the respondents and the target population. For both populations, the recruited sample was a reasonable match to the target population in terms of sex, age, and residence type (urban, peri-urban, or rural). Unemployment was high in both populations, with only 23% of respondents in KwaZulu-Natal and 55% in Western Cape reporting full-time, part-time, or casual employment. Household sizes were large in KwaZulu-Natal, with 48% living in a household of eight or more, and smaller in Western Cape, with 79% living in a household of four or fewer. significantly higher for women than for men in both communities, however the differences were generally not large. There was a tendency for casual contact time to decrease slightly with age in both communities, and close contact numbers and time were substantially higher in 15-19 year olds than in older age groups in Western Cape. Close contact numbers and time, and casual contact time, increased with increasing household size in both settings, driven by increases in contact with household members. Differences by household size were larger in Western Cape than in KwaZulu-Natal, most likely reflecting the fact that household sizes were self-reported in Western Cape, but obtained through linking to community census data in KwaZulu-Natal. Contact between household members made up a higher proportion of total contact in KwaZulu-Natal than in Western Cape, for all types of contact. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Estimated contact patterns by age did not vary greatly between the three transmission routes/infections, in either setting. Age mixing patterns were less assortative, however, in the non-saturating airborne and Mtb matrices compared to the droplet matrices, in both settings. The exception to this was contact between 15-19 year olds in KwaZulu-Natal, which was more intense in the non-saturating airborne and Mtb matrices than the droplet matrices. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Contact numbers between child 'respondents' and contacts in each age group were estimated from data on contact between adult respondents and child contacts. Ranges shown are bootstrapped 95% plausible ranges. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint 15 intensities for droplet infections respectively. Numbers shown in graph a are the mean number of contacts respondents in each age group have with contacts in each age group per day. Numbers shown in graph b are the rate of contact between each individual in the population per day, expressed as rates ×10 5 . 'Numbers' and 'rates' in graphs c, d, f, and g are standardised so that the mean overall contact intensity by reported by adult respondents is the same as the mean number of overall close contacts reported by adult respondents (graph a). Contact numbers between child 'respondents' and contacts in each age group were estimated from data on contact between adult respondents and child contacts. Ranges shown are bootstrapped 95% plausible ranges. The majority of data used to generate age-mixing matrices used in transmission models is close contact datadata on contacts involving a face-to-face conversation. For infections where airborne transmission is common, close contact data is likely to miss many potential effective contacts. Using data from two provinces in South Africa, we created improved estimates of age-mixing patterns for airborne infections, using a wider 'casual contact' definition that incorporated anybody 'sharing space' indoors. We also demonstrated a novel method for generating age-mixing matrices relevant for the transmission of Mtb, an airborne infection where long disease durations and the saturation of household contacts play an important role in transmission dynamics. Finally, we estimated age-mixing patterns from close contact data using the most commonly used method, which we have labelled as relevant for droplet transmission. In our settings, contact patterns did not vary greatly between contacts relevant for droplet infections and those relevant for non-saturating airborne transmission or Mtb. Using close contact data in models of the transmission of Mtb or other . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint airborne infections in our study communities, however, may mean that the importance of adults aged 50+ to transmission is overestimated. Very few data are available on casual contact patterns. Previous studies in the same community in Western Cape have found greater drops in casual contact time than in close contact numbers in older age groups (6) , and decreases in indoor casual contact numbers with age (7) . Another study in the same community found high levels of age-assortative mixing with respect to casual contact time in schools and workplaces (8) . More data are needed on casual contact patterns, and age-mixing patterns in particular, in order to determine whether the findings of this study are generalisable to other settings, and to improve the predictions from mathematical models of the transmission of Mtb and other airborne infections. Our approaches to generating the separate droplet and airborne transmission matrices are necessarily simplifications, and many infections will not fit neatly into these two categories. There is considerable uncertainty about the role of different transmission routes to the spread of many infections, and both airborne and droplet transmission are thought to play a role in the transmission of some respiratory infections, including SARS-CoV-2(9). For these infections, an intermediate matrix may be preferable. There are two main differences between our droplet and airborne/Mtb age-mixing matrices. The first is the type of non-household contacts considered: close (face-to-face conversation) or casual (sharing space indoors) respectively. The second is that the airborne and (nonhousehold component of the) Mtb matrices are based on contact time, rather than unique contact numbers. The primary reason for using contact time for casual contacts is that respondents are unlikely to be able to estimate unique casual contact numbers for many locations they visit, necessitating the use of contact time, or assumptions about the rate of turnover of unique people in a location. For our droplet transmission matrices, we chose to . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint use unique contact numbers in a 24-hour period, as that is the most commonly used method, and therefore allows comparisons to be made with what is typically done. It should be noted, however, that both the choice of a 24-hour time period, and the lack of any weighting or restrictions by contact duration or other measures of closeness, are relatively arbitrary choices. Robust evidence as to the types of contact most relevant to transmission are limited for respiratory infections. A number of studies have compared the fit to data on varicella, parvovirus B19, or influenza A seroprevalence by age of models parametrised using contact patterns generated from close contact data in a range of different ways (10) (11) (12) . Overall, these studies suggest that analysis methods that give greater weight to more intimate contacts may be preferable in some circumstances. This could be achieved by restricting what counts as a contact to those involving physical touch and/or a minimum contact duration, or by using contact time rather than contact numbers. It has been suggested that approaches based on contact numbers may be more suitable for more highly transmissible infections, where only a short duration of contact is needed for transmission, whereas approaches based on contact time may be more suitable for less transmissible infections, where repeated or longer contacts are needed (13) . Fewer studies have considered expanding the pool of contacts beyond close contacts only, to also include casual contacts. One study however, that had paired individual-level contact data and pandemic influenza A serological data, found that models that included a variable for number of locations visited were strongly supported over those that only included variables for age and close contact numbers (14) . This suggests that airborne transmission may play a role in the spread of influenza A, and/or that the standard close contact definition misses a significant proportion of contacts at risk of droplet transmission. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint Other factors may also influence airborne and Mtb transmission risk, which are not accounted for in the analyses. Ventilation rates play a large role in determining airborne infection risk (15) , and weighting by ventilation rates would improve our airborne and Mtb matrices. Unfortunately, few data on ventilation rates by location type are available, and they show large amounts of variation between locations, and between the same location on different days (16) . Saturation of contacts may occur for infections other than Mtb, particularly highly transmissible pathogens such as measles virus. An approach based on casual contact numbers may be preferable for these infections, but would be highly dependent on assumptions made about how unique contact numbers are related to estimates of cross-sectional numbers of people present. There are a number of limitations when using casual contact data to estimate mixing patterns. Firstly, estimates of contact time in places where large numbers of people are present are likely to be less reliable. This is because people's estimates of the number of people present are likely to be poor, and because the assumption that there is a risk of transmission between all people present in the space may not be true in larger spaces. In our main analysis, when estimating contact time, we cap the number of people at risk of transmission at 100. In our sensitivity analyses, we show that using a cap of 20 people, or not capping the numbers of people, has a moderate impact on casual contact time age mixing matrices (see Appendix). Conducting similar sensitivity analyses may be necessary when using age mixing matrices calculated using casual contact time in mathematical models A second limitation is that the approach we use to determining the ages of adults present in locations other than respondents' own homes is indirect, and relies on the assumption that the age distribution of adults present in a location type reflects the duration of time respondents of different ages reported spending in that location type. This may not always be a reasonable assumption, if different age groups tend to visit different locations of the same type (or at . CC-BY 4.0 International license It is made available under a perpetuity. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint different times), or substantial mixing occurs with people from outside the study setting. These issues are discussed further in McCreesh et al 2019 (6) . An additional limitation of our estimates for KwaZulu-Natal only is that we did not recruit 15-17 years olds, and instead assumed in the analysis that contact by 18-19 year olds was representative of contact by all [15] [16] [17] [18] [19] year olds. This is unlikely to be true, with contacts by 15-17 and 18-19 year olds differing greatly in Western Cape (see Appendix, Figure S7 ). For this reason, our estimates for [15] [16] [17] [18] [19] year olds should be treated with caution for KwaZulu-Natal. To conclude, overall our estimated age-mixing matrices for droplet transmission, nonsaturating airborne transmission, and Mtb were not substantially different from each other for either setting. This provides some reassurance that the widespread use of close contact data to parameterise age-mixing matrices for transmission models of airborne infections may not be resulting in major inaccuracies. Some differences were observed however, particularly in the oldest age group, and our data were from two South African settings only. We recommend that future social contact surveys should collect data on casual contacts as well as close contacts, to determine whether the similarity between different types of contact pattern is true across other settings. We would also urge mathematical modellers to consider whether unique close contact numbers in a 24-hour period are the most appropriate contacts for the infection and scenario they are simulating, and to consider performing sensitivity analyses when there is uncertainty as to the most appropriate contact definition. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint Projecting social contact matrices in 152 countries using contact surveys and demographic data A systematic review of social contact surveys to inform transmission models of close contact infections. bioRxiv Community-based outbreaks of tuberculosis. Archives of Internal Medicine Estimated durations of asymptomatic, symptomatic, and care-seeking phases of tuberculosis disease An explanation for the low proportion of tuberculosis that results from transmission between household and known social contacts Estimating age-mixing patterns relevant for the transmission of airborne infections Indoor social networks in a South African township: potential contribution of location to tuberculosis transmission Integrating social contact and environmental data in evaluating tuberculosis transmission in a South African township Ten scientific reasons in support of airborne transmission of SARS-CoV-2. The lancet Using empirical social contact data to model person to person infectious disease transmission: an illustration for varicella What types of contacts are important for the spread of infections? Using contact survey data to explore European mixing patterns The Contribution of Social Behaviour to the Transmission of Influenza A in a Human Population Little Italy: An Agent-Based Approach to the Estimation of Contact Patterns-Fitting Predicted Matrices to Serological Data Social contacts and the locations in which they occur as risk factors for influenza infection Role of ventilation in airborne transmission of infectious agents in the built environment-a multidisciplinary systematic review. Indoor air Measuring ventilation and modelling M. tuberculosis transmission in indoor congregate settings, rural KwaZulu-Natal is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprintThe copyright holder for this this version posted February 7, 2022. ; https://doi.org/10.1101/2022.02.06.22270386 doi: medRxiv preprint