key: cord-0330146-vq8f62yh
authors: Franco, N.; Coletti, P.; Willem, L.; Angeli, L.; Lajot, A.; Steven, A.; Beutels, P.; Faes, C.; Hens, N.
title: Inferring age-specific differences in susceptibility to and infectiousness upon SARS-CoV-2 infection based on Belgian social contact data
date: 2021-10-14
journal: nan
DOI: 10.1101/2021.10.10.21264753
sha: 6139e80cbcbd5330520ca2e961935aad854cd0c3
doc_id: 330146
cord_uid: vq8f62yh

Several important aspects related to SARS-CoV-2 transmission are not well known due to a lack of appropriate data. However, mathematical and computational tools can be used to extract part of this information from the available data, like some hidden age-related characteristics. In this paper, we investigate age-specific differences in susceptibility to and infectiousness upon contracting SARS-CoV-2 infection. More specifically, we use panel-based social contact data from diary-based surveys conducted in Belgium combined with the next generation principle to infer the relative incidence and we compare this to real-life incidence data. Comparing these two allows for the estimation of age-specific transmission parameters. Our analysis implies the susceptibility in children to be around half of the susceptibility in adults, and even lower for very young children (preschooler). However, the probability of adults and the elderly to contract the infection is decreasing throughout the vaccination campaign, thereby modifying the picture over time.

Since the start of the COVID-19 pandemic, a new respiratory disease caused by the SARS-CoV-2 coronavirus, many mathematical and statistical approaches have been considered to identify transmission dynamics and characteristics of the virus. Some of those characteristics are still not completely known due to the lack of appropriate data. However, these characteristics are necessary in order to correctly inform public health policies as well as to develop more advanced scientific tools like mathematical and computational models. Concerning COVID-19, as for most infectious diseases, it quickly became apparent that some of the disease characteristics are strongly age-dependent [1] . In particular, the susceptibility to SARS-CoV-2 infection as well as the infectiousness upon infection may be lower for children than for adults and the elderly. Knowledge of such a difference could have an important impact on public health strategies in terms of prioritization of vaccination or the choice of targeted non-pharmaceutical interventions.

We propose a method to estimate those heterogeneous differences concerning SARS-CoV-2 transmission, focusing on relative susceptibility and infectiousness. Our approach is based on the method used by [2] to estimate susceptibility profiles for influenza A/H1N1. However, we have refined it to include a larger number of age categories and applied the methodology to SARS-CoV-2 transmission using a numerical approach. The method relies on the analysis of social contact data in order to derive an estimate of the relative incidence which is compared to real incidence data.

Social contact surveys [3] coupled with the next generation principle [4, 5] have been used for years to estimate key epidemiological parameters such as the basic (and effective) reproduction number (i.e., the average number of new infections caused by a typical infected individual during their entire infectious period in a (fully) susceptible population), relative incidence or differences in susceptibility [2] . The first large-scale social contact study, POLYMOD [6], collected social contact patterns for eight European countries between May 2005 and September 2006. In 2020-2021, social contact data has been collected in the so-called CoMix survey [7-10], initially in the United Kingdom, The Netherlands and Belgium and afterwards extended to other European countries. Comix collected timely social contact information during the COVID-19 pandemic.

Social contact data can be used as a proxy to model SARS-CoV-2 transmission using the so-called social contact hypothesis [11] , which implies that the age-specific number of infectious contacts is proportional to the self-reported age-specific number of social contacts by a proportionality factor. This proportionality factor, often denoted by q, assumes that the probability of transmission is homogeneous across the different age classes. In the current paper, we aim to disentangle and quantify the heterogeneous components of this proportionality factor further [12], elucidating information on relative age-specific susceptibility and infectiousness. These estimates could serve to inform heterogeneous COVID-19 mathematical models relying on social contact data, such as e.g. mechanistic models [13] [14] [15] [16] . Social contact data are also used in [17, 18] to derive heterogeneous contributions to SARS-CoV-2 transmission using an approach based on the reproduction number. We go one step further using an approach based on the relative incidence derived from the next generation principle.

More specifically, we use the CoMix social contact data combined with daily incidence data on the number of new confirmed COVID-19 cases in Belgium over the period December 2020 to May 2021 to estimate the proportionality factor and its heterogeneous unmeasured components. We disentangle potential sources of heterogeneity in the acquisition of SARS-CoV-2 infection especially focusing on the comparison between children (infant, primary and secondary school) and adults. We also estimate the time evolution of the transmission parameters for different adult age classes throughout the vaccination campaign as carried out in Belgium showing an evolution of the proportionality factors over time. Then we present an illustration of the utility of heterogeneous proportionality factors by comparing the reproduction number estimated from the CoMix social contact data to the ones estimated from incidence of cases and hospitalisations, respectively.

In the remainder of this paper, we will talk about susceptibility or infectiousness when considering immunological aspects of disease transmission, while we will add the prefix q whenever quantities can carry additional effects related to susceptible and infectious individuals in order to avoid any ambiguity.

If we denote by w = (w j ) a vector representing the relative incidence within age class j and by M T a matrix containing the social contact data, then we have the following system:

where R t represents the reproduction number. The core matrix of this system, K =q diag(a i ) M T diag(h j ) is called the next generation matrix and gives the number of new infections in a successive generation. Details concerning the construction of this matrix and the next generation principle can be found in Section Materials and methods.

Using our method, we compare the social contact matrix M T extracted from the CoMix social contact survey in Belgium [7] and its derived relative incidence w to the incidence obtained from real-life data in Belgium coming from PCR positive tests [19] . This method allows for an estimation of either the relative qsusceptibility (a i ) or relative q-infectiousness (h j ) by age class, while assuming that the other set of parameters is known from the literature (i.e., holding one of the two vectors fixed). The chosen age groups are [0, 6) years, [6, 12) years, [12, 18) years, [18, 30) years and subsequent 10-year age classes up to 80+ years in order to account for the Belgian educational system. The period of observation goes from 22 December 2020 to 26 May 2021 (and to 15 June 2021 for confirmed cases data). A detailed description of considered data, literature assumptions and fitting procedure is presented in Section Materials and methods.

The estimated relative q-susceptibility for the whole period is presented in Fig 1 and implies that very young children in age group [0, 6) are about 0.182 (95% percentile bootstrap-based CI: 0.146-0.230) times as susceptible compared to the first adult age class [18, 30) with relative susceptibility equal to 1 (95% CI: 0.829-1.252). Primary school students aged [6, 12) have a relative susceptibility of 0.550 (95% CI: 0.427-0.629) and secondary school students aged [12, 18) a susceptibility of 0.603 (95% CI: 0.536-0.700). This shows an increasing q-susceptibility by increasing age up to [18, 30) after which the relative q-susceptibility tends to decrease slightly. Note, however, that this q-susceptibility captures not only differences in clinical susceptibility to infection and the rather low relative q-susceptibility in the [12, 18) age class could therefore be influenced by (compliance to) non-pharmaceutical interventions, over and above the age-specific contact frequencies.

Due to the method, these numbers only have a relative interpretation and the proposed representation in Fig 1 is under the assumption of a mean susceptibility of 1 for the first adult class [18, 30) .

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The comparison of the relative incidence as estimated based on the positive PCR test data and the CoMix social contact data is presented in Fig 2. The social contact data are presented by waves starting with wave 12 on 22 December 2020 and an inter-survey wave interval of two weeks for subsequent waves (cf. details in Table A of S1 Appendix). The nationally collected data are represented in blue and the estimates coming from social contact data in two colors: in green, the initial estimate with a homogeneous proportionality factor (i.e., with a i = 1 and h j = 1 for all i, j) and in red, the estimate using heterogeneous q-susceptibility and infectiousness as presented in Fig 1. We clearly observe that estimates of the relative incidence under the homogeneous proportionality factor assumption (green) are very different from the empirical estimates (blue), especially for the young age groups. The relative incidence among adult age classes is estimated relatively well up to a constant, but the relative incidence for children coming from the homogeneous social contacts approach is clearly overestimated, except perhaps during times of school closure (see, e.g., wave 19) . This finding provides a clear indication that SARS-CoV-2 transmission is less important in children as compared to adults.

The result of estimating the q-infectiousness for the whole period, is depicted in Fig 3. The estimates also show a potential important heterogeneity concerning the proportionality factor on the infectiousness side. However, this reverse exercise provides less accurate results, with very large confidence intervals and some bootstrap estimates reaching zero, both being problematic when dealing with relative values. Those effects are the result of a lack of constraints for q-infectiousness. Indeed, while it is impossible to reach zero susceptibility for a specific age class when having at the same time non-zero incidence, it is technically allowed for age-specific infectiousness to be zero as the observed incidence could result from transmission from other age classes.

Exact estimates of the components of the q-susceptibility (a i ) and q-infectiousness (h i ) are provided in Tables D and H of S1 Appendix. Additional estimates under the assumption of homogeneity regarding infectiousness or susceptibility (i.e., estimating (a i ) under h j = 1, ∀j or estimating (h j ) under a i = 1, ∀i) are also presented in Figs B and J of S1 Appendix together with estimated values and the effect on the relative incidence. These additional estimates provide qualitatively similar results.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 

Since proportionality factors capture several effects, they also capture time-dependent effects such as the reduction in susceptibility and infectiousness as a result of the vaccination campaign. In order to account for such a time evolution, we also performed the previous analysis using groups of two consecutive CoMix waves instead of the full period. The decision to consider two CoMix waves (28 days) together is motivated by the fact that a sufficiently long non-holidays period is required as social contacts in children are of importance and the heterogeneity of the transmission concerning adult classes is partially constraint by infection reported by children. Note that the gradual introduction of the alpha variant of concern might also interfere. Estimates of the time-dependent q-susceptibility relying on the same (time-invariant) assumption with regard to the infectiousness vector are presented in Fig 4. A normalisation of the relative values was performed over the different waves such that the average of the estimated factors for the age classes [0, 6), [6, 12) and [12, 18) is assumed constant. This choice is motivated by the fact that the vaccination campaign was not including children during the entire study period, hence proportionality factors regarding susceptibility can be expected to be stable for these age classes. Thus, the results provide an estimate of the evolution of adults' proportionality factors under an on average constant assumption for children [0, 18) . As indicated previously concerning the estimation during the complete time-period, a decreasing qsusceptibility is observed through adult age classes (see Fig 1) , with the oldest age class (80+) being the least susceptible among all adults aged 18 years or older. This is a priori in contrast with usual assumptions regarding age-specific susceptibility to SARS-CoV-2 infection. However, in Fig 4, we clearly observe the highest relative q-susceptibility in the 80+ age class for the earliest waves as compared to all other age groups, 6/15 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14, 2021. ;  or at least an equal q-susceptibility by considering the lower side of the confidence interval. Moreover, the q-susceptibility in the oldest age class decreases rapidly over time towards the lowest relative q-susceptibility equal to 0.446 (95% CI: 0.266-0.660) among the adult age groups. In general, the estimated q-susceptibility is almost similar across the different adult classes during the first period with the exception of the oldest class 80+ with an estimated relative q-susceptibility of 1.844 (95% CI: 0.920-3.127). Overall, q-susceptibility estimates of other age classes tend to decrease over time, albeit at a slower pace and to a lesser extent. This is in line with the implementation of the vaccination policy in Belgium, giving vaccination priority to residents of nursing homes (CoMix waves 13-16, see schematic timeline in Fig A of S1 Appendix) and the elderly in the general population (CoMix waves [18] [19] [20] [21] , while going gradually down from old to young throughout the study period.

Exact values of the estimates in Fig 4 are provided in Table E of S1 Appendix as well as time-dependent q-susceptibility and q-infectiousness under the various constraints mentioned above.

In order to check the utility and validity of the use of a heterogeneous proportionality factor, we illustrate its application by determining the reproduction number R t , or more specifically the variation of the reproduction number over time, and comparing this evolution with R t directly estimated from confirmed cases/hospitalizations data.

In Fig 5, the variation of the reproduction number computed from the CoMix data is compared to the reproduction number computed either from the number of cases [20] (panel a) or from hospitalizations [19] (panel b). Clearly, specific choices of q−susceptibility and q−infectiousness affect the computation and in Fig 5 we report results for the homogeneous and heterogeneous scenarios of Fig 2. A homogeneous assumption for q−infectiousness and q−susceptibility leads to a poor agreement with the reproduction number estimated from both confirmed cases and hospitalizations and is also characterized by a larger uncertainty while the use of the estimated heterogeneous reproduction factor agrees more with reality. In green: estimated Rt under the assumption of a homogeneous proportionality factor. In red: estimated Rt with the estimated age-varying q-susceptibility under the assumption on age-varying infectiousness as in Figs 1 and 2. Dots represent means and bars represent 95% percentile (nonparametric) bootstrap-based confidence intervals.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint

We have demonstrated in this paper that social contact data can be used to inform transmission parameters and to estimate age-specific characteristics of SARS-CoV-2 transmission. More specifically, the next generation approach enables us to disentangle age-specific differences in transmission rates while relying on temporal changes in social contact behaviour measured using consecutive waves of a social contact panel study. Clearly, SARS-CoV-2 transmission is partly influenced by age-specific differences in contact behaviour, but importantly, additional age-specific factors related to susceptibility and infectiousness, in a broad sense, are quintessential to account for. We have shown that such factors imply a smaller susceptibility for children as compared to adults, with the estimated susceptibility in children being around half of the susceptibility in adults, and even less for very young children (Fig 1) . This result is in accordance with a very recent result obtained using CoMix social contact data in England but using a calibration on the reproduction number instead of the next generation approach [17] . With respect to that, we assessed the impact of assuming homogeneous transmission parameters on the reproduction number, showing how (age-)heterogeneous parameters are necessary to correctly align the reproduction number from the CoMix data and the reproduction number estimated from infections or hospitalizations. Moreover, our method is able to estimate time-varying transmission parameters and it shows a gradual decrease in susceptibility of adults in line with the progression of the Belgian vaccination campaign (Fig 4) . This decrease implies a progressive change in the dynamics of the epidemic with largely unvaccinated childhood age groups gradually becoming more important drivers of SARS-CoV-2 transmission than predominantly vaccinated adult age groups.

However, our method suffers from several limitations. A potential bias which needs to be acknowledged is the use of PCR data which correspond to the observed relative incidence and do not necessarily correspond to the true relative incidence as each age class is not necessarily tested in the same way, even if we discard periods of strong variation in testing policy. Indeed, even in the absence of a change in testing policy, age-specific differences in symptomatology, disease severity and the probability of developing symptoms upon infection lead to different shares of symptomatic and asymptomatic cases to be detected. Despite the fact that we can infer q-susceptibility and q-infectiousness from the observed PCR test data, we cannot further disentangle both components by estimating the aforementioned quantities simultaneously. By comparing the two separate approaches, the estimation of the relative q-susceptibility seems most informative, since proportionality factors are better constrained by the data. More specifically, the estimated q-susceptibility was identifiable when fitting to real incidence data while q-infectiousness estimates were estimated to be zero for certain age classes, which seems an artefact of the methodology. Another limitation of the proposed method is that a further decomposition of q-susceptibility (or q-infectiousness) in clinical susceptibility (infectiousness) and other external factors relevant for transmission between susceptible and infectious persons is difficult, at least without availability of relevant additional data thereon. Nonetheless, an assessment and quantification of the (relative) q-susceptibility, q-infectiousness and the corresponding relative incidence provides useful insights into heterogeneous SARS-CoV-2 transmission dynamics.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint

Our study is based on Belgian social contact data collected within the CoMix survey [7, 8] during the epidemic period between December 2020 and May 2021. These data are processed by means of the online Socrates tool [8, 21, 22] . Participants were asked to fill in a contact dairy including all contacts made during a specific day, reporting the type of contact, location, and age of the contacted person, with a contact defined as an in-person conversation of at least a few words, or a skin-to-skin contact. The CoMix survey was repeatedly performed in different waves and different survey periods. More specifically, an initial survey period containing 8 waves was carried out between 4 March 2020 and 27 July 2020 targeting adults only. A second survey period, still ongoing in 2021, began on 11 November 2020 targeting participants of all ages. The waves are conducted with an interval of two weeks (14 days). For more detailed information on the CoMix survey and the stratification process, the reader is referred to [7, 8] . A detailed timetable of the CoMix waves and survey periods is presented in Table A of S1 Appendix. A schematic timeline of CoMix waves according to the evolution of the alpha variant of concern and vaccination campaign in Belgium is presented in Fig A of S1 Appendix. We use the following notation. N i denotes the number of individuals in the Belgian population of age i according to Belgian demographic data [23] and integrated into the Socrates tool [22] . In general, we use subscripts i as an index for the participant's age, and j as an index for the contacted person's age. The following observable quantities (dependent on the wave chosen) can be extracted from the survey:

• m ij represents the average daily number of individuals of age j who are contacted by a participant of age i. The elements m ij constitute a matrix M called social contact matrix.

• c ij is the per capita contact rate per day for participants of age i with persons of age j in the population. The elements c ij constitute a matrix C called the contact rate matrix. This matrix is related to the social contact matrix by the relation c ij = m ij /N j .

In theory, due to the reciprocal nature of contacts, the total number of contacts between members of two age classes, as reported by participants in each of the age groups, must be equal, hence N i m ij = N i c ij N j = N j c ji N i = N j m ji , which is equivalent to the condition that the contact rate matrix should be symmetric, i.e., c ij = c ji , ∀i, j. The social contact matrix M respects the relation N i m ij = N j m ji , but is in general not symmetric due to differences in N i and N j . In practice, the observed total number of contacts N i m raw ij and N j m raw ji are not necessarily equal due to sampling bias, hence, we calculate the reciprocal social contact matrix by:

All these notations and definitions are similar to those described in detail in [24] , except that the subscripts i and j and order of indices are inverted here such that the definition of the social contact matrix M corresponds to the natural output of the Socrates tool [22] .

The social contact hypothesis [11] implies that the age-specific number of infectious contacts is proportional to the self-reported age-specific number of social contacts. There are two ways to interpret empirical social contact survey data in light of this hypothesis: either survey participants can be infected by their infectious contacts or participants can infect their susceptible contacts. Here, we consider the first interpretation as initial definition -since the CoMix survey did not specifically target infected persons, and symptomatic participants may have been less likely to participate in the survey at the height of their symptoms. However, we will show that the two interpretations lead to the same mathematical result under the assumption of reciprocity of social contacts.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ;  If we denote by w j the incidence within age class j over a short observation interval (e.g. corresponding to a wave period), then v j = w j /N j is the risk of being infected during the observation interval for that age class (incidence rate or force of infection). The new generation of infected people is given by:

where q is a general proportionality factor completely defining the relationship between infection and contact events. The q-factor accommodates several effects such as susceptibility to infection, infectiousness upon infection, duration of the infectious period, type and effectiveness of contacts, seasonality, pre-existing natural and vaccine-induced immunity, etc. The elements k ij = q m ij Ni Nj = qN i c ij define a matrix K called the next generation matrix (or reproduction matrix ) since k ij represents the mean number of individuals of age i that are infected through a single individual of age j during their entire infectious period (for which the time between consecutive generations of infected individuals is chosen to be equal to the average duration of infectiousness).

Note that under the reciprocity assumption leading to a symmetric matrix C, the relation N i m ij = N j m ji provides:

corresponding to the second interpretation that survey participants (on the right side of the transpose contact matrix M T ) can directly infect their contacts (now on the left side) modulo the proportionality factor. This expression relying on the transpose of the social contact matrix obtained as a direct output of the Socrates tool, M, is chosen because of its better numerical stability. The recurrence relation of the next generation matrix K:

tends to a stable distribution due to the Perron-Frobenius theorem [25] , i.e.,

with R t corresponding to the reproduction number of SARS-CoV-2 [26] which is defined as the leading eigenvalue of the next generation matrix K. More specifically, estimation of the reproduction number R t and the relative incidence w can be done by computing the leading eigenvalue and corresponding right-eigenvector of K. However, R t depends on the proportionality factor q, which might be unknown, but the relative incidence w is independent of q and can therefore be directly extracted from the social contact data M T . The reproduction number is initially the basic reproduction number R 0 , but switches to the effective reproduction number as long as social contact data evolve and the proportionality factor q captures the depletion of susceptible. We emphasize here that the eigenvector w is only recovered up to a global constant and therefore individual components w j have no meaning. What can be interpreted are relative ratios such as w i /w j , providing an estimate of the relative incidence in age class i as compared to the incidence in age class j. This vector is usually normalized such that i w i = 1. In the same way, the incidence rate v i can be recovered, in relative sense, as the leading left-eigenvector of M T . The switch from a homogeneous proportionality factor q to a heterogeneous q ij is performed by assuming:

where the vector (a i ) acts on the susceptible side, the vector (h j ) acts on the infectiousness side, andq is a remaining global proportionality factor captures any residual effect. This remaining factor has no influence on the computation of the relative incidence w. However, due to the presence ofq, the vectors (a i ) and (h j ) only have a relative interpretation.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint

The heterogeneous next generation matrix is defined as:

We note that we are working here with a next generation matrix with small domain. There also exists a next generation matrix with large domain taking explicitly into account the different states of the disease and their duration for each age class [4] . However, the small domain approach is appropriate here since we do not work with a dynamical system and heterogeneity in disease duration is part of the effects captured by the proportionality factors.

Estimating relative q-susceptibility (a i ) and q-infectiousness (h j ) from COVID-19 age-structured indicators

The vectors (a i ) and (h j ) have an important impact on the determination of the leading right eigenvector in the system:

The obtained relative incidence w * can be compared with the normalized relative incidencew estimated from the observed incidence data in Belgium. Using this approach, we are able to determine q-susceptibility and q-infectiousness corresponding to SARS-CoV-2 transmission in Belgium. However, (a i ) and (h j ) vectors cannot be estimated simultaneously in a unique way from this process since there remains an indeterminacy [2, 12]. Nevertheless, the identifiability problem can be solved by imposing a constraint on one of the two vectors.

For this study, we choose each time a heterogeneous constraint coming from the literature as well as a homogeneous constraint (whose results are only presented in S1 Appendix). The heterogeneous constraints are defined from the following assumptions: as assumed in [14] using data from [27] . Assuming that the relative infectiousness of asymptomatic versus symptomatic individuals is 0.51 [14] , we obtain the following constraint: 

We use Belgian data on daily incidence of COVID-19 confirmed by means of a positive PCR test, as provided by the Belgian Institute for Public Health, Sciensano [19] . In order to reduce testing biases, the period of study is restricted to a period with almost constant testing policy (with testing allowed for both symptomatic and asymptomatic people and before biases induced by the introduction of the EU Digital COVID Certificate).

Since there is a delay between a change in social contact behaviour and its effect on the relative incidence, we consider PCR test results for the period starting 7 days after the onset of a specific CoMix wave and lasting for 14 days thereafter. Concerning social contact data, the initial CoMix survey waves (1 to 8) are discarded due to a variable testing policy and lack of information regarding child-child contacts. The three subsequent waves (9 to 11) are also discarded since, despite the introduction of measuring child-child contacts, the information was collected 11/15 . CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ;  using a different survey formulation. CoMix waves 12 to 23 correspond to a period with constant testing policy, an identical survey design as well as without vaccination in children, which implies that the results with regard to age classes [0 − 6), [6 − 12) and [12 − 18) years are expected to be more stable. The start of wave 12 corresponds to 22 December 2020 when the vaccination campaign in adults has not been started and the last wave considered corresponds to 26 May 2021, with PCR tests considered up to 15 June 2021 (thus when vaccination of the oldest individuals in the Belgian population was nearly completed).

The calibration is performed using the statistical software R with a best fit search using the Hellinger distance (which is suitable for distributions) on relative incidences w * andw. The optimization is done by means of a random walk, until no change in the distance during 100 consecutive iterations is observed, and with an initial homogeneous prior (a i = 1 or h i = 1, ∀i = 1 . . . 10) and steps of length N 0, 0.005 2 10 . The sensitivity analysis is performed by repeating the process over 200 nonparametric bootstraps using the previous posterior as new prior. Uncertainty is quantified by means and 95% percentile confidence intervals (i.e., 2.5% and 97.5% quantiles of all bootstrap-based estimates).

Since q-susceptibility and q-infectiousness vectors represent relative values, a normalization process should be chosen for the representation of the results. We choose the following normalization: • The mean a 4 or h 4 (across the bootstrap runs) for the first adult age class [18, 30) is set to one. This constraint is chosen because we mainly want to compare susceptibility and infectiousness of children versus adults while the age class [18, 30) is one of the most stable ones in the bootstrapping process.

Via the next generation approach, the ratio of the eigenvalues of two next generation matrices can be used to evaluate the relative reduction in the reproduction number. This can be done to compare the time-varying reproduction number derived from the CoMix survey with independent evaluations of the reproduction number. We use as comparison the R t computed from the daily number of cases [20] and the daily number of hospitalizations [19] . In order to account for the time delays associated with infections and hospitalizations (e.g. time to develop symptoms, time to hospitalisations, etc.), the reproduction number computed from the CoMix social contact data was shifted forward in time. A time shift of 7 (14) days is considered when comparing R t estimates with the reproduction number computed from the number of confirmed cases (respectively hospitalizations). As the reproduction number is known up to the overall constantq, we fix the reproduction number for CoMix wave 12 to be equal to the reproduction number computed from infections or hospitalizations. Uncertainty due to sampling variability is estimated via 10000 nonparametric bootstraps.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. 

We thank several researchers from the SIMID COVID-19 consortium (interuniversity collaboration between University of Antwerp (CHERMID) and UHasselt (DSI, CenStat) as well as other researchers from the Interuniversity Institute of Biostatistics and statistical Bioinformatics (I-BioStat) (KU Leuven and UHasselt) for numerous constructive discussions and meetings. The authors thank the EpiPose consortium partners for useful discussions and for help in setting up the CoMix survey as part of EpiPose. The authors are also very grateful for access to the data from the Belgian Scientific Institute for Public Health, Sciensano. The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript.

None.

R codes and all necessary data to run the codes (potentially aggregate) are available on GitHub at https:// github.com/nicolas-franco-unamur/Next-gen. CoMix data and age-class re-aggregate PCR tests data are provided with the code. CoMix social contact data are also available via http://www.socialcontactdata.org. PCR tests data are owned by Sciensano, publicly available in aggregate form at https://epistat.wiv-isp. be/covid/ while non-aggregate data can be requested via the online form https://epistat.wiv-isp.be/ datarequest.

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint

In this Supplementary Material, we present the complete results obtained on the estimation of heterogeneous proportionality factors using the next generation matrix. Social contact data M are taken from CoMix waves 12 to 23 as described in the timetable given in Table A and timeline in Fig A. Relative incidences are computed as the leading eigenvector of the next generation matrix diag(a i ) M T diag(h j ) and compared with the normalized relative incidence estimated from PCR positive tests. The q-susceptibility (a i ) and q-infectiousness (h j ) vectors are either assumed or estimated. We present 4 different sets of results:

• Estimation of q-susceptibility (a i ) using homogeneous infectiousness assumption h j = 1 ∀j (Figs B to E  and Tables B and C) • Estimation of q-susceptibility (a i ) using heterogeneous infectiousness assumption (h j ) coming from the literature (Figs F to I and Tables D and E)

• Estimation of q-infectiousness (h j ) using homogeneous susceptibility assumption a i = 1 ∀i (Figs J to M and Tables F and G)

• Estimation of q-infectiousness (h j ) using heterogeneous susceptibility assumption (a i ) coming from the literature (Figs N to Q and Tables H and I) Each time, we present the estimate of the proportionality factors for the complete period (waves 12 to 23) or estimated by groups of two waves, with the exact values presented in the subsequent tables. The normalization method is described at the beginning of each subsection. We associate to the results the estimate of the relative incidence, with the real data presented in blue, the estimate without any proportionality factor a i = h i = 1 ∀i in green and the estimate with the assumed and estimated proportionality factors in red. Dots represent means and bars represent 95% nonparametric bootstrap confidence intervals (2.5% and 97.5% quantiles). 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint Estimation of q-susceptibility using homogeneous infectiousness Method: Estimation of the (a i ) relative q-susceptibility vector. Assumption: homogeneous infectiousness (h j ) = (1, 1, 1, 1, 1, 1, 1, 1, 1, 1) . Normalization method: Mean q-susceptibility among children age classes [0,6), [6,12) and [12,18) is assumed constant among bootstraps and wave groups (if applicable). The mean of the first adult age class [18, 30) is set to 1 for the first period. 1, 1, 1, 1, 1, 1, 1, 1, 1 Relative incidence using estimated q-susceptibility and assumption on infectiousness (1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ).

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021 1, 1, 1, 1, 1, 1, 1, 1, 1 ). 1, 1, 1, 1, 1, 1, 1, 1, 1 ).

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. Table B . Relative q-susceptibility using assumption on infectiousness (1, 1, 1, 1, 1, 1, 1, 1, 1, 1) corresponding to Figure B . Table C . Relative q-susceptibility with time evolution using assumption on infectiousness (1, 1, 1, 1, 1, 1, 1, 1, 1, 1) corresponding to Figure D. S4/S13

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint Estimation of q-susceptibility using heterogeneous infectiousness Method: Estimation of the (a i ) relative q-susceptibility vector. Assumption: heterogeneous infectiousness (h j ) = (0.54, 0.55, 0.56, 0.59, 0.7, 0.76, 0.9, 0.99, 0.99, 0.99) using the proportion of asymptomatic cases in the Belgian population with asymptomatic infectiousness assumed at 0.51 as used in [14] using data from [27] . Normalization method: Mean q-susceptibility among children age classes [0,6), [6,12) and [12,18) is assumed constant among bootstraps and wave groups (if applicable). The mean of the first adult age class [18, 30) is set to 1 for the first period. 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint Estimation of q-infectiousness using homogeneous susceptibility Method: Estimation of the (h j ) relative q-infectiousness vector. Assumption: homogeneous susceptibility (a i ) = (1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ). Normalization method: No normalization accross bootstraps and wave groups is applied here since the q-infectiousness among children age classes is estimated at (0, 0, 0) for several bootstraps and this prevents using the same normaliztion method than in other subsections. The mean of the first adult age class [18, 30) is set to 1 for the first period. 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint Table F . Relative q-infectiousness using assumption on susceptibility (1, 1, 1, 1, 1, 1, 1, 1, 1, 1 Table G . Relative q-infectiousness with time evolution using assumption on susceptibility (1, 1, 1, 1, 1, 1, 1, 1, 1, 1) corresponding to Figure L. S10/S13

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint Estimation of q-infectiousness using heterogeneous susceptibility Method: Estimation of the (h j ) relative q-infectiousness vector. Assumption: heterogeneous susceptibility (a i ) = (0.4, 0.39, 0.38, 0.79, 0.86, 0.8, 0.82, 0.88, 0.74, 0.74) taken from [1] . Normalization method: Mean q-infectiousness among children age classes [0,6), [6,12) and [12,18) is assumed constant among bootstraps and wave groups (if applicable). The mean of the first adult age class [18, 30) is set to 1 for the first period. 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint 

. CC-BY-NC 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted October 14, 2021. ; https://doi.org/10.1101/2021.10.10.21264753 doi: medRxiv preprint

Age-dependent effects in the transmission and control of COVID-19 epidemics

The impact of contact tracing and household bubbles on deconfinement strategies for COVID-19

Modelling the early phase of the Belgian COVID-19 epidemic using a stochastic compartmental model and studying its implied future trajectories

Extended SEIR-QD model with nursing homes and long-term scenariosbased forecasts

A data-driven metapopulation model for the Belgian COVID-19 epidemic: assessing the impact of lockdown and exit strategies

Estimating the impact of reopening schools on the reproduction number of SARS-CoV-2 in England, using weekly contact survey data

Contact surveys reveal heterogeneities in age-group contributions to SARS-CoV-2 dynamics in the United States

Belgian public health institute

COVID-19 Data Dashboard

SOCRATES: an online tool leveraging a social contact data sharing initiative to assess mitigation strategies for COVID-19

Social Contact Rates (SOCRATES) Data Tool: as part of the socialcontactdata.org initiative

StatBel, the Belgian statistical office

Handbook of Infectious Disease Data Analysis

On the definition and the computation of the basic reproduction ratio R0 in models for infectious diseases in heterogeneous populations

Estimating clinical severity of COVID-19 from the transmission dynamics in Wuhan, China