key: cord-219107-klpmipaj
authors: Zachreson, Cameron; Mitchell, Lewis; Lydeamore, Michael; Rebuli, Nicolas; Tomko, Martin; Geard, Nicholas
title: Risk mapping for COVID-19 outbreaks using mobility data
date: 2020-08-14
journal: nan
DOI: nan
sha: 
doc_id: 219107
cord_uid: klpmipaj

COVID-19 is highly transmissible and containing outbreaks requires a rapid and effective response. Because infection may be spread by people who are pre-symptomatic or asymptomatic, substantial undetected transmission is likely to occur before clinical cases are diagnosed. Thus, when outbreaks occur there is a need to anticipate which populations and locations are at heightened risk of exposure. In this work, we evaluate the utility of aggregate human mobility data for estimating the geographic distribution of transmission risk. We present a simple procedure for producing spatial transmission risk assessments from near-real-time population mobility data. We validate our estimates against three well-documented COVID-19 outbreak scenarios in Australia. Two of these were well-defined transmission clusters and one was a community transmission scenario. Our results indicate that mobility data can be a good predictor of geographic patterns of exposure risk from transmission centres, particularly in scenarios involving workplaces or other environments associated with habitual travel patterns. For community transmission scenarios, our results demonstrate that mobility data adds the most value to risk predictions when case counts are low and spatially clustered. Our method could assist health systems in the allocation of testing resources, and potentially guide the implementation of geographically-targeted restrictions on movement and social interaction.

Similar to other respiratory pathogens such as influenza, the transmission of SARS-CoV-2 occurs when infected and susceptible individuals are co-located and have physical contact, or exchange bioaerosols or droplets [1, 2] . Behavioural modification in response to symptom onset (i.e., self-isolation) can act as a spontaneous negative feedback on transmission potential by reducing the rate of such contacts, making epidemics much easier to control and monitor.

However, COVID-19 (the disease caused by SARS-CoV-2 virus) has been associated with relatively long periods of pre-symptomatic viral shedding (approximately 5 -10 days), during which time case ascertainment and behavioural modification are unlikely [3, 4] . In addition, many cases are characterised by mild symptoms, despite long periods of viral shedding [5] . Transmission studies have demonstrated that asymptomatic and pre-symptomatic transmission hamper control of SARS-CoV-2 [6] [7] [8] . Pre-symptomatic and asymptomatic transmission has also been documented systematically in several residential care facilities in which surveillance was essentially complete [9, 10] . Currently, there are no prophylactic pharmaceutical interventions that are effective against SARS-CoV-2 transmission. Therefore, interventions based on social distancing and infection control practices have constituted the operative framework, applied in innumerable variations around the world, to combat the COVID-19 pandemic.

Social distancing policies directly target human mobility. Therefore, it is logical to suggest that data describing aggregate travel patterns would be useful in quantifying the complex effects of policy announcements and decisions [11] . The ubiquity of mobile phones and public availability of aggregated near-real-time movement patterns has led to several such studies in the context of the ongoing COVID-19 pandemic [12] [13] [14] . One source of mobility data is the social media platform Facebook, which offers users a mobile app that includes location services at the user's discretion. These services document the GPS locations of users, which are aggregated as origindestination matrices and released for research purposes through the Facebook Data For Good program. The raw data is stored on a temporary basis and aggregated in such a way as to protect the privacy of individual users [15] . Several studies have utilised subsets of this data for analysis of the effects of COVID-19 social distancing restrictions [16] [17] [18] [19] In this work, we complement these studies by addressing the question: to what degree can realtime mobility patterns estimated from aggregate mobile phone data inform short-term predictions of COVID-19 transmission risk?

To do so, we develop a straight-forward procedure to generate a relative estimate of the spatial distribution of future transmission risk based on current case data or locations of known transmission centres. To critically evaluate the performance of our procedure, we retrospectively generate risk estimates based on data from three outbreaks that occurred in Australia when there was little background transmission.

The initial wave of infections in Australia began in early March, 2020, and peaked on March 28th with 469 new cases. The epidemic was suppressed through widespread social distancing measures which escalated from bans on gatherings of more than 500 people (imposed on March 16th) to a nation-wide "lockdown" which began on March 29th and imposed a ban on gatherings of more than 3 people. By late April, daily incidence numbers had dropped to fewer than 10 per day [20] . The outbreaks we examine occurred during the subsequent period over which these general suppression measures were progressively relaxed. One of these occurred in a workplace over several weeks, one began during a gathering at a social venue, and one was a community transmission scenario with no single identified outbreak center, which marked the beginning of Australia's "second wave" (which is ongoing as of August, 2020). The term "community transmission" refers to situations in which multiple transmission chains have been detected with no known links identified from contact tracing and no specific transmission centres are clearly identifiable.

In each case, we use the Facebook mobility data that was available during the early stages of the outbreak to estimate future spatial patterns of relative transmission risk. We then examine the degree to which these estimates correlate with the subsequently observed case data in those regions. Our results indicate that the accuracy of our estimates varies with outbreak context, with higher correlation for the outbreak centred on a workplace, and lower correlation for the outbreak centred on a social gathering. In the community transmission scenario without a well-defined transmission locus, we compare the risk prediction based on mobility data to a null prediction based only on active case numbers. Our results indicate that mobility is more informative during the initial phases of the outbreak, when detected cases are spatially localised and many areas have no available case data.

Our general method is to use an Origin-Destination (OD) matrix based on Facebook mobility data to estimate the diffusion of transmission risk based on one or more identified outbreak sources. The data provided by Facebook comprises the number of individuals moving between locations occupied in subsequent 8-hr intervals. For an individual user, the location occupied is defined as the most frequently-visited location during the 8-hr interval. More details on the raw data, the aggregation and pre-processing performed by Facebook before release, and our pre-processing steps can be found in the Supplemental Information.

COVID-19 case data is made publicly available by most Australian state health authorities on the scale of Local Government Areas (LGAs). In these urban and suburban regions, LGA population densities typically vary from approximately 0.2 × 10 3 to 5 × 10 3 residents per km 2 , but can be low as 20 residents per km 2 in the suburban fringe where LGAs contain substantial parkland and agricultural zones. The output of our method is a relative risk estimate for each

LGA based on their potential for local transmission. The general method is as follows:

1. Construct the prevalence vector p, a column vector with one element for each location with a value corresponding to the transmission centre status of that location. For pointoutbreaks in areas with no background transmission, we use a vector with a value of 1 for the location containing the transmission centre and 0 for all other locations. For outbreaks with transmission in multiple locations, we construct p using the number of active cases as reported by the relevant public health agency.

2. Construct an OD matrix M, where the value of a component M ij gives the number of travellers starting their journey at location i (row index) and ending their journey at location j (column index). To approximately match the pre-symptomatic period of COVID-19, we average the OD matrix over the mobility data provided by Facebook during the week preceding the identification of the targeted transmission centre. By averaging over an appropriate time interval, the OD matrix is built to represent mobility during the initial stages of the outbreak, when undocumented transmission may have been occurring. The choice of appropriate time interval varied by scenario, as described below.

3. Multiply the OD matrix by the prevalence vector to produce an unscaled risk vector r with a value for each location corresponding to the aggregate strength of its outgoing connections to transmission centres, weighted by the prevalence in each transmission centre. This is re-scaled to give the relative transmission risk for each region R i . In other words, we treat the OD matrix as analogous to the stochastic transition matrix in a discrete-time Markov chain, and compute the unscaled vector of risk values r as:

so that r is approximately proportional to the average interaction rate between susceptible individuals from location i and infected individuals located in the outbreak centres. These approximate interaction rates are then re-scaled to give relative risk values R i between 0 and 1:

For point-outbreaks, this is simply:

where k is the column index of the single outbreak location. The numerator is the number of individuals travelling from region i to the outbreak centre, and the denominator is the total number of travellers into the outbreak centre over all origin locations j.

In addition to the typical assumptions about equilibrium mixing (in the absence of more detailed interaction data), this interpretation is subject to the assumption that the strength of transmission in each centre is proportional to the number of active cases in that location.

This assumption is consistent with the observation that the majority of individuals start and end their journeys in the same locations, but there is not sufficient data to unequivocally determine the relationship between transmission risk within an area and active case numbers in the resident population of that area. Therefore, it is appropriate to think of our method as a heuristic approach to estimating transmission risk based only on qualitative information about epidemiological factors and informed by near-real-time estimates of mobility patterns.

These are derived from a biased sample of the population (a subset of Facebook users), and aggregated to represent movement between regions containing on the order of 10 3 to

Outbreaks occur in different contexts, some of which may suggest use of external data sources to infer at-risk sub-populations. Such inference can be used to refine spatial risk prediction.

For example, the workplace outbreak we investigated occurred in a meat processing facility,

where the virus spread among workers at the plant and their contacts. To adapt the general method to this context, we averaged OD matrices over the subset of our data capturing the transition between nighttime and daytime locations, as an estimate of work-related travel. In addition, we examined the effect of including industry of employment statistics as an additional risk factor. In this case, we used data collected by the Australian Bureau of Statistics (ABS) to estimate the proportion of meat workers by residence in each LGA, and weighted the outgoing traveller numbers by the proportion associated with the place of origin.

The resulting relative risk value R i is a crude estimate of the probability that an individual:

• travelled from origin location i into the region containing the outbreak centre;

• travelled during the period when many cases were pre-symptomatic and no targeted intervention measures had been applied;

• made their trip(s) during the time of day associated with travel to work and;

• were part of the specific subgroup associated with the outbreak centre (in this case, those employed in meat-processing occupations).

The variation described above is specific for workplace outbreaks in which employees are infected, but could be generally applied to any context where a defined subgroup of the population is more likely to be associated (e.g., school children, aged-care workers, etc.), or in which habitual travel patterns associated with particular times of day are applicable.

For each of the three outbreak scenarios, we present the mobility-based estimates of the relative transmission risk distribution, and a time-varying correlation between our estimate and the case numbers ascertained through contact tracing and testing programs. For details of these correlation computations, see the Supplementary Information.

Cedar Meats is an abattoir (slaughterhouse and meat packing facility) in Brimbank, Victoria.

It is located in the western area of Melbourne. It was the locus of one of the first sizeable outbreaks in Australia after the initial wave of infections had been suppressed through widespread physical distancing interventions. Meat processing facilities are particularly high-risk work environments for transmission of SARS-CoV-2, so it is perhaps unsurprising that the first large outbreak occurred in this environment [21, 22] . It began at a time when community transmission in the region was otherwise undetected. As the transmission cluster grew, it was thoroughly traced and subsequently controlled. The contact-tracing effort included (but was not limited to)

intensive testing of staff, each of which required a negative test before returning to work, 14-day isolation periods for all exposed individuals, and daily follow-up calls with every close contact.

The outbreak was officially recognised on April 29th, when four cases were confirmed in workers at the site and, according to media reports, Victoria DHHS informed the meatworks of these findings [23] . we also explored the effect of weighting mobility by a context-specific factor: the proportion of employed persons with occupations in meat processing ( Figure 1b ).

The geographic distribution of relative transmission risk due to mobility into Brimbank during the nighttime → daytime transition is presented in Figure 2 (a), while the distribution generated by including both mobility and the proportion of meat workers in each LGA is shown in Figure   2 (b).

To validate our estimate, we computed Spearman's correlation between this risk estimate for each region to the time-dependent case count for each region documented over the course of the outbreak (supplied by the Victorian Department of Health and Human Services). We use Spearman's rather than Pearson's correlation because while we expect monotonic dependence between estimated relative risk and case counts, we have no reason to expect linear dependence or normally-distributed errors. The outbreak case data was supplied as a time series of cumulative detected cases in each LGA for each day of the outbreak. Therefore, we present our correlation as a function of time from April 29th, when recorded case numbers began to increase dramatically (before May 1st, the number of affected LGAs was too small compute a confidence interval (n ≤ 3)). As case numbers increase, correlation between our risk estimates and case numbers Hume (5) Melton (3) Wyndham (2) Whittlesea (8) Moorabool (13) Brimbank (1) Greater Geelong (11) Banyule (17) Darebin (12) Moreland (10) Hobsons Bay (6) Melbourne (9) Moonee Valley (7) Maribyrnong (4) Yarra (15) Stonnington (16) Port Phillip (14) b) 

Whittlesea (12) Moorabool (15) Brimbank (1) Greater Geelong (16) Banyule (17) Darebin (10) Moreland (9) Hobsons Bay (8) Melbourne (4) Moonee Valley (5) Maribyrnong (7) Yarra (13) Stonnington (14) Port Phillip 

The next scenario we examine began with a single spreading event that occurred during a large gathering at a social venue in Western Sydney. While workplaces have frequently been the locus of COVID-19 clusters, many outbreaks have also been sparked by social gatherings [25, 26] . In urban environments, such outbreaks can prove more challenging to trace, as the exposed individuals may be only transiently associated with the outbreak location.

The Crossroads Hotel was the site of the first COVID-19 outbreak to occur in New South

Wales after the initial wave of infections was suppressed. The cluster was identified on July 10th, 2020, during a period when new cases numbered fewer than 10 notifications per day. However, the second wave of community transmission in Victoria produced sporadic introductions in NSW, one of which led to a spreading event at the Crossroads Hotel [27] . Based on media reports, state contact-tracing data indicated that the cluster began on the evening of July 3rd, during a large gathering [28] .

Unlike the Cedar Meats cluster, the Crossroads Hotel scenario was not a workplace outbreak with transmission occurring in the same context for a sustained time period, but a single spreading event in a large social centre. For this reason, to estimate relevant mobility patterns we averaged trip numbers over all time-windows in our data (daytime → evening → nighttime → daytime)

for the period of June 27th -July 4th. It was also necessary to perform some pre-processing of the mobility data provided by Facebook in order to correlate case data provided by New South

Wales Health to our mobility-based risk estimates due to substantial differences in the geographic boundaries used in the respective data sets (see Supplemental Information and Technical Note).

Aside from these minor differences, the method applied in this scenario is essentially the same as the one described above for the Cedar Meats outbreak. Risk of transmission in an area is assessed as the proportion of travellers who entered the outbreak location from that area (see Equation 3 ).

Correlation of our risk estimate to the number of cases in each LGA as a function of time is shown in Figure 4(a) . Heat maps of estimated risk and case numbers are shown in Figures  4(b) and 4(c), respectively. In this analysis, the available data did not explicitly identify the outbreak to which each case was associated, however, it did distinguish between cases associated with local transmission clusters and those associated with international importation. Because the Crossroads Hotel cluster was the only documented outbreak during this time, we attribute to it all cluster-associated cases during the period investigated. This assumption is anecdotally consistent with media reports that specify more detailed information about the residential location of individuals associated with the outbreaks. The COVID-19 case data for New South Wales is publicly available [29] . could have been predicted based on case numbers and mobility data that were available in early June. Our goal is to examine whether the effectiveness of mobility patterns in predicting relative transmission risk from point outbreaks can extend to community transmission scenarios in which outbreak sources are unknown.

In the community transmission scenario, as with the Crossroads Hotel outbreak, there were no clear context-dependent factors that suggested the use of other population data. In contrast to the first two scenarios, community transmission was occurring in multiple locations at the beginning of our investigation period. For each day, the unscaled risk estimate r i is the product of the OD matrix (averaged over the preceding week) and the vector of active case numbers in each location (see Equation 1 ). Therefore, in this case the relative risk value R i represents the proportion of travellers into all areas containing active cases, with the contribution of each infected region weighted by the number of active cases (see Equation 2 ).

For this scenario, we investigate the correlation between relative risk estimates at time t, and incident case numbers (notifications) at time t , for all dates between June 1st and July 21st. We 

The results of our correlation analysis for the Victoria community transmission scenario are shown in Figure 5 

The goal of this study was to develop and critically analyse a simple procedure for translating aggregate mobility data into estimates of the spatial distribution of relative transmission risk from COVID-19 outbreaks. Our results indicate that aggregate mobility data can be a useful tool in estimation of COVID-19 transmission risk diffusion from locations where active cases have been identified. The utility of mobility data depends on the context of the outbreak and appears to be more helpful in scenarios involving environments where context indicates specific risk factors.

The procedure we presented may also be useful during the early stages of community transmission and could help determine the extent of selective intervention measures.

In community transmission scenarios, mobility will already have played a role in determining the distribution of case counts when community transmission is detected. Our results indicate that the insight added by the incorporation of mobility data diminishes as case counts grow.

However, we also observed low correlations due to stochastic effects in the Crossroads Hotel scenario. Taken together, these results indicate that there is an optimal usage window that opens when case counts are high enough for aggregate mobility patterns to shed light on transmission patterns, and closes when these transmission patterns begin to determine the distribution of active cases which then predict their own future distribution with only limited information added by considering mobility.

Our examination of the second wave of community transmission in Victoria showed that several weeks before it was recognised, the spatial distribution of a small number of active cases 

It is essential that the use of mobility data for disease surveillance comply with privacy and ethical considerations [11] . Due to this requirement, there will always be trade-offs between the spatiotemporal resolution of aggregated mobility data and the completeness of the data set after curation, which typically involves the addition of noise and the removal of small numbers based on a specified threshold. To help ensure users cannot be identified, Facebook removes OD pairs with fewer than 10 unique users over the 8-hr aggregation period. The combination of this aggregation period with the 10-user threshold affects regional representation in the data set, particularly in more sparsely populated areas. The final product resulting from these choices contains frequently-updated and temporally-specific mobility patterns for densely populated urban areas, at the cost of incomplete data in sparsely populated regions. In general, increased temporal or spatial resolution will reduce trip numbers in any given set of raw data, which can have a dramatic impact on the amount of information missing from the curated numbers [31] .

The comparison of our results from the Cedar Meats outbreak and those from the Crossroads Hotel cluster demonstrate that the utility of aggregated mobility patterns in estimation of the spatial distribution of relative risk depends on the context of the outbreak, with more value in situations involving habitual mobility such as commuting to and from work. Detailed examination of the inconsistencies between risk estimates and case data from the Crossroads Hotel outbreak indicate that small numbers of people travelling longer distances were responsible for the relative lack of correspondence in that scenario. In particular, news reports discussed instances of single individuals who had travelled from the rural suburbs to visit the Crossroads Hotel for the July 3rd gathering who then infected their family members. These scenarios were not consistent with the risk predictions produced by the mobility patterns into and out of the region and exemplify the limitations of risk assessment based on aggregate behavioural data.

The mobility data provided by the Facebook Data For Good program represents a non-uniform and essentially uncharacterised sample of the population. While it is a large sample, with aggregate counts on the order of 10% of ABS population figures, the spatial bias introduced by the condition of mobile app usage cannot be determined due to data aggregation and anonymisation.

While it is possible to count the number of Facebook users present in any location during the specified time-intervals, it is not possible to distinguish which of those are located in their places of residence. In order to account for the (possibly many) biases affecting the sample, a detailed demographic study would be necessary that is beyond the scope of the present work. A heat map (Supplemental Figure S1 ) of the average number of Facebook users present during the nighttime period (2am to 10am) as a proportion of the estimated resident population reported by the ABS (2018 [32] ) shows qualitative similarity to the spatial distributions of active cases and relative risk shown in Figure 5 

On a fundamental level, mobility patterns are responsible for observed departures from continuum mechanics observed in real epidemics [33] . Over the past two decades, due to public health concern over the pandemic potential of SARS, MERS, and novel influenza, spatially explicit models of disease transmission have become commonplace in simulations of realistic pandemic intervention policies [34, 35] . Such models rely on descriptions of mobility patterns which are usually derived from static snapshots of mobility obtained from census data [31, 36, 37] . While this approach is justifiable given the known importance of mobility in disease transmission, it is also clear that the shocks to normal mobility behaviour induced by the intervention policies of the COVID-19 pandemic will not be captured by static treatments of mobility patterns. To account for the dynamic effects of intervention, several models have been developed to simulate the imposition of social distancing measures through adjustments to the strength of contextspecific transmission factors [38, 39] . This type of treatment implicitly affects the degree of mixing between regions without explicitly altering the topology of the mobility network on which the model is based and it is unclear whether such a treatment is adequate to capture the complex response of human population behaviour. Given the results of our analysis, the incorporation of real-time changes in mobility patterns could add policy-relevant layers of realism to such models that currently rely on static, sometimes dated, depictions of human movement.

Example scripts and data used for computing risk estimates and correlations can be found in the associated GitHub repository:

https://github.com/cjzachreson/COVID-19-Mobility-Risk-Mapping

However, due to release restrictions on the mobility data provided by Facebook, the OD matrices are not included as these were derived from the data provided by the Facebook Data

For Good program (random matrices are included as placeholders). The processed mobility data used in this work may be made available upon request to the authors, subject to conditions of release consistent with the Facebook Data For Good Program access agreement.

A generic implementation of the code used to re-partition OD matrices between different geospatial boundary definitions is enclosed in the supplementary Technical Note.

The data used in our study was provided by the Facebook Data for Good program. The data set (in the Disease Prevention Maps subset) is aggregated from individual-level GPS coordinates collected from the use of Facebook's mobile app. Therefore, the raw data is biased to over- (national-scale) and smaller (city-scale) regions of interest, we determined that the state-level data provided the best balance, with trip numbers large enough to produce a sufficiently dense network of connections while still providing a subregion size that is usually smaller than the Local Government Areas for which case data is reported.

Because the raw mobility data is provided as movements between tiles, while case data is provided based on the boundaries of Local Government Areas. We note that while Facebook releases data aggregated to administrative regions, these regions were not geographically consistent with the current LGA boundaries for Australia. In order to ensure consistency of our method across datasets and jurisdictions, we produced our own correspondence system. We did this by performing two spatial join operations. These associate either tiles or LGAs with Meshblocks (the smallest geographic partition on which the Australian Bureau of Statistics releases population data). Meshblocks were associated based on their centroid locations. Each meshblock centroid S1 was associated to the tile with the nearest centroid and to the LGA containing it. We did not split meshblocks whose boundaries lay on either side of an LGA or tile boundary, as their sizes are sufficiently small that edge effects are negligible (in addition, the set of LGAs forms a complete partition of meshblocks, so edge effects were only observed for tile associations). We then associated tiles to LGAs proportionately based on the fraction of the total meshblock population within that tile that was associated with each overlapping LGA.

Once a correspondence is established between the tile partitions on which mobility data is released and the LGA partitions on which case data is released, the matrix of connections between tiles must be converted into a matrix of connections between LGAs. The Supplementary Technical

Note explains how we performed this step, and gives a general method for converting matrices between partition schemes. Briefly, the number of trips between two locations in the initial data is split between the overlapping set of partitions in the new set of boundaries (in this case, local government areas), based on the correspondence between partition schemes determined as explained in the previous subsection.

To investigate the spatial sample biases present in the mobility data provided by Facebook, we examined the ratio of Facebook users to ABS 2018 population for each suburb in Victoria.

While the true number varies from day to day, an example of this distribution is shown as a heat map in Supplemental Figure S1 , which displays the average number of Facebook mobile app users indexed to each LGA between the hours of 2am and 10am from May 15th to June 25th, divided by the estimated resident population reported by the ABS in 2018. The distribution is narrow, with most urban areas falling in the range of 5 % to 10 % Facebook users. However, this is not an exact representation of residential population proportions, as many mobile users work during the nighttime and will not be located at their residence during the selected period.

Unfortunately, it is not possible to precisely quantify the bias introduced by Facebook's sampling scheme.

Despite these limitations, it may still be informative to examine whether accounting for the bias For the Cedar Meats outbreak scenario, accounting for the Facebook sample bias in this way improves the correlation between our mobility-based relative risk estimate and the recorded case counts ( Figure S2a ). For the community transmission scenario, performing this extra step does not appear to substantially change the result shown in Figure 5 (compare Figure S2b and Figure   5c ). 

We used Spearman's rank correlation to investigate the correspondence between our relative risk estimates and documented case data. This measure of correlation is typically used when comparing ordinal data, or, more generally, when monotonic relationships are expected, but errors are not normally-distributed. In order to investigate the monotonicity between relative risk estimates and reported case numbers, we aligned the documented case data for all regions in which infections had been tabulated against the corresponding relative risk estimates for those regions. Note that our correlations did not include regions for which no case data was available.

Therefore, our correlation results illustrate the degree to which risk estimates are monotonic with case numbers, but do not account for any risk estimates made in areas with no cases to compare to. This results in a high degree of uncertainty when the number of affected areas is small, reflected by the wide confidence intervals observed in the early stages of the Cedar Meats and Crossroads Hotel outbreaks (Figures 3, and 4a , respectively).

The 95% confidence intervals were computed using Fisher's Z transformation with quantile parameter α = 1.96.

Two data sets from the Australian Bureau of Statistics were used in this study: 1) number of residents by industry of occupation (2016), and 2) resident population (2018).

The distributions shown in Figure S1 were computed by dividing the number of Facebook users indexed to each LGA during the nighttime period by the resident population in each LGA. We obtained the population data from the ABS 2018 population dataset which is publicly available [32] . The Facebook user populations are provided by the Data For Good program in addition to the mobility data discussed above.

As a context-specific risk factor for the Cedar Meats outbreak we obtained the number of To compute the factors used to weight the mobility-based relative risk predictions, we divided the total number of workers in both of the above categories by the number of employed persons (those employed full time or part-time) in each LGA, which we also drew from the 2016 Australian Census via Census TableBuilder.

COVID-19 case data by local government area is available from Australian jurisdictional health authorities. For this work, we used data provided by NSW Health [29] (all data is publicly S5 available) and from Victoria DHHS. The data used for the Cedar Meats outbreak scenario was obtained from DHHS through a formal request to the Victorian Agency for Health Information (VAHI) and cannot be made public in this work. The case data by LGA used to evaluate the Victoria community transmission scenario was taken directly from the COVID-19 daily update archives available on the DHHS public website [30] .

Transmission routes of respiratory viruses among humans. Current opinion in virology

guideline for isolation precautions: preventing transmission of infectious agents in health care settings

The incubation period of coronavirus disease 2019 (COVID-19) from publicly reported confirmed cases: estimation and application

Temporal dynamics in viral shedding and transmissibility of COVID-19

Epidemiologic features and clinical course of patients infected with SARS-CoV-2 in Singapore

Quantifying SARS-CoV-2 transmission suggests epidemic control with digital contact tracing

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (SARS-CoV-2)

Presymptomatic Transmission of SARS-CoV-2-Singapore

Presymptomatic SARS-CoV-2 infections and transmission in a skilled nursing facility

Asymptomatic and presymptomatic SARS-CoV-2 infections in residents of a long-term care skilled nursing facility

Aggregated mobility data could help fight COVID-19

COVID-19 outbreak response, a dataset to assess mobility changes in Italy following national lockdown. Scientific data

Effectiveness of social distancing strategies for protecting a community from a pandemic with a data driven contact network based on census and real-world mobility data

Social Distancing as a Health Behavior: County-level Movement in the United States During the COVID-19 Pandemic is Associated with Conventional Health Behaviors

Facebook Disaster Maps: Aggregate Insights for Crisis Response & Recovery

Economic and social consequences of human mobility restrictions under COVID-19

Job Loss and Behavioral Change: The Unprecedented Effects of the India Lockdown in Delhi

Interdependence and the Cost of Uncoordinated Responses to COVID-19

Human Mobility in Response to COVID-19 in France

COVID-19) at a glance -10

COVID-19 Among Workers in Meat and Poultry Processing Facilities-19 States

Interregional SARS-CoV-2 spread from a single introduction outbreak in a meat-packing plant in northeast

First Cedar Meats COVID-19 case confirmed on 2

Infection fatality rate of SARS-CoV-2 infection in a German community with a super-spreading event. medrxiv

High SARS-CoV-2 attack rate following exposure at a choir practice

COVID-19 Weekly Surveillance in NSW, Epidemiological Week 31, Ending

Fears of further spread as Crossroads Hotel virus cases become infectious within a day

NSW COVID-19 cases by location and likely source of infection

Updates about the outbreak of the coronavirus disease (COVID-19)

Creating a surrogate commuter network from Australian Bureau of Statistics census data. Scientific data

by Region

Synchrony, waves, and spatial hierarchies in the spread of influenza

Mitigation strategies for pandemic influenza in the United States

Interfering with influenza: nonlinear coupling of reactive and static mitigation strategies

What can urban mobility data reveal about the spatial distribution of infection in a single city

Investigating spatiotemporal dynamics and synchrony of influenza epidemics in Australia: An agent-based modelling approach

Impact of non-pharmaceutical interventions (NPIs) to reduce COVID19 mortality and healthcare demand

Modelling transmission and control of the COVID-19 pandemic in Australia

Bing Maps Tile System

About TableBuilder; 2020