key: cord-0937695-ycg49y28
authors: Novakovic, Aleksandar; Marshall, Adele H
title: The CP-ABM Approach for Modelling COVID-19 Infection Dynamics and Quantifying the Effects of Non-Pharmaceutical Interventions
date: 2022-05-14
journal: Pattern Recognit
DOI: 10.1016/j.patcog.2022.108790
sha: fdeeb0e0c81dbf8f51a424537f31d4ac02f4a057
doc_id: 937695
cord_uid: ycg49y28

The motivation for this research is to develop an approach that reliably captures the disease dynamics of COVID-19 for an entire population in order to identify the key events driving change in the epidemic through accurate estimation of daily COVID-19 cases. This has been achieved through the new CP-ABM approach which uniquely incorporates Change Point detection into an Agent Based Model taking advantage of genetic algorithms for calibration and an efficient infection centric procedure for computational efficiency. The CP-ABM is applied to the Northern Ireland population where it successfully captures patterns in COVID-19 infection dynamics over both waves of the pandemic and quantifies the significant effects of non-pharmaceutical interventions (NPI) on a national level for lockdowns and mask wearing. To our knowledge, there is no other approach to date that has captured NPI effectiveness and infection spreading dynamics for both waves of the COVID-19 pandemic for an entire country population.

On the 30 th January 2020, the World Health Organization declared a global outbreak, and on the 11 th March of the same year, it declared the pandemic of coronavirus disease 2019 . COVID-19 is caused by a respiratory virus named SARS-CoV-2, with its main mode of human-to-human transmission via direct, indirect or close contact with an infected person through their infected secretion (droplets, aerosols, saliva) which is produced when an infected person sneezes, coughs, talks or sings [1] .

One of the main characteristics of this virus is that the people who get infected do not develop any symptoms immediately upon its contraction (i.e. become symptomatic or clinical). The period of time that passes between exposure to the virus and symptom onset is called the incubation period [1] . This incubation period varies from person to person, lasting for the median of 5.1 days (95% CI = [4.5, 5.8] ), but in some instances its duration can be as long as 14 days [2] . However studies have also demonstrated that infected people without symptoms, that is those who are in the incubation period and not yet symptomatic (i.e. presymptomatic or preclinical careers) and those who never develop any symptoms (i.e. asymptomatic or subclinical careers), can spread the disease too [3] . Given that the subclinical people are not always tested, numerous research findings have suggested that the total number of infections is very likely greater than the number of reported cases [4] . The true proportion of COVID-19 transmissions that is accounted for by preclinical or subclinical people is unknown and as such this has big implications for prevention [5] .

To fight the global pandemic, the nations around the world have been introducing a range of different non-pharmaceutical interventions (NPIs) that aim to control and reduce the rapid spread of the virus.

These NPIs include, but are not limited to, partial and/or total lockdowns on both a regional and/or national level, promoting social distancing and the making wearing of face masks mandatory on public transport, indoors and/or outdoors, etc. Although these measures are proven to be effective in reducing the spread of disease [6] , their enforcement affects sociopolitical, economical and all the other aspects of life. Given the complexity of societies and differences in the interventions that different nations are taking in fighting COVID-19, it is very difficult to predict their short and medium term impact both on a national and global level [7, 8] . Some interventions such as regional and total national lockdowns, can have a devastating impact on the national economies and mental health of society particularly those most vulnerable which is why it is particularly important to quantify the effect that these NPIs have on the reduction of the spread of the virus, so that taking any similar approaches can be justified in future decisions.

When it comes to modelling the spread of COVID-19 the three approaches most commonly found in the literature are the compartmental, AI and agent-based modelling approaches, each with their own advantages and disadvantages.

Compartmental models belong to the equation based models group, and provide a theoretical framework for describing disease dynamics and analysing a specific outbreak or epidemic within a closed and well mixed homogenous population [9, 10] . Each compartment represents one disease status, and each individual in the population can be in exactly one compartment in a given time but can move from one compartment to another depending on the model parameters [11] .

A compartmental model consists of a system of differential equations, with each differential equation representing a single compartment in the model [12] . There are many types of these models, but given that there is a known incubation period for COVID-19, the SEIR (Susceptible-Exposed-Infected-Recovered) compartmental models [13] are most frequently used for modelling its disease dynamics as for instance the research by Kuniya [14] . By adding new compartments, many refinements of the standard SEIR COVID-19 models have been made in order to create more realistic models that for example include: super spreaders [15] , preclinical and subclinical [16] , quarantined [17] , and other types of patient.

Compartmental models represent the majority among those that can be found in the literature for simulating COVID-19 disease dynamics [8] . Their main advantage is in their capability to capture large scale infection dynamics at a macro level (e.g. country-wide/continent-wide pandemic), with relatively low computational overhead. However, at the same time this top down approach is usually being listed as their biggest limitation in the literature, as they are unable to capture more refined information on the spread of disease such as for instance the interaction between individuals [10] .

More recently with the increasing data availability, AI approaches have been gaining more traction within COVID-19 modelling in particular, for diagnostic purposes from medical images [18] [19] [20] and for forecasting the spread of the disease. For example of the former, Wang et al. [18] proposed a new framework that utilises deep learning to differentiate and localise COVID-19 from chest X-ray images of community acquired pneumonia patients. Similarly, an example of the latter case is the work of ArunKumar et al. [21] 

Agent Based Models (ABMs), take the opposite approach to the macro level view of the compartmental and AI models, by creating a micro level view and simulating it up to the macro level. The ABMs are computer based simulations that consist of heterogeneous and adaptive individual entities called agents that are uniquely identifiable, and capable of acting autonomously and interacting with one another in the simulated environment [23, 24] . Their main characteristic is that by defining the set of micro-rules that describe the agents' behaviour in the simulated environment, they are able to capture emergent macro effects and realistically model a real world system [25] . More precisely, when it comes to epidemiological modelling, the behaviour of the agents combined with the transmission pattern and disease progression will lead to the emerging population dynamics such as a disease outbreak or pandemic [24] . In order to be as realistic as possible so that the results of the ABMs can be applied on a population level, the characteristics of agents, their behaviour and the characteristics of the disease in the simulated environment should be as close as possible to the ones that can be found in the real world. However this capability of ABMs to mimic real world scenarios comes with a price, as with the increased level of details that are captured by the models, comes a requirement for increased computing power to run these simulations. Therefore, the scalability related issue when creating highly detailed models, is often listed as one of the main limitations of ABMs as noted in the literature [10] . 2. In the Chernivtsi region of Ukraine, an ABM was developed to predict the spread of COVID-19

using 1000 agents and further applied to regions in Slovakia, Turkey and Serbia using between 500-1000 agents [29] . The authors include visual comparisons of the real data and forecasted spread to demonstrate the approach's suitability and report that statistical results, and sensitivity analysis were also conducted. The method is noted to be slower and disadvantaged due to the dependence on the random number generator which can produce different simulation results for the same initial parameters. This is overcome by parallelizing the models on different cores/computers. 3 . In Italy, the Calabria region was modelled using an ABM with a closed population of 250 agents moving within a square section of 250 × 250 m 2 [9] . The number of agents was kept low to minimize the computational cost, which was reported to have an average CPU time for the simulations over the 90 day period to be 3.5 hours on a computer with four cores and 6Gb of RAM. The model was assessed visually by comparing the simulated number of cases with those reported for Cambia and other regions of Italy.

4. In Brazil, an ABM was created to replicate a closed society consisting of 300 agents made up of 5 agent types (people, houses, businesses, the government and the healthcare system) and

simulated for a 2 month period [8] . The paper also models the financial impact, however there is no information provided on the calibration of the model.

Ireland an ABM previously developed for modelling the spread of measles was extended into a hybrid model to represent COVID-19 infection for 12 small towns, with one agent representing each individual in the populations ranging in size from 73 to the largest town at 1782

people [10] . The overall hybrid model was compared with the ABM but no calibration with the real data was discussed.

6. In France, an ABM was constructed for 500,000 agents and run for a period of 360 days [30] . The Araya [32] who model 100 construction worker agents working on one specific project every hour of their working day for a period of 3 working months. Due to the lack of data, both models are hypothetical in nature and based on subject matter expert information rather than data.

The model presented in this paper is motivated by challenges faced by the authors when modelling the infection spread of COVID-19 for the Northern Ireland COVID-19 Modelling Group. Previous models listed a large number of fixed parameters which requires making a large number of assumptions which may be unrealistic or are too specific to the population in question to be easily transferable to Northern

Ireland. There is only one paper [30] that clearly describes the calibration performed on their model and in many no description or calibration appears to be performed. Likewise, for those papers that consider a subsection of their population and wish to expand that to the entire population, it is unclear how extrapolation is performed if any. Additionally, all of the models previously reported are considering only wave one of COVID-19 pandemic and run the models for a short period of time.

The main contributions of this paper are as follows:

1. We propose a novel methodology based on the hybridisation of change point detection and agent based modeling techniques for modelling COVID-19 infection dynamics, and quantifying the effects of non-pharmaceutical interventions on a national level. The proposed methodology is programming language agnostic and enables researchers to develop models that can be run and calibrated on consumer grade hardware and not necessarily just on supercomputers.

2. We demonstrate effectiveness of the methodology introduced by modeling the spread of the new daily confirmed COVID-19 cases in Northern Ireland during the period between the 9 th March and 15 th November 2020. This is accompanied by a detailed description of the entire calibration and validation process, as well as interpretations of the fitted parameters.

3. We are able to successfully capture the role that subclinical (also known as the asymptomatic)

patients have on the spread of the virus and estimate their relative infectiousness, as well as successfully quantify the NPI effects that the wearing of masks, one regional and two national lockdowns have had on the virus spread reduction.

4. As far as we are aware of, this study is the first of its kind that was able to successfully capture COVID-19 infection dynamic during such a long period of time and successfully model both infection waves.

The paper proceeds as follows. In Section 2 we introduce the CP-ABM methodology and provide detailed descriptions for the algorithms of each of its components. In Section 3 we describe how we applied this methodology in Northern Ireland (as our case study) and provide a thorough discussion of the obtained results making parallels with the outcomes that can be found in other independent research studies and literature. Finally, in Section 4 we give our closing remarks and lay out directions for our future research.

The main objective of the CP-ABM modelling approach is to realistically simulate the COVID-19 infection transmission within a closed society living in a shared environment during the observed time period. The model assumes that a disease transmission may occur only when the infectious (preclinical, clinical or subclinical) person gets in direct contact with a susceptible person, and that changes in the contact behaviour of people is the driving force behind COVID-19 infection rates [33] . 

Many timeseries datasets can exhibit a sudden change in structure, such as unexpected jumps or drops in level or volatility [34] . The indexes that denote the moments in time in which the characteristics of timeseries abruptly change are called change points. Change points usually indicate changes in the underlying data generation mechanism, and given that the changes in social mixing patterns have a detrimental impact on the COVID-19 spreading dynamics, in our work we associate change points with the moments in the timeseries that correspond to those non-pharmaceutical interventions (NPIs) that played a critical role in governing contact behaviour between individuals in the population.

In order to explain how the CP component works we need to introduce some basic definitions. Let us assume that the observed data for the confirmed daily COVID-19 cases is represented by

where ≥ 1, is a sequence of observations in ℕ. Also, let ℰ = 〈 ( ) 〉 =1 be an increasing sequence of integers that can take the values between 1 and (inclusive), representing the indexes of occurrence of all NPIs that were used for regulating social-mixing patterns in (e.g. lockdowns, opening/closing schools, etc.). The main task of the CP component is to find ⊂ ℰ = 〈 ( ) 〉 =1 , an increasing sequence of integers that can take the values between 1 and − 1 (inclusive), that represent the change points in and correspond to the key NPIs from ℰ that were most influential on the spread of the disease.

These change points segregate the timeseries into

In practice the exact number of change points is usually unknown [34] , hence the problem of estimating can be formulated as a model selection problem where the objective is to find the best segmentation ̂ that minimizes the following quantification criteria:

where (•) represents a cost function that measures a goodness of fit for segment (̂− 1 ,̂] , while ( ) is a carefully selected penalty parameter that increases with [35] . This penalty parameter is used to prevent overfitting by controlling the number of change points and thus penalizing for complexity.

The choice of (•) most often depends on the assumptions of how the distribution of the underlying dataset is parametrised. When it comes to the spread of COVID-19, a growing body of evidence suggests that the number of confirmed cases follows a power-law distribution [36] . Fagan et al. [37] demonstrate that no single change point detection method fully captures the behaviour of the heavy-tailed nature of the data of power-law distributions. To mitigate these issues, they propose a hybrid solution where the optimal segmentation that minimizes (1) is found by utilising the ED-PELT non-parametric change point detection algorithm in which the empirical distribution of is used to define (•) [38] , in conjunction with two penalty choices for ( ): the modified Bayesian information criterion (mBIC) [39] and the change points for a range of penalties (CROPS) [35] .

A schematic overview of the CP component that we use is shown in Figure 2 . The approach of Fagan et al. [37] is utilised in the approximation phase of our algorithm for the CP Detection Component to approximate the locations where the statistical properties of change. The pseudo-code of the approximation phase along with the identification, confirmation and segmentation phases is provided in Algorithm 1. The main task of the approximation phase (lines 2-9) is to find the optimal number of change points and their location in , and in the case of misalignment to adjust the positions of detected change points to match the nearest NPIs from ℰ that were crucial in influencing social-mixing patterns in the population (identification phase, lines [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] . This component also enables the modeller to include any additional NPIs of interest that are not fully captured by the previous two phases (NPI confirmation phase, lines [20] [21] [22] [23] [24] and to produce the final segmentation (segmentation phase, line 25) to inform the ABM component of the model to appropriately adapt the behaviour of its agents during the simulation. 

Input: = {〈 〉 =1 | ∈ ℕ + } ← A sequence representing a timeseries dataset containing the daily numbers of confirmed COVID-19 cases; ℰ = {〈 〉 =1 | ≤ } ← An increasing sequence of integers taking values between 1 and , representing all indexes in that correspond to NPI that affected social mixing patterns Output:

= 〈 ( ) 〉 =1 ← an ordered sequence of change points representing the key NPIs from ℰ that affected disease spreading dynamics the most 1.

← 〈 〉; ← 1 Initialisation 2.

̂← estimated set of change points by applying ED-PELT with mBIC penalty (eq. 1) Approximation Phase (Figure 2 a)) [37] 3.

̂← estimated set of change points by applying ED-PELT with CROPS (eq. 1) 4.

if |̂| = 0 and |̂| > 2 then 5.

for each ̂∈̂ do Identification Phase (Figure 2 b)) 11.

← arg min ∈ℰ |̂− | 12.

← ; ← + 1 13. end for 14. for each ′ ∈̂′ do 15.

← arg min ∈ℰ | ′ − | 16.

if interpretation leads to conclusion that is relevant then 17.

end for 20. for each ∈ ℰ\ do Confirmation Phase (Figure 2 c)) 21.

if interpretation leads to conclusion that is relevant then 22.

← ; ← + 1 23.

end if 24. end for 25. return sort( ) Segmentation Phase (Figure 2 d))

The main objective of the ABM component is to realistically simulate COVID-19 infection dynamics inside the closed heterogenous population that is inhabiting the shared closed environment. It also aims to estimate the effectiveness of the key NPIs that lead to the changes in population mixing dynamics, as well as the effectiveness that mask wearing, has had on the disease spread during the observation time Each cell can contain multiple agents, with agents inheriting the infection status of the cell in which they are located (e.g. all agents that are on the cell with infection status are susceptible, while those on the cell with status are exposed, etc.). Let us denote with ℭ = {{ } =1 | ∈ ℕ + ; ℭ ∈ { , , , , , , }} a set of ∈ ℕ + agents that represent all individuals in the simulation, with the th agent belonging to one out of seven given cells ∈ ℭ. An overview of the attributes characterising cells and agents is provided in Table 1 . Age stratified susceptibility to infection [40] duration The ABM component conducts an iterative process (i.e. simulation) that is implemented through a series of discrete steps that we call ticks, with each tick ( ) in the simulation representing one day in the real world. The simulation starts by initialising the input parameters for the baseline relative infectiousness for each agent type, the number of agents at each state at the beginning of the simulation, the mean daily contacts in each segment between change points, the probability of the exposed agents developing a subclinical infection at the end of the incubation period, and the variations attributed to additional behavioural effects. The output of the model is the estimated number of daily cases ̂= 〈̂〉 =1 which is then used to evaluate the effectiveness of the NPIs by comparing the estimated values from a number of different scenarios against the true observed data . The model is also able to implicitly capture the number of agents on each cell and track their interactions and behaviours at each tick, but for the focus of this paper we keep our attention on the number of estimated daily cases.

In order to produce outputs that are as realistic as possible, the number of agents to be used in the simulation and their age stratification should be initialised to match publicly available demographics data of the modelled population or geographical region. It is important to highlight that the size of the agent set remains constant within the simulated time period, that is, there is no change in the population demographics. The number of agents to be randomly allocated to each cell ∈ ℭ during the initialisation phase ( = 0), is controlled by the 0 parameter. In reality, the true number of agents with exposed, preclinical, subclinical and recovered infection status is unknown so the parameters 0 , where ∈ { , , , } need to be either assumed or estimated through calibration. In addition, there is the possibility that some susceptible agents are being protected and shielded from the rest of the other types of agent.

Such individuals cannot be infected and are described as shielded susceptible. Shielded agents are located at the cell with status and the time in which the shielding will occur will depend on the scenario being modelled.

It has been assumed that the agents' social mixing patterns are the driving force behind COVID-19 Likewise, we introduced the parameter that controls whether the agents with a developed clinical infection are also going to be involved in spreading infection at tick or not.

Overall, the relative infectiousness of ℭ is affected by two main factors. The first factor is the baseline relative infectiousness of the agents that depends on the cell ∈ ℭ in which they are located.

We assume that the baseline relative infectiousness of agents with preclinical and clinical infection is 1 (corresponding to 100%) and that the agents who develop subclinical infection are less infectious than those who develop preclinical and clinical infection as suggested by the evidence from the literature [5, 16] . Based on the fact that, there is no clear agreement in the literature about how much smaller their relative infectiousness is, the value needs to be obtained via calibration. Once initialised, the baseline relative infectiousness will remain constant throughout the simulation for all infectious agents. The second factor that has a significant impact on the overall relative infectiousness of the agents ℭ are introductions and relaxations of non-pharmaceutical interventions such as social distancing, mask wearing enforcement or hygiene promoting rules, that cannot be solely captured by the decrease or increase of the mean daily contacts . To intrinsically capture this extra variation that is attributed to these external behavioural effects, we introduce a behavioural parameter that can either take a value of 0 if there are no non-pharmaceutical intervention taking place at tick or otherwise take a real value from the [0,1] interval that is obtained again by calibration. The value of the parameter varies from intervention to intervention, and requires separate fitting for each intervention.

The probability that the susceptible agent will become infected at tick depends on its age stratified susceptibility to infection ℊ ( 

To Table 2 contains a timeline of the occurrences of these interventions and at which tick the agents should change their mean daily contact behaviour in the simulation. In the simulation time period the two intervals in ticks (19, 144] and (221, 251] correspond to two national lockdowns that were introduced in March and October respectively, whereas the period between ticks (206, 221] correspond to a regional lockdown in the Derry and Strabane local government district (LGD) in October 2020. In NI the mandatory wearing of masks commenced on the 10 th August 2020 and remained in place until the end of the observational study. The wearing of masks is incorporated into the model through its impact on the overall relative infectiousness, not mean daily contacts, and is captured by the behavioural parameter. An overview of all the behavioural and mean daily contact parameters that were used in the simulations is provided in Table 3 . This requires fitting a total of 22 parameters, the process of which is described in section 3.1. If we were interested in modelling a shorter time window, this would require fitting fewer ( , ] and ( , ] parameters, which would result in fitting a fewer number of parameters overall. This is entirely driven by the number of key NPIs identified by the CP component during the modelled time period. The timeline of these events plotted against the daily numbers of confirmed COVID-19 cases is illustrated in Figure 4 . Table 2 ) and the shaded backgrounds represent the duration of key NPIs such as the national and regional lockdowns, the wearing of facemasks and a period of no lockdown restrictions at all.

For the purpose of the ABM component calibration we utilised the standard generational genetic algorithm (GA) [43] with population size 50, 15% mutation rate and 85% crossover rate, using tournament To identify the optimal values of the input parameters of the ABM component for the NI model, the GA was running continuously for almost 18 days and stopped when there was no further improvement in the fitness score. The fitness score of the best fit ABM component was ̅̅̅̅̅̅ = 10899.72. An overview of the input parameters, and their values that correspond to the best fitted model is provided in Table 3 . All estimates produced by the ABM component are obtained by running the simulations 50 times using the same random number generator seeds as the ones used by the GA with the best solution (a choice made for the reproducibility reasons), and averaging the results obtained. In order to ensure that the ABM component is producing realistic patterns, deeming it fit for purpose, the patterns in its output had to agree with the patterns in the observed dataset, which is validated against several different criteria. Firstly, all GA estimated input parameters relating to the mean daily contacts 

Following the validation of the model, we can focus our attention on the other ABM input parameters that have been estimated by the GA algorithm. We have estimated that the overall proportion ( → ) of people who contract COVID-19 and develop a subclinical infection is 38%, which falls within the 95% prediction interval of the proportions reported in the study by Buitrago-Garcia et al. [5] . There are no clear guidelines on the base relative infectiousness of people with developed subclinical infection, however given that the model has proven to validate well against several different criteria and that the estimated → is in line with the findings reported in the literature, we estimate that their baseline relative infectiousness is 39%.

The main effects attributed to lockdowns is the change in social mixing patterns captured by the mean daily contacts parameter. In addition lockdowns can also lead to a change in other behavioural aspects of the population which has been captured by the behavioural effects parameter. We assume that these additional behavioural changes are a direct result of other NPIs that were not directly impacting social With this in mind, the resulting behavioural effects parameter for the regional lockdown 

In the period between the 10 th August (tick 154) and 1 st October (tick 206) only the wearing of masks was compulsory (blue coloured area in Figure 4) 

where and ̅ are the actual and projected numbers of cumulative confirmed cases respectively.

These results have been further validated by comparing the CP-ABM estimations with the ones found in the literature. In Germany, Mitze et al. [44] cases in the scenario when masks were never introduced is illustrated in Figure 6 a).

A similar approach was taken for quantifying the effectiveness of the lockdowns. Given the assumption that the NPIs being made during lockdowns lead to changes in social-mixing patterns and other aspects of people's behaviour, the effectiveness of the lockdowns was estimated by altering both the mean daily contacts and behavioural effect parameters. Experiments are then simulated to consider what would have happened if lockdowns were never introduced by setting the mean daily contacts and behavioural parameters during the lockdown to be equal to their respective values prior to lockdown and kept constant for the entire evaluation period.

We conducted three experiments in total. In the first experiment we explore the impact of the first national lockdown by projecting the number of cumulative confirmed cases in the scenario in which the first national lockdown was never introduced and hence perform the simulations when the mean daily contacts and behavioural effects parameters are set to match those prior the lockdown, i.e. ( In the second experiment we explore the impact of the regional lockdown by projecting the number of cumulative confirmed cases in the scenario in which the regional lockdown and second national lockdown did not occur. This required setting the mean daily contacts to be the same as the ones that preceded the regional lockdown ( lockdown was never introduced, with only the regional lockdown taking place. Therefore, the mean daily contacts and behavioural effects parameters were set to match those of the regional lockdown, i.e. In the first experiment, if there was no national lockdown (Figure 6 b) having only a regional lockdown in the Derry and Strabane local government district was not a strong enough intervention on its own to prevent the entire spread of disease throughout the country, however it did have an impact on slowing the rise in cases as the projected number of cumulative confirmed cases is almost 47% as opposed to 78% greater in comparison to having no regional nor second lockdown ( Figure   6 c)). The figures also demonstrate that without the second national lockdown being introduced, we would have expected an increase of 47% in the number of confirmed cases on the 15 th November. Figure 6 . The estimated percentage increase in cumulative confirmed COVID-19 cases (equation (3)) in Northern Ireland if: a) the masks wearing rules were never introduced b) there was no first national lockdown, c) there was no regional lockdown in Derry and Strabane local government district nor second national lockdown, and d) there was no second national lockdown.

The NI COVID-19 infection dynamics model that was built on the CP-ABM methodology, that we introduce in this paper, is implemented using a combination of multiple technologies. 

This paper introduces the CP-ABM, a new state of the art methodology that is capable of accurately mimicing the disease dynamics for COVID-19. The application of the CP-ABM to the Northern Ireland population is included as a demonstration of its suitability and effectiveness for this purpose. The model covers the time period of 251 days and to the best of our knowledge, this study is the first of its kind that was able to successfully capture COVID-19 infection dynamics during such a long period of time and successfully model both infection waves. Genetic algorithms were used for calibration purposes to ensure that the exact infection dynamics patterns can be replicated by the CP-ABM. This was validated through the existing literature. We also capture the role played by the subclinical people in the infection dynamics workflow.

The CP-ABM approach uniquely incorporates change point detection into agent based models in order to

identify key events which leads to changes in contact behaviour between people which is the driving force behind the COVID-19 infection rates [43] . The ABM component, unlike the top down approaches of the equation based modelling techniques, allows the simulation of each individual's behaviour in the population. The resulting key events from the CP component are used in the ABM component to appropriately direct the adaptive behaviour of its agents to produce realistic results. The demand for high computational power has limited the application of ABMs on a population level for COVID-19. However, we overcome this issue by employing an efficient infection centric modelling approach which aims to capture the interactions of only those agents who are able to transmit the disease to others during the simulation. The proposed methodology enables researchers to develop models that can be run and calibrated on consumer grade hardware as it is programming language agnostic and does not necessarily require supercomputers.

The new CP-ABM methodology is also able to quantify the effects of non-pharmaceutical interventions on a national level which we demonstrated in use cases that estimate the effectiveness of masks and regional and national lockdowns. This has been achieved through the simulations of different scenarios which consider what would have happened if those NPIs were never introduced, and compare the estimated projections of the confirmed daily cases with the real observations. We identify that the NPIs in Northern Ireland considered in our experiments were necessary and highly effective in preventing an even more extreme outbreak of the country's pandemic.

Although not explored in this paper, the model can be expanded to have the functionality to capture the number of agents on each cell and their interactions at each tick during simulations. This can be utilised in future work to consider the interaction between agents in different settings and to evaluate scenarios for considering the impact of vaccines, new COVID-19 variants and detailed analysis of clusters dynamically changing over time. We demonstrated the CP-ABM capabilities through the implementation of the model on the NI population but the model is not unique to NI and therefore it can be applied to other countries and geographical areas (both large and small). Also, the methodology is not COVID-19 specific and hence may be used as a basis for creating epidemiological models to capture infection dynamics properties and the assessment of the NPI effects of other diseases both in human and animal health.

Our research is implemented using the Netlogo language which is a higher level language for rapid development of agent based models. The speed could be significantly improved if the algorithm was implemented in one of the lower level languages, such as C, C++ or Rust. Also in this paper, we use standard genetic algorithms for the purpose of the ABM component's hyperparameters calibration and in the future we wish to explore other optimisation strategies to investigate whether they would lead to obtaining faster results.

Transmission of SARS-CoV-2: implications for infection prevention precautions

The incubation period of coronavirus disease 2019 (CoVID-19) from publicly reported confirmed cases: Estimation and application

Evidence for transmission of covid-19 prior to symptom onset

Estimating the asymptomatic proportion of coronavirus disease 2019 (COVID-19) cases on board the Diamond Princess cruise ship

Occurrence and transmission potential of asymptomatic and presymptomatic SARSCoV-2 infections: A living systematic review and metaanalysis

The effect of control strategies to reduce social mixing on outcomes of the COVID-19 epidemic in Wuhan, China: a modelling study

Short-term forecasting COVID-19 cumulative confirmed cases: Perspectives for Brazil

COVID-ABS: An agent-based model of COVID-19 epidemic to simulate health and economic effects of social distancing interventions

A hybrid multi-scale model of COVID-19 transmission dynamics to assess the potential of nonpharmaceutical interventions

A hybrid agent-based and equation based model for the spread of infectious diseases

Modeling Epidemics with Compartmental Models

Mathematical and computational approaches to epidemic modeling: a comprehensive review

Global dynamics of a SEIR model with varying total population size

Prediction of the Epidemic Peak of Coronavirus Disease in Japan

Mathematical modeling of COVID-19 transmission dynamics with a case study of Wuhan

Age-dependent effects in the transmission and control of COVID-19 epidemics

Mathematical modelling of COVID-19 transmission and mitigation strategies in the population of Ontario

Automatically discriminating and localizing COVID-19 from community-acquired pneumonia on chest X-rays

MetaCOVID: A Siamese neural network framework with contrastive loss for n-shot diagnosis of COVID-19 patients

Multi-task contrastive learning for automatic CT and X-ray diagnosis of COVID-19

Forecasting of COVID-19 using deep layer Recurrent Neural Networks (RNNs) with Gated Recurrent Units (GRUs) and Long Short-Term Memory (LSTM) cells

A Survey on Mathematical, Machine Learning and Deep Learning Models for COVID-19 Transmission and Diagnosis

Special Section on Medical Simulation Analyzing the impact of modeling choices and assumptions in compartmental epidemiological models

Why should we apply ABM for decision analysis for infectious diseases?-An example for dengue interventions

An agent-based approach for modeling dynamics of contagious disease spread

Modelling transmission and control of the COVID-19 pandemic in Australia

OpenABM-Covid19 -An agent-based model for non-pharmaceutical interventions against COVID-19 including contact tracing

Spatio-temporal simulation of the novel coronavirus (COVID-19) outbreak using the agent-based modeling approach (case study

Modeling and analysis of different scenarios for the spread of COVID-19 by using the modified multi-agent systems -Evidence from the selected countries

A stochastic agent-based model of the SARS-CoV-2 epidemic in France

An agent-based model to evaluate the COVID-19 transmission risks in facilities

Modeling the spread of COVID-19 on construction workers: An agent-based approach

Quantifying the impact of physical distance measures on the transmission of COVID-19 in the UK

Analysis of changepoint models

Computationally Efficient Changepoint Detection for a Range of Penalties

Power-law distribution in the number of confirmed COVID-19 cases

Change point analysis of historical battle deaths

A computationally efficient nonparametric approach for changepoint detection

A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data

Estimation of country-level basic reproductive ratios for novel Coronavirus (SARS-CoV-2/COVID-19) using synthetic contact matrices

Agent-Based and Idividual-Based Modeling: A Practical Introduction

Home -NINIS: Northern Ireland Neighbourhood Information Service

Adaptation in Natural and Artificial Systems

Face masks considerably reduce COVID-19 cases in Germany

Genetic Algorithms for the Exploration of Parameter Spaces in Agent-Based Models

The authors wish to acknowledge the Northern Ireland COVID-19 Modelling Group for their interaction in the model building process and their valuable feedback. We also wish to acknowledge Advanced Analytics Labs Ltd for the provision of the computing resources for model building, calibration and running of the experiments described in this paper.

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

This research has been funded by Queen's University Belfast

The authors have no competing interests 

Ontario Tech University with expertise in the fields of data analytics and artificial intelligence and their interdisciplinary applications in solving big data problems, especially the ones related to real-time healthcare analytics, and decision making under uncertainty.