key: cord-1038268-u0vq8987 authors: Decock, Kristof; Debackere, Koenraad; Vandamme, Anne- Mieke; Van Looy, Bart title: Scenario-driven forecasting: modeling peaks and paths. Insights from the COVID-19 pandemic in Belgium date: 2020-07-13 journal: Scientometrics DOI: 10.1007/s11192-020-03591-6 sha: 5fcc6bf27fd1e159506f4346c7bbbe89ec8d46b4 doc_id: 1038268 cord_uid: u0vq8987 The recent ‘outburst’ of COVID-19 spurred efforts to model and forecast its diffusion patterns, either in terms of infections, people in need of medical assistance (ICU occupation) or casualties. Forecasting patterns and their implied end states remains cumbersome when few (stochastic) data points are available during the early stage of diffusion processes. Extrapolations based on compounded growth rates do not account for inflection points nor end-states. In order to remedy this situation, we advance a set of heuristics which combine forecasting and scenario thinking. Inspired by scenario thinking we allow for a broad range of end states (and their implied growth dynamics, parameters) which are consecutively being assessed in terms of how well they coincide with actual observations. When applying this approach to the diffusion of COVID-19, it becomes clear that combining potential end states with unfolding trajectories provides a better-informed decision space as short term predictions are accurate, while a portfolio of different end states informs the long view. The creation of such a decision space requires temporal distance. Only to the extent that one refrains from incorporating more recent data, more plausible end states become visible. Such dynamic approach also allows one to assess the potential effects of mitigating measures. As such, our contribution implies a plea for dynamically blending forecasting algorithms and scenario-oriented thinking, rather than conceiving them as substitutes or complements. In December 2019, the first cases of COVID-19 have been reported in Wuhan City (Habei province, China). As a respiratory infectious disease caused by the SARS-CoV-2 virus, COVID-19 belongs to the Coronaviridae family and is believed to have a zoonotic origin. By January 20, 2020, cases were confirmed in Thailand, Japan and South Korea. The first European cases were reported from France already on January 24 and from Germany by January 28. The World Health Organization (WHO) declared the outbreak a "public health emergency of international concern" on January 30, 2020; where the WHO Director-General labelled it as a global pandemic on March 11, 2020. Italy was the first European country heavily affected by COVID-19, with clusters of cases being reported in Lombardy, Piedmont and Veneto on February 22, 2020. Over the following days and weeks, additional cases were reported from several other European countries. By March 18, Italian hospitals experienced a huge shortage of Intensive Care Unit beds (ICU), hospital beds and breathing assistance equipment. Daily, up to 1000 people deceased from COVID-19 in Italy, exceeding 10,000 causalities by March 28. During the early stage of the outbreak of a pandemic, governments have to take critical decisions relying on only a few available stochastic data points. Epidemiological modeling is one of the key methods available to inform governments about the presumed underlying dynamics and parameters of the epidemic, by fitting the observed data to existing models obtained from previous epidemics and delineating the uncertainty around parameters. As soon as the uncertainty is low enough to model the course of the epidemic, the effect of containment and mitigation measures can be simulated, and the resulting outcomes can then inform decision making. The most used models for the spread of COVID-19 in a naïve population are variants of Susceptible/Infected/Recovered (SIR) models, with potentially additional categories (Wu et al. 2020) . They represent a virus moving into a Susceptible population, where those Exposed progress to Infected and either Recover or Die (E and D being additional categories relevant for , where the rates of transitions between compartments have to be estimated from the data points. Such estimation also implies that one uncovers the reproductive number-R (the average number of secondary infections caused by one infected person)-and either the number of infected people or the infection fatality rate (the one being derived from the other). When not enough data points are available to estimate those parameters (e.g. by means of combining differential equations, stochastic and Bayesian inference), and when neither the number of infected people (given absence of surveillance testing) nor the infection fatality rate (given the unknown proportion of undiagnosed) are known, such models have to make assumptions that may not necessarily be true expanding the uncertainty of the predictions. As epidemiological models initially-i.e. during the early phases of a newly emergent virus-cannot accurately estimate the basic epidemiological parameters, it becomes speculative to delineate or forecast future capacity needs within the health system (e.g. number of hospital beds, beds in ICU, …) including the implied timeframes. Assessing the resilience of the healthcare system and taking appropriate decisions in terms of potentially implied medical capacity, requires insights in both short-term evolutions and medium-term end states. Within this paper, we examine whether diffusion models widely known and used within the innovation (and technology) discipline and being 'agnostic' in terms of underlying parameters bear relevance for modeling and forecasting pandemic phenomena like the COVID-19 virus. Within the next section, we briefly outline the essence of diffusion models whereby the seminal work of Bass (1969) provides our starting point. Next, we outline the heuristicsbuilding on the Bass model-advanced by Decock et al. (2020b) that project and assess multiple end states ('scenarios') in parallel. This approach is then applied to forecast and monitor the ICU capacity in Belgium during the (first) raise of the COVID-19 pandemic in March 2020. It will become visible that this approach yields accurate predictions and is 1 3 instrumental to inform capacity related decision-making processes. We end by discussing limitations, potential refinements and avenues for future research. When modelling the diffusion of an innovation, two processes are being advanced: adoption initiated by so-called innovators (Rogers 1962 ) followed by imitation. As such, these processes resemble the phenomena observed in the context of epidemics revolving around infection resulting in local spread. Management scholars apply sigmoid growth curves, like the logistic equation (Verhulst 1838) or the Bass model (1969) , to model and forecast the outcomes of the underlying dynamics. During the early stages of the introduction, exponential growth prevails; as the diffusion process evolves, the growth rate diminishes and leads to an end stage of adoption, i.e. market or segment saturation. In the context of the COVID-19 pandemic, we apply the Bass (1969) diffusion model to mirror underlying growth dynamics. In general, pandemics evolve in a sigmoidal manner whereby infectious spread ultimately results in herd immunity. This process towards herd immunity might imply high peaks (of simultaneously infected people), whereby the height of the peak is unknown in case of a new virus. These peaks will-after a time delay-result in an increase in the number of hospitalized people, patients entering the ICU's and lastly a rise in deceased. All these subsequent curves follow a sigmoid curve as well with heights and timing driven by exposition and transition rates which are also not exactly known in the case of a new virus (Kermack and McKendrik 1927) . As this process 'mimics' the diffusion patterns known in the management literature, modelling based on diffusion models stemming from the management literature might become relevant. 1 As explained in Decock et al. (2020b) , the straightforward intuition and predictive ability of the model advanced by Bass (Parker 1994; Chandrasekaran and Tellis 2007) resulted in adoption by marketing and innovation scholars alike (e.g. Mahajan et al. 1990; Massiani and Gohs 2015) . The Bass model disentangles innovators from imitators, both considering a different adoption rationale and reacting to different means of communication (Bass 1969; Lekvall and Wahlbin 1973; Mahajan et al. 1990; Bass et al. 1994) . In its discrete form, the Bass model can be written as: where N t = the cumulative number of adopters (here: ICU occupation) at time t, m = the potential market (here: the ultimate number of ICU occupation; the end state), p = coefficient of innovation (here: the infection parameter), q = coefficient of imitation (here: the intensity of contamination), (m − N t−1 ) = non-adopters at the beginning of period t, and (N t−1 /m) = the fraction that has already adopted. Predicting the evolution of these S-shaped curves requires an estimation of at least three parameters, related to the takeoff, the steepness, and the saturation level of the curve. Most scholars use Bass's model as a starting point to derive the 'optimal' set of parameter estimations and consequently distill an end-state. Decock et al. (2020b) conclude their overview of the literature with pointing at the consensus among scholars and practitioners alike: it remains a cumbersome endeavor to delineate robust parameter estimations when only few trend data are available (e.g. Van den Bulte and Lilien 1997; Mahajan et al. 1990 ). As such, scholars and practitioners tend to favor scenario-oriented methods during these early stages-characterized by high levels of uncertainty-in order to explore plausible futures and their constituents. Indeed, from the 1970s onwards, quantitative forecasting (including trend extrapolation) has been losing momentum in favor of qualitative foresight approaches. The 'traumatic' effect of the oil crisis of 1973 initiated a "paradigm shift" in future oriented research (Mietzner and Reger 2005) , as this event didn't fit with most of the anticipated and predicted futures. Only a few companies, including Royal Dutch/Shell, have been elaborating scenarios in which this type of event was portrayed as plausible (van der Heijden 1997; Schwartz 1991) . In the early 1970s, Pierre Wack-inspired by the pioneering work of Herman Kahn at the Rand Corporation in the 1960s (e.g. Kahn and Wiener 1967)-applied foresight theories from the field of public planning towards business contexts (see also Wack 1985a, b) . Since the 1980s, scenario-oriented thinking has been further developed for management purposes (with pioneering work of Shell's Strategic Planning Group members including Arie de Geus, Peter Schwarz and Kees Van der Heijden) at the expense of more quantitative approaches, including extrapolation and forecasting. Recently, Decock et al. (2020b) build on the initial Bass model in order to model quantitatively different scenarios related to the diffusion of the Battery Electrical Vehicle. The heuristics advanced start from the premise of multi-finality (Buckley 1967) : "similar initial conditions may lead to dis-similar end-states". Allowing for different end states to unfold, a wide range of the implied parameters (m, p and q) are initially being considered. Within a next step, more plausible scenarios are selected by means of a loss function, which confronts the obtained forecasts with the few real observations already available. As such, this approach does neither require ex ante assumptions about the relevant range of implied growth parameters (p and q) nor about potential end states. The proposed heuristics revolve around the development of a three-dimensional search space, reflecting the presence of three model parameters to be estimated. The considered search grid in this paper consists of 250,000 different parameter combinations reflecting different scenarios. For m, we allow variations of 10% (from 10 to 100% of end states) while, for p and q, the ranges vary with 250 and 100 steps respectively. The considered m values have been defined based on the population size whereas relevant ranges of p and q have been identified in line with ranges documented within the innovation diffusion literature. The exhaustive search grid allows us to assess all different combinations of the three parameters, with no ex ante assumptions on either initial values for each parameter 1 3 or potential equivalences between different value combination across the three parameters. Thus, in this first step, all imaginable combinations are considered to gauge potential diffusion pathways. In a next step, we select more plausible diffusion pathways by introducing a loss function. This loss function assesses how well each parameter combination explains the current, available observations and therefor calculate 250.000 loss functions whereby the 'goodness of fit' for each combination is defined as: In a final step, we introduce a threshold value pertaining to the R 2 , in order to selectand, in a subsequent step, to assess and qualify-more plausible scenarios. This subset of diffusion paths is being considered as the more likely scenarios as they are performing best at explaining what we currently observe in terms of diffusion and will be analyzed in terms of growth dynamics and end states. In case of the COVID-19 pandemic, it appeared unclear for regions/countries which end states would materialize with respect to the total number of infected persons, the number of deceased to expect as well as the implications in terms of capacity and occupation of hospital beds and ICU. In this context, decision makers would benefit from forecasts which consider capacity requirements also at the highest burden. At the same time, precise estimates of relevant parameters are to a large extent absent: the available numbers for a wide range of countries suggest high spreads (both in terms of contamination and potential end states). This complicates the quest for accurate predictions, unless one considers simultaneously the presence of multiple end states and different growth dynamics (reflected in the three parameters to gauge). Stated otherwise, incorporating the multi-finality logic replaces the quest for the most accurate estimation by a systematic assessment of plausible trajectories (paths) and their implied peaks based on limited time series. By applying this set of heuristics, the COVID-19 diffusion patterns for Belgium have been modeled to analyze whether they allow to arrive at a better-informed decision space, both in terms of the short-term path that is being 'walked' (daily/weekly) and in terms of the peaks to expect (end states). Models have been developed both for deceased and for ICU occupation rates in several countries; however, we focus in this contribution on the ICU occupation in Belgium. We argue and demonstrate how blending the initial forecasting models with scenario-oriented thinking-i.e. forecasting 'paths' leading to different 'peaks'-could yield novel insights in terms of decision making under highly uncertain circumstances. In Belgium, the first confirmed COVID-19 case was reported on February 3, 2020, related to a person repatriated from Wuhan; while the first COVID-19 deceased was registered on March 10. In 'steady state'; the Belgian hospitals have around 1900 ICU beds at their disposal. Policy makers decided to increase ICU capacity and allocated approximately 2300 ICU beds exclusively for COVID-19 patients. The question that then becomes crucial: will this be enough for the coming weeks and months? In order to obtain insights on the likelihood of potentially unfolding scenarios, it becomes crucial to monitor the ICU occupation and delineate policy options for timely adjusting the ICU capacity. Table 1 depicts the number of patients in ICU (related to in Belgium, between March 12 and March 24. A multi-dimensional search space was composed, consisting of scenarios whereby the ICU-capacity was allowed to range from 1000 to 10,000 beds. Combined with variations in p (250) and q (100) we calculate 250.000 possible curves, based on the observations until March 24 (i.e. time series of 13 data points). Next, we retrieve all parameter combinations that pass the 99% threshold (n = 806) 3 and label them as the more likely scenarios. Figure 1a , b combine the initial observations (i.e., the green dotted line) with the stylized more likely scenarios (averaged by end state), for both the ICU occupation in the long term (Fig. 1a ) and the short term (i.e. 1 week ahead) (Fig. 1b) . The latter also includes dashed grey lines representing the curves composed of the minimum and the maximum of the daily values of this subset of 806 models. Based on the subset of more likely scenarios, we could compose a dashboard informing decision makers about the corresponding short-term evolutions, related to all plausible end states. Table 2 depicts the average forecasts of ICU occupation for the different end states between March 25 and March 31, the overall daily minimum and maximum forecast, the overall daily average forecast of the 806 pathways, and the average daily forecast of the 10 averaged end states. One could conclude that-based on the observations until March 24-worst-case scenarios could not be excluded yet (implying end states requiring 5.000 ICU beds and beyond). As days evolved, data of the 15 days between March 25 and April 8 became available, as reported in Table 3 and visualized as red dots in Fig. 2a , b for the total ICU occupation and in Fig. 2c regarding the daily net increase. Based on this unfolding path it becomes feasible to qualify the likelihood of different pathways and their corresponding end-states. During the first 3 days after modelling different end states and their implied pathsi.e. between March 25 and March 27-the reported actuals were higher than the average forecasts of the 10,000 ICU occupation pathway, while still below the overall maximum. From March 28 onwards, the observed patterns started to move into the direction of scenarios suggesting that required levels would be situated between 1000 and 3000 of ICU occupation. Worst-case scenarios-pertaining to 5000 + ICU capacity-became less and less likely. Based on these insights, we reported to the Belgian government an expected maximum ICU occupation between 2000 and 3000 beds needed by the third week of April (see Fig. 2a, b ). In addition, the peak of the net increase-for both end states-would be reached by the beginning of April (see Fig. 2c ). Consecutive model updates and refinements allowed us to predict the paths and the peaks even more precisely. Based on the updates, we reported from March 30 onwards that an expected maximum ICU occupation between 1000 and 2000 beds would be required by the second week of April, with a corresponding peak of the net increase-for both end states-reached by the end of March. From April 3 onwards we could narrow down this prediction towards a maximum capacity range between 1500 and 2000 beds. In addition, we confirmed to the Belgian authorities that the peak of daily net increase has been reached between March 25-27, and that for the coming days, the growth rate of the ICU capacity would decline substantially. The plateau of ICU occupation in Belgium was reached by April 4, fluctuating since then around 1260 beds. On April 6 we reported that the peak of the maximum capacity was reached exactly 1 3 1 week after the peak of daily net increase. The growth peak has been reached exactly 2 weeks after the introduction of the mitigation measures by the Belgian government. The heuristics outlined above combine a broad range of plausible scenarios (implying different end states and their resulting peaks) with an assessment of more likely pathways. Not only do the resulting models provide accurate predictions in the short term; when additional observations become available, they also point out plausible end-states. By initially allowing a wide range of futures (scenarios) to unfold, decision makers have at their disposal a decision space which includes different (and thus also worst-case) scenarios. When new observations become available, pathways become visible pointing at specific end states in the medium term. As such, the creation of such a decision space requires temporal distance: only to the extent that one refrains from incorporating immediately more recent data, more plausible end states become visible. At the same time, updating and re-calibrating the pathways seem to offer potential to start qualifying Knightian uncertainty (Knight 1921) , which definitely is a line of future research to pursue. More specifically, the frequency distribution of the end states (as reported in "Appendix"), retained in the subset of more plausible scenarios might inform a quantitative assessment of the implied uncertainty. As such, our contribution implies a plea for combining forecasting algorithms with a scenario-oriented lens and vice versa. Dynamically blending both approaches has the potential to inform policy makers in situations of urgent decision needs conditioned by profound uncertainty. At such critical moments during an unfolding pandemic, the use of large amounts of data cannot inform decision-making, as those data are largely absent in such instances. In addition, the number of parameters to be estimated in epidemiological modeling increases with the complexity of the model: the more categories added, the more data points needed for confident estimates. To estimate the peak in ICU occupation, even more parameters are needed, such as daily influx and daily outflux. Diffusion modeling allows to gauge the peak without any underlying assumption, except that the overall shape of the curve is sigmoidal. In absence of containment or mitigation measures, this is a valid assumption, given that R 0 is not changing. At the beginning of the epidemic, R 0 is the basic reproductive number, indicating the number of secondary infections caused by each infected person, if no measures are taken. When mitigation measures are taken, R changes (R t being the reproductive number under the set of mitigation measures at that time). Any measures have as goal to reduce R 0 to a value, R t , that is below 1. Our heuristics will lay out likely paths towards potential end states based on initial observations coinciding with R 0 . As long as R t is not changing or diminishing (below one), the heuristics allow to forecast a peak (which in case of a diminishing R t reflects 'worst-case' predictions). If on the other hand, R t is not influenced in a consistent (downward) way (below one), the sigmoid assumption no longer holds, and the diffusion of the pandemic will start to display multiple peaks which no longer will be grasped by the underlying model specifications which relies only on three parameters (and hence two inflection points). The dynamic use of forecasting techniques combined with the multi-finality, end state based, foresight scenarios while coupled to a temporal distancing axiom and the subsequent loss function calculations, brings a novel insight to outcome prediction modeling that is deemed directly relevant to policy makers in times of pandemic. Rather than just extrapolating from evidence, our approach signals the judicious governance of evidence as those model building blocks merge into one coherent set of pathways towards increasingly likely outcomes. Policy makers can decide and act in a better-informed way on the basis of this set of pathways. In line with Justin Parkhurst's plea (2016) for the good governance of evidence. As the pathways unfold, policy makers know that they incorporate the effects of the portfolio of mitigating measures they have taken going forward. As such, our heuristics support simultaneous policy making, policy experimenting and policy learning. A coherent and consistent approach to this triad of policy making-experimenting-learning is deemed indispensable to policy development in times of deep crises, as the Covid-19 one illustrates. Such conclusion does warrant further research into the development and management of policy design for deep societal crises. Our modeling approach offers a first endeavor in that direction. A new product growth for model consumer durables Why the Bass model fits without decision variables Sociology and modern systems theory A critical review of marketing research on diffusion of new products Scenario-driven forecasting: Lessons learned from modeling the COVID-19 pandemic Bass re-visited: Quantifying multi-finality European Centre for Disease Prevention and Control (ECDC) The year 2000: A framework for speculation on the next thirty-three years Contributions to the mathematical theory of epidemics-I Risk, uncertainty and profit A study of some assumptions underlying innovation diffusion functions New product diffusion models in marketing: A review and directions for research The choice of Bass model coefficients to forecast diffusion for innovative products: An empirical investigation for new automotive technologies Advantages and disadvantages of scenario approaches for strategic foresight Aggregate diffusion forecasting models in marketing: A critical review The politics of evidence: From evidence-based policy to the good governance of evidence Diffusion of innovations The art of the long view: Planning for the future in an uncertain world Bias and systematic change in the parameter estimates of macrolevel diffusion models Scenarios. The art of strategic conversation Notice sur la loi que la population suit dans son accroissement Scenarios: Uncharted waters ahead Scenarios: Shooting the rapids state ment-on-thesecon d-meeti ng-of-the-inter natio nal-healt h-regul ation s-(2005)-emerg ency-commi ttee-regar ding-theoutbr eak-of-novel World Health Organization. WHO Director-General's opening remarks at the media briefing on COVID-19-11 Nowcasting and forecasting the potential domestic and international spread of the 2019-nCoV outbreak originating in Wuhan, China: A modelling study Acknowledgements This paper is an extended version of a previous research note published in the ISSI Newsletter (Decock et al. 2020a ). This study benefited from useful input and reflections by Jorge Ricardo Blanco Nova (KU Leuven), Michela Bergamini (KU Leuven), Sien Luyten (Flanders Business School) and Xiaoyan Song (KU Leuven). We want to express our gratitude to the Rega Institute and the Institute for the Future for providing a context to validate our models, and to EURO POOL GROUP for funding part of the research reported here. Including scenarios below the 99% threshold allows more pathways to be explored. However, this approach yields a similar frequency distribution for the different end states, as summarized in Table 4 .