key: cord-0115451-s0sl659c
authors: Chopra, Ayush; Gel, Esma; Subramanian, Jayakumar; Krishnamurthy, Balaji; Romero-Brufau, Santiago; Pasupathy, Kalyan S.; Kingsley, Thomas C.; Raskar, Ramesh
title: DeepABM: Scalable, efficient and differentiable agent-based simulations via graph neural networks
date: 2021-10-09
journal: nan
DOI: nan
sha: 21546c480d75f05db7b2eff09bbb616a741d1034
doc_id: 115451
cord_uid: s0sl659c

We introduce DeepABM, a framework for agent-based modeling that leverages geometric message passing of graph neural networks for simulating action and interactions over large agent populations. Using DeepABM allows scaling simulations to large agent populations in real-time and running them efficiently on GPU architectures. To demonstrate the effectiveness of DeepABM, we build DeepABM-COVID simulator to provide support for various non-pharmaceutical interventions (quarantine, exposure notification, vaccination, testing) for the COVID-19 pandemic, and can scale to populations of representative size in real-time on a GPU. Specifically, DeepABM-COVID can model 200 million interactions (over 100,000 agents across 180 time-steps) in 90 seconds, and is made available online to help researchers with modeling and analysis of various interventions. We explain various components of the framework and discuss results from one research study to evaluate the impact of delaying the second dose of the COVID-19 vaccine in collaboration with clinical and public health experts. While we simulate COVID-19 spread, the ideas introduced in the paper are generic and can be easily extend to other forms of agent-based simulations. Furthermore, while beyond scope of this document, DeepABM enables inverse agent-based simulations which can be used to learn physical parameters in the (micro) simulations using gradient-based optimization with large-scale real-world (macro) data. We are optimistic that the current work can have interesting implications for bringing ABM and AI communities closer.

The coronavirus has had significant impact on global society and economy. Successfully navigating this phase involves answering a series of "what-if" questions that can have long lasting implications on our future. These questions may include: 'what-if' we delay the second dose of the vaccine, 'what-if' we reopen schools, 'what-if' we impose a lockdown, etc. Agent-Based Modeling has emerged as a central tool that can help policy makers ground their decisions to tackle these challenging questions. ABMs are descriptive simulation models that enable one to (i) study the actions and interactions of large heterogeneous populations, and (ii) analyze the emergent effects of behavioral and clinical interventions on various public health outcomes such as cumulative mortality, infections and hospitalizations.

While ABMs are powerful descriptive tools, emergent behavior can be highly sensitive to the scale of the input population and calibration of the input parameters. Conventional ABM frameworks such as Mesa and NetLogo (Masad and Kazil 2015; Wilensky 2021) follow an object-oriented design that is centered around agent definition and actions, which are characterized as objects. While conceptually appealing, these are often inefficient to scale to large agent populations to sufficiently represent the behavior in real-world contexts. In this work, we introduce DeepABM, a novel framework for agent-based modeling that takes a network-centric approach that revolves around the interaction networks of the agents in the simulation. DeepABM builds upon concepts of tensor-calculus and graph neural networks in deep learning to deliver scale and efficiency to these descriptive simulations.

In DeepABM, individual agents (and their states) are modeled as tensors, while their interactions as represented as permutation-invariant message-passing operations in graph neural networks. Leveraging the advantages of and recent advances in in graph deep learning, DeepABM can seamlessly scale to large populations (of size greater than 100,000 agents) in real-time and efficiently execute on graphic processing units (GPU). Furthermore, while not used to obtain the results discussed in this paper, DeepABM can also concurrently calibrate a large number of input parameters using supervised gradient-based optimization instead of (restrictive) randomized search methods.

DeepABM has been adapted during the COVID-19 pandemic to efficiently simulate transmission dynamics and study the impact of various non-pharmaceutical interventions (NPIs) on various public health outcomes. We will, henceforth, refer to this model as DeepABM-COVID. Agent distribution and their interactions in DeepABM-COVID are parameterized using real-world census data. Parameters related to disease transmission and progress are calibrated using a plethora of research studies that analyze real-life clinical data. DeepABM-COVID models agent interaction over multiple networks (household, occupation and random) and can concurrently support several diverse interventions (quarantine, exposure notification, testing, vaccination). DeepABM-COVID can model 200 million interactions over 100,000 agents across 180 timesteps in about 90 seconds on a GPU. To contextualize the extend of this gain, we benchmarked our analysis by implementing the same simulation in Mesa, which takes around five hours to execute.

DeepABM-COVID has already been used in a research study that analyzes the impact of delaying the second dose of the two-dose COVID-19 vaccines, in favor of administering the vaccine to a broader segment of the population. This research has been recently published in British Medical Journal and has received attention from the various research communities considering effective vaccine allocation strategies. The simulation study, parameters and behavior governing vaccine efficacy, immunity against the vaccine, etc. have been designed in collaboration with leading clinical and public health experts from the Mayo Clinic and Arizona State University. Other members of the team includes computational scientists from MIT as well as biostatisticians from Mayo Clinic and Harvard University. We outline some of the relevant results from that study in Section 5 below, and refer the reader to Romero-Brufau et al. (2021) .

The following sections provide more detailed information about the framework and the paper is structured as: Section 2 includes preliminary for the DeepABM framework, Section 3 discussed the DeepABM-COVID framework, Section 4 highlights the various interventions that are supported. Section 5 includes a detailed case-study studying the effect of delaying second dose and conclusion is in Section 6. Chopra, Gel, Subramanian, Krishnamurthy, Romero-Brufau, Pasupathy, Kingsley, and Raskar

DeepABM is a tensor calculus based approach for our simulations. In this approach, agents are not modeled as objects, but rather agent states are modeled as tensors and tensor algebra is used to model state transitions. Vectorized implementations of ABM have been explored in literature, primarily for mean-field or similar approximation based models and also for some agent based models. However, to the best of our knowledge, this is the first time a network based model that captures individual interactions, is modeled using tensor calculus. Essentially, we use the message passing abstraction of graph convolutional neural networks to model the inter-agent interaction and associated infection dynamics (Kipf and Welling 2016). The message passing network (Zhong, Li, and Pang 2020) provides infrastructure to collect messages (effects of interactions) from all neighbours of a node for all nodes in a network. This design has been central to significant progress in geometric deep learning (Kipf and Welling 2017; LeCun et al. 2015; Veličković et al. 2017) . Hence, deep learning frameworks (such as Pytorch Paszke et al. (2019)) provide optimized differentiable implementations for this with support for GPU execution. We leverage this graph deep learning infrastructure to design interaction networks and model inter-agent interactions over them (such as disease spread) in agent-based models. The disease dynamics are modeled using standard parametrized tensor operations.

The underlying mathematical framework for DeepABM can be considered to be a partially observable semi-Markov game (POSMG), which is a slight modification of the definition given in (Littman 1994) . Our ABM is modeled as a game as each agent receives different information (partial observations) at each step and chooses its response in a decentralized manner based on its private information and its own objective. We further consider this as a semi-Markov game as the decision epochs or state sojourn times are sampled from a distribution that depends on the current state and next state of the agent, unlike a fixed sojourn time as in Markov games. Furthermore, the action spaces of the agents are state-dependent.

To design DeepABM-COVID, we adopted the basic disease infection and progression parameters used in Abueg et al. (2020) and implemented a number of non-trivial extensions. The following sections provide information on the various components of DeepABM-COVID for the sake of completeness. Further information on choices of parameters, and other characterizations of realistic behavior can be found in Romero-Brufau et al. (2021) .

Agent State Definition: At any given time step, the state of the agent is composed of static and dynamic components. The dynamic components include agent attributes that change over time when an agent interacts with others. The static components are attributes of the agent that do not change over the course of the simulation. However, these static components influence an agent's interaction with its neighborhood and the subsequent evolution of the dynamic attributes. This agent state is represented as a one-dimensional tensor obtained by concatenating the static and dynamic components.

For the agent representation in DeepABM-COVID, the static component includes the following attributes: (i) age, (ii) household, (iii) occupation, (iv) (random) number of daily interactions. Each of these attributes are categorical variables and are initialised for each agent using the real-world census (for age, household, and occupation) and mobility data (to generate the number of interactions) for King's County in the State of Washington. In particular, age corresponds to an identifier from 1 to 8 (non-overlapping) age-groups and occupation corresponds to an identifier from 1 to 23 possible defined occupations.

The dynamic component of each agent's state includes the following attributes: (i) (current) disease stage, (ii) quarantine status, and (iii) vaccination status. At any step, the disease stage of each agent can be Chopra, Gel, Subramanian, Krishnamurthy, Romero-Brufau, Pasupathy, Kingsley, and Raskar one of the defined eleven values: susceptible, asymptomatic, presymptomatic mild, presymptomatic severe, mild symptomatic, severe symptomatic, hospitalized, critical in ICU, recovered, vaccinated or dead. For vaccination, we assume that the available vaccines follow a two-dose regimen, and the vaccination status can take one of three distinct values: pre-vaccination (1st dose eligible), partially vaccinated (1st dose completed), and fully-vaccinated (second dose completed). The quarantine status for an agent is a binary variable: true or false. These dynamic components change over time and drive various events (such as transmission instance) in the simulations. To initialize the agent population, we use the census and mobility statistics for Kings County, WA as in Hinch et al. (2020) .

Interaction Networks: At any step of the simulation, each agent interacts with neighboring agents concurrently across three independent networks. In the DeepABM-COVID simulator, these networks are used to represent (i) household network, which defines cross-age interactions within family members, (ii) occupation network, which defines interactions due to the individuals' employment at their workplaces at different industries, and (iii) a random interaction network, which defines other random interactions agents may have over the course of a day.

There are multiple household networks in the simulation, one for every household, each of which is modeled as a fully-connected network. Each agent is assigned to a single household network at the initialization and this remains fixed throughout the simulation. We assume that each (non-quarantined) agent is connected to and will interact with every other agent in his household at every step.

There are 23 occupation networks (as in Hinch et al. (2020)) defined in the current version of DeepABM-COVID. Every agent is assigned to a single occupation network once at initialization, and this assignment does not change throughout the simulation. Each occupation network is modeled as a smallworld Watts-Strogatz network (Watts and Strogatz 1998), which is initialized once but re-parameterized at every step. What this means is that the specific agents in an occupation network (i.e., those that share the same occupation) don't change over time but each agent may interact with a different subset of agents on any given day. The re-parameterization of each occupation network is done independently in accordance with the value of the parameter that defines the mean number of daily interactions for that occupation.

There is a single global random network in the simulation. This is modeled as a small-world Watts-Strogatz network that is re-initialized at every time step. The network is used to simulate infections that result from random interactions that agents may have with unknown individuals. These interactions, for instance, may occur during visits to the grocery store, doctor's office, at the train station, etc. We observe that these random interactions play a significant role in viral spread in the simulation, in consistence with real-world estimates. At any given time step, we use each agent's (age-stratified) number of random interactions to parameterize this global random network. We define the networks using the interaction parameters given by Hinch et al. (2020) .

We next explain the transition dynamics that govern the agent state transitions over simulation steps in DeepABM-COVID. This involves studying the infection transmission that regulates per-step inter-agent interactions and the consequent inter-step per-agent disease progression.

Infection Transmission: Infection is spread through interactions between infected and susceptible individuals. The rate of transmission primarily depends upon: (i) infectiousness of the pathogen, (ii) age-dependent susceptibility of infectee to transmission, and (iii) type of interaction (i.e., in which network it occurred).The infectiousness varies over time, starting at zero when the agent is infected, peaks at an intermediate time and eventually tends to zero. Duration of infectiousness is modeled with a gamma distribution and we use the values referenced by Abueg et al. (2020) . In particular, we stratify age into nine age-groups: 0-10, 11-20, 21-30, 31-40, 41-50, 51-60, 61-70, 71-80, and 80+ , and follow the susceptibility parameters as used in (Abueg et al. (2020)). The type of interaction is defined by the specific network in which it originates. To account for the duration of interaction, which is not directly observable, we use a multiplicative scalar to amplify Chopra, Gel, Subramanian, Krishnamurthy, Romero-Brufau, Pasupathy, Kingsley, and Raskar effect of interaction in household networks. Finally, viral transmission in each interaction is represented by the following equations:

where t denotes the amount of time since infection; s i indicates the infector's symptom status (asymptomatic, mild, moderate/severe); as is the age of the susceptible; n is the type of network where the interaction occurred; I is the mean number of daily interactions; f Γ (u; µ i , σ 2 i ) is the probability density function of a gamma distribution; µ i and σ i are the mean and width of the infectiousness curve; R scales the overall infection rate (under some simplifying assumptions it is mean number of people infected by each moderately/severely symptomatic individual); S a s is the scale-factor for the age of the susceptible; A s i is the scale-factor for the infector being asymptomatic; B n is the scale-factor for the network on which the interaction occurred.

Disease Progression:

When an agent is infected, it enters a hierarchy of disease progression, as summarized in Romero-Brufau et al. (2021) . This corresponds to the evolution of disease stage (which is part of the dynamic component of an agent state) of an agent during the course of the simulation. The progression time delay and probability between two disease stages primarily depends upon age of agent. These parameters are obtained from established clinical literature for COVID-19 disease progression. Essentially, the duration of various disease stages of an agent is randomly generated using continuous time distributions with disease stage and age-dependent parameters.

The DeepABM-COVID simulator currently supports the following interventions: isolation and selfquarantine, digital exposure notification (DEN), testing, and vaccination.

Upon experiencing symptoms, agents undertake a diagnostic test. If the test returns positive, then the agent is self-quarantined for 14 time steps (days). There is a daily dropout probability from the quarantine to model non-compliance in the real-world. When an individual starts quarantine, the quarantine status attribute in agent state is set to true (=1). Once an agent breaks quarantine, after 14 days or due to non-compliance, the quarantine status is reset to false. This cannot be updated until a new test is administered (following re-emergence of symptoms) which turns positive and initializes a new quarantine instance. Agent quarantine status influences the infection transmission from interactions on the given day. While agent is successfully quarantine, we scale the infectiousness of the agent to 0 when modeling interactions.

When an agent tests positive, he starts to self-quarantine for a period of 14 days. If DEN is enabled to do so, a test notification is sent to other (non-quarantined) agents that this positive agent interacted with over the past 7 steps (days). We note that this notification is only sent to agents that also have access to the exposure notification app (kept as part of agent state variable) to mimic real-world constraints on digital contact tracing. Agents are assigned DEN app access randomly at initialization with a probability governed by an app-adoption parameter in society. The notified agents may undergo a test (based on compliance probability) on the next step of simulation and undergoes self-quarantine if the test returns positive. Rate per 100,000

Cumulative Mortality No intervention Self-quarantine + DEN Self-quarantine + DEN + POC Test Self-quarantine 

DeepABM-COVID simulations have support for various kinds of testing types: RT-PCR test (which is the most accurate type of testing available), an antigen test, and a much less accurate, rapid point-of-care test. During a specific simulation run, we assume that only one of the three tests is present in the community, although this assumption can be relaxed. Each test is parameterized by two variables: (i) specificity, and (ii) turnaround time. To model real-world constraints of sample collection, analysis and delivery, we assume the following parameters for each of the three tests: i) rapid antigen test (specificity=0.65, time=2 steps), ii) RT-PCR test (specificity=0.95, time=3 to 5 steps uniformly sampled), iii) rapid point-of-care test (specificity = 0.85, time=1 step). For all simulations, we use RT-PCR test by default, unless specified otherwise.

DeepABM-COVID simulates the two-dose vaccination regimen as followed in the mRNA vaccines such as Pfizer and Moderna. In the simulation, it is possible to make different assumptions on the timing of the doses, the efficacy of each dose, and the prioritization scheme to be followed in the vaccination campaign. Further details on vaccination modeling capabilities of DeepABM-COVID can be found in Section 5.

We study the effect of simulating with the different interventions presented above and present few sample results in Figure 1 . The results correspond to mean and standard deviation of 10 independent runs over 120 steps. We compare four configurations: a) no-intervention b) self-quarantine c) self-quarantine + DEN d) self-quarantine + DEN + Rapid Point-of-Care Testing. To ensure consistence with real-world scenarios, we assume DEN app-adoption of 0.3 and an app compliance probability of 0.8. The point of care test has turnaround time of 1 step (i.e. same day) and we assume specificity of 0.85 to account for real-world variability in sample collection and analysis. The results show that all interventions help reduce cumulative incidence of deaths and infections and incremental addition of new interventions further helps in controlling spread. We study the effect of the vaccination intervention with detailed sensitivity analysis in Section 5.

One of the most significant accomplishments of the COVID-19 pandemic response has been the development, manufacture and deployment of vaccines against the disease. Massive investment and innovative science have yielded multiple effective SARS-CoV-2 vaccines in record time. Two of the most effective vaccines, namely those produced by Pfizer and Moderna, consist of two-dose regimens with the second dose administered 21 or 28 days following the initial dose, while the approved viral vector vaccine received approval with a second dose 4-12 weeks after the first. As with most clinical trials of new therapeutics, dosing regimens were decided a priori, based on a combination of preliminary data and intuition. Emerging data, however, have suggested that most of the protective benefit conferred by SARS-CoV-2 vaccines may result from a single dose alone. If true, this may have significant implications regarding the optimal allocation of limited vaccines. In particular, multiple public health authorities have speculated that deviating from the standard two-dose regimen in favor of a broader one-dose regimen may save lives and reduce infectious spread.

Despite the increasing pace of vaccinations in the US, the vaccine supply, relative to the worldwide populations in need remains constrained. We therefore sought to understand the population health implications of delaying the administration of second dose vaccines in favor of vaccinating a broader number of people with a first dose. In particular, our goal in this study was to compare the effectiveness of the standard dosing strategy to that of a delayed second dose strategy, which prioritizes the vaccination of a first-dose eligible individual using an agent-based simulation model. In alignment with the age-based prioritization, both strategies we considered observe prioritization by age. Despite the fact that our agent based simulation model involves representations of social and employment networks, we did not consider any prioritization with respect to employment sector; i.e., healthcare workers, for example, were not prioritized over other employment networks. In addition to the standard dosing and delayed second dose strategies, we consider a third alternative strategy, which we refer to as delayed second dose except for 65+ to test if the increased mortality risk among the elderly implies a differential vaccine dosing regimen for this segment of the population.

It is useful to provide the following example to demonstrate the three vaccination strategies clearly, we offer the following example of six hypothetical individuals: Adam -first dose eligible 78 yr old; Betty -second dose eligible 78 yr old; Charlie -first dose eligible 68 year old; David -second dose eligible 68 year old; Eleanor -first dose eligible 40 year old; Frank -second dose eligible 40 year old. Table 1 below shows the order in which these individuals are prioritized for vaccine administration under each strategy. To analyze the effectiveness of these three strategies, we use the DeepABM-COVID simulator explained above. The modular design of the DeepABM toolkit enabled seamless implementation of various prioritization logic with minimal edits. Further, we also augment several extensions to accurately reflect the immunity that first dose and second dose vaccines confer on individuals, as we explain below. Various further details on the statistical analysis of clinical data to justify our choices of these parameters are provided in Romero-Brufau et al. (2021) .

We assume that a dose of the vaccine provides a certain probability of becoming immune to infections. This probability is dependent on whether the vaccine is administered as a first dose or a second dose. In accordance with trial data, we assume that twelve days after receiving the first dose of the vaccine, agents have a 60%, 70%, 80% or 90% probability of becoming immune (depending on the scenario). After receiving the second dose, agents reach a 95% probability of becoming immune.

We consider two versions of the immunity offered by the vaccine. In the first version, we assume that the vaccine provides sterilizing immunity, meaning that agents, even when they are exposed to the virus, do not get infected. Under the other version of the immunity effect we tested, agents still get infected at the same rate as non-vaccinated agents, but they experience asymptomatic infections with the associated probabilities. In this case, they can still transmit the disease with the same probability as a non-vaccinated asymptomatic patient.

Under each vaccination strategy, we run 15 replications of the agent based simulation for 180 days, and observe the number of infections, hospitalizations and deaths over time for each simulation replication. We initialize the simulations with 10 infected agents and start vaccinations when the number of infected agents reach 1% of the population, which happens around simulated day 20. We plot the median as well as 25th and 75th percentiles of those 15 runs for each outcome of interest under the vaccination strategies to compare and contrast them with respect to their effectiveness in reducing the number of cumulative deaths as well as cumulative infections and the number of hospitalizations over time. We evaluate the performance of the three strategies under a number of different carefully constructed cases to support decisions by public health officials tasked to design and deploy vaccination campaigns. More details can be found in Romero-Brufau et al. (2021) but we include below two important observations on the impact of (i) efficacy of the first dose of vaccine and (ii) daily vaccine administration rate.

Impact of Vaccine Efficacy on the Effectiveness of Delayed Second Dose Strategy: To evaluate the effectiveness of delaying the second dose, we first set the daily vaccination rate to 0.3% of the population size (100K in our simulation study) and observe the differences in the cumulative deaths under each vaccination strategy. The vaccine administration rate of 0.3% per day is obtained from the reported vaccination numbers in the U.S. (early in the vaccination campaign) and other countries around the world. At the time of writing of this article, the US vaccination rates have increased dramatically, and consequently, we analyzed the effectiveness of these strategies under higher daily vaccination rate capability, as we describe below. Figure 2 is adopted from Romero-Brufau et al. (2021), and provides the quartiles of observed cumulative deaths in the replications under the standard dosing and delayed second dose strategies for a daily vaccination rate of 0.3%. The results demonstrate the important finding that the comparative effectiveness of the delayed second dose strategy depends strongly on the efficacy of the first dose vaccine. In particular,the total cumulative mortality on day 180 is lower for the delayed second dose scenario under the assumption that the first dose effectiveness is higher than 80%, which is typically justified by the data from the clinical trials as well as vaccine deployment programs all over the world for the two-dose Pfizer and Moderna vaccines.

Impact of Vaccine Administration Rate on Effectiveness of Delayed Second Dose Strategy: The above presented results are also dependent on the speed at which the vaccination campaign can be run. To test this, we adopt a first dose vaccine efficacy of 80% (which is well supported by the studies on clinical and trial data) and evaluate the effectiveness of the delayed second dose strategy under daily administration rates of 0.1% (very slow rate), 0.3% (nominal rate) and 1% (relatively fast rate reflective of more recent US vaccination rates). Figure 3 is again adopted from Romero-Brufau et al. (2021) , and demonstrates the above cited effect clearly. Essentially, the standard dosing becomes the preferable strategy as the daily administration rates improve. However, given that most countries are lagging behind in vaccination efforts, this study points to the important advantage that delaying second doses offers by providing a broader, albeit less protective, first-dose administration across the population.

The case study clearly demonstrates the impact of a computationally efficient simulation framework like DeepABM-COVID. At each stage of the pandemic, there are interesting public health policy questions that our team of experts plan to explore using the modeling and simulation capabilities offered by DeepABM-COVID. The DeepABM-COVID simulator is also made available online, to help researchers with modeling various interventions for COVID as well as adapting the same for studying other interesting emergent phenomenon in public health and beyond.

Chopra, Gel, Subramanian, Krishnamurthy, Romero-Brufau, Pasupathy, Kingsley, and Raskar 

In this paper, we introduce DeepABM, a toolkit for agent-based modeling that leverages graph neural network frameworks from deep learning to bring scale and efficiency to agent-based simulations. DeepABM can seamlessly scale to large populations (with more than 100,00 agents) in real-time and also execute efficiently on a GPU. We extend the toolkit to introduce DeepABM-COVID for simulating spread of COVID-19 with (concurrent) support for several interventions. We use DeepABM-COVID to specifically study the public health impact In particular, we present a sample of our results on delaying second dose of the mRNA vaccine and present recommendations on when this strategy could be usefully adopted.

An interesting direction of future research is to leverage DeepABM to introduce inverse agent-based simulations which can be used to learn physical parameters in the (micro) simulations using gradient-based optimization with large-scale real-world (macro) data. This can be used to calibrate agent-based simulations, as a significant shift from current grid search techniques, as well as enable them for real-world predictive modeling. Furthermore, DeepABM separates modeling of agent transition and agent behavior, enabling learning of adaptive behavior of agents by searching over a space of rules instead of using fixed rule-based behavior. We are optimistic that the current work can have interesting implications for bringing ABM and AI communities closer. RAMESH RASKAR, Ph.D. is an Associate Professor at MIT Media Lab and directs the Camera Culture research group. His focus is on AI and Imaging for health and sustainability. He is also founder and chairman at PathCheck foundation, a non-profit for COVID-19 response and has deployed digital contact tracing solutions in several states. He received the Lemelson Award (2016), ACM SIGGRAPH Achievement Award (2017), DARPA Young Faculty Award (2009), Alfred P. Sloan Research Fellowship (2009), TR100 Award from MIT Technology Review (2004) and Global Indus Technovator Award (2003) . His email address is raskar@mit.edu.

Proceedings of the 2021 Winter Simulation Conference S. Kim, B. Feng, K. Smith, S. Masoud, Z. Zheng, C. Szabo, and M. Loper, eds . 

As an aid to authors who seek to improve the clarity and readability of their papers in the Proceedings of the Winter Simulation Conference, this paper summarizes some useful guidelines on technical writing, including current references on each topic that is discussed.

Writing a clear, readable exposition of complex technical work is at least as difficult as doing the work in the first place. Given below is an outline of key considerations to bear in mind during all stages of writing a paper that will be reviewed for possible presentation at the Winter Simulation Conference (WSC) as well as publication in the Proceedings of the Winter Simulation Conference. For questions about these guidelines, please send e-mail to jwilson@ncsu.edu or contact the proceedings editors.

Organizing the paper (what to do before beginning to write) A. Analyze the situation-that is, the problem, the solution, and the target audience. 1. Formulate the objectives of the paper.

2. Specify the scope of the paper's coverage of the subject and the results to be discussed. Orient the paper toward the theme of your session as indicated either by the title of your session or by the instructions of your session chair. Also take into account the general focus of the track containing your paper, which could be tutorials, case studies, vendors, methodologies, domain-specific applications, or general applications.

3. Identify the target audience and determine the background knowledge that you can assume for this particular group of people. Introductory tutorials are generally attended by newcomers who are interested in the basics of simulation. Advanced tutorials are designed to provide more experienced professionals with a thorough discussion of special topics of much current interest; and some special-focus sessions in this track are designed to provide experts with an overview of recent fundamental advances in simulation theory. Methodology sessions are attended by professionals who have at least an undergraduatelevel background in computer simulation techniques. In the case studies and applications tracks, session attendees are generally familiar with the area covered by their session. Vendor sessions may contain both new and experienced users of the relevant software products.

Wilson 4. Formulate the most logical sequence for presenting the information specified in item 2 to the readers identified in item 3. For a discussion of effective aids in organizing your paper (specifically, brainstorming, clustering, issue trees, and outlining), see chapter 3 of Matthews and Matthews (2014) . In structuring your presentation, keep the following points in mind.

a. Introductory and advanced tutorials should have an educational perspective. Within the advanced tutorials track, special-focus sessions should synthesize the latest research results in a unified treatment of a given topic.

b. Methodology contributions should provide state-of-the-art information on proven techniques for designing, building, and analyzing simulation models.

c. Application papers should relate directly to the practice of simulation, and they should emphasize lessons of transferable value.

B. Make outlines to organize your thoughts and then to plan both the written and oral presentations of your work. For excellent discussions of the construction and use of various types of outlines, see the following: chapter 1 of Menzel, Jones, and Boyd (1961) ; the sections titled "Develop an issue tree to assess presentation balance" and "Outline to develop the paper's framework" in chapter 3 of Matthews and Matthews (2014) ; and chapter 3 of Pearsall and Cook (2010) . 1. The introductory paragraph(s) a. State the precise subject of the paper immediately.

b. State the problem to be solved.

c. Summarize briefly the main results and conclusions.

d. Tell the reader how the paper is organized.

2. The main body of the paper a. Include enough detail in the main body of the paper so that the reader can understand what you did and how you did it; however, you should avoid lengthy discussions of technical details that are not of general interest to your audience.

b. Include a brief section covering notation, background information, and key assumptions if it is awkward to incorporate these items into the introductory paragraph(s).

c. Include sections on theoretical and experimental methods as required. For an application paper, you should discuss the development of the simulation model-including input data acquisition as well as design, verification, validation, and actual use of the final simulation model. For a methodological or theoretical paper that requires substantial mathematical development, see Halmos (1970) , Higham (1998) , pages 1-8 of Knuth, Larrabee, and Roberts (1989) , Krantz (1997 Krantz ( , 2001 , or Swanson (1999) . Wilson c. Avoid paragraphs of extreme length-that is, one-sentence paragraphs and those exceeding 200 words.

d. Place the important conclusions in the stress position at the end of the paragraph.

5. Allocate space to a topic in proportion to its relative importance.

6. For methodology papers, emphasize the concepts of general applicability that underlie the solution procedure rather than the technical details that are specific to the problem at hand. Supply only the technical details and data that are essential to the development.

7. For application papers, emphasize the new insights into the problem that you gained from designing, building, and using the simulation model.

8. Use standard technical terms correctly. a. For standard usage of mathematical terms, see James and James (1992) and Borowski and Borwein (2002) . For example, a nonsquare matrix cannot be called "orthogonal" even if any two distinct columns of that matrix are orthogonal vectors.

b. For standard usage of statistical terms, see Dodge (2003) , Porkess (2005) , and Upton and Cook (2014) . For example, the probability density function of a continuous random variable cannot be called a "probability mass function."

c. For standard usage of computer terms, see The Free On-Line Dictionary of Computing (Howe 1993) and Dictionary of Algorithms and Data Structures (Black 1998) .

d. For standard usage of industrial engineering terms, see Industrial Engineering Terminology (IISE 2000) . For example, the time that a workpiece spends in a manufacturing cell may be called "cycle time" or "flow time" but not "throughput time." 9. Avoid illogical or potentially offensive sexist language. See Miller and Swift (2001) for a commonsense approach to this issue.

10. Strictly avoid the followinga. religious, ethnic, or political references; b. personal attacks;

c. excessive claims about the value or general applicability of your work; and d. pointed criticism of the work of other people. Such language has no place in scientific discourse under any circumstances, and it will not be tolerated by the proceedings editors. With respect to vendor sessions, items c and d immediately above require authors to avoid invidious comparisons of their products with competing products. 11. In writing the final section of the paper containing conclusions and recommendations for future work, you should keep in mind the following maxim:

The mark of a good summary is revelation: "Remember this, reader? And that? Well, here's how they fit together." (van Leunen 1992, 116) C. For each table, compose a caption that briefly summarizes the content of the 

AIP Style Manual

The Craft of Scientific Writing

The Careful Writer: A Modern Guide to English Usage

Dictionary of Algorithms and Data Structures

Communicating in Science: Writing a Scientific Paper and Speaking at Scientific Meetings

Collins Web-linked Dictionary of Mathematics

Fowler's Dictionary of Modern English Usage

Writing for Your Peers: The Primary Journal Paper

Line by Line: How to Improve Your Own Writing

The Oxford Dictionary of Statistical Terms

The Little, Brown Handbook

The Chicago Guide to Grammar, Usage, and Punctuation

How to Write and Publish a Scientific Paper

The Science of Scientific Writing

Webster's Third New International Dictionary of the English Language

Sin and Syntax: How to Craft Wickedly Effective Prose

How to Write Mathematics

Handbook of Writing for the Mathematical Sciences

The Free On-Line Dictionary of Computing. London: Imperial College Department of Computing

Technical Writing and Professional Communication for Nonnative Speakers of English

GA: Institute of Industrial Engineers

ISO 80000-2: Quantities and Units: Part 2: Mathematical Signs and Symbols to Be Used in the Natural Sciences and Technology. Geneva: International Organization for Standardization

Mathematics Dictionary

Mathematical Writing

Primer of Mathematical Writing: Being a Disquisition on Having Your Ideas Recorded

Handbook of Typography for the Mathematical Sciences

Successful Scientific Writing: A Step-by-Step Guide for the Biological and Medical Sciences

Writing a Technical Paper

The Handbook of Nonsexist Writing

ANSI/NISO Z39.14-1997 (R2009): Guidelines for Abstracts

Woe Is I: The Grammarphobe's Guide to Better English in Plain English

The Elements of Technical Writing

Collins Web-linked Dictionary of Statistics

Mathematical Notation: A Guide for Engineers and Scientists

Oxford English Dictionary

The Elements of Style

The Chicago Manual of Style

The Visual Display of Quantitative Information

A Handbook for Scholars. Rev

In How to Use the Power of the Printed Word

The Aims of Education

Style: The Basics of Clarity and Grace

Style: Lessons in Clarity and Grace

Responsible Authorship and Peer Review

An Outline of Scientific Writing: For Researchers with English as a Foreign Language

On Writing Well: The Classic Guide to Writing Nonfiction

AUTHOR BIOGRAPHY

His current research interests are focused on probabilistic and statistical issues in the design and analysis of simulation experiments. He has held the following editorial positions: departmental editor of Management Science

During the period 1997-2004, he was a member of the WSC Board of Directors corepresenting the INFORMS Simulation Society; and he served as secretary

Wilson b. Summarize any unresolved issues that should be the subject of future work. c. State the final conclusions explicitly in plain language.

Writing the paper A. Prepare an abstract that is concise, complete in itself, and intelligible to a general reader in the field of simulation. The abstract may not exceed 150 words, and it should not contain any references or mathematical symbols. 1. Summarize the objectives of the paper.2. Summarize the results and conclusions.3. State the basic principles underlying any new theoretical or experimental methods that are developed in the paper.4. For complete instructions on the preparation of scientific abstracts, see Guidelines for Abstracts (NISO 2010) , pages 91-93 of Carter (1987) , page 5 of the AIP Style Manual (AIP 1990) , or chapter 9 of Gastel and Day (2016) .B. Write the rest of the paper as though you were talking to a group of interested colleagues about your work. 1. Strive for accuracy and clarity above all else.2. In writing the introduction, you should remember the following maxim:The opening paragraph should be your best paragraph, and its opening sentence should be your best sentence. (Knuth, Larrabee, and Roberts 1989, 5) You cannot achieve such an ambitious goal on the first try; instead as you add new sections to the paper, you should review and revise all sections written so far. For more on the spiral plan of writing, see pages 131-133 of Halmos (1970) .a. Like the abstract, the introduction should be accessible to general readers in the field of simulation.b. For methodology papers and advanced tutorials, substantially more advanced background may be assumed in the sections following the introduction.3. In constructing each sentence, place old and new information in the respective positions where readers generally expect to find such information. For an excellent discussion of the principles of scientific writing based on reader expectations, see Gopen and Swan (1990) and Bizup (2014, 2017) . a. Place in the topic position (that is, at the beginning of the sentence) the old information linking backward to the previous discussion.b. Place in the stress position (that is, at the end of the sentence) the new information you want to emphasize.c. Place the subject of the sentence in the topic position, and follow the subject with the verb as soon as possible.d. Express the action of each sentence in its verb.4. Make the paragraph the unit of composition. a. Begin each paragraph with a sentence that summarizes the topic to be discussed or with a sentence that helps the transition from the previous paragraph.b. Provide a context for the discussion before asking the reader to consider new information.

D. For each figure, compose a caption (or legend) that explains every detail in the figure-every curve, point, and symbol. See the AIP Style Manual (AIP 1990) or chapters 17 and 18 of Gastel and Day (2016) for excellent examples.E. Revise and rewrite until the truth and clarity of every sentence are unquestionable.1. For questions about the rules of English grammar and usage, see Bernstein (1965) , Butterfield (2015) , Fowler and Aaron (2016) , Garner (2016) , Hale (2013) , O'Conner (2009), Strunk and White (2000) , the Oxford English Dictionary (Simpson and Weiner 1989) , and Webster's Third New International Dictionary of the English Language, Unabridged (Gove 1993).2. For those who use English as a second language, particularly helpful references are Booth (1993) , Fowler and Aaron (2016) , Huckin and Olsen (1991) , and Yang (1995) .3. For guidelines on how to edit your own writing effectively, see Cook (1985) .4. For a comprehensive discussion of all aspects of scientific writing, see Alley (1996) and Gastel and Day (2016) .F. Prepare a complete and accurate set of references that gives adequate credit to the prior work upon which your paper is based. 1. The author-date system of documentation is required for all papers appearing in the Proceedings of the Winter Simulation Conference. Chapter 15 of The Chicago Manual of Style (University of Chicago Press 2010) provides comprehensive, up-to-date information on this citation system.2. In preparing your list of references, you should strive for completeness, accuracy, and consistency. Using the information provided in your list of references, the interested reader should be able to locate each source of information cited in your paper. 4. The final electronic version of your paper-that is, the portable document format (PDF) file ultimately produced from the Word or L A T E X source file of your paper-may include external hyperlinks referring to some of the electronic sources cited in the paper that are accessible online. a. If an external hyperlink is live, then it is colored blue; and when viewing the PDF file of your paper on a computer, the reader may select (click) that hyperlink for immediate online access to the cited material. More specifically, selecting (clicking) a live external hyperlink will activate the reader's web browser so that, if all goes well, the cited source of information will be displayed in the web browser. A live external hyperlink may also be used to activate the reader's e-mail software for sending a message to a specific e-mail address; for example, see the hyperlink given in the first paragraph of this document.b. If an external hyperlink is not live, then it is colored black; and such a hyperlink merely displays the URL or DOI of the cited material without providing a mechanism for immediate online access to that material.Wilson c. If you use external hyperlinks in your paper, then you must ensure that the text displayed for each external hyperlink is correct and complete so that a reader who has only a hard copy of the paper can still access the cited material by (carefully) typing the relevant displayed text of the hyperlink into the address bar of a web browser or e-mail program. Remember that your responsibility for the accuracy and completeness of each hyperlink in your paper parallels your responsibility for the accuracy and completeness of each conventional citation of a nonelectronic source-neither the editors nor the publisher of the proceedings can verify any of this information for you.G. See Wilson (2002) for a discussion of the following ethical and "strategic" considerations in writing a scientific paper that will be considered for publication in a peer-reviewed journal or conference proceedings such as the Proceedings of the Winter Simulation Conference: 1. achieving a consensus among collaborators on who should be a coauthor of the paper;2. achieving a consensus among coauthors on the order of authorship in the paper's byline; and 3. writing the paper so as to anticipate and answer key questions that will be asked by the paper's referees and readers.

Achieving a natural and effective style A. Alfred North Whitehead memorably expressed the gist of the matter of writing style: Finally, there should grow the most austere of all mental qualities; I mean the sense for style. It is an aesthetic sense, based on admiration for the direct attainment of a foreseen end, simply and without waste. Style in art, style in literature, style in science, style in logic, style in practical execution have fundamentally the same aesthetic qualities, namely attainment and restraint. The love of a subject in itself and for itself, where it is not the sleepy pleasure of pacing a mental quarter-deck, is the love of style as manifested in that study.Here we are brought back to the position from which we started, the utility of education. Style, in its finest sense, is the last acquirement of the educated mind; it is also the most useful. It pervades the whole being. The administrator with a sense for style hates waste; the engineer with a sense for style economises his material; the artisan with a sense for style prefers good work. Style is the ultimate morality of mind. (Whitehead 1929, 12) Kurt Vonnegut made the following equally trenchant observation on writing style.Find a subject you care about and which you in your heart feel others should care about. It is this genuine caring, and not your games with language, which will be the most compelling and seductive element in your style. (Vonnegut 1985, 34) Strunk and White (2000) , Williams and Bizup (2014 , 2017 ), and Zinsser (2006 are excellent references on achieving a natural and effective writing style.B. Contrast the following descriptions of an experiment in optics:1. I procured a triangular glass prism, to try therewith the celebrated phenomena of colors. And for that purpose, having darkened my laboratory, and made a small hole in my window shade, to let in a convenient quantity of the sun's light, I placed my prism at the entrance, that the light might be thereby refracted to the opposite wall. It was at first a very pleasing diversion to view the vivid and intense colors produced thereby.2. For the purpose of investigating the celebrated phenomena of chromatic refrangibility, a triangular glass prism was procured. After darkening the laboratory and making a small Wilson aperture in an otherwise opaque window covering in order to ensure that the optimum quantity of visible electromagnetic radiation (VER) would be admitted from solar sources, the prism was placed in front of the aperture for the purpose of reflecting the VER to the wall on the opposite side of the room. It was found initially that due to the vivid and intense colors which were produced by this experimental apparatus, the overall effect was aesthetically satisfactory when viewed by the eye. The most striking difference between these two accounts of the experiment is the impersonal tone of the second version. According to version 2, literally nobody performed the experiment. Attempting to avoid the first person, the author of version 2 adopted the third person; this in turn forced the author to use passive verbs. As Menzel, Jones, and Boyd (1961, 79 ) point out, "Passive verbs increase the probability of mistakes in grammar; they start long trains of prepositional phrases; they foster circumlocution; and they encourage vagueness." Notice the dangling constructions in the second sentence of version 2. Version 1 was written by Isaac Newton (1672, 3076). Even though it was written over 340 years ago, Newton's prose is remarkable for its clarity and readability.C. To achieve a natural and effective writing style, you should adhere to the following principles that are elaborated in chapter 5 of Menzel, Jones, and Boyd (1961) : 1. Write simply.2. Use the active voice.3. Use plain English words rather than nonstandard technical jargon or foreign phrases.4. Use standard technical terms correctly.5. Avoid long sentences and extremely long (or short) paragraphs.6. Avoid slavish adherence to any set of rules for technical writing, including the rules enumerated here.7. Remember that the main objective is to communicate your ideas clearly to your audience.

In writing a paper for publication in the Proceedings of the Winter Simulation Conference, the author should keep in mind the key considerations outlined in this paper. Questions and suggestions for improvement of this document are welcome.

These guidelines are based on a similar document prepared by James O. Henriksen, Stephen D. Roberts, and James R. Wilson for the Proceedings of the 1986 Winter Simulation Conference.