key: cord-0492227-65gv4o88
authors: Huang, Yiduo; Shen, Zuojun Max
title: Optimizing timetable and network reopen plans for public transportation networks during a COVID19-like pandemic
date: 2021-09-08
journal: nan
DOI: nan
sha: 2267e24076b4dcc60e07706665ca12e4fa5e9ec2
doc_id: 492227
cord_uid: 65gv4o88

The recovery of the public transportation system is critical for both social re-engagement and economic rebooting after the shutdown during pandemic like COVID-19. In this study, we focus on the integrated optimization of service line reopening plan and timetable design. We model the transit system as a space-time network. In this network, the number of passengers on each vehicle at the same time can be represented by arc flow. We then apply a simplified spatial compartmental model of epidemic (SCME) to each vehicle and platform to model the spread of pandemic in the system as our objective, and calculate the optimal open plan and timetable. We demonstrate that this optimization problem can be decomposed into a simple integer programming and a linear multi-commodity network flow problem using Lagrangian relaxation techniques. Finally, we test the proposed model using real-world data from the Bay Area Rapid Transit (BART) and give some useful suggestions to system managers.

In response to the COVID-19 outbreak, the Center of Disease Control and Prevention (CDC) has promulgated guidance, including the practice of social distancing. The state of California also made intensive efforts and became the first state to issue a mandatory shelter-in-place order, which has largely mitigated the spread of the disease. Consequently, both the demand and supply for public transport decreased drastically from March to September.

As the situation evolves, regular activities will resume in the foreseeable future. Public transit, which is an integral part of personal lives, will become the first to face the sheer volume of crowds. Great concerns along with transmission relapse may be raised again when quantifying a crowded transportation system's specific risk. It is worth noting that although many efforts have been made to model the spread of the disease and the public transportation network design problem has been thoroughly studied in the transportation literature, no clear guidance is available to public transit agencies during the recovery phase. In contrast, the infection risk of a pandemic like COVID-19 is not considered in the existing network planning and scheduling methods. Furthermore, safety measures such as social distancing and vehicle disinfection change the public transportation system's working procedure. These factors need to be considered when drafting the recovery plan.

Therefore, a reopening design method considering the risk of new infection is needed for public transportation and becomes more urgent in the coming months. Taking the active social distancing order and the risk of new infection into consideration, we address the following problems simultaneously: (1) Network reopening design: how to re-gauge the network and determine which lines and stations to open in the given region. We need to decide if we can open each line or station at the beginning of the day;

(2) Timetable design: how to design a timetable to satisfy all demands and control the risk of new cases. We need to decide when we should dispatch one train on line l, assuming a fixed travel time. We say one train on line l is dispatched at time t if one train of line l leaves its first stop and start a new run. We call one train operation a 'dispatch' and we use term 'run' and 'dispatch' interchangeably in this paper.

The COVID-19 pandemic has dramatically impacted the global economy in 2020, with 167K deaths and > 5M confirmed cases in the US alone by September 2020 [28] .

The transmission mechanism of COVID-19 and other similar epidemics have been studied by public health researchers from January to August, and new results are being consistently obtained. It is already known that COVID-19 can be transmitted directly from person to person through respiratory droplets [14] , and it is likely that "a person can get COVID-19 by touching a surface or object that has the virus on it" [15] . The disease is highly contagious, with a reproduction rate of R 0 estimated to be 2.7 to 5.7 in Wuhan, China [42] in September 2019 and 6.4 in New York State [27] in March 2020.

To model the transmission of COVID-19 or other pandemic in a public transit network, the spatial nature of the pandemic's spread should be captured explicitly. There are plenty of studies on modeling the spread of pandemics in the mathematical epidemiology literature (see books such as [3] , [12] , and [37] ). Generally, the pandemic spread can be modeled using a network where each node represents one location (a city, province, or block). Inside each node, the pandemic spread according to differential or difference equations, similar to the Susceptible, Infectious, or Recovered (SIR) model.

During the COVID-19 pandemic, researchers have further developed these models to evaluate COVID-19 travel policies. Chinazzi [17] studied the impact of travel restrictions in China. They used the global epidemic and mobility model (GLEAM) and model regions as nodes in a metapopulation network based on air transportation data. Jia et al. [29] developed a spatiotemporal 'risk source' model to assess the risk of COVID-19 community transmission using the actual movement data of cell phone users. They modeled the infection based on a multiplicative exponential model and predicted the number of new infections in cities or provinces outside Wuhan. Li et al. [33] simulated the spatiotemporal dynamics of infection and inferred important parameters using an iterative filtering-ensemble adjustment Kalman filter framework. They concluded that the transmission rate β in spatial compartmental models is 1.12 (95% CI [1.06, 1.19] ). This parameter captures the rate at which an infected agent infects others and plays an important role in our model.

Based on these models, operations management and economics researchers have studied the optimal lockdown or reopen decision making aspects during the pandemic. Birge et al. [11] proposed a spatial compartmental model of epidemics (SCME) that was an extension of the works of [3] [12] [37] .They optimized the neighborhood travel ban/reopen plan policy in New York City. [31] proposed a decision-making model implemented at Yale University. Alvarez et al. [4] attempted to find the optimal lockdown policy to minimize the loss. Other researchers such as [1] [22] [23] used heterogeneous Susceptible-Infectious-Removed (SIR) or Susceptible-Exposed-Infectious-Removed (SEIR) models to make critical decisions on the lockdown period or medical supply distribution during the pandemic.

The COVID-19 pandemic has heavily impacted public transit systems in the US. For example, in the bay area, Bay Area Rapid Transit (BART) implements social distancing procedures, and masks are mandatory. They use hospital-grade disinfectants to clean the carts multiple times a day [7] . AC transit, who provide bus services to the east San Francisco Bay Area, have modified the timetable and canceled some line services [2] . Ridership in New York dropped by 92% in mid-April, and night services are no longer available [20] . Similar measures have been implemented by different transit agencies across the US [6] [16] .

To date (May 2021), there is no confirmed evidence of COVID-19 transmission via public systems as long as proper precautions and necessary safety measures are in place [30] [41] [43] . However, the public transit system may play some role in the spread of the disease [15] which is supported by the early cases in Hunan and Zhejiang, China [36] [44] , and the risks must be considered when a reopening plan is drafted for the transit systems. Transportation researchers are now trying to model the pandemic within public transportation systems [38] [32] [35] . Mo et al. [38] modeled the pandemic using an public transportation encounter network (PTN) and an epidemic transmission model.

The aim of this study is to integrate public transit network reopen plan and timetable optimization. The transit network reopen is a network design problem aiming to determine the network layout, frequencies, timetables, and ticket prices. Transportation network design has been studied for decades. The reader can refer to survey papers such as [19] [24] [26] and some of the latest papers on this topic [5] [13] [18] .

The integrated transit network and timetable design problem can be formulated as a two-level network design problem. The upper level involves the design problem with integer design variables, while the lower-level problem is the traffic assignment problem with path or link flow variables. The formulation of the lower-level problem depends on the assumption of the user routing choice. One typical assumption is user equilibrium (UE), that is, Wardrop's first principle. UE assumes that a Nash equilibrium will be achieved among all users once the network design is given. For example, in [21] , the UE condition was formulated as variational inequalities. Their algorithm can determine a near-optimal network layout and frequency design of a transit system. However, the arc capacity constraints were not explicitly stated in their model. Verbas and Mahmassani [47] assumed an elasticity of travel demand and optimized the frequency allocation.

Another common assumption made when solving the integrated network and timetable design problem is the system optimal (SO), that is, Wardrop's second principle. It assumes that the manager can control the route choice of all users. When the SO is assumed, the two-level problem becomes a one-level optimization problem because the decision-maker determines all the decision variables, so the problem is usually easier to solve than UE. For example, in Niu's papers [39] and [40] , the authors only considered a railway corridor, and there were no transfers so that the SO could be defined analytically. Although they only considered one simple corridor, they used a space-time network with a time dimension, so their nonlinear network UE was still complicated to solve. Szeto and Jiang [45] considered a capacitated static transit network with transfers, and as the lower-level problem became linear, the SO could be solved efficiently.

Besides the SO and UE, users can be assumed to be boundedly rational, where the users will comply with the system assigned path as long as the path does not cost much more than the current shortest path. Liu and Zhou [34] used this assumption and built an agent-based space-time network. Their model captured many subtle behaviors of the passengers, such as transfer and waiting time on platforms.

The contributions of this work can be summarized as follows: (1) Providing references to transportation managers. We will answer several important questions: How can the public transportation system look ahead to meet the challenge of demand during the pandemic? How can the existing transit network be reopened? How can we design an optimal schedule? (2) Novel formulation of a pandemic in public transit systems. We use a space-time dynamic transit network to capture complicated behaviors such as transfers and waiting at platforms, and we capture the risk of pandemic transmission using a simplified spatial compartmental model of epidemics (SCME) [11] (3) Efficient algorithms for similar problems. Although the full formulation is difficult to solve because of a huge number of variables and constraints, we show that with Lagrangian relaxation, the problem can be decomposed into two subproblems. One subproblem involves a multi-commodity network flow problem, while an off-the-shelf solver such as Gurobi [25] can be used to solve the other subproblem.

The remainder of this paper is organized as follows. In section 2, we present the assumptions of our model, and detail the formulation of our dynamic transit network considering the pandemic risk and social-distancing measures. A simple example is used to illustrate the major features of the dynamic network. In Section 3, we formulate the dynamic network design problem with pandemic risk as our objective function and social-distancing constraints, and the Lagrangian relaxation procedure used to solve this problem is presented in section 4. In section 5, we validate our model using real-world data from the BART system in the San Francisco Bay Area and provide an optimal timetable design. In section 6, the conclusions and potential future research questions are presented.

Consider a city planing to reopen public train/bus network under a pandemic similar to COVID-19. We will call this pandemic 'the pandemic' in the following sections. We will build a space-time passenger flow network in section 2.1, and we will integrate epidemic/pandemic model with this network in section 2.2. In each station s, there could be one or many platforms plt j ∈ P LT (s), where P LT (s) is the set of platforms in station s. In many real-world systems, multiple lines can share one platform while it is also common that one station has many platforms for different lines.

Let the set of all candidate bus/train lines be L = {0, 1, ..., |L|}, and the set of lines passing station s ∈ S using platform plt j be L(s, plt j ). One line consists several arcs connecting platforms from one station to another.

There are three types of nodes in our network: origin, destination, and platforms. We use tuple (i, k) to index nodes in N 0 , where i ∈ S ∪ O ∪ D There are also several types of arcs, as is illustrated in figure 1 , including walking arcs between origins/destinations and stations, in-vehicle travel arcs between platforms of the same line at two different stations, and transfer arcs from one platform to another at the same station. Note that there are two types of transfers: transfers using the same platform, and transfers using different platforms. We use set A 0 to represent all arcs described above. The physical network is defined as

We can illustrate the network using a simple example in figure connects station C to D. Consider one passenger travel from the 'resident' area to the 'commercial' area. He/she can take line 1 from station B to station A, transfers to line 2 at station A, transfers to line 3 at station C, takes line 3 from station A to station C, and walk to the commercial area. His/her trajectory is plotted in figure 1 using red arcs. Note that in station A, there are two platforms so that transfer passengers need to change their platforms, while in station C, all lines use the same platform.

During a pandemic like COVID-19, the capacity of each vehicle is strictly limited according to the social distancing rules. Therefore, we need to know the exact number of passengers on each vehicle and impose a strict social distancing capacity constraint, instead of just calculating the average number of passengers and compare it with the capacity. In addition, we need to track all passengers who stayed at the same vehicle or waited at the same platform with patients, since infection may happens on platforms or vehicles. Besides, we want to know the exact waiting time to optimize the timetable so that passengers can transfer between different lines smoothly.

We cannot answer these questions using the physical network in section 2.1.1, which can only give network flows averaged over time. We need to add an time dimension to the network and model the city as a space-time network similar to the models in [18] [39] [40] . The idea is to make many copies of the nodes physical network and connect them using different types of arcs representing in-vehicle travelling, waiting on the platforms, walking, and early/late departure/arrival.

Recall that the set of locations is O ∪ S ∪ D, the set of lines is L, and for each station s ∈ S, L(s, plt j ) ⊂ L is the set of lines coming through station s using platform plt j . Consider a discrete time horizon T := {1, 2, 3, ...}, we make |T | copies of N 0 in the physical network and add a time index to each node. In this network, each node can be represented by a tuple (i, k, t), where i ∈ O ∪ S ∪ D is the physical location, k ∈ {ori, dest} ∪ {plt j ∈ P LT (s)} is the type of the node (defined in 2.1.1), and t ∈ T is the time step. Mathematically, the set of nodes is defined as:

Where we make |T | copies of the node set in physical network:

In words, node set N t at each time step t is the union of the set of origin nodes, the set of destinations, and the set of all platforms. N is the union of all |T | of N t s. Using the nodes defined, let the platform for line l at station s be plt(l, s), we can construct the whole space-time network using the following 6 types of arcs:

Where sets are defined:

• A transf er := {(s, plt(l 1 , s), t) → (s, plt(l 2 , s), t), ∀s ∈ S, l 1 , l 2 ∈ L(s), t ∈ T } contains arcs between two platforms the same station, where L(s) is the set of lines travelling through station s. Note that we ignore the walking time inside the stations.

, t ∈ T , s 1 , s 2 are two consecutive stations on line l} contains arcs between platforms of the same line at different stations. These are in-vehicle travel arcs, where ∆t s 1 ,s 2 ,l is the train running time from s 1 to s 2 on line l, plt(l, s) is the platform that line l used at platform s.

• A wait := {(s, plt(l, s), t) → (s, plt(l, s), t + 1)|s ∈ S, l ∈ L(s, plt i ), t ∈ T } contains arcs between the same platform at different time period.

These are platform waiting arcs where passengers can wait until the next available train/bus arrival, plt(l, s) is the platform that line l used at platform s.

t ∈ T , plt j ∈ P LT (s)} contains walking arcs from stations to destinations, where ∆t s,d is walking time from s to d. 

These arcs connects all destination nodes of the same location together, so as long as the passenger arrive in their destination, they can complete their journey.

We define the set of time dependent OD pairs:

and for each w ∈ W, the demand is d w .

Consider the previous example in figure 1 , the passenger's trajectory on the space-time graph is in figure 2. The whole process is:

1. Walking from resident to station B's platform plt 1 . This arc is in set

2. Waiting at the platform until the next train from line 1 arrives at station B. These arcs are in set A wait

3. Riding a train/bus from B to A using line 1. 5. Waiting at station A. These arcs are in set A wait .

6. Riding line 2 to station C, assuming the travel time from station A to C on line 2 is

7. Waiting for line 3 at the same platform, and travelling to station D

8. Walking to commercial area from station C, assuming it takes t 8 − t 7 to walk from station C to 'commercial' area.

(D, plt 1 , t 7 ) → (commercial, dest, t 8 )

According to pandemic models like SCME [11] models. We know that pandemic like COVID-19 transmission happens when susceptible passengers stay with patients in the same location for a certain period of time. In other words, a healthy person must spend some time with a patient together to get infected.

If we ignore the walking time inside stations (see assumption 4 in section 3.1), the pandemic transmission happens only when susceptible passengers stay with patients in the same vehicle or on the same platform for a period of time. Using the dynamics from SCME [11] , we treat each origin node o ∈ O as the place where population lived, and each in-vehicle travel link and onplatform waiting link as place where transmission happens. We ignore the dead population since our time horizon is short. We consider people in three categories: healthy but not immune (susceptible), cured or vaccinated, and infected (exposed + clinical infected + subclinical infected). The daily new cases for population living in area o ∈ O is

where IP o is the infected population in area o, ∆IP o is the expected number of new cases, T T o,(i,j) is the total time that susceptible people from area o spent on link (i, j), and IR ij is the infection rate of all passengers on link (i, j). β is the transmission rate (≈ 1.12 for COVID-19 [33] ). For example, if 5 susceptible passengers from origin o stay in the same train from i to j and the travel time is 6 minutes, then T T o,(i,j) = 5 × 6 = 30 minutes. If there is one patients in a train from i to j with other 99 healthy (susceptible) passengers, then IR ij = 0.01. Let the network flow of w ∈ W on link (i, j) be u w ij , the travel time of link (i, j) be c ij , and the ratio of susceptible population be q s (healthy people who have never been infected and have not taken vaccine) we can calculate the total time that people from o spent on link (i, j):

Let the probability that a person from OD pair w is infected be q w , we can calculate the infection rate on link (i, j):

Plug 2 and 3 into 1, we can estimate the new infections in our transit system among people from origin o,

Therefore, we can simply sum all infections together to estimate the total number of new infections in our system:

3. Network and timetable design problem under the pandemic

We make these key assumptions:

1. Users are boundedly rational. 2. Buses/trains are capacitated.

3. The OD demand at each time step is deterministic, and the system will serve all the demand. 4. Walking time inside stations is ignored. 5. Bus/train travel time between two stations are fixed. 6. Buses/trains need to be cleaned and disinfected at the last station after one run. 7. The train dwell time is ignored, and passenger boarding takes no time. 8. Assumptions on the pandemic as in section 2.2.

In assumption 1, we assume users are boundedly rational, similar to model in [34] , that the system manager can control the flow of passengers to some degree, but only when the users are deviated from the shortest path by limited amount of time. The manager can control the flo by changing the ticket price, enforcing capacity rules so that passengers are forced to change route when a cart reaches its capacity, or releasing real-time on-board crowd data so that passengers know the risk and they will avoid crowded areas for their own safety concerns.

When demand is small and capacity constraints are loose, our optimization problem 20 will give an all-pair shortest path solution, which is the only user equilibrium, because the objective function is linear. When demand is large, the optimization problem 19 gives one of the equilibrium: one user cannot improve his/her cost by changing route, because all better routes are congested. With the boundedly rational assumption, we know this equilibrium is similar to the actual equilibrium in real life, where people greedily look for the shortest path. Moreover, during the pandemic, some transit agencies got temporary power to manage passenger flow network by rejecting entrance to stations. In this case, agencies can manage the network flow and system optimal/boundedly rational can be achieved. For example, in mid-March, the government requires that every passenger submits application before using the bus system [46] .

In assumption 2, we assume that bus/train is capacitated due to the social distancing order. Although in some public transportation system like SF BART, there is no supervisors in the stations and the limit-capacity social distancing rule seems not mandatory. However, BART provides real-time onboard crowd data to public [7] and people who are concerned may check these information and decide whether boarding the next train or not. According BART's data, in most of the cases, the actual number of passengers are smaller than the requirement of social distancing capacity and the capacity constraints are valid.

In assumption 3, we assume that all time-dependent OD demand must be satisfied. Considering the fact that people who need public transportation system during a pandemic like COVID-19 usually don't have access to other means of transportation like private cars, while carpool service suspends and taxis are expensive. In addition, during a pandemic like COVID-19, people are required to 'go out only when necessary' so travel demand are deemed 'necessary'. Therefore, in our model, all demand must be satisfied, otherwise the problem is not feasible.

In assumption 4, we ignore the walking time inside the station so that we only need to consider the transmission on platforms and vehicles (see our SCME model in section 2.2). Noting that passengers spent most of their time waiting on platforms and staying in vehicles while using public transportation systems, and the time that passengers spend walking inside the stations (usually less than 1 min) can be ignored compared to waiting time and traveling time (can be > 10 min), so this assumption is close to reality.

In assumption 5, we assume that the running time between two stations is constant. Because when train/bus follows the timetable, there is no congestion in most of the mass transit systems if no accident happens. The travel time from station to station can be deemed as fixed since there is little congestion for bus system during the pandemic, and the travel time is relatively stable in railway systems.

During the pandemic like COVID-19, we know trains/buses need to be disinfected after each run. We assume in assumption 6 that this happens only at the last station.

We also ignore the train/bus dwell time at stations as assumption 7, since vehicles in mass transit like BART spend most time travelling [9] .

The last assumption is valid for public transportation systems: it is only possible that a healthy passenger to get the pandemic when an infected passenger stays within a certain distance.

We only list important notations used in our optimization model. You can find the definition of different sets of arcs and nodes in section 2.2. The cost of dinsinfection is included in this cost. Sum all these costs together and consider our budeget, we have the following budget constraint:

Using the definition from table 1, we define binary variables x lt = 1 if we run one train on line l at time t from its first station, z l = 1 if line l is opened, and y s = 1 if station s is opened. We know the binary design variables must comply:

x lt ≤ z l , ∀l ∈ L, t ∈ T (7)

x lt , z l , y s ∈ {0, 1}, ∀l, t, s

In addition, we must have enough bus at each first/last station through the time horizon.

In 9, the LHS is the total number of vehicles at time t 1 at the station s, and we need to make sure that there are always non-negative number of vehicles at each station. We define L in (s) the set of lines using station s as the last station, L out (s) the set of lines using s as the first station, and T l is the time for l to complete a run (from the first to the last station). In 10, the total number of vehicles at time 0 must be less or equal than the total available vehicle number.

We use the path-flow formulation in our model, let P w be the set of paths for OD pair w, and let f p w be the flow on path w for OD pair w. Since we consider deterministic OD demand, we have the network flow conservation:

and f p w ≥ 0, ∀w ∈ W, p ∈ P w

We can calculate the flow on each arc using path flow, using the 0-1 indicator α p ij : α p ij = 1 if path p takes arc (i, j), = 0 otherwise.

We also have the social-distancing capacity constraints for each vehicles:

Consider the design variables, no flows are allowed if there is no available line/station/train run, which are:

where A travel (l, t) ⊂ A travel is the set of arcs corresponding to train/bus of line l departure at time t

where A os (s) ⊂ A os , A sd (s) ⊂ A sd are the sets of arcs correspond to station s.

For each OD pair w, there would be a shortest path solution SP (w) on physical network, the travel cost of this path is (i,j)∈SP (w) c ij . We have the boundedly rational constraint:

This constraint means that the route that the users choose cannot be much worse from the shortest path. This set of constraints can also be written as p ∈ P w , where P w is the set of paths satisfying boundedly rational conditions.

We are looking forward to control the total number of pandemic contraction within the system while all passengers are satisfied. Using the expression of total new infections from equation 5 in section 2.2 , the problem is min u,v,x,y,z,r,Ms βq s

such that 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17 , and 18 are satisfied. Note that the goal to minimize the pandemic contraction coincide with the goal to minimize total passenger travel time since we have a linear objective function with positive coefficient proportional to link travel time.

To make the problem feasible when doing numerical iterations. We add arcs A od := {(o, ori, t) → (d, dest, t), o ∈ O, d ∈ D, t ∈ T } between origin and destination nodes with large penalty c ij = M , and flow on this path doesn't subject to rational bounded condition 18 

Here the second term is not the travel time, but a large penalty for unsatisfied demand. Arcs in A od can be used only when this OD demand cannot be served and problem is infeasible. We cannot directly add the travel cost and the cost of new cases since the travel cost is in short term while the cost of new infection is in long term, and it's difficult to estimate the net present value of pandemic cost, or the value of time for each passenger during the pandemic. Also, we know that minimizing the total pandemic contraction can help minimizing total travel time, therefore, We don't need to add link travel cost to the objective function.

To find the optimal design, we can use Lagrangian relaxation method by relaxing constraints 16 and 17. Let multipliers of 16 be π lt for each line l at time t, and multipliers for 17 be µ s for station s. We have the following dual function:

L(π, µ) := min u,v,x,y,z,r,Ms βq s

such that 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 and 18 are satisfied. Therefore, we can decompose the problem into two sub problems,

The first sub problem is:

such that 11, 12, 13, 14, 15, and 18 where lt(i, j) is the line and time correspond to arc (i, j) if (i, j) ∈ A travel , and s(i, j) is the station that correspond to arc (i, j) if (i, j) ∈ A os ∪ A sd .

The second sub problem is

such that 6, 7, 8, 9, 10.

For SU B 2 , we have an integer programming with small number of integer variables, which can be solved efficiently using solvers like GUROBI [25] . If the manager don't care about the fleet problem, constraints 9, 10 can be ignored and this problem become a bin packing problem.

SU B 1 is a linear min-cost multi-commodity network flow problem with boundedly rational route choice conditions. We have introduced the path flow variable in the last section, here we can rewrite all link flow variable using path flow variables, and then solve it using column generation. We can reformulate SU B 1 as

such that 11, 12, 18 and w∈W p∈Pw

, and c w ij = 0 for all other arcs. We call c w ij the link cost with multipliers. Note that 26 is equivalent to 15 and 14 combined.

Note that although SU B 1 has exponentially many variables, the number of constraints are significantly smaller. We can apply column generation techniques to SU B 1 with 11, 12, 26 and 18:

Given a set of paths, if the dual variables for each equation in flow conservation 11 is λ w , and the dual variables for each capacity constraint in 26 is λ ij , then the reduced cost for each path flow f p w is C p − λ w − ij∈p λ ij , where we define the total path cost

To find the minimum reduced cost for one OD pair w, define all the set:

Here, P w contains all paths for OD pair w that satisfies boundedly rational condition and one path that directly connect origin to destination but with large link cost as penalty. We only need to solve the following problems to generate a path:

If −λ w + min p∈P w (C p − ij∈p λ ij ) < 0, we can generate a new path using the path with minimum (C p − ij∈p λ ij ). Since λ w is a constant given w, C p > 0 and λ ij ≤ 0, this path can be found by solving a shortest path problem with non-negative link cost. If the minimum reduced cost is negative, we can add this path to basis and do one simplex iteration. Note that we only search paths in P w , if the shortest path (with respect to link cost c w ij ) that we found doesn't satisfy 18, and the reduced cost of this path is negative, we will try to find the second shortest path and so on, until the reduced cost in 28 become non-negative.

The upper bound can be found using the opening plan from SU B 2 . We then apply the column generation algorithm on this network (with opened link only, and multipliers π = 0 and µ = 0) to find the network flow solution. Note that this problem is always feasible using objective function in 20, but the cost will be extremely large if some demand cannot be satisfied by the public transportation system.

We know the dual function is concave, the subgradient for each π lt is SG lt := (i,j)∈A travel (l,t) v ij − x lt (i,j)∈A travel (l,t) N ij , and the subgradient for each µ s is SG s := (i,j)∈Aos(s)∪A sd (s) v ij − y s (i,j)∈Aos(s)∪A sd (s) N ij . Our Lagrangian relaxation (LR) algorithm is summarized below: LR Algorithm

Step 1. Initialize multipliers, and solve SU B 1 with these multipliers using column generation.

Step 2. Solve SU B 2 using solvers

Step 3. Check if |SU B 2 − SU B 1 | is small enough or we have done more than 1000 iterations.

Step 4. Find an upper bound based on the result from SU B 2 using column generation algorithm for SU B 1 , but only with opened links and 0 multipliers.

Step 5. Calculate sub-gradient (i.e. constraint violation), update the multipliers and integer variables using Line search. Go to step 3.

The step length is chosen using the equation below and line search,

where U B is the best upper bound obtained so far and LB(π t , µ t ) is the value of dual function at this iteration. We set α t = 0.1 at first. At each LR iteration, we will do several steps of line search Line search

Step 1. Calculate γ t using 29, k = 1

Step 2. Calculate U B = U B((π lt + γ t SG lt ) + , (µ s + γ t SG s ) + ) by solving SU B 1 ((π lt +γ t SG lt ) + , (µ s +γ t SG s ) + ) using column generation and SU B 2 ((π lt + γ t SG lt ) + , (µ s + γ t SG s ) + ) using MIP solver.

Step 3. If U B < U B and k < max k , then k = k + 1, γ t = γ t /2 and go to step 2, otherwise update π lt = (π lt + γ t SG lt ) + , µ s = (µ s + γ t SG s ) + and stop. • Transmission rate of the pandemic β

In this model, the contingency of the pandemic represented by β. According to [33] , β = 1.12person/day = 1.12 60×24 person/min. The definition of this parameter can be shown using an example: if we have 1000 susceptible people spent one day in a place where 5% of population is infected, then the expected number of new infection is 1000 × 0.05 × 1.12 = 56 persons. We set the time tolerance for each user to be T L = 30 min in the toy model.

The cost parameters estimated in the BART example in section 5.2 are used. In addition, the pandemic infection rates in different counties are extracted using the JHU data [28] on May 10, 2021. We assume that stations in this toy problem are located in different counties in SF bay area. We set the cost of opening one line to be twice the price of operating one train run.

The OD demand is generated randomly from a simulated Poisson distribution. At every minute, for every OD pair, the demand is P oisson(Demand) distributed, where Demand is the demand intensity. We set Demand = 5/6 in this example (50 passengers per hour). As this is an easy problem, it can be solved using GUROBI solver [25] and compared with the lower bound determined using the Lagrangian relaxation. The iteration process is shown in figure 4 . The lower/upper bound after 1000 iterations is (1.4497, 1.4947), while the exact solution of the solver is 1.4947 in the toy problem. The proposed algorithm has a relatively tight bound (< 4%) for the toy problem, and the optimal solution is found when solving the upper bound. Therefore, our algorithm can correctly identify the optimal solution in this example.

When budget is limited, we cannot operate too many train/bus runs. The total number of pandemic infection will increase as we cut the budget (See 5). We can see there is a minimum budget where the model become infeasible, and both of the pandemic cases and the total travel time are more To show the impact of budget, we will use the best upper bound result after 1000 LR iterations for different parameters. We set the capacity to be 600. Note that our algorithm is just a heuristic and might not be the actual optimal solution, but just an upper bound.

In this example, when budget is greater than 70% of the total budget provided in the last example, we can see the optimal solution will not change since the budget constrain is loose. When budget is smaller than 70%, the pandemic infection (our objective function) will increase when we cut the budget further. When budget is smaller than 45% of the total budget, the problem could be infeasible, since we cannot find a feasible solution satisfying all demand after 1000 LR iterations. Therefore, the budget transition point is < 45%.

Note that the pandemic cases (objective function) is more sensitive when budget level gets smaller (see fig 5 (a) ). This is because the LP relaxed lower bound F (b) := LP relaxed optimal objective function with budget b is convex with respect to b. Although the actual optimal objective function is not convex with respect to budget since we have a lot of integer variables, the trend is similar: the new pandemic cases is not sensitive when budget is great, but will be sensitive when budget is small.

For total travel time, we can see in fig 5 (b) that the trend of total travel time is the same as the pandemic cases, even if we don't have total travel time explicitly in our objective function. This is because the goal to minimize the pandemic contraction coincide with the goal to minimize total passenger travel time (see section 3.7). When the capacity of one train is limited, we need to assign more train runs to the system to satisfy all demand. However, with the budget constraint, the number of open lines and train runs are limited, and the objective function (the number of new cases) will increase when we have smaller capacity for each train.

The trend of the new cases and total travel time are similar to what happens when we change budget: there is a minimum capacity where this problem become infeasible, and both of the new cases and total passenger travel are more sensitive when the capacity gets closer to the minimum capacity.

We change the capacity of one train in this model to see how the new case will change when we have different level of capacity (see fig 6) . For each different parameter, we set the budget level to be 60% and did 1000 iterations of our LR algorithm and use the best upper bound as the result.

In our experiments, we can see that when capacity is greater or equal to 400, the objective function is the same. While when capacity become less than 320, there is still no feasible solution found after 1000 LR iterations, so the transition point is < 320. The trend of total travel time is similar, but with some fluctuations since the total travel time is not our objective function but is just positively correlated to the objective (new cases), and we are only using approximated algorithms here.

Also note that the transition point for budget and capacity are correlated. If the budget is smaller then the transition point for capacity is greater. For example, we also did the sensitivity analysis when the budget level is 100%. In this case, the transition point for budget is less than 110, which is much smaller than 320.

BART is a rapid transit system that serves the San Francisco Bay Area. It spans 121 miles of double track, consisting of 48 stations [10] . We consider the peak hours operation (7:00-10:00) as the time horizon.

The network is shown in figure [9] . The time horizon is discretized into 1-minute intervals. According to the timetable, it can be observed that the in-vehicle travel time is much longer than the dwell time at each station, so the assumptions are satisfied. There are many parallel links in BART, and transfers are prevalent in this system. The proposed space-time network model can capture all these network features.

The operation cost can be estimated from the BART budget report [9] . The total operating cost budget is 767.8M in 2020. As the operation time of each train dispatch for each line can be calculated, assuming that the operation cost is proportional to the operating time, the cost of one dispatch can be estimated, as shown in Table 2 .

Since the demand originated at each station is non-zero, and the walking distance between two stations in BART system is long. We know that we have to open all the lines otherwise the system would be infeasible, since we have strict demand constraints. We set the time tolerance for each user to be T L = 45min. Figure 7 : map of BART [9] During the pandemic, BART reduced operation in 2020, and now the system is being reopened according to the guidelines [7] . Disinfectants are being sprayed on the surfaces of train cars and station platforms, and face coverings are required at all times for all riders aged 13 and above. According to BART, the capacity of each car is 30 people under the 6-feet social distancing rule. As long trains are in use during the pandemic, each train's capacity is 600 people. As a platform's capacity is approximately 1.5 times the train capacity, we use 900 as the platform capacity.

In the current timetable, lines 0 and 1 have high operation frequencies as these two lines are Oakland International Airport shuttles that current employ smaller carts called AirBART. Therefore, the operation pattern is different from the other line and we will ignore the these two lines from Oakland Airport and relocate the demand from Oakland Airport to Coliseum station instead.

Hourly OD demand data from the BART API [8] are used to estimate the intensity of the Poisson flow during the decision time horizon. The demand is aggregated every 20 minutes to reduce the dimensions of the problem. That is, the desired departure time in our time dependent OD matrix must be multiplies of 20 minute, like 0,20,40,60 etc. Here, we consider operation from 8:00AM to 11:00AM. Using the proposed algorithm, the following iteration process is executed. To further simplify the model, train dispatch is only allowed every 10 min, that is, dispatch occurs only at time 8:10, 8:20, and so on.

After plugging the BART network in our model and simplifying the problem by only considering a max frequency of 10 minutes and ignoring Oakland Airport line, we got 277 0-1 variables. The number of integer variable x lt is 12 × 6 × 3 = 216 since we have 12 lines during a 3-hour time period. And we have one variable for each line and each station.

The number of continuous variables is exponentially many, but we only generate a few of them in our iteration. The network has 22630 arcs: We have 3888 links in A travel , and 2398 links in A wait using sparse storage of network by only considering time step where there could be an arrival/departure of a train at the platform. Most of the lines will share platforms except at station MacArthur, 19th St Oakland, Coliseum and Balboa Park, we have 71 + 53 + 107 + 125 = 356 links in A transf er where 71, 53, 107 and 125 corresponds to MacArthur, 19th St Oakland, Coliseum and Balboa Park respectively. In addition to these, we have 13000 links in A os and 2988 links in A sd . Noting that a large number of links in A os and A sd will not make the algorithm much more difficult to solve since they will not add too much route choice, and we use the path-flow formulation and column generation, which only requires doing one shortest path for each OD pair.

The number of non-zero OD pair is 2800 as we aggregate demand every 20min and most of the time dependent OD matrix is zero. When solving SUB1, we need to find shortest paths for 2800 OD pairs. We will keep generate paths until all paths satisfying boundedly rational conditions with negative reduced cost is found, so that the number of basis expands quickly. When most of the capacity constraints are not tight, and the network structure is not too complicated like BART, we can find the optimal solution after less than 100 column generation steps. The typical number of columns needed is less than 15000 in our iterations and we only have at most |A travel | + |W| < 5000 constraints (rows) while most of the capacity constraints are not tight.

We have to admit that if the demand level is high and most of the constraints are tight, this problem can be difficult to solve and we cannot find an feasible solution after many LR iteration steps. This is a limit of our problem. But luckily, for BART system, only some trains reaches their capacity like trains running on cross-bay link, where there is a large flow from east bay to the downtown SF every workday morning.

After 1000 LR iterations, the gap is approximately 5.4%. (LB: 0.470605, UB: 0.496054). The upper bound solution is better than the cost calculated using the current running timetable (cost: 0.500649), which is calculated based on timetable of BART weekday operation [7] using our column generation algorithm to find the optimal network flow and cost.

In the proposed optimal solution, all train lines and stations will be opened. The reason is that we need to cover all stations to have a feasible solution and we must open most of lines in BART system to make sure all stations are covered. Although passengers can walk directly from the origins to stations, the walking distance is great since the distance between two station is quite large in BART system. Therefore, passengers using the system must have access to a station nearby, and we need to open most of the lines to cover all stations. Another reason is given by experiments in section 5.2.3, the optimal design will open all lines once all stations are covered and the timetable become more flexible to reduce the number of new infections.

The optimal timetable is shown in table 3. For example if x 2,10 = 1 then we will have one train run dispatched for line 2 at time = 10min (8 : 10). It can be seen that most of the trains will have two to four runs per hour in our optimal plan, while the current timetable has only two runs. It can also be observed that lines 11 and 13 (Richmond to Millbrae, and Antioch to SFIA) has much more runs than lines 10 and 12 ( Millbrae to Richmond, and SFIA to Antioch). During the morning peak hour, people need to leave their homes in the East Bay for travel to the San Francisco CBD area, and our design captured this demand pattern.

Compared to the original timetable, our timetable has higher frequency in the first two hours (8:00AM-10:00AM) or (0-120 min). This is because the demand in the first two hours are much higher than the demand from 10:00AM to 11:00AM. In BART data, the total daily demand is 2200 in 8:00-9:00 AM, 1649 in 9:00-10:00AM and 1337 in 10:00-11:00AM.

Another major difference between our timetable and the existing timetable is that we have much fewer operations for line 8 and 9 (Millbrae to SFIA and SFIA to Millbrae). There are two reasons for this: (1). These lines serves only passengers between Millbrae and SFIA, and the number of passengers is small compared to the whole system (≈ 0.1%, with 14.75 passengers from MLBR to SFIA and 15.57 passengers from SFIA to MLBR while the whole system has an average of 27677.38 daily users). (2) . Our algorithm favors long distance lines since we assume all trains need to be cleaned after one operation, and the cleaned cost and time is fixed. Therefore, the per-mile cost is higher for short lines compared to other lines, and the algorithm tend to operate long distance lines. 

From the previous example, we know with the boundedly rational constraints, the problem can be infeasible if we don't cover all the stations, and given the stations are covered, we will open all lines in the optimal solution. To find out why it is the case and what will happen if we don't open all the lines or stations. We assume in this experiment that passengers will go to the closest station from their origin and leave the system from the station closest to their destination using the shortest path in the physical network, so the total demand is the same. We open the lines one by one following table 4. The proposed algorithm is then executed for 1000 LR iterations to find the upper/lower bound of the COVID-19 cost, optimal timetable design, and network flow assignment for each experiment. The resulting objective function (new cases), total travel time inside and outside the transit system are shown in table 5. infeasible  60734  66413  86091  80032  70073  Travel time outside the system infeasible  14134  9057  336  216  0  Line 2 runs  infeasible  14  12  13  6  8  Line 3 runs  infeasible  15  12  13  2  9  Line 4 runs  infeasible  0  9  2  10  9  Line 5 runs  infeasible  0  12  3  13  5  Line 6 runs  infeasible  0  0  13  13  7  Line 7 runs  infeasible  0  0  3  10  8  Line 8 runs  infeasible  0  0  0  0  2  Line 9 runs  infeasible  0  0  0  0  2  Line 10 runs  infeasible  0  0  0  4  4  Line 11 runs  infeasible  0  0  0  4  11  Line 12 runs  infeasible  17  14  14  2  9  Line 13 runs  infeasible  17  17  7  15  11 From table 5, we can see when we only open two lines, the demand is huge and we cannot satisfy all the demand in experiment 0. As we expand the number of lines, more stations are covered, the travel cost outside the system (time it takes for passengers to travel from origins to stations and from stations to destinations) will decrease, and the number of train runs needed for each line decreases since we don't rely solely one these lines after opening new lines. Also, we notice that the new cases or travel cost inside the system is not monotone. It depends on the network structure.

1. When we only open a small fraction of lines, not all stations are covered and many people spend most of their time walking outside the system. Therefore, the number of new infection inside the system is small. At this time, opening a new line result in a increase in demand with respect to passenger distance traveled, and the objective function (new cases) will also increase. 2. When we have the most of the network opened, all the stations are covered and we have many parallel lines. These features enable us to make flexible timetables and the optimal new infection number is small. At this time, if we close some existing lines and stations, passengers will go to nearby opened stations and the number of queuing passengers waiting will increase, leading to more infections. In addition, if we close parallel lines in the system, the designed capacity between two stations are less flexible and leads to an increase in both total travel time and the waiting time on platforms, thus increasing the number of infections. This also explained why all lines are opened in the optimal design from section 5.2.2

In this study, we combined a simplified SCME model and a space-time dynamic transit network to estimate the impact of the pandemic on the public transit system. The proposed model can predict new the pandemic cases in any transit system. We also developed an algorithm to find the optimal network reopening plan and timetable simultaneously, considering the risk of pandemic infection and social distancing rules. The proposed model and the Lagrangian relaxation frame can also be extended to other economics problems where reopening too quickly under pandemic conditions may result in a 'second wave' of the pandemic.

Based on the numerical example, it can be observed that the proposed algorithm can provide a relatively good lower and upper bounds for the design problem. However, we also notice that this algorithm can only solve relative simple problem where not too many capacity constraints are tight, otherwise the column generation will take a long time and the number of variables explodes.

From the experiments, we know that the sensitivity of budget and capacity constraints are correlated. With capacity fixed, there is a transition point for budget where the problem become infeasible if budget is smaller than this point, and both new infection number and travel time are more sensitive as budget gets closer to this transition point. Similar phenomena holds for capacity with budget fixed. In addition, with more budget, the transition point of capacity become smaller and the number of new cases is less sensitive with respect to capacity.

We also know that if we assume people who need the public transportation system will go to the closest opened station and keep using the system. Then we should either (1) open as many lines as possible and operate parallel lines, so that the system has enough flexibility to deal with time and space dependent travel demand and control the spread of the disease, or (2) only operate a small number of lines so that people are forced outside the system to prevent infections. It is not wise to cover many stations with only small number of lines. In that case, we will keep most of the demand in the system while we cannot deal with this demand with many lines closed.

Our future research will consider the dynamics in stochastic demand modeling. In the current model, the OD demand is deterministic and must be given in constraints 11 . This means that if one OD pair relies on one particular line, then this line must be opened with the optimal design. In contrast, some demand may be sacrificed to decrease the risk of the pandemic. Furthermore, the demand itself is a stochastic process (possibly Markovian) following specific transition models. It is possible to design a non-myopic network reopening plan for each period during the entire reopening phase.

Millbrae to SFIA 40

168 11 Richmond to Daly City/Millbrae 10

13 Antioch to SFIA/Millbrae

A multi-risk SIR model with optimally targeted lockdown

Mathematical epidemiology

A simple planning problem for covid-19 lockdown

Two-phase stochastic program for transit network design under demand uncertainty

San Francisco Bay Area Rapid Transit District Adopted Budget Fiscal Year 2020

Controlling epidemic spread: Reducing economic losses with targeted closures

Mathematical models in epidemiology

Mathematical programming formulations for transit network design

Covid-19: Frequently asked questions

Covid-19: How it spreads

Covid-19: using transportatoin: Public transit, rideshares and taxis, micro-mobility devices, and personal vehicles

The effect of travel restrictions on the spread of the 2019 novel coronavirus (covid-19) outbreak

Optimal design of intersecting bimodal transit networks in a grid city

A review of urban transportation network design problems

New york city subway ridership down 92 percent due to coronavirus

A continuous equilibrium network design model and algorithm for transit systems

Managing covid-19 pandemic without destructing the economy

Health versus wealth: On the distributional effects of controlling a pandemic

Transit network design and scheduling: A global review

Planning, operation, and control of bus transport systems: A literature review

State-by-state estimates of r0 at the start of covid-19 outbreaks in the usa. medRxiv

Population flow drives spatio-temporal distribution of covid-19 in china

There is little evidence that mass transit poses a risk of coronavirus outbreaks

Covid-19 scratch models to support local decisions

Modeling epidemic spreading through public transit using time-varying encounter network

Substantial undocumented infection facilitates the rapid dissemination of novel coronavirus (sars-cov-2)

Capacitated transit service network design with boundedly rational agents

Investigating physical encounters of individuals in urban metro systems with large-scale smart card data

Transmission of sarscov-2 in public transportation vehicles: A case study in hunan province, china. Open Forum Infectious Diseases 7

An introduction to mathematical epidemiology

Modeling epidemic spreading through public transit using timevarying encounter network

Optimizing urban rail timetable under timedependent demand and oversaturated conditions. Transportation Research Part C: Emerging Technologies

Train scheduling for minimizing passenger waiting time with time-dependent demand and skipstop patterns: Nonlinear integer programming models with linear constraints

Fear of public transit got ahead of the evidence

High contagiousness and rapid spread of severe acute respiratory syndrome coronavirus 2

Covid-19 : point epidémiologique du 4 juin 2020

Community Outbreak Investigation of SARS-CoV-2 Transmission Among Bus Riders in Eastern China

Transit route and frequency design: Bi-level modeling and hybrid artificial bee colony algorithm approach

Wuhan public transport will officially provide commuter services from march 16

Integrated frequency allocation and user assignment in multimodal transit networks

This research is funded by ITS-Berkeley SB 1 (Project ID: 2021-09) Conflicts of interest: None