key: cord-1030269-ufwue1ww authors: Shirai Reyna, Olivia Sashiko; Flores de la Mota, Idalia; Rodríguez Vázquez, Katya title: Complex networks analysis: Mexico’s city metro system during the pandemic of COVID-19 date: 2021-07-14 journal: Case Stud Transp Policy DOI: 10.1016/j.cstp.2021.07.003 sha: 66c8fc447aa171d303dd89a4d5a57b1f309883c5 doc_id: 1030269 cord_uid: ufwue1ww The COVID-19 (SARS-CoV-2) pandemic is changing the world, the way we socialize, as well as politics and public transport logistics and Mexico is not the exception. Authorities are designing new policies for public transport. The first restrictions include the close of different metro stations to ensure the social distance, avoid excess of passengers and reduce the demand, which also changes the metro system and the way it operates. In this paper a model is presented, based on the analysis and comparison of the Mexico’s City Metro System as a total network as well as the network with stations that are closed due to COVID-19. Using complex networks will give to the population information about the connectivity, efficiency and robustness of the system, in order to be able to make improvements, have adequate planning, set up different policies to improve and meet the needs of the system after COVID-19. The Mexico City Metro System entered operation on September 4th, 1969 with Line 1. The Metro System (Figure 1 ) is now a metropolitan train network consisting of 12 lines covering 226 km with 195 stations, of which 115 are underground, 55 are at ground level and 25 are elevated (https://www.businessinsider.nl/mexico-city-metro-review-2019-1?international=true&r=US). In the system, there are 48 transfer stations that permit transfer to other lines and; this system has the capacity to transport more than 4.5 million users per day. The Metro System estimates that it has the capacity to transport over 4.5 million users per day, however real data says that there is about 5.5 million passengers per day, thus Mexico City's Metro is one of the most crowded systems in the world) (https://www.metro.cdmx.gob.mx/operacion/ cifras-de-operacion). It serves areas of Mexico City and the State of Mexico and is managed by the Collective Transportation System (STC-Sistema de Transporte Colectivo. The price of a one-way ticket is 5 pesos (approx. 0.25 USD). It has a total of 384 trains, 321 of which are pneumatic tires and of these, 292 have 9 cars and 29 have 6 cars, while 63 have railway wheels, consisting of 12 with 6 cars, 21 with 9 cars and 30 with 7 cars. Table 1 shows how these trains are distributed among the lines (https://metro. cdmx.gob.mx/storage/app/media/Metro%20Acerca%20de/Mas% 20informacion/planmaestro18_30.pdf). Out of this total of 384 convoys, 285 are in operation and 99 are out of service for the following reasons: 33 owing to a lack of spare parts; 20 are in reserve; 17 for maintenance; 15 for general check-up; 7 because of breakdowns; 5 for modernization; and one more for special work and another for the reprofiling of wheels. Also, these convoys have different capacity depending on the number of wagons that they have. In Table 2 , the passenger capacity per train (https://www.metro.cdmx.gob.mx/parque-vehicular) is presented and it is important to know this fact because it affects directly to the operation. According to the last population census conducted by INEGI (Instituto Nacional de Estadística y Geografía-National Institute of Statistics and Geography), in 2015 there were 119.5 million inhabitants in Mexico, making it one of the eleven countries with the highest population in the world. The Mexican entities with the highest number of inhabitants are: State of Mexico (16, 187, 608) and Mexico City (8, 918, 653) , which together add up to over 25 million inhabitants (http://cuentame.inegi.org.mx/poblacion/habitantes.aspx?tema=P) that make up the Metropolitan Zone of the Valley of Mexico (ZMVM-Zona Metropolitana del Valle de México). Mexico City's Metro is one of the most crowded in the world. One study found it to be the second most crowded metro system in the world just behind New Delhi, India, with up to six passengers crammed into every square meter of train during rush hours. Both the Metrobús and Metro are oversaturated and unappealing options during commuting hours. These systems simply do not have the capacity to move the city's entire population of commuters and do not connect up many of the suburban neighborhoods. Public transport in Mexico City consists of a variety of systems. Apart from the Metro (https://www.globalmasstransit.net/archive.php? id=23169), there is, for example, the Metrobús, a rapid transit bus with dedicated lanes that runs along Insurgentes Avenue and other principal avenues, having a total of 7 lines; RTP (Passengers Transport Network) that are buses that operate in specific lanes along 94 routes; trolley buses with defined stops on 8 lines; "Tren Suburbano" (Suburban Train) that runs as a suburban train that connects Mexico City with the State of Mexico in the North; as well as traditional buses that operated by private companies along defined routes but without specific stops. The main problems of the metro system are the result of factors such as wear and tear of the gears, caused by natural wear; cracks or fractures in older lines; a lack of lubrication on rails as well as differential subsidence caused by the subsidence of the subsoil of the city. In December 2019 a worldwide crisis was unleashed by a new coronavirus COVID-19 (SARS-CoV-2). The World Health Organization (World Health Organization, 2020a) recognized COVID-19 as a pandemic on 11 March 2020 resulting in many governments implementing social distancing policies or targeted lockdowns, owing to its fast and easy propagation has spread from China throughout the world, with millions of confirmed cases and deaths from this disease (https:// www.who.int/emergencies/diseases/novel-coronavirus-2019/situation-reports) (World Health Organization, 2020b) (https://covid19. who.int/table). The study of behavior in respect of the spread of this disease is extremely useful in helping its mitigation and control. Among the different methods used for modeling aspects of the COVID-19 pandemia there are mathematical models for modeling the spread as well as complex networks for analyzing outbreaks like the one in Wuhan (Liu et al. 2020) . Among the aspects of interest for modeling are the factors that have an influence on the behavior of the transmission of the virus. One of these factors is the accelerated spread of the pandemia for each place in particular, as every country can establish its own strategies according to the conditions of its infrastructure, economy and demographics. In addition to the population's response to these strategies it can take the capacity of the infrastructure and human resources the country has into account and, in some cases, these factors can mean that a country has a disastrous response to a pandemia. It is important to mention that the reports produced for decision makers are based on a broad range of supports, including the aforementioned analyses, that are used for the implementation of policies, such as closing down borders and international flights, enacting national lockdowns, closing parts of a city or the economy, relaxing said restrictions and deploying equipment and resources for field epidemiology (Bruinen de Bruin et al., 2020) . The policies being applied worldwide for the mitigation of COVID-19 include: • Restrictions on mobility; limiting the movement of people to contain or slow down the spread of the virus. • Socio-economic restrictions: limitation on social and economic activities where people get together for educational, recreational, sporting activities or for work-related activities. This includes total closure or limitations. • Physical distancing: also known as social distancing indicates una suitable physical distance between people (currently defined as between 1.5 m and 2 m). • Hygiene measures: limit the risk of spreading the virus and the direct or indirect contamination of others by giving concise instructions for hygiene such as washing your hands for at least 20 secs, coughing and sneezing into your elbow, avoiding touching surfaces and making contactless payments, etc. • Communication: always keeping the general public informed about the rates of the spread of infection and mitigation measures. Due to COVID-19 (SARS-CoV-2), the Mexican government (https:// semovi.cdmx.gob.mx/comunicacion/nota/tarjeta-informativa-cierre- temporal-de-estaciones-por-fase-3-covid-19) decided to close 36 metro stations (about 20% of the system) with the objective of enhancing to reduce the number of passengers on the stations and trains, reduce the flow passengers avoid congestions and also improve the speed and frequency of the trains. This measure started on April 23, 2020 and, at the time of writing this article (towards the end of May 2020) it is still uncertain when this restriction is going to end. The stations that are closed are the ones that serve fewer passengers and that do not have any connection with other lines, these because if they close a station that is a transfer the connections in the network will suffer a big disruption. The only line that still has all the stations open is line 3 that runs from Universidad to Indios Verdes. That is because it is an important line that connects the south with the north of the city and has very important stations serving with great importance due to the presence of main hospitals such as La Raza, Centro Médico and Hospital General. Another restriction that the Mexican government has implemented is the compulsory use of facemasks within the metro system and on all public transport, the use of hand sanitizer, avoid talking, singing or yelling along the journey and avoid eating inside the trains and stations. Since the first restriction began in Mexico, it is estimated that the flow of passengers has been reduced by almost 80%. The government (Table 3 ) also decided to close 45 Metrobús stations, 4 "Tren Ligero" (tram) stations and 1 station of the ECOBICI (Ecobike), public bicycle rental system. The government has proposed measures to further restrict car use during the COVID-19 contingency to limit mobility during the pandemic. The policy operates like the usual "Hoy no circula" but applies to all cars, including cars with hologram 0, 00 stickers, hybrids, and electric cars. An exception is made for public transport, special vehicles such as ambulances, fire-engines, and police cars as well as vehicles belonging to health-care workers and freight transport. So, this analysis will focus on using Complex Networks to identify the stations that have the most significant problems, vulnerability and the robustness of the network. It will carry out a simulation of the entire system, creating different scenarios to see how it works under these different scenarios. There are some references relating to the study of the Mexico City Metro System (Dillarza-Andrade, 2017; Hernández González and Flores de la Mota, 2018; Guerra, 2014; Vera-Morales, 2017) but they examine specific aspects of Mexico City's Metro. For example, the thesis of Dillarza-Andrade (Dillarza-Andrade 2017) only talks about boarding times at one station on one line, specifically the Pantitlán station, one of the most crowded stations. She used agent-based simulation to represent the problem in this thesis. On the other hand, there are several studies that talk about transport networks in general (Háznagy et al., 2015 : Louf et al., 2014 Shiau and Lee, 2017) . There are also a lot of papers that only study metro (subway) systems as complex networks and their topology. Some of these papers study metros or subway systems in general (Camille Roth, 2012; Derrible, 2012; Derrible and Kennedy, 2009, 2010; Drozdowski et al., 2014; Fortin et al., 2016; Gattuso and Miriello, 2005; Kim and Song, 2015; Stoilova and Stoev, 2015; Wang et al., 2015 Wang et al., , 2017 Wu et al., 2016 Wu et al., , 2018 Zhang, 2017) . It can be used these kinds of studies as references for methodologies in general. There are other papers that use specific cases such as London, Boston, Beijing, Stockholm, Kuala Lumpur, among others (Cats, 2017; Chopra et al., 2016; Ding et al., 2015; Latora and Marchiori, 2002; Sun et al., 2018) that could be useful when comparing Mexico City's Metro System with other metro systems. There are not so many studies related to public transport and COVID-19, there are some papers (Koehl, 2020 ) that talk about the pollution and climate change during the lockdown in different countries. Another study (Hensher, 2020) talks about the mobility in public transport with the case of Australia and they analyzed two scenarios with the public transport and rideshare services the first one that the return to normal activity will be in a few months and the other scenario is that the normal service with return in more than 18 months. This study is focus on the analysis of the metro as a whole system for the purpose of identifying a first approach. This step is perhaps one of the most difficult steps because all the information related to the analysis need to be looked up. The second step, therefore, analyzes the data using basic statistical techniques with some statistical tools like R (Dalgaard, 2008) . The network can be then constructed in different ways, where the adjacency matrix is the most common. An adjacency matrix is a square matrix used to describe a network or a finite graph (Newman, 2010) . Also, a network can be created if the nodes, edges and connections or the relationship between them are available. Once the network has been acquired, the complex networks methodology is applied, specifically to analyze the topology or structure of the networks, for example: clustering, closeness, betweenness, assortativity and other complex network metrics (Caldarelli and Catanzaro, 2012; Newman, 2010) . The degree distribution of the networks can be obtained as a good approach to the network's behavior. With all the metrics and the degree distribution, networks can be classified into one of several different network models (Random Networks, Small World Networks and Scalefree Networks). This step consists of translating the information into time series, so, applying a decomposition algorithm of time series into the threecomponents series (Seasonal, Trend and Random); thus, the ACF (Auto Correlation Function) and the PACF (Partial Auto Correlation Function) are obtained. In this step, time series models such as ARIMA (Autoregressive Integrated Moving Average) models are also created in order to perform some forecast of the data (Mood, 1974). For the simulation process, many techniques and methodologies (Banks, 1998) can be applied, and different scenarios of the network can be created in order to evaluate the various structures and vulnerability that will enable to compare networks and find, which one is better. Here, different scenarios, can be proposed; for example, what happens when one node or an edge is deleted. For the statistical analysis (Dalgaard, 2008; Mood, 1974) it is used R software with RStudio (Version 1.2.5033) , an open-source programming language and software environment for statistical computing and graphics, is used specific R software packages, such as, igraph, networks, tkrplot, sand, sna, forecast, TimeSeries, TSA and others, were considered. This software (RStudio) permits to generate graphs, compute different network metrics like clustering or transitivity, different centrality metrics, plot networks, create mathematical models, forecast data as well as giving more functions. Also, a BI (Business Intelligence) software (MicroStrategy Desktop Version 11.2.0300.14736) is used, and it allows to prepare some data, just like an ETL (Extract, Transform, Load) process and to create reports and visualization of the data. According to the methodology (Shirai Reyna & Flores de la Mota, 2019), at the first step, the data of the number of passengers by station and quarters is gathered: from the first quarter of 2011 to the fourth quarter of 2019 (https://www.metro.cdmx.gob.mx/operacion/cifrasde-operacion). Computing basic statistics: First, the number of passengers per line, is analyzed in order to have the ranking of the lines with the greater number of passengers. In Table 4 , it is seen that line 2 has the greatest number of passengers. Then, the number of passengers per station is also analyzed to have a ranking of the stations with the highest numbers of passengers. In Table 5 , it is seen that the Indios Verdes station is the most crowded, but it can be noticed that Pantitlán is a hub, thus, the cumulative number of passengers is higher than Indios Verdes. This is important because this means that special attention is needed to this station, while line 2 and line 3 are the lines with the most passengers. From Table 6 , it is seen that Tlaltenco, a station of Line 12, is the less crowded, but it can be noticed that there is a trend on line 5, line 4 and line 6 that have a lot of stations that are not very crowded, but, some of these stations are transfer stations. It is important to point out here that stations that have connection with another line cannot be closed because this would cause a major failure in the system's connectivity. The next step in the methodology is to create the network. Then, structure of the metro system, is characterized as an adjacency matrix. In this case, the nodes represent the stations and the edges represent the connections along the line. The following figure (Fig. 3) shows the structure of the system as a complex network. Once the system is characterized as a complex network, the diverse complex network metrics to study the topological structure of our network can be computed. Table 7 presents the results of the complex network analysis carried out for the entire system. The minimum degree corresponds to 1 and is understandable as they are terminal stations, while the maximum degree is 4 which corresponds to stations like Pantitlán or Chabacano; meanwhile, the average degree is 2.25, which mean that there are very few transfer or connection stations. On the other hand, density is important as it describe how connected the network is. The real systems modeled with networks, in general, are not very dense, owing to the cost of the links. The network has a density of 0.011 which indicates that connectivity within the network is very low and poor. Another metric that it is relevant in this analysis is mean distance, which is the average distance between all the pairs of nodes. Thus, it is expected that the networks to have a low average distance, which has to do with the small world property, but in this case, the mean distance of 12.94, which is quite high in comparison to the number of nodes and edges. In addition to the metrics listed above, it is important to study the topology of the network, like clustering that is important. The clustering coefficient of a node is the ratio of existing links connecting a node's neighbors to each other and the maximum possible number of such links, while global clustering is the average of the clustering of all the nodes. Starting with global clustering, which is the tendency of the network to form triangles or be transitive, then the global clustering is very low (0.056) having a low tendency to form triangles. While, the mean local clustering, it is very similar to global clustering but, in this case, lower (0.017). Then, it can be said that there is no tendency to form small groups, i.e., they remain in the whole group. On the other hand, betweenness centrality helps to identify how important a node is within a network, computing how many short paths pass through the node in question. Then, the average intermediate centrality for each case is calculated, obtaining a value of 0.144, which means that the network has a very low betweenness centrality. Closeness centrality focuses on computing the shortest paths from each node to all the other nodes in the network and here the closeness is relatively high. Talking about the correlation of nodes, the coefficient of assortativity gives values between − 1 and 1, gives information related to if a network is assortative or disassortative. Then, in this work, the network has a value of 0.24, which means that it is assortative. From Fig. 4 , it is observed that the degree distribution seems to follow the Binomial Distribution related to. With all these results, it is possible to analyze and compare the behavior of the different stations and lines. In addition, the topology of the whole system was analyzed, concluding what type of network model is and what specific characteristics and properties they share. The next step is to perform the time series analysis. Firstly, data is organized and sorted by date (from the most recent date backwards). The next step is to plot the time series just as the example shown on Fig. 5 , where the 12 lines are plotted as a time series. It is seen that there are some lines with the same behavior -for example lines 1, 2 and 3-, integrate a cluster due to these lines exhibit same patterns. A strange behavior is presented in line 12 because it was opened at the end of October 2012 and then, a section of the line had to be closed again owing to technical problems. In Fig. 6 , time series of all the passengers are shown and the next step is to analyze them. Time series decomposition, that is a mathematical procedure that transforms a time series into a multiple different time series is applied. The original time series is often split into 3 component series: • Seasonal: Patterns that repeat with a fixed periodicity. • Trend: The underlaying trend of the metrics. • Random: Also called "noise", "irregular" or "remainder". This is the remainder of the original time series after the seasonal and trend series are removed (Fig. 7) . To continue with the analysis, the ACF (Auto-Correlation Function) that gives autocorrelation values for any series with its lagged values, is used. These values are plotted along with the confidence, having an ACF plot. In simple terms, it describes how well the present value of the series is related to its past values. A time series can have components such as trend, seasonality, cyclic and residual. ACF considers all these components while finding correlations, hence it is a complete auto-correlation plot (https://towardsdatascience.com/significance-of-acf-and-pacf-plot s-in-time-series-analysis-2fa11a5d10a8) (Fig. 8) . The PACF (Partial Auto-Correlation Function) is also applied. Basically, instead of finding correlations of present with lags like ACF does, it finds correlation of the residuals (which remains after removing the effects which are already explained by the earlier lag(s)) with the next lag value, hence 'partial' and not 'complete' as variations are removed before finding the next correlation. So, if there is any hidden information in the residual which can be modeled by the next lag, a good correlation may obtain, and the next lag can be kept as a feature while modeling. Remember while modeling, many features are not desirable to be kept which are correlated as that can create multicollinearity issues. Hence only the relevant features are retained (Fig. 9) . The ACF and PACF plots are more commonly used to obtain the values of p and q to feed into the ARIMA model. All these analyses are important because they show the patterns, seasonality, and trends that the passengers follow throughout the entire time. For the simulation scenario, a network without the 36 stations that are closed is considered, then the adjacency matrix is built with the hypothesis that these stations completely deleted from the entire system. Thus, instead of having 195 stations, only 159 stations on the metro system are considered (Fig. 10) . On the scenario COVID-19, 36 stations (nodes) and the edges that connect them are deleted, still having a connected network. According with the methodology, the different complex network metrics for the scenario COVID-19 are computed and then the results are analyzed and compared with the original network (Table 8) . Comparing the two-network metrics, it can be found that the maximum and minimum degree is the same in both scenarios. The mean degree is almost the same and smaller in the case of the diameter in the COVID-19 scenario, which means that it is the shortest distance between the two most distant nodes in the network and so, it can enter a smaller number of stations, currently 32 stations and 39 with the original network. The mean distance is very important. In the COVID-19 scenario there is a small increase, and it is almost 14, which means that it would take more stations to go from one station to another, so, desire to have a small number. The number of cliques stays the same. Density describes the portion of the potential connections in a network that are actual connections, therefore the numbers are very small, and the COVID-19 scenario network has a higher number, so it is better connected. On the other hand, the clustering metrics change a little, are a bit higher in the COVID-19 scenario but not too much, so, it is continued with the same hypothesis of the original network. The centrality metric does not change a lot as there is just a higher number in the degree and betweenness centrality and a lower number in the closeness centrality but not a big difference. Then the scenarios keep almost the same characteristics as the original network. From the simulation of the COVID-19 scenario it can be conclude that the decision implemented by the government was the right one, because the number of stations closed did not majorly affect the connectivity and operation of the original network, meaning that connectivity, robustness and efficiency seem to be good, and also because the government opted not to close transfer stations. The situation of the pandemic around the world increases the uncertainty when all the stations are going to operate as before or what the new policies for starting up regular operations will be. If a node (station) that is not important is deleted in terms of the number of passengers and its connection with other stations or lines, it will not have a great impact because the network will continue operating also with not such a good connectivity, efficiency and robustness as the original one, but it will not be as bad as if one of the important node is deleted. It is concluded that the degree distribution of the network follows a Binomial Distribution (Fig. 4 and Fig. 11) , and in this case the network follows a Random Network Model because of the binomial distribution on the degree; the mean distance is high (tends to p ~ log N); the clustering is low (tends to k/N), where k is the average degree of the nodes. In random networks, the neighbors of a certain node are chosen at random, so there is no correlation between the degree of neighboring nodes. Finally, these networks are more robust in resisting targeted attacks, while at the same time being vulnerable to internal errors. After the time series analysis, it is concluded that there is no evidence of a growing trend in the number of passengers and there are some patterns in the seasonal cycles. It is hard to find the behavioral patterns at a macro level. Then, for the next steps of the research, the same analysis but at medium and micro levels will be do. Also, when the data is available, a time series analysis to compare the COVID-19 scenario with past time series should be done, to find out how much the flow of passengers has decreased and how much time the restriction will last. As part of future work, the methodology can be applied to different scenarios for example back to the normality the government is planning to reopen some of the close stations. Also, the government decide to close extra stations due to national holidays to avoid the people use the public transport and reduce congestions. For future work it would be important to analyze the Metro's 2018-2030 Master Plan (https://metro.cdmx.gob.mx/storage/app/ media/Metro%20Acerca%20de/Mas%20informacion/planmaes-tro18_30.pdf) as it proposes a variety of actions to improve the metro system, such as new lines, the extension of current lines and a maintenance program that could be simulated and compared with the current network in order to be able to analyze the topology of the new networks. Then, it can be proposed different actions to improve the metro service. Furthermore, for further work, create a multilayer network with all of Mexico City's different public transport systems such as Metrobus, RTP, Trolley Buses, among others, to analyze passenger flow, alternative routes, the connectivity and efficiency of the multi-modal network as well as to improve urban public transport in general. The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Handbook of Simulation. Principles, Methodology, Advances, Applications, and Practice Initial impacts of global risk mitigation measures taken during the combatting of the COVID-19 pandemic Networks: A Very Short Introduction Evolution of subway networks Topological evolution of a metropolitan rail transport network: the case of Stockholm A network-based framework for assessing infrastructure resilience: a case study of the London metro system Introductory Statistics with R Network centrality of metro systems Network analysis of world subway systems using updated graph theory Characterizing metro networks: state, form and structure Medición del tiempo de abordaje de los pasajeros del STC: estación Pantitlán Complex network theory applied to the growth of Kuala Lumpur's Public Urban Rail Transit Network Mind the gap: a heuristic study of subway tours Applying complex network theory to the analysis of metro networks Innovative GTFS data application for transit network analysis using a graph -oriented method Compared analysis of metro networks supported by graph theory Mexico City's suburban land use and transit connection: the effects of the Line B Metro expansion Complex network analysis of public transportation networks: a comprehensive study, Models and Technologies for Intelligent Transportation Systems (MT-ITS) 3-5 What might Covid-19 mean for mobility as a service (MaaS)? Examining accessibility and reliability in the evolution of subway systems Urban transport and COVID-19: challenges and prospects in low-and middle-income countries Is the Boston subway a small-world network? A new SAIR model on complex networks for analysing the 2019 novel coronavirus (COVID-19) Scaling in transportation networks Measuring network-based public transit performance using fuzzy measures and fuzzy integrals Complex network analysis: Mexico's City metro system An application of the graph theory which examines the metro networks Vulnerability assessment of urban rail transit based on multi-static weighted method in Beijing Un Modelo de simulación para mejorar los mecanismos de evacuación en el STC Metro Quantifying the robustness of Metro Networks Multicriteria robustness analysis of metro networks A Network Analysis of World's Metro Systems Analysis of metro network performance from a complex network perspective Structural analysis of bus networks using indicators of graph theory and complex network theory. The Open Civil Eng Coronavirus disease (COVID-2019) situation reports. WHO Coronavirus Disease (COVID-19) Dashboard Sashiko Shirai gratefully acknowledges to CONACYT and UNAM for the scholarships. Also, authors acknowledge UNAM for its financial support under PAPIIT DGAPA project IT102117 for this research.