key: cord-0545613-qqvuh2a9
authors: Louail, Thomas; Barthelemy, Marc
title: A dominance tree approach to systems of cities
date: 2022-02-28
journal: nan
DOI: nan
sha: 3b036a7c56dfb4b0004d4c31cc8c9e1d725844e3
doc_id: 545613
cord_uid: qqvuh2a9

Characterizing the spatial organization of urban systems is a challenge which points to the more general problem of describing marked point processes in spatial statistics. We propose a non-parametric method that goes beyond standard tools of point pattern analysis and which is based on a mapping between the points and a"dominance tree", constructed from a recursive analysis of their Voronoi tessellation. Using toy models, we show that the height of a node in this tree encodes both its mark and the structure of its neighborhood, reflecting its importance in the system. We use historical population data in France (1876-2018) and the US (1880-2010) and show that the method highlights multiscale urban dynamics experienced by these countries. These include non monotonous city trajectories in the US, as revealed by the evolution of their height in the tree. The method also captures the attraction basins of cities at successive scales, and while in both countries these basin sizes become more homogeneous at larger scales, they are also more heterogeneous in France than in the US. We introduce a simple graphical representation - the height clock - that monitors the evolution of the role of each city in its country. Finally, we show that the height of a city in the tree is less sensitive to different statistical definitions of cities than its rank in the urban hierarchy.

Understanding the organization of urban systems has always been a central challenge in geography and economics [1] [2] [3] . On one hand the population sizes of cities have been extensively discussed since Zipf's work [4] [5] [6] , and have been shown to follow a broad distribution: there is a hierarchy of cities characterized by many small cities, a few medium cities and a very few large ones, whose population sizes are much larger than the rest. While this distribution has been fully characterized and discussed [6, 7] , the spatial distribution of cities has also been an important subject of debate [8] [9] [10] [11] . Cities are not scattered at random but follow some logic based on geographical constraints, economical considerations and historical path dependency. Central place theory [8] seeked to explain the spatial distribution of cities of different sizes based on the idea that settlements function as 'central places' that provide services to surrounding areas. The result of consumers' preferences is then a system of centers of various sizes, forming different levels of a hierarchy. A consequence obtained by Christaller [8] is that the most efficient pattern to serve areas without overlap is a triangular or hexagonal arrangement of settlements. Although this idea has been very successful and inspirational to many scientists, few works have tried to validate it quantitatively. For example, cities have been studied from the correlation point of view [13, 14] , but an important contribution to this problem is due to Okabe and Sadahiro [15] who showed that random (uniform) arrangements of cities could explain Christaller's findings. * Electronic address: thomas.louail@cnrs.fr † Electronic address: marc.barthelemy@ipht.fr

To reach this result they had to define a quantitative tool that captures the spatial dominance of a city on another, and we will elaborate on this idea in this paper. More generally, the difficulty of characterizing the spatial organisation of urban systems appears in many problems of spatial statistics [16, 17] where the points (cities in our case) have a position and are described by (at least) one quantity such as the population. In spatial statistics this general setting is referred as a 'marked point process' [16] , in which each point x i of the process carries extra information called a mark m i . The mark can be a random variable, a categorical variable, or any other additional information about the points. This problem is not only important in geography with the study of human settlements, but is also relevant for many fields ranging from ecology (positions of plants of different species), to epidemiology (locations of infected individuals), material science (positions of defects) or astronomy that is interested in the location of stars and galaxies. An urban system is an example of such marked point process, where the points are the cities and the mark is their population size. Space is obviously important in this system, and has to be considered jointly with population: if for example two cities are close to each other, it makes a difference if they have similar or very different population sizes.

Standard tools developed for the analysis of point processes usually consist in measuring spatial autocorrelations [18] , or testing the null hypothesis of complete spatial randomness with the K or L statistics which summarize the deviations from a uniform (Poisson) distribution [16, 17] . These tools were then extended to marked point processes and describe deviations from known cases such as the Poisson process, or intensity and moments mea-sures [19] . A tool that would go beyond these measures and provide a more precise characterization of these processes would then be extremely useful for a wealth of different problems.

To further characterize marked point processes it is possible to start from geometrical structures constructed on top of the point pattern. These 'secondary structures' [16] comprise in particular tessellations and networks [20] . The Voronoi tessellation is one of the most relevant structure in computational geometry, and is of major importance in the resolution of many problems, in particular in location science [21] . Networks can also be constructed over a set of points, and useful tools include proximity graphs (such as the random geometric graph) or excluded volume graphs (such as the Gabriel graph). Measures on these secondary structures can then characterize the point process itself, and constructing a spatial network on top of the point process enables to import all the networks knowledge into spatial statistics. For example, a recent approach uses first-passage times of random walks on networks constructed over a set of points in order to quantify correlations in complex systems [22] .

However, approaches based on secondary structures are in general used to estimate deviations from uniformity, or the importance of some correlations. Here, starting from the Voronoi tessellation, we will construct a tool that is useful to understand the multiscale structure of a system of marked points, and to compare it to other systems. Such a tool for characterizing the organization of this type of systems should thus encode both the spatial information and the population (in more general terms both the positions and the marks of the points). The purpose of this tool is not to describe the deviation from a uniform distribution, but to compare two systems (such as urban systems in different countries for example) and to understand the temporal evolution of these systems from a non-local, high-level perspective.

In the following of this paper we describe such a tool and its associated measures. First we will describe the dominance tree, introduced in [15] , that will constitute the basis of our analysis. We will then introduce different measures to extract information from this tree. In particular, we will show that the height of a node in this tree encodes both its mark and space-related information. We will first illustrate this method on toy models, and then on empirical data, to compare the evolution of the French and the US urban systems over the 20th century.

The method we propose can be applied to any marked point process, and we will illustrate it on the case of cities in a country. We assume that a given country has N cities and each city i is characterized by its location x i in some coordinate system, and its population P i (t) which can vary over time. In order to characterize this system we have to define a data structure that encodes both the spatial information (the location of cities) and their importance (their population size).

The first step is to construct the dominance tree proposed by Okabe and Sadahiro [15] . The idea is to recursively construct Voronoi tesselations over the set of nodes (see for example the book [20] and references therein). The Voronoi cell V i of a node i is the set of points that are closer to i than to any other node, Fig. 1(A) an example of a Voronoi tesselation computed for nodes distributed in the plane. Starting from this Voronoi tessellation, we identify local maxima or local 'centers': a point i is a local maximum if its population is larger than those of the neighboring Voronoi cells. In other words, for any point i we define with the help of the Voronoi tesselation the set of neighbors Γ(i) that are the nodes whose Voronoi cell is adjacent to V i . A city i is then a local center if its population P i is larger than the populations of its neighbors: P i > P j ∀ j ∈ Γ(i). In the example of Fig. 1 (A), we thus have 3 local maxima labeled a, l, and m. In a second step, we keep these local centers and construct a new Voronoi tessellation over them ( Fig. 1(B) ). Cities that do not appear anymore (i.e. that were not local centers at the previous step) belong now in the new Voronoi cell of a local center, and are therefore 'dominated' by this city and belong to its 'attraction basin' (for example, nodes b, c, d, f , g, belong to the attraction basin of node a). Here again we determine local maxima, i.e. cells that do not have any neighbor whose population is larger than their own. We repeat this procedure recursively until only one city is left ( Fig. 1(C) ) which, by construction, is the city with the largest population. We therefore understand that a city will 'survive' many iterations if its population is large but also if it is well located. A large city located very close to an even larger one will quickly be absorbed by this larger city. It is interesting to note that this process bear some ressemblance to the coarse-graining obtained by means of real-space renormalization [23] . The dominance tree -shown in Fig. 1 (D) for the simple case we just described -is then some sort of bookkeeping of the changes of scale and the hierarchical organization of the marked points: each node has children that correspond to other nodes which belong to its attraction basin. We show on Fig. 1 (E-H) the Voronoi tesselations that correspond to the successive steps of this process applied to US cities. In this case the root is New York City (successive Voronoi cells of New York and Los Angeles are colored) and the depth of the tree, that is the number of iterations to reach the root, is H = 6.

We thus represent this process by the dominance tree ( Fig. 1(D) ) where we keep at each level the remaining points and where the links denote the spatial dominance relation. At height zero (the leaves of the tree), we then have all cities of the system that are locally dominated by a larger neighbor city. After one iteration of the process, we have the local centers, etc. until we reach the root of the tree which is the city with the largest population. We first construct the Voronoi tessellation (shown with dotted lines) for these points and we observe that there are three local maxima: a, l, m, in red and blue. These are the points which do not have any neighbor whose mark is larger than their own. (B) From step (A) we keep the local maxima only, we construct the Voronoi tessellation over this set, and determine the local maxima (which is here the node a). (C) At the end of the process, we are left with the node with the largest mark.

(D) The entire process can be described by the 'dominance tree': leaves correspond to the points which are not local maxima, and the children of a given node are the points that belong to its attraction basin. For example, as can be seen in (B), nodes k, j and e belong to the attraction basin of node l, which is a local maximum at level/height h=1. (E-H) Successive Voronoi tesselations obtained when applying this 'decimation' process for US cities marked with their 2010 population sizes [24] . After H = 6 iterations, we obtain a single city. By construction, the node with the largest mark will always be the root of the dominance tree, but for the other nodes their height in this tree will not only depend on their mark but also on their location.

Okabe and Sadahiro proposed this construction [15] and used it to prove that the most important quantitative statements in Christaller's central place theory were also observed for a random spatial Poisson point process. We can then represent a system of cities by a dominance tree, where the height of a city is the largest iteration before it is absorbed by a larger city. Once we have constructed the dominance tree, each city i has its height h i . We recall here that, by definition, the height of a node in a tree is the number of edges between this node and the furthest leaf going down in the tree. In other words, it is the length of the longest path (i.e. its number of edges) from the node to the deepest leaf (in contrast, the depth of a node in a tree is the distance from a node to the root). The height of a leaf is then h = 0, while the root has the largest height (denoted by H in the following). We will show that, due to the statistical properties of Voronoi tesselations built from spatial Poisson point processes, H ≈ log 6 (N ) where N is the number of points in the system, and H is in general of order 5 or 6 in our empirical analysis of urban systems. We introduce n f (h) which is the number of cities such that their final height is h. In contrast, the total number of cities at a given height h is denoted by n i (h) and we have the following relation

Once we have constructed the dominance tree, we have to characterize it. There is a large number of possible measures, but we will focus here on the height of a city, and we will discuss it for toy models and for empirical data. A city i at time t is then characterized by its population P i (t) (or equivalently by its rank r i (t) when population are sorted by decreasing order), and its height h i (t) in the dominance tree. This height characterizes the role of the city in the hierarchical organization of the urban system. While in the following we analyze urban systems at the national scale, the method could be applied at different scales as well, such as the regional or continental scale.

In order to define a toy model we have to specify both the point distribution and the population distribution. Different models are possible and we will mainly consider the following variants. First, we will consider that populations are distributed according to the power law ρ(P ) ∼ 1/P α with α = 2. This case corresponds to the classical result of Zipf and even if recent data show that this exponent can fluctuate considerably [6, 7] , it won't affect our discussion here, as the relevant quantity is not the population size itself, but its corresponding rank in the hierarchy: indeed, what matters for the construction of the dominance tree is to know if a city is larger than its neighbors. In the following, we will thus indifferently discuss population or rank for characterizing a given city.

For the point distribution we will consider a uniform distribution (spatial Poisson process) of the x i in the square [−1, +1] 2 . We thus have two lists: a list of population sizes P i and a list of positions x i (i = 1, . . . , N ). In order to define a model we have to specify how to match these lists, and the resulting correlations encode in a simple way the spatial organization of the system. To simplify the description we will assume that the populations are sorted in decreasing order P 1 > P 2 > · · · > P N and that locations are sorted in increasing order according to their distance to the center (0, 0): ||x 1 || < ||x 2 || < · · · < ||x N ||. We will consider the following three cases:

• The 'deterministic model' : we associate the largest population to the closest point to the center (0, 0): P 1 ↔ x 1 and follow the order P i ↔ x i . This model mimics in some way a system with a central organization around the main city.

• The 'random model': we associate the largest population to the closest point to the center (0, 0): P 1 ↔ x 1 , and then associate randomly the rest of the P i s to the rest of x i s. In this case, we have a central large city but there is no specific organization around it.

• Finally, the 'tunable model' is less deterministic. We first assign P 1 to x 1 as before, but for the remaining N − 1 cities we proceed as follows. For P 2 we choose at random a po-

is the distance between x 1 and x k , Z = k=2,...,n exp(−d(x 1 , x k )/L) is the normalization and L is a positive parameter that can be interpreted as a polarization or interaction distance. We then proceed in a similar way for P 3 with the remaining x i left, etc. until the list of populations is exhausted. This tunable model is then able to interpolate between the deterministic case with L 1 and the random case with L 1.

These toy models allow us to explore the effect of various parameters and to test various aspects of our characterization of spatial hierarchies. We show in Fig. 2 (A) the Voronoi tessellation of 10 3 points in the square. The color code represents here the rank of cities according to their population (the darker, the larger the associated population). On Fig. 2 we have represented the 'tunable' model that interpolates from the deterministic case (on the left) with a clear 'monocentric' pattern with population decreasing with the distance to the center, to the random model (last figure on the right) where there are no correlations between rank and distance to the center.

Before turning to these specific cases, we first discuss some general properties about the depth of the tree, and the number of nodes at a certain level. First, by construction, the root of the dominance tree is the city with the largest population. An important quantity is the depth H of the tree that will measure the number of different hierarchical levels in the system. In order to estimate H, we first evaluate the number of cities n i (h) that are left at the height h. We denote by z h the average number of neighbors in the Voronoi tessellation at level h, and the number of cities at level h is then

where the brackets · denote the average over all levels. This expression implies that the number of cities and the level h are related as follows

where a = ln N and b = ln z ≡ ln z eff . The depth of the tree H corresponds to the height of its root, at which there is only city left (n i = 1). Hence we obtain

In the 'random' toy model, cities are uniformly distributed and we can roughly expect z eff ≈ 6 [20] and with N = 10 4 cities we obtain a tree of height H ≈ 5. In contrast, for the deterministic model, there is essentially a single local maximum leading to a small depth (see Supplementary Figures S1 and S2) We consider now the relation between the rank and the height of a city in the dominance tree, and the result is shown for the random model in Fig. 2 (B) where locations are fixed and different disorder realizations correspond to different arrangement of populations sizes on these fixed locations (in other words, location is the quenched disorder, while the mark is the annealed disorder). This plot shows that on average, smaller ranks lead indeed to larger heights (and eventually the largest city with rank r = 1 has the largest height h = H). However, we also observe on this plot that this relation is not univocal, as there are large fluctuations around the average behavior. In other words, the rank (or population) of a city does not determine its height in the tree. This shows that the height encodes simultaneously the rank of a city in the urban hierarchy and its location. In particular, we see on Fig. 2 (B) that cities with ranks r ≥ 2 can have different heights depending on the disorder realization.

For each value of L for the tunable model, using the general relation Eq. 2, from the measure of n i (h) versus h, we compute the average number z eff = exp(b) of neighbors of the Voronoi tessellations obtained in the decimation process (see Fig. 2(C) ). For small L values, this model reproduces the deterministic case where cities are distributed around the largest one in decreasing order. In this case in many regions there are no local maxima and the effective number of neighbors z eff is very large. As L increases we tend towards the random model where cities are distributed randomly in the plane. In this case we expect a regular Voronoi tessellation constructed over a Poisson point process, with z eff ≈ 6 [20] .

We have shown on these simple toy models that the height of a city in the tree does not depend on its rank alone, but also on its location and its neighborhood. In order to illustrate further the effect of the spatial arrangement of cities on their heights, we consider the following simple situation. We locate the largest city 1 (of rank r = 1) at the center of the plane and we assign to a point at distance d(1, 2) the second largest city 2. We then locate at random all the other cities and measure the height h 2 of city 2 as a function of the distance d(1, 2). We also record the maximum and minimum height reached by 2 in the random configurations. The resulting plot is shown in Fig. 3 . We see in this plot that the height h(2) of city 2 depends strongly on the spatial arrangement of cities in the space between it and the larger city (1): if there is a smaller city in between, the city 2 is a local maximum and has a larger height (Fig. 3(A) ). In the opposite case, city 2 will be absorbed by city 1 at the next iteration of the decimation process. We test this numerically and the result is shown in Fig. 3(B) . First, as expected we observe that the height h(2) is on average an increasing function of the distance to the largest city. At small distance, the city 2 has a large chance to belong to the attraction basin of city 1 and will be quickly 'absorbed' by it, leading to a small height. In contrast, as the distance increases it will take more steps before the city 2 is in the attraction basin of city 1, leading to a larger height. In addition, for a given distance d(1, 2) we observe important fluctuations of h(2). This clearly shows that the height encodes the spatial organization of the neighborhood of a city, and in particular of the space between the city and the closest larger one. The height is then related to the number of local maxima in this intermediate space.

These toy models demonstrate the relevance of the height of a city in the dominant tree for monitoring its importance in the system. We will now apply these ideas on empirical data.

The previous section helped us to understand the main properties of the dominance tree and the factors that determine the height of a marked point. We will now apply this tool to the spatial organization of the French and the US urban systems, and their evolution over the last 130 years (see the Data section at the end for details).

We first focus on the height of cities as discussed above. We show on Fig. 4(A) and (B) the heights, at different dates, of the 20 cities that are currently the largest ones in the US and in France.

The root of the tree is by construction the largest city of the system, which in both countries has remained the same over the entire period (New York City in the US, and Paris in France). We observe small height variations for French large cities, a fact that was already observed in [28] for the ranks of old European cities in the urban hierarchy, but with notable exceptions such as Lille. In the US it interesting to note that even if the population of San Francisco grew steadily since its beginning, its height in the US tree decreased, signalling a changing environment and the higher growth of neighboring cities.

We observe that in both countries there are different cities which at some point of their history reached the height H − 1: in the US Los Angeles, Chicago, or San Francisco; and for France, Marseille, Lyon, Nantes, Bordeaux or Lille. We also observe that the 20 municipalities that are currently the largest ones in France experienced much less height fluctuations than the 20 largest US cities, some of the latter spanning 3 or 4 different levels in the tree. The height hence captures ascending and descending city trajectories in the US that were not observed in France during the last 130 years. Like other countries in Europe, France has experienced a longstanding urbanization, with a slow and regular evolution, and a city like Paris was already an international hub in the Middle-Age. In contrast, the US is a relatively new country, which experienced successive waves of urban development in the last centuries [29] . The trajectory of a given city in the dominance tree highlights how the role and the local importance of this city evolved in time. It provides a richer information that the rank alone [28] , because it encapsulates information not only about the position of the city in the urban hierarchy, but also about its importance among its neighbors.

Beyond single cities trajectories, the dominance tree also contains information about the spatial organization of the urban hierarchy at larger scales. At each step of the iterative process for constructing the dominance tree, a local maximum -say city i -contains a number of cities in its Voronoi cell V (i). We denote by φ i the total population of this attraction basin: φ i = j∈V (i) P j (the population of the 'seed' city P i is not taken into account in φ i ). At each step of the iterative process we thus have a collection of values of φ, one for each Voronoi cell. We compute the Gini coefficient G of these values (see for example [30] ): a large value of G (ie. close to one) indicates that there are a few very large basins of attraction and all the others are small, while in contrast a small G (close to zero) indicates that most attraction basins have roughly the same population size. We plot G for France and the US at different tree levels h, that correspond to different spatial scales, and for different time periods (see Fig. 4(C) ). There are two remarkable features on this plot. First, in both countries the Gini is decreasing as we approach the root, which means that even if we start from a very heterogeneous situation at small scales (at the urban agglomeration level), at a large scale the population sizes of the attraction basins become comparable. Second, we observe an important difference between the two countries. In France the system has evolved towards a situation where the attraction basins are becoming more heterogeneous (at all spatial scales), while the data show the opposite in the US. Surprisingly enough, it seems that in France, the urban system did not evolve towards a uniform distribution of important basins, but that inequalities in population sizes, at all spatial scales increased in the recent periods.

We also investigate the relation between the population size of a city and its height in the dominance tree. In order to quantity this relation, we compute two lists in which cities are ranked in decreasing order of (i) population sizes and (ii) heights. We compare these lists using Kendall's τ correlation coefficient (see for example [31] ). This quantity is equal to one when the order of the cities is strictly identical in the two lists, and −1 when the lists have an opposite order. For both countries we calculate this quantity for each year available in the data (see Supplementary Figure S3 ), and we also calculate it for the toy models previously discussed. In particular, we obtain for the 'random' toy model the average value τ ≈ 0.43 (represented on Supplementary Figure S3 by the dashed line). For the deterministic model we obtain a value close to 0, and for the tunable model, τ interpolates between these two values for different L. Overall, we observe relatively small values of Kendall's τ in these systems, confirming that the height is not determined by population alone, and that the location of a city has its own importance and effect. Supplementary Figures 4 and  5 complement this observation, and show that there are many pairs of cities such as P i > P j and h i < h j .

We can understand these different results with the following considerations. When large cities are close to one another, some are dominated by even larger ones, and consequently their height in the tree is small. This leads to a small value of τ as observed for the deterministic model. In contrast, when the population size and the location are uncorrelated as in the random model, large cities can reach higher levels in the tree before being 'ab- sorbed' by an even larger city. In this case, there is a higher correlation between rank and height, leading to a larger value of τ . We observe that τ is smaller for France than for the US, suggesting that the system of cities is more 'mature' in France, with a stronger concentration of population within large urban agglomerations, composed of large municipalities located close to one another, in the viccinity of a historical large municipality that constitutes the core of the agglomeration. In addition, we observe that τ is decreasing in time for both countries, signalling a decrease of the correlation between the height and the rank in the urban hierarchy. This can be due to two different effects: smaller cities becoming more important (large h) due to their strategical location, or in contrast cities with small h whose population size increased because they are located near an important city from which they depend.

A city i of population P i (t) has thus a certain height h i (t) in the dominance tree which characterizes the importance of its role at a regional level. In contrast, the leaves of the tree (with height h = 0) do not dominate any other city and depend directly on a more important city in their vicinity. In order to characterize the evolution of the urban system and how different cities can see their role evolving in this system, we study the temporal variation of their height. A first approach was shown in Fig. 4(A,B) , but for a better visualization we propose a 'height clock', in the same spirit as the rank clock introduced by Batty in [28] . On a polar plot, the radius is equal to the height h i (t) and the angle is proportional to time. During a time range [t 1 , t 2 ], the total height variation for a city i is then given by ∆ i = t2−1 t=t1 |h i (t + 1) − h i (t)| and according to the dynamics of this variation, we classify cities into four categories (we show some representative examples in Fig. 5 ).

The first type of cities display a constant height ('stable' cities, not shown here). For France this is for example the case of Paris (and New York City for the US), which was always the most important city in the system since 1880. Toulouse in France has a constant height equal to 4 (H − 2, which can also be seen in Fig. 4) , which indicates the steady importance of this city at a regional level. This class of cities is by far the most important for both countries with a share of 86.7% for France and 72.4% for the US. This is an important indication that the French system reached earlier some kind of more stable state, compared to the US where more active change dynamics were observed in the last century.

The second type comprises cities for which h(t 1 ) = h(t 2 ), but which experienced height fluctuations in between. This is the case for example of Mulhouse, Nantes or Lyon in France, or Austin in the US. As we can see, Mulhouse saw its height fluctuating between 1870 and 1970 before stabilizing in the last decades. During the whole period its population grew steadily (except during the 2nd world war) and this again shows that the height analysis encodes more than just the population dynamics, but the whole system dynamics by considering also the population dynamics of neighboring cities. The third type correspond to cities whose height decreased, while the fourth type are cities whose height increased and which became more important in the organization of the system. The proportion of cities that belong to one of these two categories is much larger in the US than 5.5% 8.4%

3.1% 6.6%

12.6% 4.7% in France, which is consistent with the urban dynamics observed in both countries during the 20th century [29] . Obviously these results need to be replaced in a geographical context, but these observations prove that non-trivial information can be extracted through this single quantity h, that can shed light on the evolution of cities and their importance in the urban system.

The impact of the definition of a city is a central problem in urban studies, in particular when the goal is to characterize, sort and compare cities [7, 26, 27] . In general, administrative boundaries (such as municipalities) fail to capture meaningful borders, and many statistical definitions have been built in order to group municipal-ities in a meaningful set. There are two main types of definitions used by national statistics bureaus in France and in the US. The first one is based on the contiguously built-up area ('morphological' definitions): starting from a dense core municipality, neighbor municipalities are aggregated to this core as long as there is no 'gap' (i.e. an empty area between buildings) larger than a given distance, which is a free parameter in this method (often taken as a few hundred of meters). The second category consists of definitions that are based on people's mobility, generally journey-to-work commuting flows. In this case municipalities are aggregated if the proportion of their residents that commute for work to the previously aggregated municipalities is larger than a given threshold. These are 'functional definitions', and resulting objects are often refered to as 'functional urban areas'. These different statistical definitions capture different properties and serve different purposes. For a given city the municipalities that are included in its 'morphological' and 'functional' definitions may then be different. Consequently for a given city, the shape, the total area and the population size are different from one definition to another. For example, in 2017 according to INSEE, the municipality of Paris had approx. 2.2M inhabitants, the Paris urban unit ('unité urbaine' definition) about 10.7M, and the Paris urban area ('aire d'attraction des villes') had 13M inhabitants.

While these statistical definitions lead to different sets of municipalities composing the cities, key variables that quantity the importance of a given city should not depend on the fluctuations at smaller scales. For example, for cities that are regional, national or international hubs, key variables such as their rank in the urban hierarchy (according to their population) or their height in the dominance tree should be the same whatever the city definition considered.

In 2021, there were about 2, 400 urban units (following the morphological definition) in Metropolitan France, that gathered approx. 7, 500 municipalities representing a total population of 50M people. Following the functional definition, there were 682 urban areas, gathering 26, 000 municipalities for a total population of 60M. For each definition we determined the ranks of cities in the population hierarchy, and then attributed their rank to all the municipalities that composed the city according to this definition. Since Paris' urban unit is the most populated in France, all the municipalities that are included in the Paris urban unit were attributed rank 1, all the municipalities included in Lyon's urban unit were attributed rank 2, etc. We did the same for the urban area definition: we determined the population hierarchy of urban areas, and attributed their rank to each of their municipalities. The 7k+ municipalities that are simultaneously part of an urban area and of an urban unit hence have two population ranks r i and r i , one according to the morphological definition, the other according to the functional definition. In order to compare these lists of ranks, we computed the Kendall correlation coefficient τ , and obtained a value τ r = 0.286.

The height of a city in the dominance tree should have the same robust behavior as the rank, as it encodes the role of the city in the system. The height of a city in the tree should therefore not depend too strongly on the definition chosen for delineating cities in space, and should be at least as robust as the rank. In order to test this, we computed the dominance tree for both statistical definitions, determined the height of each city and attributed this height to all the municipalities comprised in the city. For the 7k+ municipalities that are both in an urban unit and in an urban area, we ended up with two lists of heights, and computed the Kendall τ coefficient between these two lists. We obtained τ h = 0.3. We thus have τ h > τ r , proving that the height of a city in the dominance tree is in fact a little less sensitive to different city definitions than its rank in the population hierarchy.

An important problem in spatial statistics is to characterize a system of points that have marks. For cities, the simplest mark is the population size and the importance of a city at different geographical scales results from both its location and its rank in the urban hierarchy, determined by its population size. We presented a simple tool based on the dominance tree that encodes the spatial structure of a marked point process. The height of a point in this tree appears to be a useful information, and its monitoring allows to understand the dynamics of spatial hierarchies. We have applied this method to toy models and to the evolution of the French and the US systems of cities, but this method goes beyond these examples and could be applied to a wealth of problems, depending on the nature of points and on the quantity chosen to characterize them. It allows for the study of spatial correlations in a simple way, and could help in comparing the dynamics of many different urban systems, discussing their evolution in time, classifying them, etc. and finally contributing quantitatively to urban theory thanks to a shared, non-parametric tool. It could also be useful in connecting distinct bodies of literature, namely the studies of the statistical distribution of settlement and city sizes [6, 7] , and those that investigate their spatial distribution on the Earth [11, 12] . Further studies could also investigate the relation between the height of cities and socio-economical factors, such as their GDP.

Concerning the application of the method to urban systems, several lines of future research can be identified. The first one would be to adapt the method so that it could be applied not only to points but alos to cellular tissues, by considering the real geographical envelopes of the geographical units rather the Voronoi cells computed from their centroids. Since the set of municipalities, departments, regions, etc. in a country naturally form tessellations, the rest of the procedure would remain the same: identifying the cells that are the local maxima, aggregating the dominated cells to the one that is a local maxima, and so on.

Future studies may also further investigate the sensitivity of the method against different definitions of cities, that would correspond to different marked point processes or to different elementary building blocks used for building the dominance tree. In the same vein, it would be interesting to measure differences in the tree structure when considering more elaborated distances than the Euclidean distance. One could for example consider transport networks distances, or average travel times. Such data are however much harder to obtain for many different countries, and for ancient periods of time.

Beyond urban geography, other problems in spatial statistics could also be addressed with this simple tool. As it encodes in a simple way various correlations, it could shed a new light on the characterization of random tessellations and random patterns in two dimensions, and could help in revisiting important problems in the physics of foams, such as the Aboav-Weaire and Lewis laws [32, 33] which in general focus on the distribution of local statistics such as the number of edges, the neighbor statistics, etc. but not on the global spatial organization.

We use census population data for France provided by the French national statistics bureau (INSEE) [25] . These data give the population sizes of approximately 32, 000 French municipalities (administrative boundaries) of Metropolitan France for every decade during the period 1876-2018. For the US, the dataset [24] is a compilation of US cities populations between 1790 and 2010 (every 10 years). The data come primarily from the US Census Bureau dataset and contain 7, 500 incorporated cities which at a certain point of their history had a population larger than 2, 500 individuals. Other cities were added from a variety of sources (see [24] for details). Both datasets are public and can be freely downloaded (see [24, 25] for the URLs).

We note that the underlying criteria used to delineate cities in space for computing their population size may influence our results. Here we use administrative boundaries (municipalities) for France which are known to be problematic in many ways when comparing historical population data from one city to another, or between countries [26] . However, the dominance tree is constructed in order to highlight the spatial organization of urban systems at various scales and while most methods used to determine relevant spatial boundaries for cities rely on arbitrary thresholds that are different from one country to another (shares of commuting flows, minimal distance between consecutive built-up areas, etc.), this non-parametric method could provide an alternative way to construct urban agglomerations with Voronoi cells at different heights, and could constitute an interesting di-rection for future studies.

Data and source code availability. The data that support the findings of this study are available for download at https://github.com/cestastanford/ historical-us-city-populations for the US data, and at https://www.insee.fr/fr/information/ 2414405 for the French data. The source code is available at https://gitlab.huma-num.fr/tlouail/voronoize.

TL and MB designed the study. TL wrote the code and performed calculations. TL and MB analyzed and interpreted the results, prepared the figures and wrote the manuscript. Kendall τ value measuring the correlation between the rank r and the height h of cities in the dominance tree, in the US and in France, at different periods for which city population data were available and for which the dominance tree was built. The dashed line corresponds to the average value obtained for 10 3 configurations of the 'random' toy model, calculated for N = 10 4 points randomly distributed in a square box. One sees that the two variables r and h are less correlated in the French case, which is farther from the total random pairing of ranks and locations of the 'random' toy model. This means that in France there are more municipalities with small ranks (large population) which are at small heights (dominated by a close, larger city). This is the case when large municipalities grow in the immediate neighbourhood of even larger ones ("metropolisation"). The overlap area of the horizontal stripes for two consecutive h values indicates, for that given model, to what extent the mark/rank of the point alone is a poor predictor of its height in the dominance tree, which depends not only on its mark but also on its location. As for each h value there are a large number of municipalities the dots are jittered along the y axis for a given h value to ease the vizualisation. Figure 4 , we see horizontal stripes that overlap, proving that the population size of a municipality alone is a poor predictor of its height in the dominance tree. Some large municipalities have a small height because they are absorbed by a larger municipality from which they depend -this is the case for example of Villeurbanne whose population is larger than Troyes, which is however four levels above in the dominance tree. On each line we report the names of the largest and the smallest municipality at this height. 

Matching and agglomeration economies in a system of cities

SIMPOP: a multiagent system for the study of urbanism. Environment and Planning B: Planning and design

Firm linkages, innovation and the evolution of urban systems

Human behavior and the principle of least effort: An introduction to human ecology

Zipf's law for cities: an explanation

The growth equation of cities

MetaZipf. A dynamic metaanalysis of city size distributions

Die zentralen Orte in Suddeutschland: Eine okonomisch-geographische Untersuchung uber die Gesetzmafligkeit der Verbreitung der Siedlungen mit stddtischen Funktionen PhD thesis

A theory of location for cities

Locational analysis in human geography

The Spatial Distribution of US Cities

The agglomeration and dispersion dichotomy of human settlements on Earth

Uniform distribution of objects in a homogeneous field: Cities on a plain

Space-time correlations in urban sprawl

An illusion of spatial hierarchy: spatial hierarchy in a random configuration

Statistical analysis and modelling of spatial point patterns

Point process theory and applications: marked point and piecewise deterministic processes

Notes on continuous stochastic phenomena

Stochastic geometry for wireless networks

Spatial Tessellations: Concepts and Applications of Voronoi Diagrams

Location Science

First-passage times to quantify and compare structural correlations and heterogeneity in complex systems

Statistical physics: statics, dynamics and renormalization

Spatial History Project. United States Historical City Populations

Historique des populations communales. Recensements de la population 1876

Multilevel comparison of large urban systems

Diverse cities or the systematic paradox of Urban Scaling Laws. Computers, Environment and Urban Systems

Rank clocks

The Organization of Urban Systems

The Kendall rank correlation coefficient. Encyclopedia of Measurement and Statistics

Aboav-Weaire's and Lewis' laws-A review

Soap, cells and statistics -random patterns in two dimensions. Contemporary Physics

Simple Features for R: Standardized Support for Spatial Vector Data

ggplot2: Elegant Graphics for Data Analysis

TL thanks the many developers of the free software packages that were used in this study, especially sf [34] , GDAL, GEOS and PROJ to perform the geocomputation and ggplot2 [35] to draw the figures. Last but not least, TL warmly thanks the Ramelet family for hosting him in the winter of 2021 during the COVID-19 lockdown, during which most of this work was done.