key: cord-0506802-21oainly authors: Ba'illo, Amparo; C'arcamo, Javier; Mora-Corral, Carlos title: Extremal points of Lorenz curves and applications to inequality analysis date: 2021-03-04 journal: nan DOI: nan sha: d047551457f874061e0b5a64eb6cf78a0ef4f6e5 doc_id: 506802 cord_uid: 21oainly We find the set of extremal points of Lorenz curves with fixed Gini index and compute the maximal $L^1$-distance between Lorenz curves with given values of their Gini coefficients. As an application we introduce a bidimensional index that simultaneously measures relative inequality and dissimilarity between two populations. This proposal employs the Gini indices of the variables and an $L^1$-distance between their Lorenz curves. The index takes values in a right-angled triangle, two of whose sides characterize perfect relative inequality-expressed by the Lorenz ordering between the underlying distributions. Further, the hypotenuse represents maximal distance between the two distributions. As a consequence, we construct a chart to, graphically, either see the evolution of (relative) inequality and distance between two income distributions over time or to compare the distribution of income of a specific population between a fixed time point and a range of years. We prove the mathematical results behind the above claims and provide a full description of the asymptotic properties of the plug-in estimator of this index. Finally, we apply the proposed bidimensional index to several real EU-SILC income datasets to illustrate its performance in practice. Inequality is one of the main global issues in nowadays world; see Atkinson (2015) . It is commonly accepted that social inequality has increased across the globe over the last few decades; see for example Greselin (2014) for an analysis of income inequality in the United States of America and Bosmans et al. (2014) who analyze inequality in many countries. The media often claim that the social gap has widened considerably; day by day the richest are getting richer and the poor poorer. This kind of assertions are commonly based on striking facts such as "the world's richest 1%, those with more than $1 million, own 45% of the world's wealth 1 ". Obviously, social inequality has wide-ranging adverse impacts on both the society and economy; see Stiglitz (2012) , Bourguignon (2017) , Jarman (2016) , and the references therein. Therefore, in the econometric literature there is a major interest to develop and study indices quantifying inequality with accuracy, as well as summaries of the dispersion/heterogeneity of income or wealth distribution in a given population. The analysis of suitable empirical counterparts to make statistical inferences related to inequality has also played a central role within this field of research. The motivation of this work is to compare and quantify the inequality between two populations, bringing to light the intrinsic differences between the underlying distributions. The usual econometric tools for quantifying inequality are based on the comparison of one-dimensional indices that might coincide for very different distributions; see Yitzhaki and Schechtman (2012) and Fontanari et al. (2018, Appendix B) for concrete examples regarding the Gini index. As a matter of fact, the first goal of this work is to specify how different two distributions can be (with respect to a suitably chosen distance) if we know their Gini indices. To this end, we have explicitly computed the set of extreme points of Lorenz curves with a fixed value of their Gini coefficient as well as the maximum L 1 -distance between the Lorenz curves of the distributions when their Gini indices are given. We also want to identify extremal distributions, that is, those pairs of distributions for which this maximal distance is attained. We believe that these issues are relevant by themselves and they turn out to be interesting and deep mathematical problems. A second, though primary, goal of this work is to propose a new inequality index that amends (to the extent possible) the deficiencies of the current econometric approaches to compare inequality between two populations. Specifically, we aim at introducing an index (defined for pairs of distributions) that combines the main elements in social welfare evaluations: the Lorenz curve, the Gini index, and the Lorenz ordering. A detailed inspection of the minimum desired requirements such index should satisfy reveals that a statistic with appropriate properties cannot be one-dimensional, as are the most frequently used inequality measures; see Section 2.2. For this reason we propose a two-dimensional index that measures at the same time the difference between the Gini indices and an L 1 -distance between the Lorenz curves of the two populations. We can further normalize this index so that it takes values in a triangle in R 2 with vertexes (0, 0), (1, 1) and (−1, 1). We prove that the legs of this triangle characterize perfect inequality-expressed as Lorenz ordering between the distributions-while the hypothenuse (upper face), maximum dissimilarity. This new index allows us to evaluate all at once the difference between the income distributions of the two populations and their relative inequality. We can use this index to visualize the evolution (over time) of relative inequality and distance. This paper is structured as follows: Section 2 reviews the usual econometrical approaches to quantify and compare inequality. We also list the ideal properties a relative inequality index should satisfy. In Section 3 we revise the basic concepts related to inequality used throughout the paper: the Lorenz curve, the Gini index and the Lorenz ordering. In Section 4 we consider the class of Lorenz curves with a given value of their Gini coefficients and show that this set is compact (and convex) in the space L 1 . We further compute the extreme points of this set. Section 5 is devoted to compute maximum distances between two Lorenz curves and to find extremal pairs of distributions with fixed values of their Gini indices. In Section 6 we introduce the aforementioned inequality index and enumerate its main properties. Two simple normalizations of the index are also proposed to improve data visualization. In Section 7 we follow a "plug-in approach" to estimate the proposed indices. We show the strong consistency of the estimators and compute the limit distributions of their normalized versions. Necessary and sufficient conditions for some of the statistics to be asymptotically normal are provided. We also include the conclusions of a simulation study to evaluate the behaviour of the asymptotic results in finite samples. In Section 8 we compute the bidimensional index for various income datasets from EU-SILC (European Union Statistics on Income and Living Conditions) . Additional examples and material regarding the analysis of real data sets and the simulation study are included in the Supplementary Material file. Finally, the proofs of the main results are collected in Section 9, a technical appendix. In this section, first we give a general review of the current techniques to quantify inequality as well as to compare income distribution across different populations. The most frequent approach is to summarize the distribution into a one-dimensional quantity. However, we conclude the section pointing out that there is no unidimensional index satisfying simultaneously all the reasonable properties a good relative inequality measure between two distributions should fulfill. Comparing inequality in two populations is far from being new. The usual econometric tools to carry out comparisons among income distributions within countries or to analyze the evolution of inequality in different moments of time can be essentially divided into two groups. 1. Inequality measures. In the literature there is a great amount of statistics to assess economic inequality. We can mention the well-known Theil, Hoover, Amato, Atkinson and generalized entropy indices. These are only a few examples among many others, and even new measures are introduced from time to time; see Prendergast and Staudte (2018) for a recent proposal. The interested reader might consult the book by Cowell (2011) or Eliazar and Sokolov (2012) for a panoramic overview on equality indices. These measurements summarize and quantify-usually in a (normalized) single real number-the statistical dispersion and heterogeneity of income distribution in a population. There is no doubt that the most commonly used measure in this context is the Gini index (see Section 3.2), which is at the heart of social welfare evaluations. In practice, it is quite frequent to analyze the situation of two (or more) countries in terms of evenness by comparing their respective Gini indices. Most rankings where countries are ordered by income equality and poverty mappings (i.e., maps of income disparity) are usually obtained in this way. In this first group, the comparison of income distributions relies on the corresponding analysis of suitable income inequality metrics. 2. Stochastic comparisons. An essentially different way to compare (income) distributions is to establish a stochastic ordering between them; see Sriboonchita et al. (2009) . Stochastic orders, also known as stochastic dominance rules in the economic literature, are nothing but partial order relations in the set of probability measures; see Shaked and Shanthikumar (2006) . Therefore, they allow comparing and ranking distributions according to some specific criterion. If such a criterion is evenness, the Lorenz order is the most commonly accepted rule, primarily in economic sciences; see Arnold and Sarabia (2018) . Two distributions are ordered with respect to this relation if one of their Lorenz curves (see the precise definition in Section 3.1) is completely above the other one. In terms of inequality, this roughly speaking means that the wealth is distributed in a fairer way in one of the two populations. Hence, this second group of techniques consists in performing global comparisons of the distributions by means of stochastic dominance rules that take into account evenness. Pros and cons. Each of the previous two approaches has advantages and limitations. Inequality metrics provide useful summaries of income distributions that are simple and easily interpretable. These measurements can be used to make comparisons (related to inequality) among distributions by simply arranging the selected index. However, it is clear that a single real number cannot represent faithfully the distribution of income of a population. The same inequality index might correspond to many very different distributions. On the other hand, if two distributions are stochastically ordered-with respect to a certain relation that takes into account inequality-, we can derive many important consequences regarding the underlying distributions. Typically, stochastic dominance implies an ordering among many inequality measures simultaneously. For instance, if the Lorenz ordering holds, an inequality between expectations of convex functions of the involved variables is satisfied; see Arnold and Sarabia (2018) . Additionally, from the empirical point of view, there are various hypothesis tests to check whether it is reasonable to assume that two variables are ordered; see Anderson (1996) , Barrett and Donald (2003) , Zheng (2002) , Berrendero and Cárcamo (2011) , Barrett et al. (2014) , Sun and Beare (2021) , among others. Nevertheless, dominance rules are partial orders and, hence, not every pair of distributions can be arranged; see Davies and Hoy (1995) . Further, in general, the statement that two distributions are ordered cannot be proved statistically, as one would desire. This happens because a test with null hypothesis "two variables are not ordered" and alternative "the variables are ordered" is usually ill-posed: given a pair of ordered distributions, we can normally find pairs of non-ordered distributions arbitrarily close to the initial ones and hence the null and alternative hypotheses are indistinguishable; see Ermakov (2017) . Let us assume that we want to construct a unidimensional index, say I, to measure relative inequality between two populations X 1 and X 2 . We might want this index to combine the most commonly employed tools in social welfare evaluations: the Gini index and the Lorenz ordering. One can easily describe the most desirable properties a reasonable index should fulfill. (P 1 ) Normalization: The index has to take values on a reference interval. As we intend to measure relative inequality (of one variable with respect to another one), this interval has to be symmetric. Let us assume that the normalization is such that I(X 1 , X 2 ) ∈ [−1, 1]. By convention, positive and negative values of I(X 1 , X 2 ) can indicate that X 1 is fairer (in some precise sense) than X 2 , or the other way around. For instance, we might ask that I(X 1 , X 2 ) > 0 if and only if the Gini index of X 1 (or any other reference inequality index) is less than the corresponding index of X 2 . (P 2 ) Symmetry: The index has to verify that I(X 1 , X 2 ) = −I(X 2 , X 1 ). (P 3 ) Extreme values: The endpoints of the reference interval should reflect that one distribution is uniformly more equitable than the other one. Ideally, this fact can be translated into the Lorenz ordering between the variables: (i) I(X 1 , X 2 ) = +1 if and only if X 1 is smaller than X 2 in the Lorenz order. (ii) I(X 1 , X 2 ) = −1 if and only if X 2 is smaller than X 1 in the Lorenz order. (P 4 ) Value at the origin: If I(X 1 , X 2 ) = 0, then X 1 and X 2 have to satisfy some kind of equality. A weak version of this property could be that I(X 1 , X 2 ) = 0 if and only if their Gini indices (or some other related statistic) coincide. However, preferably the equality should be in distribution. For instance, we could ask that 'I(X 1 , X 2 ) = 0' be equivalent to 'X 1 = st cX 2 ', where c > 0 is a constant and '= st ' stands for stochastic equality. In other words, 'I(X 1 , X 2 ) = 0' would mean that X 1 and X 2 distribute the wealth exactly in the same manner up to a size factor. (P 5 ) Continuity: Let {X 1,n 1 } and {X 2,n 2 } be two sequences of random variables such that X 1,n 1 ↝ X 1 and X 2,n 2 ↝ X 2 (as n 1 , n 2 → ∞), where '↝' stands for some suitable mode of convergence of random variables. In this setting, continuity of the index amounts to I(X 1,n 1 , X 2,n 2 ) → I(X 1 , X 2 ) (as n 1 , n 2 → ∞). This property is essential to estimate well the index in practice with random samples of the populations. Unfortunately, some of these properties are usually incompatible for a one-dimensional index. For instance, generally (P 3 ) and (P 5 ) cannot hold at the same time. The reason relies on the fact that there are pairs of ordered distributions-according to the Lorenz ordering-arbitrarily close. We might have that X 1,n 1 ↝ X 1 and X 1 is smaller than X 1,n 1 in the Lorenz order if n 1 even and the other way around if n 1 odd, so that under (P 3 ), I(X 1 , X 1,n 1 ) = (−1) n 1 . Therefore, if we want to construct an index satisfying similar properties to those enumerated before, we need to quantify the difference between two income distributions with more than one number. cumulative distribution function F (x) = P(X ≤ x), for x ≥ 0. Formally, the Lorenz curve of the variable X (or of the distribution F ) is where (0 < x < 1) is the quantile function of X, that is, the generalized inverse of F . Hence, if X measures income in a population, for each value t ∈ [0, 1], the function in (1) gives us the (normalized) total income accumulated by the proportion t of the poorest in that population. Note that F −1 is non-decreasing, µ = ∫ 1 0 F −1 (x) dx and ′ (t) = F −1 (t) µ a.e. t ∈ (0, 1). Therefore, is a convex and non-decreasing function such that (0) = 0 and (1) = 1. In particular, is continuous except perhaps at the point 1 and has positive second derivative ′′ a.e. Moreover, as the quantile function characterizes the probability distribution, determines the distribution of the underlying variable up to a (positive) scale transformation. Explicit analytic expressions for the Lorenz curves of the usual parametric distributions can be found in Kleiber and Kotz (2003, Section 2.1.2) . By convexity, for every the Lorenz curve it holds that where (4) Figure 1 shows a graphical representation of the inequalities in (3). The function pe is called the perfect equality curve as it corresponds to the Lorenz curve of a Dirac delta measure, i.e., the probability measure corresponding to a population in which all individuals have equal (and positive) incomes. Additionally, pi is the perfect inequality curve because it can be viewed as the limit (when the total number of individuals tends to infinity) of Lorenz curves in finite populations where only one person accumulates all the wealth. Perfect equality curve Different characteristics, functionals and values of the Lorenz curve are employed to construct inequality indices; see Arnold and Sarabia (2018) . The Gini, Pietra, Amato, 20 :20 ratio and Palma ratio indices are some examples of inequality measures derived from the Lorenz curve. As we have mentioned before, a simple and effective comparison frequently used by the media can be made by analyzing the evolution of the proportion of income accumulated by the top (or bottom) 1% of the population, which is nothing but the analysis of one single value of the Lorenz curve. The most popular inequality measure derived from the Lorenz curve is the Gini index. This index has almost an uncountable number of interesting interpretations and representations; see Yitzhaki and Schechtman (2012, Chapter 2) . One possible way to define it is the following: In the sequel we denote by From (3)-(4), we have that G(X) = 2 pe − = 1 − 2 . Geometrically, the Gini index corresponds to twice the shaded area in Figure 1 . In particular, The denominator in (5) equals 1 2 (the maximum L 1 -distance between Lorenz curves) and acts as a normalizing constant so that 0 ≤ G(X) ≤ 1. The Gini index has many desirable properties: it is scale-free (because the Lorenz curve is itself invariant under positive scaling); it can be computed whenever the considered random variable is integrable (finite second moment is not necessary); it is normalized so that it takes values between 0 (perfect equality) and 1 (perfect inequality); it has a simple and effective interpretation (small values of this index amount to fair income distributions, whereas high values indicate unequal distributions); it is a quasi-convex measure (see Blackorby and Donaldson (1980) ), i.e., for all variables X 1 , X 2 and λ ∈ [0, 1], Another important instrument to compare distributions according to inequality is the so-called Lorenz ordering. Let X 1 and X 2 be two variables with Lorenz curves 1 and 2 , respectively. It is said that X 1 is less than or equal to X 2 in the Lorenz order, written X 1 ≤ L X 2 , if 1 (t) ≥ 2 (t), for all t ∈ [0, 1]. In this case, we have that pe ≥ 1 ≥ 2 , where pe is the perfect equality curve defined in (4). In other words, income is distributed in a more equitable manner in X 1 than in X 2 . For many families of parametric distributions usually considered in applications, two members of the family differing in the dispersion parameter are usually ordered in accordance with this relation. For instance, Pareto, normal, lognormal, Gamma, Weibull distributions (among others) satisfy this property; see Kleiber and Kotz (2003) . Let us consider the closure (with respect to the pointwise convergence) of the set of Lorenz curves of positive and integrable random variables with strictly positive expectation. For example, the function pi defined in (4) (see also Figure 1 ), which is not a proper Lorenz curve, belongs to L. For simplicity, we will refer to L as the class of Lorenz curves. The Gini index of ∈ L will be also denoted by G( ). In other words, For a ∈ [0, 1], we define the collection of Lorenz curves with Gini index a. Note that L a = { ∈ L ∶ = (1 − a) 2}. The following proposition shows the compactness of L a in the space L 1 . Proposition 1. For each a ∈ [0, 1], L a is a compact and convex set in L 1 . To achieve a deeper understanding of the class L a in (8) we need some basic concepts about convex sets. The notion of extreme point plays a prominent role in convex analysis; see, for example, Simon (2011, Section 8) . Roughly speaking, an extreme of a convex set is a point that cannot be expressed as a proper convex combination of other points within the set. Formally, given a convex set C, x ∈ C is an extreme point of C if x = tx 1 + (1 − t)x 2 , for some t ∈ (0, 1) and x 1 , x 2 ∈ C, implies that x 1 = x 2 . In the following we denote by Ext(C) the set of extreme points of C. The relevance of extreme points is comprehended through the Krein-Milman theorem (see, e.g., Simon (2011, Theorem 8.14) ), which is a central result in convex analysis. This theorem affirms that a convex and compact set in a locally convex space is the closed convex hull of its extreme points. Therefore, we can retrieve the entire convex set by knowing only the (usually much smaller) set of extreme points. Further, Bauer's maximum principle (see, e.g., Phelps (2013, Proposition 16.6) or Aliprantis and Border (2006, 7.69) ) states that a convex, upper-semicontinuous functional on a non-empty, compact and convex set of a locally convex space attains its maximum at an extreme point. In consequence, the knowledge of extreme points is fundamental in mathematical optimization. The practical application of these powerful results goes through the explicit computation of the extreme points of the convex set under study, which is usually a difficult task in infinite-dimensional spaces. The next theorem determines the set of extreme points of L a in (8). We believe that this result might be of independent interest and it is indeed necessary for further developments of this work. Theorem 1. For a ∈ [0, 1], we have that where a x 1 , m a x 2 and n a x 1 ,x 2 are the piecewise affine functions of L a such that (with the notation a x 1 (1 − ) = lim t↑1 a x 1 (t) and the convention 1 1 = pi in (4)). Theorem 1 summarizes the information of L a (an infinite-dimensional collection) in the set of its extreme points, which has only dimension 2. Furthermore, as L a is compact in L 1 , it identifies all possible maximizers of convex and continuous functionals. To prove Theorem 1 we first show that twice differentiation determines an affine isomorphism between L a and the set of non-negative measures on (0, 1) with some restrictions. Afterwards, we identify those combinations of delta measures that are extreme points. The last step of the proof of Theorem 1 is related to the results of Winkler (1988) and Pinelis (2016) , where they analyze the set of extreme points of a subset of measures defined through some inequalities. In Figure 2 we have depicted various extreme points of L a , with a = 0.5. The probabilistic and economic meaning of some of these Lorenz curves is described in the next section. As stated in the introduction, for two distributions with fixed Gini indices, one aim of this work is to quantify how "far" they can be from one another. Specifically, we are interested in computing the value d(L a , L b ) (a, b ∈ [0, 1]), for a suitable metric d on L × L, where L is defined in (6), and L a and L b as in (8). Theorem 1 is extremely useful for this purpose. If d is defined through a norm, d is a convex and continuous functional on (the convex set) L a × L b . Therefore, as long as L a and L b are compact, by Bauer's maximum principle, the supremum of d on L a × L b is attained in Ext(L a × L b ) = Ext(L a ) × Ext(L b ). Thus, thanks to Theorem 1, we reduce the calculation of d(L a , L b ) to a finite-dimensional problem. The exact computation of d(L a , L b ) will eventually depend on the particular choice of the metric d. In Section 5.1, we introduce a distance between Lorenz curves which is natural in this context. The computation of this maximal distance, carried out in Section 5.2, as well as the characterization of the distributions where the maximum is attained, is crucial to define the bidimensional inequality index proposed in Section 6. Depending on the interests of the researcher and the problem at hand, there are many probability metrics that can be used to quantify the distance between two random variables; see the compilation volume on probability distances and their applications by Rachev et al. (2013) . However, we note that the Gini coefficient itself is defined in terms of a (normalized) L 1 -distance between Lorenz curves; see formula (5). Therefore, a sensible and convenient choice to measure dissimilarities between distributions is also a normalized L 1 -norm of the difference between the corresponding Lorenz curves. The L 1 distance between Lorenz curves has also been used in Zheng (2018) related to almost stochastic dominance of Leshno and Levy (2002) . Explicitly, given X 1 and X 2 two random variables with Lorenz curves 1 and 2 , respectively, we define the Lorenz distance between the variables as We observe that 0 ≤ d L (X 1 , X 2 ) ≤ 1 and d L is actually a pseudo-metric because d(X 1 , X 2 ) = 0 holds if and only if X 1 = st cX 2 , where c > 0 is a constant. Further, d L can only achieve the value 1 when the variables have the perfect equality and inequality Lorenz curves in (4) . Observe that with this definition the Gini index of a variable is nothing but the Lorenz distance between the variable and a positive constant. We endow the set L in (6) with the Lorenz distance By (3), the diameter of L with respect to the metric d L is We further observe that L a in (8) is the set of ∈ L such that d L ( , pe ) = a. For any fixed a, b ∈ [0, 1], L a and L b are compact sets in L 1 (see Proposition 1). Therefore, from Theorem 1, the maximum is attained at Ext(L a ) × Ext(L b ). Definition 1. We say that the pair ( 1 , 2 ) ∈ L × L is extremal if The pair of probability distributions associated to an extremal pair of Lorenz curves will also be called extremal distributions. For notational convenience, we rename the functions a a and a 0 in (9) as − a and + a , respectively. In other words, for 0 ≤ a ≤ 1, − a , + a ∈ L a are defined as − a (t) = max 0, (with the agreement that − 1 ≡ pi defined in (4)). These two functions will play an essential role in the rest of the section. In Figure 3 we display two of these functions. The following theorem, which is the main theoretical result of this section, provides an explicit expression for M (a, b) and shows that this maximum distance is precisely attained at functions of the form (12). It should be mentioned that the computation of M (a, b) is a mathematical problem whose statement is very simple and seems to be deceptively easy. However, the proof of this result, which begins at Theorem 1 and is collected in the technical appendix, reveals that this issue is indeed more delicate and complex than expected. Theorem 2. For 0 ≤ a, b ≤ 1, let M (a, b) be as in (11). We have that (the value M (0, 0) = 0 is taken by continuity). Moreover, ( − a , + b ) and ( + a , − b ) are pairs of extremal Lorenz curves within the set L a × L b . In Figure 4 we have plotted the function M (a, b). From (13), it is easy to check that Theorem 2 asserts that the maximum distance in (11) is attained at the pairs ( − a , + b ) and ( + a , − b ). Hence, the associated probability distributions (unique up to positive scale transformations) are extremal. The function − a is the Lorenz curve of a population in which a proportion a of the people have 0 income and the rest, a proportion 1 − a, have equal and positive income. Also, − a is the Lorenz curve of a variable X a with Bernoulli distribution with parameter 1 − a, that is, P(X a = 0) = a and P(X a = 1) = 1 − a. On the other hand, + b is not a proper Lorenz curve, but it can be expressed as the limit (as n goes to infinity) of Lorenz curves of populations with n individuals where n − 1 of them fairly share a proportion (1 − b) of the wealth and there is only one "lucky person" who accumulates the rest of the total wealth (the proportion b). From a probabilistic perspective, we have that + b = lim p→0 X(b,p) , where X(b,p) is the Lorenz curve of X(b, p), a random variable with distribution P(X(b, p) = 1 − b) = 1 − p and P(X(b, p) = 1 − b + b p) = p. From Theorem 2 we can easily see the range of values of the distance d L ( 1 , 2 ), when ( 1 , 2 ) varies in L a × L b . Corollary 1. For 0 ≤ a, b ≤ 1, we have that where M (a, b) is given in (13). Theorem 2 also allows us to explicitly compute the maximum distance between Lorenz curves with a given difference of their Gini indices. We have that and Definition 2. We say that the pair ( 1 , 2 ) ∈ L 2 is super-extremal if The associated pairs of probability distributions will be also called super-extremal distributions. Obviously, each super-extremal pair is extremal because it always holds that However, from Theorem 2 and for any 0 ≤ c ≤ 1, among all the pairs ( − a , + a+c ) and ( − a+c , + a ) (with a ∈ [0, 1 − c]) of extreme Lorenz curves with a value c for the difference of their Gini indices there are only two super-extremal curves. Namely, the pairs corresponding to a = a c in (14). Observe that M * (0) is the maximum possible distance between Lorenz curves with equal Gini indices. By Theorem 2 and Corollary 2, we have that the maximum distance between two income distributions both with Gini indices equal to a is which attains its maximum at the point a 0 = 2 − √ 2 ≈ 0.59. Therefore, the maximum (Lorenz) distance between distributions with the same Gini index is Additionally, M (a, a) is the d L -diameter of L a . A graphical representation of M (a, a) and M * (0) is presented in Figure 5 . The Lorenz curves − a 0 and + a 0 are hence super-extremal Lorenz curves with equal Gini indices; see Figure 6 . The curves − a and + a satisfy another extremal property related to inequality within the class L a . Observe that for t ∈ [0, 1], by Fubini's theorem, we have that If we measure income, this quantity is a weighted average of the income accumulated by the proportion t of the poorest in that population. The weight function, w(x) = t − x, for 14 0 ≤ x ≤ t, places more weight on the poorest. Therefore, inequalities in (18) show that the Lorenz curve + a (respectively, − a ) is the most equitable (respectively, least equitable) within the class L a in this precise sense. In other words, the distributions given by − a and + a are extremes for the stochastic relation given in (18). This relationship is closely related to a stochastic ordering called third order inverse stochastic dominance; see de la Cal and Cárcamo (2010). In this section we introduce a two-dimensional inequality index defined for pairs of distributions that combines the Gini coefficients of two variables with the Lorenz distance defined in (10). Hence, the proposed index simultaneously measures relative inequality and dissimilarity between two populations. We will show that this bidimensional index satisfies many desirable properties. As the definitions in this section only involve Lorenz curves, we refer to pairs ( 1 , 2 ), with 1 , 2 ∈ L, instead of considering random variables. Let 1 and 2 be two Lorenz curves in L. As a measure of relative inequality we simply consider the difference of the Gini indices, that is, G( 2 ) − G( 1 ). To quantify dissimilarity we employ the distance d L ( 1 , 2 ) in (10). Therefore, a natural proposal for a new twodimensional index is the following: The next result provides the region of R 2 where I takes values. Proposition 3. Let I be defined in (19) and let us consider the region of R 2 defined by We have that where the function M * is defined in (15). In Figure 7 we have plotted the region ∆ specified in (20). This graphical representation is very informative because we see how different the Gini indices of two variables can be in accordance with the distance between their Lorenz curves. We first notice that the range of variation of the index I is limited as ∆ has a very small area: It is easy to check that the index I in (19) satisfies all the ideal properties enumerated in Section 2.2. Nevertheless, it has the disadvantage that its values are difficult to interpret because they are located in a narrow region. Therefore, in the rest of this section, we suggest two possible transformations of the index taking values in a more convenient region. To facilitate the understanding of the graphical representation of the index, in this section we propose two normalizations of I in (19) taking values on a simpler region of the plane, instead of lying on the set ∆ displayed in Figure 7 . For example, we can transform-through a suitable homeomorphism-the set ∆ in (20) into the triangle There are several alternatives to carry out this normalization. The simplest way to transform the set ∆ into T is by linearly stretching the segment . We thus consider the map t * ∶ ∆ → T given by 16 We introduce the two-dimensional index defined by where t * is the homeomorphism defined in (22). By construction, we have that I * takes values in the triangle (21). The mapping t * in (22) is perhaps the most natural homeomorphism to transform ∆ in (20) into T in (21). Nevertheless, only super-extremal pairs of distributions (see Definition 2) lay on the uppermost side of the triangle T , that is, the segment [−1, 1] × {1}. For instance, among all pairs of extremal distributions with equal Gini, {( − a , + a ) ∶ a ∈ [0, 1]}, only the pair with a = 2 − √ 2 achieves a value of I * equal to (0, 1). This happens because I * only takes into account the difference between the Gini indices of the involved variables. The second proposal is to incorporate to I in (19) the value of the Gini indices of each variable separately. In this way, we can send all extremal pairs of distributions to the uppermost side of T . We start with the following proposition. Proposition 4. Let us consider the region of R 3 defined by We have that where M is given in (11). (24). To understand the actual size of ∆ * , we point out that its volume is Vol(∆ * ) = 2 3 (10 − π 2 ) ≈ 0.087. However, the normalization used to construct I * in Section 6.2 (through the function M * ) generates the set that contains ∆ * (by (16)) and whose volume is (approximately) 0.14 (1.58 times larger that the one of ∆ * ). Next, we consider the map t * ∶ ∆ * → T defined by Observe that t * (∆) = T , but t * is not injective. This is not a problem as we want to send all extremal distributions with a given difference of their Gini indices to the same point on the frontier of T . Finally, we define the bidimensional index I * as By construction, I * takes values in T . Moreover, I * sends all pairs of extremal distributions (see Definition 1) to the upper side of T . Observe that, from (16), the second component of I * is always larger that the corresponding one of I * in (23). In this regard, we highlight that M (a, b) could be very different from M * (b − a). This is specially noticeable when the Gini indices of both variables are simultaneously small or large, as can be seen in Figure 5 . Hence, this second proposal could be significatively different than the previous one in this situation. Using the expression for M in (13), we can rewrite the index I * ( 1 , 2 ) in a slightly different way. For simplicity, let us set We have that The following proposition enumerates the main properties of the indices I * and I * defined in (23) and (26), respectively. Proposition 5. Let (X 1 , X 2 ) be a pair of random variables with Lorenz curves 1 and 2 . We consider the index I * = (I * 1 , I * 2 ) defined in (26). The following properties hold: (i) Normalization: We have that I * (L×L) = T in (21), i.e., I * takes values in the triangle T . Moreover, In other words, positive (respectively, negatives) values of the first component of I * indicates that 1 (respectively, 2 ) is fairer than 2 (respectively, 1 ) according to the Gini index. (ii) Symmetry: We have that I * ( 2 , 1 ) = (−I * 1 ( 1 , 2 ), I * 2 ( 1 , 2 )). (1) where M is given in (11) and (13). (iv) Value at extreme points of T : (1) I * ( 1 , 2 ) = (0, 0) if and only if 1 = 2 . Therefore, the value at the origin means that the associated distributions satisfy that X 1 = st cX 2 , where c > 0 is a constant. (2) I * ( 1 , 2 ) = (1, 1) if and only if 1 = pe and 2 = pi . (3) I * ( 1 , 2 ) = (−1, 1) if and only if 1 = pi and 2 = pe . (v) Continuity: If { 1,n 1 } n 1 ≥1 ⊂ L and { 2,n 2 } n 2 ≥1 ⊂ L are sequences such that 1,n 1 → 1 and 2,n 2 → 2 pointwise as n 1 , n 2 → ∞, then I * ( 1,n 1 , 2,n 2 ) → I * ( 1 , 2 ). The same properties hold for the index Figure 9 summarizes graphically the properties in Proposition 5. Figure 9: The triangle T (in green) in which the indices I * and I * take values. The x-axis represent the difference between the Gini indices of two variables and the y-axis the normalization of the Lorenz distance through t * or t * in (22) and (25), respectively. In this section we prove that the plug-in estimators of the indices defined in the previous section are strongly consistent. Moreover, we determine their asymptotic distributions and obtain necessary and sufficient conditions so that the estimator of I in (19) is asymptotically normal. To finish this section, we have included the conclusions of a small simulation study with generalized beta-type distributions of the second kind to evaluate the behaviour of the asymptotic results in finite samples. Let X 1 , X 2 be two random variables with distribution functions F 1 and F 2 and Lorenz curves 1 and 2 , respectively. For j = 1, 2, we consider random samples from For simplicity, we will assume that both samples are mutually independent. However, similar convergence results can be obtained when we observe "matched pairs", , drawn from a bivariate distribution (X 1 , X 2 ) with copula C satisfying that its maximal correlation is strictly less than one; see Beare (2010) . As pointed out in Barrett et al. (2014) and Sun and Beare (2021) , this second setting is more reasonable when we have one sample of individuals and two measures of welfare. To simplify the notation, in the sequel all estimated quantities are denoted with a "hat", and it will be implicitly understood the dependence on the corresponding sample sizes. To estimate the inequality indices introduced in Section 6, the starting point is the natural estimator of the distribution function of the sample. Namely, for j = 1, 2, we denote byF j the empirical distribution functions of the samples, i.e., where 1 A stands for the indicator function of the set A. The corresponding empirical quantile functions areF −1 j (x) = inf{y ≥ 0 ∶F j (y) ≥ x} (0 < x < 1) and the empirical Lorenz curves areˆ X j,i are the sample means. Therefore, the plug-in estimator of the indices I, I * and I * defined in equations (19), (23) and (26) are respectively given bŷ The next proposition shows the strong consistency of these estimators. Proposition 6. As n 1 , n 2 → ∞, we have that The proof of Proposition 6 (see the Appendix) shows that strong consistency of the estimators of the indices follows from the (almost surely) uniform convergence of the empirical Lorenz curves to its theoretical counterparts. We observe that this convergence can be derived under weaker assumptions regarding the samples of X 1 and X 2 . For instance, in Csörgo and Yu (1999, Theorem 2.1) strong uniform consistency ofˆ j to j is obtained under very general conditions. The computation of the asymptotic distribution of the indices relies on the convergence of the empirical Lorenz processes (associated with X j with Lorenz curves j , respectively, for j = 1, 2) given by The analysis of the convergence of Lorenz processes can be traced back to Goldie (1977) . However, we will use a recent result by Sun and Beare (2021) in which the weak joint convergence of the processes in (28) is obtained by using a new result regarding the convergence of the quantile process in L 1 (see Kaji (2018) and Kaji (2019)) together with the functional delta method; see van der Vaart and Wellner (1996, Section 3.9). Finally, we show the (directional) Hadamard differentiability of the map (19), which essentially follows from Cárcamo (2017, Lemma 4) , and apply the (extended) functional delta method (see Shapiro (1990, Theorem 2 .1)) to derive the asymptotic distributions. Therefore, we need to impose various conditions on the variables so that the associated Lorenz processes converge in L 1 . Assumption 2 (Regularity condition). For j = 1, 2, we have that F j (0) = 0 and F j has at most finitely many jumps and is continuously differentiable elsewhere with strictly positive density. Assumption 1 amounts to saying that the variables X j belong to the Lorentz space L 2,1 ; see Grafakos (2008, Section 1.4 ). This condition is equivalent to the convergence of the classical empirical process (associated with F j ) in the space L 1 ; see del Barrio et al. (1999, Theorem 2.1) . Condition Λ 2,1 (X j ) < ∞ is slightly stronger than EX 2 j < ∞: it holds for example when EX 2+ j < ∞, for some > 0. The smoothness condition in Assumption 2 is necessary to conclude the convergence of the quantile process in L 1 through the differentiability of the inverse map plus the convergence of the empirical process in L 1 ; see Kaji (2019) . Under these assumptions, and with the functional delta method, Sun and Beare (2021, Lemma 2.1) obtained the asymptotic behaviour of the empirical Lorenz process in C([0, 1]) ≡ the space of continuous real-valued functions on [0, 1]. This result is collected in the following lemma where we use the arrow '↝' to denote the weak convergence of probability measures in the sense of Hoffmann-Jørgensen; see van der Vaart and Wellner (1996) . Further, for j = 1, 2, B j will denote two independent standard Brownian bridges on [0, 1]. Lemma 1. For j = 1, 2, let us assume that X j satisfy Assumptions 1 and 2. As n j → ∞, where L j are (independent) centered Gaussian processes with continuous trajectories a.s. that can be expressed as where B j are independent standard Brownian bridges. Some comments should be made regarding the previous key lemma. First, we have opted for the less restrictive assumptions given in Kaji (2019, Proposition 4 .2) instead of those considered in Sun and Beare (2021, Assumption 2.1) or the more demanding in Barrett et al. (2014, Assumption 1) . However, for simplicity, we assume that X 1 and X 2 are independent. If this is not the case, a similar result can be stated (see Sun and Beare (2021, Lemma 2 .1)): the joint limit distribution of ( √ , but in this case the Brownian bridges B 1 and B 2 in (30) are correlated. The computation of the asymptotic distribution of the estimator of I follows from Lemma 1 together with the functional delta method. Traditionally, to apply this latter tool it is usually assumed that the considered maps are Hadamard differentiable. However, as showed by Shapiro (1991) (see also Dümbgen (1993) ) it is enough to have Hadamard directional differentiability. We recall this concept in the following definition. Definition 3. Let D and E be real Banach spaces with norms ⋅ D and ⋅ E , respectively. for all h ∈ D 0 and all sequences {h n } ⊂ D, {t n } ⊂ R such that t n ↓ 0 and h n − h D → 0. The main difference between full and directional Hadamard differentiability is that the derivative φ ′ θ is not necessarily linear in Definition 3. However, if equation (31) is satisfied, then φ ′ θ is continuous and positive homogeneous of degree 1; see Shapiro (1990, Proposition 3 .1). The proof of following lemma follows from Cárcamo (2017, Lemma 4). Lemma 2. The map δ ∶ L 1 → R defined by δ(f ) = f is Hadamard directionally differentiable at every f ∈ L 1 . For g ∈ L 1 , its derivative is given by where sgn(⋅) is the sign function. In particular, if the Lebesgue measure of the set In the following proposition we establish the asymptotic behaviour of the normalized estimator of the index I in (19). We impose the following condition on the sample sizes. Assumption 3 (Sampling condition). The sample sizes n 1 and n 2 are (weakly) balanced, that is, as n 1 , n 2 → ∞, n 1 (n 1 + n 2 ) → λ, where λ ∈ [0, 1]. Proposition 7. Let Assumptions 1, 2 and 3 be fulfilled. Then, as n 1 , n 2 → ∞, where with L j the independent centered Gaussian processes in (30) (j = 1, 2) and the derivative δ ′ in (32). The proof of Proposition 7 (see the Appendix) relies on the joint convergence of the underlying Lorenz processes. Therefore, any sampling scheme ensuring this joint convergence is enough to derive the asymptotic distribution of the normalized estimator. The following corollary provides necessary and sufficient conditions for the limit distribution in (33) to be bivariate normal. Corollary 3. Under the conditions of Proposition 7, the following three assertions are equivalent: (a) The set { 1 = 2 } = {x ∈ [0, 1] ∶ 1 (x) = 2 (x)} has zero Lebesgue measure. (b) The limit distribution in (33) can be expressed as where L is defined in (34). (c) The limit in (33) has a centered bivariate normal distribution. We observe that { 1 = 2 } is the set of crossing points of the two Lorenz curves. The case when this set has zero Lebesgue measure (Corollary 3 (a)) is actually a reasonable assumption when we consider two different populations in practice; for instance, when comparing the anual household income of two different countries. In this scenario, the asymptotic distribution of the index I is normal, which simplifies implementing the usual inferential procedures (confidence intervals, hypothesis testing). Otherwise, if the { 1 = 2 } does have positive Lebesgue measure, the limit distribution in (33) could be complicated to handle. Further, as the derivative appearing in the limit is not linear, the corresponding map δ is not fully Hadamard differentiable and, consequently, the standard bootstrap scheme fails. Fang and Santos (2019) propose several methodologies to correct the bootstrap scheme in this situation. Once the asymptotic distribution of the estimator of I has been established in the previous section, the corresponding distributions for the indices I * and I * can be derived thanks to the (traditional) delta method; see for instance van der Vaart (1998, Theorem 3.1). In the case of the index I * in (23), we have that I * ( 1 , 2 ) = t * (I 0 ( 1 , 2 )), where t * in (22) is (by construction) a smooth map. Hence, we can state the following result. Proposition 8. Let I * be the index defined in (23). In the conditions of Proposition 7 we have that n 1 n 2 n 1 + n 2 where I is the limit distribution in (33), t * = (t * 1 , t * 2 ) is in (22) and The expression of ∂t * 2 ∂x (x, y) can be easily computed, but it is too long to be included here. We observe that the evaluation of the derivative of t * in (36) is understood as a product of matrices. At least theoretically, Proposition 8 provides the asymptotic distribution of I * . Nevertheless, we point out that even when the distribution of I is bivariate normal (see Corollary 3), the second component of the asymptotic distribution in (36) could be complicated as it is expressed as a non-linear transformation of I. The asymptotic distribution of the estimator of I * in (26) can be computed by following the same steps as in the proof of Proposition 7. First, we consider the map ψ ∶ C([0, 1]) × C([0, 1]) → R 3 given by Let 1 , 2 ∈ L. We observe that ψ( 1 , 2 ) = (G( 1 ), G( 2 ), d L ( 1 , 2 )) and I * ( 1 , 2 ) = t * (ψ( 1 , 2 )), where t * is in (25). Again, it can be checked that ψ is Hadamard directionally differentiable at ( 1 , 2 ) with derivative given by where δ ′ is defined in (32). Therefore, from (37) and by the chain rule, we obtain the following result. Proposition 9. Let I * be the index defined in (26). In the conditions of Proposition 7 we have that where ψ ′ is in (38), t * = (t * 1 , t * 2 ) is in (25) and t ′ (x,y,z) = ∂t 1 ∂x (x, y, z) ∂t 1 ∂y (x, y, z) ∂t 1 ∂z (x, y, z) ∂t 2 ∂x (x, y, z) ∂t 2 ∂y (x, y, z) ∂t 2 ∂z (x, y, z) The expressions of ∂t 2 ∂x (x, y, z) and ∂t 2 ∂y (x, y, z) can be easily computed, but they are too long to be included here. We illustrate here some of the previous asymptotic results through a small simulation study. We focus on the index I in (19) since the other proposals, I * in (23) and I * in (26), are smooth transformations of I. To carry out the simulations there are two options to generate the data: to consider parametric families of Lorenz curves (see for instance Sarabia (2008) ) or to use parametric probability density functions. We have chosen to simulate the data from probability densities that are used in practice to model income distributions. Many families of probability distributions have two parameters that represent changes in location and scale; Weibull or Lognormal distributions are examples of such type of families. When this happens, in the Lorenz curve (1) the location parameter disappears (because of the normalization) and only the scale parameter remains. Typically, by moving this dispersion parameter, distributions of the same family are ordered in the Lorenz sense. Therefore, the value of the index between two variables of the same (two-parameter) family is usually located on one of the 2 diagonals of the region ∆ in (20). Consequently, to look for examples whose index lies within the triangle we resort to distributions with more than 2 parameters. Here we propose to use generalized beta distributions of the second kind (GB2). This family has been previously considered as a model for the distribution of income; see Chotikapanich et al. (2018) and McDonald and Ransom (2008) . The probability density of the GB2 distribution depends on 4 positive parameters: a, b, p, q and is given by where B(⋅, ⋅) is the Euler beta function. We will denote X ∼ GB2(a, b, p, q) a random variable with this density. From the expression of the density of X ∼ GB2(a, b, p, q) in (39), we see that f (x a, b, p, q) behaves as 1 x aq+1 , as x → ∞. In particular, if α > 0, we have that EX α < ∞ if and only if aq > α. Consequently, GB2 variables are integrable whenever aq > 1, and, in such a case, we can compute their Lorenz curves. Further, to apply Proposition 7 or Corollary 3 it is sufficient that aq > 2. It can be checked that the Lorenz curve of X ∼ GB2(a, b, p, q) is given by where is the incomplete beta function. The Lorenz curve of X ∼ GB2(a, b, p, q) in (40) does not depend on b as it is essentially a scale parameter. However, it is not convenient to select b = 1 for the considered examples, as the mean of X is which also depends on the rest of the parameters. Therefore, to compare better the densities of the considered models in all the GB2 distributions we fix the value + 1 a, q − 1 a) . In this way, the variables always have expectation 1. We have considered 5 models of pairs of GB2 variables to reflect a wide range of possible situations regarding the value of the index I in (19). Model 1: In this example, we have that Therefore, I(X 1 , X 2 ) ≈ (−0.0037, 0.0858). Hence, the value of the index lies almost on the vertical line of equality of Gini indexes and the distance between the distribution is intermediate. We also note that a i q i > 2 and this means that the integrability condition given in (29) is satisfied. In particular, we can apply Corollary 3 to obtain that the (normalized) plug-in estimator of the index is asymptotically normal. Model 2: For i = 1, 2, we consider We can compute Therefore, I(X 1 , X 2 ) ≈ (−0.3510, 0.3510). That is, the variables are ordered with respect to the Lorenz dominance. Further, again a i q i > 2 and the integrability condition in (29) holds. By Corollary 3 we see that the (normalized) plug-in estimator of the index is a asymptotically normal and the limit distribution is concentrated on the diagonal L 2 . Model 3: We obtain that G(X 1 ) ≈ 0.3547, G(X 2 ) ≈ 0.3723, d L (X 1 , X 2 ) ≈ 0.041. Then, I(X 1 , X 2 ) ≈ (−0.0176, 0.0414). The two variables have a similar Gini index and small distance between them. The parameters satisfy a i q i > 2 and we can use Corollary 3 to conclude that the (normalized) plug-in estimator of the index is asymptotically normal. Model 4: For i = 1, 2, we consider We have that We hence obtain that, I(X 1 , X 2 ) ≈ (−0.0007, 0.1369). We see that the variables have similar Gini index and a high distance between them. In this case, a 1 q 1 = 1.2 and we cannot apply Proposition 7 or Corollary 3 because X 1 does not satisfy the integrability condition (29). Only the consistency of the estimator is guaranteed by Proposition 6. Model 5: In this example X 1 = X 2 (in distribution). We consider The index takes the value (0, 0) and as a i q i = 4, we can apply Proposition 6 to conclude that the (normalized) estimator of the index converges in distribution to a non-Gaussian random vector. Additional details of these simulations can be found in the Supplementary Material. The main conclusions are the following: In Models 1-3, the asymptotic distribution of the estimator of the index is normally distributed. However, if the corresponding Lorenz curves are close to each other, then larger sample sizes are needed to observe a Gaussian distribution. This is reasonable because when the variables coincide (Model 5), the limit distribution is not normal (it has a second positive component). In the case of Model 4 (there is no convergence), we observe that we can estimate the index reasonably well as the estimator is consistent (see Proposition 6). EU-SILC (European Union Statistics on Income and Living Conditions) is the reference source for comparable longitudinal and cross-sectional microdata on income, living conditions, poverty and social inclusion in Europe. As part of its objective of monitoring poverty and social inclusion in the EU, the EU-SILC project releases statistics and reports on income and living conditions, for instance, indicators on the distribution of income. The microdata are separately provided to EU-SILC by each country participating in the project, as collected by the administrative organism in charge of compiling the official statistics of that state. In this section we compute the plug-in estimator (27) of the bidimensional inequality index introduced in (19) for cross-sectional income microdata (at the household level) obtained from EU-SILC collection. The random variable X under consideration is the equivalised disposable income, the total disposable income of a private household divided by the equivalised household size. The total disposable income represents the total income of a household which is available for saving or spending. The equivalised household size is the number of household members converted into "equivalised" adults by the modified OECD (Organisation for Economic Co-operation and Development) equivalence scale: the first household member aged 14 years or more counts as 1 person, each other household member aged 14 years or more counts as 0.5 person, each household member aged 13 years or less counts as 0.3 person. The equivalised disposable income is one of the variables describing income at the household level which is used by Eurostat (the statistical office of the EU) to compute Gini coefficients and other inequality indicators. We use the bidimensional inequality index I in (19) to compare the empirical Lorenz curves derived from the equivalised disposable income of two populations, X 1 and X 2 . These are generated in two different ways: (i) Cross-temporal comparisons within countries: Fix a reference year and country. Analyze the evolution of the index in that country (relative to the initial year) over a period of time. This gives insight into the evolution of income inequality in a specific country along the period under study. (ii) Longitudinal comparisons between two countries: Two different countries in the same year (and letting the year evolve along an available period). This allows comparing the relative evolution of inequality in the two countries. EU-SILC offers microdata from several European countries on a span of more than a decade. Due to obvious proximity reasons, as an example of (i), we have chosen to focus on the evolution of Spain Year 2008 was the onset of a severe Spanish financial and economic crisis, officially ending in 2014. It was triggered by the world financial crisis of 2007-08, but one of its main causes was the heavy dependency of the Spanish economy and labour market on low-productivity activities such as construction and services (see, e.g., Royo (2020, Chapter 4)). The existence of a housing bubble and a record level of family indebtedness had a snowball effect. There was a first recession period between 2008 and 2010 and a second one between 2011 and 2013. In 2012, Spain had to apply for a 100 billion rescue package provided by the European Stability Mechanism. Due to the resulting steep rise of unemployment rate, in 2013 more than half a million immigrants returned from Spain to their countries of origin. Figure 10 plots the evolution of the mean equivalised disposable income and the Gini index from 2008 to 2019 and clearly reflects this abrupt crisis. Figure 10 also shows the hard climb towards a recovery of the pre-crisis level, which has taken more than 9 years (The Economist (2018), IMF (2017)) and has been abruptly ended by the COVID-19 pandemic (IMF (2020)). Although in 2017 Spanish GDP went beyond its pre-crisis peak of 2007 and many indicators reflected the impressive recovery, it was generally agreed that the country was more unequal than in 2008 (The Economist (2018), IMF (2017)). Thus, it is interesting to analyze the evolution of inequality in Spanish society from 2008 onwards (see Blavier (2017)), in particular to compare the distribution of income between 2008 (held fixed) and the following years. Microdata from INE and EU-SILC cover up to year 2019 (included). Income data from 2020 are not yet available, but they will undoubtedly reflect the severe economic contraction induced by the pandemic and an increase of socio-economic disparities in the population. To this end, we have computed the estimationÎ of the bidimensional inequality index (19) and one of its normalized versionsÎ * for the equivalised disposable income in Spain in 2008 (X 1 ) and in any of the years in the span 2009-2019 (X 2 ). Observe that the index I * separates the points more thanÎ, especially those with similar Gini coefficient (near the vertical axis). The resulting indices (see Figure 11) show the devastating effects of the crisis on the distribution of income. From 2011 to 2017 the Lorenz curves of the corresponding years were either on the frontier L 1 (years 2011 and 2016) or very near it, meaning the curves were strictly ordered 1 ≥ 2 (or almost so) and income was distributed more equitably in 2008 than in 2011 or 2016. The curve 2 for the rest of the years from 2011 to 2017 is below 1 except for a rightmost interval contained in [0.8,1] where to that of 2008 in the Supplementary Material). Income distribution in 2008 is therefore almost more equitable than that of the years 2012, 2013. . . (in the sense defined by Zheng (2018) ). This would support the generalized social perception that the 2008 crisis in Spain stroke not only the lowest income class but also the middle class (see Alonso et al. (2017) ), broadening the gap between both groups and the richest (last income decile). Another collateral effect of the 2007-08 world financial crisis was the abrupt deterioration of the Greek sovereign-debt crisis. In 2009 the newly elected Greek government announced that its predecessor had underreported national debt levels and deficits. The consequent loss of confidence in the Hellenic economy, its structural weaknesses and other problems such as tax evasion triggered a chain reaction: the increase of national bond yields and the recession resulted in the downgrading of Greek bonds to junk status and a threat of sovereign default in 2010. Successive international bailout loan programs (in 2010, 2012 and 2015) came at the cost of severe austerity measures in Greece. As a result, a huge number of businesses were bankrupt and the unemployment rate rose without control, thus entering a spiral of economic implosion and population impoverishment. Surprisingly, the effects of this deep and prolonged crisis on inequality in Greece were not as dramatic as one could expect (see the evolution of the Greek Gini index in Figure 12 and the Supplementary Material). As noted by Mitrakos (2014) , Greece already entered the crisis with a high level of income inequality and, also, the thousands of people who ended up homeless as a result of the crisis were not part of the Household Budget Surveys. The effects of the austerity measures are noticeable in the sharp decline of the household disposable income (see Figure 12 and the Supplementary Material). In 2018 Greece exited the last of the bailouts, still owing a debt-to-GDP ratio of more than 150%. In contrast, Finnish economy has been growing steadily since the country joined the euro zone and is stable, diversified and competitive. The negative effects of the 2007-08 world crisis on Finnish economy were not severe. Income inequality is among the lowest in the EU (see Figure 12 ). We compare the evolution of the bidimensional inequality index I between these two extreme countries of the EU. Our aim is to check the ability of the index to reflect that the Greek and Finnish income distributions are ordered (or almost so) in all the years of the period. Indeed, in Figure 13 we can see that the index I is mainly on the left frontier L 2 of the region ∆ or extremely close to it, that is, Finland (almost) uniformly distributes income more evenly than Greece. However, observe that in 2018 and 2019 the distance between the two countries has greatly diminished indicating an improvement in the Greek distribution of income. In the Supplementary Material we have compared Greece with Portugal, a country facing economic problems well before the world crisis of 2007-08 and whose income inequality was even greater than that of Greece at the start of the crisis. In this case, the bidimensional index is also lying on the L 1 frontier of ∆ (or very near it). In Figures 14 and 15 we can see the relative evolution of inequality in Spain and Portugal from 2008 to 2018. Observe that in 2008 the inequality indexÎ was almost on the right frontier of the region ∆, indicating that income distribution in Portugal was nearly ordered with respect to that of Spain (Spain uniformly distributed income better than Portugal). But, as the crisis struck in Spain, the Portuguese economy cut the distance with the Spanish one and the indexÎ moved towards the vertical line of equal Gini We examine an example of two countries, Germany and France, whose relative inequality has great variations as reflected in the position of the bidimensional index I (see Figure 17 ). The value and the evolution of the Gini index is heavily dependent on the variable (e.g., disposable equivalised household income or personal labour income) under study (see Battisti et al. (2016) ). The subject of growing inequality in Germany has been a matter of interesting discussions (see, e.g., Dao (2020) ): corporate investments and assets revenue benefit the richest and have widened top income inequality; a decrease of unemployment has increased the variability and range of wages. The German reunification caused an inequality increase, but the Gini index has been stable since the mid-2000s (see Figure 16 ). Inequality in France is a matter of great concern. Extensive redistribution of wealth and income through taxes and social transfers is carried out with the aim of correcting poverty and reducing income disparities. This explains the tendency of the bidimensional index I to be in the left half of the region ∆, indicating a more unequal distribution of income in Germany than in France. Here we collect the proofs of the main results stated in Sections 4-7. First, we enumerate below some regularity properties of the functions in the set L defined in (34) that will be useful throughout the appendix. Figure 13 : Bidimensional inequality index I for income data from Greece (X 1 ) and Finland (X 2 ). We start with a slight change in the definition of the functions in the set L of (6). Given ∈ L we redefine the value of at 1 as (1) = sup [0, 1) . This redefinition is motivated by the fact that, as shown in the following proposition, functions in L become continuous in [0, 1]. In addition, the convexity of L and L a remains true. With this definition, L becomes the set of convex ∶ [0, 1] → [0, 1] such that (0) = 0 and (1) = sup [0,1) . In the following proposition we denote by W 1,1 (0, 1) the Sobolev space W 1,1 in the interval (0, 1), which is equivalent to the set of absolutely continuous functions in [0, 1]; it is endowed with the norm where ′ is the distributional derivative of , which coincides a.e. with the derivative of . For α ∈ (0, 1) we denote by W 1,∞ (0, α) the Sobolev space W 1,∞ in the interval (0, α), which is equivalent to the set of Lipschitz continuous functions in [0, α]; it is endowed with the norm See, e.g., Brezis (2011, Chapter 8) or Evans and Gariepy (1992, Chapter 4) for the definition and properties of these spaces. for each α ∈ (0, 1). Moreover, and for each α ∈ (0, 1), (b) The function ′ is locally of bounded variation, the right derivative ′ (x + ) exists for all x ∈ [0, 1) and is non-decreasing. Moreover, ′ (0 + ) ≥ 0. (c) ′′ is a non-negative Radon measure. Proof. Convex functions are locally Lipschitz (see Evans and Gariepy (1992, Theorem 6.3 .1) or Simon (2011, Theorem 1.19)), have a first derivative locally of bounded variation (see Evans and Gariepy (1992, Theorem 6.3 .3)) and have a second derivative in the sense of distributions (see Evans and Gariepy (1992, Theorem 6.3 .2) or Simon (2011, Theorem 1.29)), which in fact is a non-negative Radon measure. Further, the right derivative ′ (x + ), exists for all x ∈ [0, 1) and is non-decreasing (see Simon (2011, Theorem 1.26) ). As (0) = 0 and ≥ 0, we necessarily have that ′ (0 + ) ≥ 0, so ′ (x + ) ≥ 0 for all x ∈ [0, 1). By the version of the fundamental theorem of calculus for convex functions (see Simon (2011, Theorem 1.28) ), is non-decreasing. In addition, the derivative of exists a.e. and coincides a.e. with the right derivative, so ′ ≥ 0 a.e. In particular, We conclude that W 1,1 (0,1) ≤ 1 + 1−a 2 . On the other hand, we observe that the affine function s ∶ [0, 1] → R given by s(x) = ′ (α + )(x − α) + (α) is a supporting line of at the point (α, (α)). By convexity, we hence have that s ≤ and then, We conclude that ess sup (0,α) We are now ready to prove that L a is a compact set of L 1 . Proof of Proposition 1 in Section 4 (Compactness of L a ). The convexity of L a is straightforward. Let us prove its compactness. Let { n } n∈N be a sequence in L a . By Proposition 10, { n } n∈N is bounded in W 1,1 (0, 1), so by the Rellich-Kondrachov theorem (see Brezis (2011, Theorem 8.8) ), there exists a subsequence (not relabelled) and an ∈ L 1 such that n → in L 1 as n → ∞. This also implies that G( ) = a. On the other hand, for each α ∈ (0, 1) we have by Proposition 10 (a) that { n } n∈N is bounded in W 1,∞ (0, α), so by the Ascoli-Arzelà theorem (see Brezis (2011, Theorems 4.25 and 8.8)), for a further subsequence, n → uniformly in [0, α] as n → ∞. In particular, function, we obtain that is convex in [0, α]. Therefore, 0 ≤ ≤ 1 in [0, 1) and is convex in [0, 1). We redefine (1) as (1) = sup [0,1) , so that becomes continuous in [0, 1]. We also obtain that 0 ≤ ≤ 1 in [0, 1] and is convex in [0, 1]. Therefore, ∈ L a and the proof is finished. ◻ 9.2 Proof of Theorem 1 in Section 4 (Extreme points of L a ) We will use an alternative description of the elements in L a in terms of positive measures concentrated on the interval (0, 1). The main idea is based on the following fact: any curve ∈ L a is univocally determined by its second derivative, ′′ , together with the conditions (0) = 0 and G( ) = a (or, equivalently, = 1−a 2 ). Given a ∈ [0, 1], we denote by M a be the set of non-negative Radon measures µ concentrated on the interval (0, 1) and such that Proposition 11. For a ∈ [0, 1], the map T a ∶ L a → M a defined by T a ( ) = ′′ is an affine isomorphism with inverse T −1 a ∶ M a → L a given by Proof. First we see that the map T a is well defined. Given ∈ L a , we have from Proposition 10 that ′′ is a non-negative Radon measure. As ′ is locally of bounded variation, a.e. t ∈ (0, 1). Figure 17 : Bidimensional inequality index I for income data from Germany (X 1 ) and France (X 2 ). As is locally Lipschitz and (0) = 0, for x ∈ [0, 1], we have that where for the last equality we have used Fubini's theorem. Integrating in x ∈ (0, 1) equality (43) (and by Fubini's theorem again) we obtain the restriction As ′ (0 + ) ≥ 0, from (44) we directly obtain the first inequality of (41). On the other hand, (43) and (44) show that Hence, imposing (1) ≤ 1 we have the second inequality of (41). Now, for µ ∈ M a , we define as in the right-hand side of (42) and we will check that ∈ L a . First, (0) = 0 and Further, thanks to the second inequality of (41), we obtain that By Leibniz integral rule (differentiation under the integral sign), it can also be checked that and hence ′′ = µ as measures. As µ is positive, from (46) we have that ′ is essentially non-decreasing, is convex and (1 − s) 2 dµ(s) ≥ 0, a.e. x ∈ (0, 1), by the first inequality of (41), so is non-decreasing. In particular, ≥ 0. This shows that ∈ L a . Finally, we prove that the maps T a and (42) are mutually inverse. Given ∈ L a , if we apply first T a and then (42) we get back thanks to (45). Conversely, given µ ∈ M a , if we apply first (42) and then T a we recover µ by (47). Since T a is affine, the proof is concluded. Next we calculate Ext(M a ). We denote by δ x the Dirac measure at x ∈ [0, 1]. Proposition 12. For a ∈ [0, 1], we have that Proof. The proof is divided into several smaller results. Step 1: The null measure µ ≡ 0 ∈ Ext(M a ). This is direct as all the measures in M a are non-negative. Step 2: For all x 1 ∈ (0, a], the measure µ = 1−a (1−x 1 ) 2 δ x 1 ∈ Ext(M a ). Clearly, µ ∈ M a . Assume that µ = t 1 µ 1 + t 2 µ 2 , for some t 1 , t 2 > 0 with t 1 + t 2 = 1 and µ 1 , µ 2 ∈ M a . Then µ i = β i δ x 1 and, due to (41), (1−x 1 ) 2 , and, hence, µ 1 = µ 2 . Therefore, µ ∈ Ext(M a ). Step 3: For all x 2 ∈ (a, 1), the measure a x 2 (1−x 2 ) δ x 2 ∈ Ext(M a ). The proof is similar to the one of the previous step and it is therefore omitted. Step 4: If for some x 1 ∈ (0, a] and α ∈ R ∖ {0, 1−a (1−x 1 ) 2 }, µ = αδ x 1 ∈ M a , then µ ∉ Ext(M a ). The fact µ ∈ M a implies that 0 < α < 1−a (1−x 1 ) 2 . Therefore, for ε > 0 small enough, we have that µ ± εδ x 1 ∈ M a since both are positive measures and the restrictions (41) are satisfied; indeed, Finally, we can write µ = 1 2 (µ + εδ x 1 ) + 1 2 (µ − εδ x 1 ), and, hence, µ ∉ Ext(M a ). Step 5: If for some x 2 ∈ (a, 1) and α ∈ R ∖ {0, The proof is similar to that of Step 4 and it is left to the reader. Step 6: For all x 1 ∈ (0, a) and x 2 ∈ (a, 1), µ = It is immediate to check that Therefore, µ ∈ M a . Moreover, if µ = t 1 µ 1 + t 2 µ 2 for some t 1 , t 2 > 0 with t 1 + t 2 = 1 and µ 1 , µ 2 ∈ M a , then Furthermore, as µ i = ∑ 2 j=1 β ij δ x j for some β ij ≥ 0 (for i, j = 1, 2), we have that β ij x j (1 − x j ) = a, i = 1, 2, and, hence, Therefore, µ 1 = µ 2 . Consequently, µ ∈ Ext(M a ). Step 7: If for some α 1 , α 2 > 0 and 0 < x 1 < x 2 < 1 with If both inequalities in (48) were equalities, we necessarily have that 39 against our assumption. Therefore, at least one of the two inequalities of (48) is strict. If ∑ 2 i=1 α i (1 − x i ) 2 < 1 − a, then we consider the signed measure defined by Then, it is straightforward to check that, for small enough ε > 0, µ ± εµ 0 ∈ M a and Therefore, µ ∉ Ext(M a ). we then consider the signed measure Again, we have that, for small enough ε > 0, µ ± εµ 0 ∈ M a and equality (49) holds. We conclude that µ ∉ Ext(M a ). Step 8: If µ ∈ M a is supported in more than two points, then µ ∉ Ext(M a ). We consider the signed measure µ 0 = ∑ 3 i=1 α i µ A i . For ε > 0 small enough, define µ + and µ − as µ ± = µ ± εµ 0 . Then µ = 1 2 µ + + 1 2 µ − with µ ± ≠ µ. Moreover, µ ± are positive measures since where A c stands for the complement of the set A in (0, 1). In fact, µ ± ∈ M a since Therefore, we conclude that µ ∉ Ext(M a ). The eight steps above complete the proof. Step 8 of the previous proof is related to the works by Winkler (1988) and Pinelis (2016) , where they analyze the set of extreme points of subset of measures defined through some inequalities. Proof of Theorem 1 in Section 4 (Extreme points of L a ). By Proposition 11, we have the equality Ext(L a ) = T −1 a (Ext(M a )). Now, we can use Proposition 12 to determine the set Ext(L a ). By (42) and Proposition 12, we obtain three families of extreme curves in L a . First, for x 1 ∈ (0, a], let a x 1 = T −1 a 1−a (1−x 1 ) 2 δ x 1 and a 0 = T −1 a (0). More explicitly, we have that, for Second, for x 2 ∈ (a, 1), we set m a Finally, for x 1 ∈ (0, a) and x 2 ∈ (a, 1), let n a x 1 , In this case we have that These curves admit the characterization as piecewise affine functions given in (9). Therefore, the proof of Theorem 1 is complete. ◻ Here we compute the exact value of M (a, b) in (11). The proof of this theorem is long and we have divided it into several results. It is based on following proposition. Proposition 13. For a, b ∈ [0, 1], let M (a, b) be defined in (11). We have that M (a, b) = max{d L ( 1 , 2 ) ∶ 1 ∈ Ext(L a ) and 2 ∈ Ext(L b )}. Proof. The distance d L ∶ L a × L b → R is a convex and continuous functional in L 1 . Further, by Proposition 1, the convex sets L a and L b are compact in L 1 . Therefore, by Bauer's maximum principle (see, e.g., Phelps (2013, Proposition 16.6)), the maximum of d L is attained at the set Ext(L a × L b ) = Ext(L a ) × Ext(L b ). Proposition 13 together with Theorem 1 reduce the calculation of M (a, b) to a finitedimensional problem, in fact, to several problems of dimension at most 4. Although in principle these problems can be solved using elementary analytic techniques, the computations are extremely cumbersome. For this reason, in the following we present several auxiliary results to simplify the calculations. Let a ∈ [0, 1] and ∈ L a . To prove the inequalities in (18), we will first show that there exist c + , c − ∈ [0, 1] (depending on ) such that To check (51), we note that the right derivative of at 0, ′ (0 + ), is necessarily less than or equal to 1 − a. Further, ′ (0 + ) = 1 − a if and only if = + a , as in this case + a is a supporting line of at 0. If ′ (0 + ) < 1 − a, then − + a is a continuous and convex function in [0, 1), starting at 0, with negative derivative at 0, and zero integral. Hence, there exists c + satisfying (51) . To prove (52), observe that − a − ≤ 0 in [0, a). Also, (a) = 0 if and only if = − a . If (a) > 0, then − a − is a continuous and concave function in [a, 1) such that − a (1) − (1 − ) ≥ 0. As − a − has zero integral, there exists c − ∈ (a, 1) satisfying (52). Now, for 0 ≤ t ≤ c + , by (51), we directly have that ∫ For c + ≤ t ≤ 1, as and + a have the same integral in (0, 1), we have that An analogous reasoning shows the second inequality in (18) and the proof is complete. ◻ Firstly, we carry out a detailed analysis of the maximum value of the distance d L ( , m) (for ∈ L a and m ∈ L b ) when the number of crossing points of the functions and m is less than one. In the following lemma we consider the simpler case in which the Lorenz curves are ordered. Proof. By (7) and the triangular inequality, we always have that Moreover, if the inequality above is an equality, we conclude that and m are ordered since − m − − m is a continuous and non-negative function on (0, 1). The last part follows from the fact that d L ( + a , + b ) = a − b , where + a and + b are defined as in (12). Observe that the minimal distance is attained when the curves ∈ L a and m ∈ L b are (pointwise) ordered, i.e., when the underlying variables are ordered in the Lorenz sense. The second result (Lemma 5 below) analyzes the maximum distance between pairs of curves with only one sign switch. We need the following auxiliary lemma. Lemma 4. For any convex function ϕ ∶ [0, 1] → R, we have that Proof. For this proof we consider the original definition of L given by (6), i.e., any ∈ L satisfies (1) = 1 instead of (1) = sup [0, 1) . In this way, each ∈ L a is itself a distribution function of a random variable, say X , concentrated on the interval [0, 1]. The inequalities in (18) together with the fact that all variables X ( ∈ L a ) have the same expectation imply (53); see (Shaked and Shanthikumar, 2006, Theorem 3.A.1) . We now establish the maximum value of the Lorenz distance for curves with one crossing point. Lemma 5. Let a, b ∈ [0, 1] and consider ∈ L a , m ∈ L b and ± a , ± b the curves defined in (12). We have that Moreover, it holds that Proof. As in Lemma 4, in this proof we consider the definition of L given by (6), i.e., any ∈ L satisfies (1) = 1. Thus, each function in L is a distribution function of a random variable concentrated on [0, 1]. We only show (a), as the proof of (b) is analogous. Let us consider the function s t 0 ∶ [0, 1] → R defined as Clearly, the function s t 0 is convex, Lipschitz and its distributional derivative is given by s ′ t 0 = −1 [0,t 0 ] + 1 (t 0 ,1] , where 1 A stands for the indicator function of the set A. By integration by parts in the Lebesgue-Stieltjes integral, we obtain that As s t 0 is convex, by Lemma 4, we conclude that Now, from the expression of the Wasserstein distance between probability distributions on the line (see Vallender (1974) ) and by virtue of the Kantorovich-Rubinstein duality (see Villani (2009, eq. (6. 3))), we have that Putting together the relations above we obtain that Therefore, part (a) of this lemma is proved. To finish, we note that the curves + a and − b have one sign switch at the point t 0 = a (a+b−ab). The Lorenz distance in (54) can be directly computed by elementary geometry (as twice the sum of the areas of two triangles; see Figure 6 ), which completes the proof of the lemma. Another observation that simplifies to a great extend the calculations is a symmetry reasoning. Given the graph of a function in L a , its symmetry along the line y = 1 − x corresponds to the graph of another curve in L a . The precise definition, statement and proof are as follows. Lemma 6. For ∈ L, define −1 ∶ [0, 1] → [0, 1] as in (2). Let us consider the functioñ We have that the map ↦˜ is a bijective isometry from (L, d L ) to (L, d L ) whose inverse is itself. That is, for any , m ∈ L, we have that d L ( , m) = d L (˜ ,m). Moreover, for a ∈ [0, 1], the map ↦˜ is also a bijective isometry from (L a , d L ) to (L a , d L ). Proof. As in the proof of Lemma 5, we consider the original definition of L given by (6), i.e., any ∈ L satisfies (1) = 1. Let ∈ L. To check the first assertion we will first show that˜ ∈ L. Let us consider Indeed, by Proposition 10, is non-decreasing. We shall see that ∶ [x , 1] → R is strictly increasing. Assume, by contradiction, that there exist x 1 , x 2 with x ≤ x 1 < x 2 ≤ 1 and (x 1 ) = (x 2 ). Then, the constant function (x 2 ) is a supporting line of at (x 2 , (x 2 )), so, by convexity, ≥ (x 2 ). Since x 2 > x , by definition of x we have that (x 2 ) > 0. In particular (0) > 0, which is a contradiction. We conclude that ∶ [x , 1] → R is strictly increasing. By Proposition 10, [0,1) is continuous and, in particular, ∶ (x , 1) → (0, (1 − )) is a bijection. Moreover, [0,x ] = 0 and (1) = 1. With these properties of , it is immediate to check that −1 is given as in (55). Now, from (55), we obtain that The inverse of an increasing convex function is concave (see Simon (2011, Example 1.6) ). Hence, the function −1 in (55) is concave, and this implies that˜ in (56) is convex. In particular, it follows that˜ ∈ L. Next we will check that̃ = . It can be easily seen that˜ (1 − ) = 1 − x and x˜ = 1 − (1 − ). Moreover, ˜ (x˜ ,1) −1 ∶ (0, 1−x ) → (x˜ , 1) is given by ˜ (x˜ ,1) −1 (y) = 1− (1−y). Therefore, by (56), we conclude that̃ = . Next we need to prove that the map ↦˜ is an isometry. We consider m ∈ L and observe first that ˜ −m = −1 − m −1 . Further, as and m are distribution functions of random variables concentrated on the interval [0, 1], by using the expression of the Wasserstein distance for real-valued random variables (see Vallender (1974) Finally, to check the last assertion of the lemma it is enough to verify that, for a ∈ [0, 1] and ∈ L a , one has G(˜ ) = a. Let us fix ∈ L a . We will equivalently show that ˜ = (1 − a) 2. We apply Laisant's formula (see, for instance, Parker (1955) ) for the integral of the inverse to obtain Therefore, From (56), we finally obtain that which completes the proof of the lemma. Lemma 6 helps us to disregard some cases in the computation of the value of M (a, b) , as the next result shows. Lemma 7. For a, b ∈ [0, 1], let M (a, b) be defined in (11). We have that Proof. We describe first how the extreme points of Theorem 1 are affected under the isometry ↦˜ defined in Lemma 6. From (9) and (56), we find that they are the piecewise affine functions such that Asñ a x 1 ,x 2 ∉ Ext(L a ), by Proposition 13 and since ↦˜ is an isometry (see Lemma 6), we can exclude the functions n a x 1 ,x 2 in the computation of the maximum given in (50). Here, we will combine all the previous results to prove Theorem 2. We start with the following lemma. Lemma 8. For a, b ∈ [0, 1], let d 1 (a, b) and d 2 (a, b) be defined in (57). We have that where the curves ± a , ± b are defined as in (12). Proof. It is easy to see that the curves a x 1 and b y 1 cannot have two crossing points, and neither can m a x 2 and m b y 2 . Therefore, the conclusion follows from Lemmas 3 and 5. We are then led to the computation of d 3 (a, b), i.e., the maximum of value of d L ( a x 1 , m b x 2 ), when x 1 ∈ [0, a] and x 2 ∈ (b, 1). This question is addressed in the following result. Lemma 9. For a, b ∈ [0, 1], let d 3 (a, b) be defined in (57). We have that Proof. Let x 1 ∈ [0, a] and x 2 ∈ (b, 1). Thanks to Lemmas 3 and 5, we can assume that the curves a x 1 and m b x 2 cross each other twice; Figure 18 represents this situation. This happens if and only if the triangle with vertices has positive orientation; equivalently, if and only if In this case, they cross each other at the points (x * 1 , y * 1 ) and (x * 2 , y * 2 ), with x * 1 ≤ x * 2 and x 1 x 2 1 0 Figure 18 : A graphical representation of the functions a x 1 (in black) and m a x 2 (in blue), as well as their crossing points, (x * 1 , y * 1 ) and (x * 2 , y * 2 ). For simplicity, from now on we will call A = A(x 1 , x 2 ) = d L a x 1 , m b x 2 . The value of A can be therefore calculated as twice the sum of the areas of the three triangles (see Figure 18 ) So are led to the maximization of A = A(x 1 , x 2 ) under the constraints (58) Obviously, the supremum under (58) coincides with the maximum under the analogous inequalities of (58) but replacing the '<' with '≤', and '>' with '≥'. Observe that A is a rational function, and, hence, the computation of its maximum in region (58) can be done by elementary techniques. Nevertheless, the computations are extremely long, so we have used the programme Mathematica (2012) in the rest of the proof to avoid unnecessary details. We find that the only solution of ∂A ∂x 1 = ∂A ∂x 2 = 0 under (58) is which in fact requires a + b 2 < 2b. The corresponding value of A is which is seen to be less than the value of d L ( − a , + b ) (computed in Lemma 5), thanks to the restriction a + b 2 < 2b. After having checked the value of A at the critical points in the interior of region (58), we analyze the value of A on the boundary. Describing the boundary of region (58) is cumbersome since it involves several cases, according to the values of a, b. In any case, the boundary is clearly contained in the set x 1 and m b x 2 do not have a proper crossing, so, by Lemma 3, A = b − a , which does not release a maximum. Therefore, we are led to the maximization of A(x 1 , x 2 ) in the set The value of A when x 1 = 0 is which is decreasing in x 2 , so the maximum is attained at x 2 = b and equals which is increasing in x 2 , so the maximum is attained at x 2 = 1 and equals The value of A when x 2 = b is which is decreasing in x 1 , so the maximum is attained at x 1 = 0 and equals The value of A when x 2 = 1 is which is increasing in x 1 . Therefore, the maximum is attained at x 1 = a and equals This concludes the proof. We finally observe that the proof of Theorem 2 directly follows from Lemmas 5, 7, 8, and 9. Proof of Proposition 4 (The set ∆ * ). This result follows from Corollary 1. Proof of Proposition 5 (Properties of the indices). We will only consider the index I * as the corresponding proofs for I * are analogous. Parts (i) and (ii) are fulfilled by construction. Parts (1) and (2) of (iii) are consequences of Lemma 3, while (iii) (3) is fulfilled by the definition of extremal Lorenz curves (see Definition 1). Part (iv) (1) is trivial while (iv) (2) and (3) follow from (iii). Finally, part (v) holds by dominated convergence. ◻ Proof of Proposition 6 (Strong consistency). By the triangle inequality, we obtain that From Goldie (1977) , we have thatˆ j → j (uniform convergence) a.s., as n j → ∞ (for j = 1, 2). Using dominated convergence, we therefore have that the right-hand side of (59) goes to 0 a.s. as n j → ∞, and, consequently,Î( 1 , 2 ) → I( 1 , 2 ) a.s. Moreover, as the maps t * in (22) and t * in (25) are continuous, we also conclude thatÎ * ( 1 , 2 ) → I * ( 1 , 2 ) andÎ * ( 1 , 2 ) → I * ( 1 , 2 ) a.s. ◻ Proof of Proposition 7 (Asymptotic behaviour). Let us consider the map φ ∶ From (19), we obviously have that n 1 n 2 n 1 + n 2 I(ˆ 1 ,ˆ 2 ) − I( 1 , 2 ) = n 1 n 2 Further, from Lemma 2, it is easy to check that φ is Hadamard directionally differentiable at ( 1 , 2 ) with derivative given by On the other hand, by Lemma 1 (and the independence assumption of the samples) we also have that Therefore, from (60)-(61), the application of the extended version of the functional delta method (see Shapiro (1990, Theorem 2 .1)) yields the result. ◻ Proof of Corollary 3 (Asymptotic normality). The equivalence between (a) and (b) follows from (32). As L is a centered Gaussian process, the distribution in (35) is normally distributed with mean (0, 0), so (b) implies (c). Conversely, if (c) holds, we have that the variable has zero mean normal distribution. Therefore, the set { 1 = 2 } has zero Lebesgue measure, since otherwise, the first summand in the previous equation would have normal distribution, which is clearly not possible. The simulated values of the normalized empirical index, n 1 n 2 n 1 +n 2 I(ˆ 1 ,ˆ 2 ) − I( 1 , 2 ) , appear in Figure 20 . The green point is the origin (0,0). The level sets in red are those of a mixture of normal densities fit (with the R library mclust) to the 1000 simulations of the inequality index. The number of components in the mixture and the parameterization of the covariance matrices of the Gaussian components was chosen via the Bayesian Information Criterion (BIC). The points in the mixture components with weight lower than 10% are colored in blue: we are thus able to detect the points in the data cloud which contribute most to the lack of normality. The convergence to normality is clear as n increases. Model 2: Since in this model the variables X 1 and X 2 are ordered (X 2 ≤ L X 1 , see Figure 21 ), the bidimensional index I lies on L 2 = {(−x, x) ∶ x ∈ [0, 1]} and only one of the components of I is of interest. We take the second component, d L ( 1 , 2 ), as it is positive. for n = 10000 and n = 50000. We have superimposed a mixture of normal densities (red line with higher weight and blue line with lower weight), whose number of components and homo/heteroscedasticity were determined with the BIC. Even though the number of components was 2 for both sample sizes, the histograms have Gaussian appearance and the component of the mixture with the highest weight (black line) is almost coincident with the mixture density. In Model 2, as the index I is one-dimensional, the convergence to normality is faster in n. The structure of the graphics is analogous to those of Model 1. The 1000 simulated values of the normalizedÎ appear in Figure 24 . We have superimposed the level sets (in red) of a mixture of Gaussian densities fit to these points. As before, BIC was used to choose the optimal parameters in the mixture. The points in the mixture component with weight lower than 10% are colored in blue. The convergence to normality is clear as n increases, but it is slower than in Model 1 as the Lorenz curves in Model 3 are closer to each other than in Model 1. In this case we have taken the sample sizes equal to n = 10 4 and n = 10 5 to illustrate that there is no convergence in distribution to normality. The 1000 simulated values of the empirical inequality indexÎ appear in Figures 26 and 27 . The green dot marks the population index I. We see that there is consistency of the empirical inequality indexÎ, but the rate of convergence seems slow. Model 5: Recall that in Model 5 the variables X 1 and X 2 follow the same distribution, so the real value of I is (0,0). The 1000 simulated values of the normalized empirical indexÎ appear in Figure 28 . 8 Application to EU-SILC income data 8.1 Yearly evolution of inequality in Spain with respect to 2008 In Figures 29-32 , on the left we display the Lorenz curves,ˆ 1 andˆ 2 , of Spanish income for 2008 and a year between 2009 and 2019, respectively. Since the two Lorenz curves are always very similar and it is difficult to appreciate the change from 2008 to the other year, on the right we plot the difference of the two Lorenz curves,ˆ 1 −ˆ 2 , scaled by the supremum norm of this difference, ˆ 1 −ˆ 2 ∞ . In Figure 38 , on the left we display the Lorenz curves,ˆ 1 andˆ 2 , of income in Spain and Portugal, respectively, for a year between 2008 and 2019. Since the two Lorenz curves are close to each other, on the right we plot the difference of the two Lorenz curves,ˆ 1 −ˆ 2 , scaled by ˆ 1 −ˆ 2 ∞ . Infinite Dimensional Analysis: A Hitchhiker's Guide I think the middle class is disappearing": Crisis perceptions and consumption patterns in Spain Nonparametric tests of stochastic dominance in income distributions Majorization and the Lorenz Order with Applications in Applied Mathematics and Economics Inequality: What can be Done? Consistent tests for stochastic dominance Central limit theorems for the Wasserstein distance between the empirical and the true distributions Copulas and temporal dependence Consistent nonparametric tests for Lorenz dominance Inequality in Germany: Myths, Facts, and Policy Implications, ifo Working Paper Tests for the second order stochastic dominance based on L-statistics Ethical indices for the measurement of poverty How did the great recession affect income inequality in Spain? economic sociology the european electronic newsletter The relativity of decreasing inequality between countries The Globalization of Inequality Functional analysis, Sobolev spaces and partial differential equations Inverse stochastic dominance, majorization, and mean order statistics Integrated empirical processes in L p with applications to estimate probability metrics Measuring Inequality Using the GB2 income distribution Weak approximations for empirical Lorenz curves and their Goldie inverses of stationary observations Wealth inequality and private savings: the case of Germany. International Monetary Fund Working Paper Making inequality comparisons when Lorenz curves intersect On nondifferentiable functions and the bootstrap. Probability Theory and Related Fields Lessons from Spain's recovery after the euro crisis Measuring statistical evenness: A panoramic overview On consistent hypothesis testing Measure Theory and Fine Properties of Functions Inference on directionally differentiable functions. The Review of Economic Studies From concentration profiles to concentration maps Sulla misura della concentrazione e della variabilità dei caratteri Convergence theorems for empirical Lorenz curves and their inverses Classical Fourier Analysis More equal and poorer, or richer but more unequal? Economic Quality Control International Monetary Fund. Spain: 2020 Article IV Consultation-Press Release; Staff Report; and Statement by the Executive Director for Spain Social inequality and its consequences in the twenty-first century Essays on Asymptotic Methods in Econometrics. Doctoral thesis Asymptotic theory of L-statistics and integrable empirical processes Statistical Size Distributions in Economics and Actuarial Sciences Integrals of inverse functions Preferred by "all" and preferred by "most" decision makers: Almost stochastic dominance Methods of measuring the concentration of wealth The generalized beta distribution as a model for the distribution of income: estimation of related measures of inequality Inequality, poverty and social welfare in Greece: distributional effects of austerity Lectures on Choquet's theorem On the extreme points of moments sets A simple and effective inequality measure The Methods of Distances in the Theory of Probability and Statistics Why Banks Fail. The Political Roots of Banking Crises in Spain Parametric Lorenz curves: Models and applications Stochastic Orders On concepts of directional differentiability Asymptotic analysis of stochastic programs Convexity Stochastic Dominance and Applications to Finance, Risk and Economics The Price of Inequality: How Today's Divided Society Endangers our Future Improved nonparametric bootstrap tests of Lorenz dominance Calculation of the Wasserstein distance between probability distributions on the line Weak Convergence and Empirical Processes: With Applications to Statistics Optimal transport: old and new Extreme points of moment sets The Gini Methodology: A Primer on a Statistical Methodology Testing Lorenz curves with non-simple random samples Almost Lorenz dominance Extremal points of Lorenz curves and applications to inequality analysis Supplementary material 2021 the R package GB2, for each model we have generated 1000 Monte Carlo samples with sample size n, that is, n 1 = n 2 = n from two variables X 1 ∼ GB2(a 1 , b 1 , p 1 , q 1 ) and X 2 ∼ GB2 Table 1: Summaries for the annual household equivalised disposable incomes in Spain ˆ 1 andˆ 2 , of the equivalised household income in Greece and Finland, respectively, for a year between 2004 and 2019. The two Lorenz curves are close to each other and it is difficult to appreciate the full detail of their differences (for instance, one could think that Finnish income is less than Greek income in the Lorenz order for any of the years) Comparing inequality between Greece and Portugal Bidimensional inequality index I for income data from Greece (X 1 ) and Portugal (X 2 )