key: cord-0179404-v9qhudt6 authors: Csat'o, L'aszl'o title: Group draw with unknown qualified teams: A lesson from the 2022 FIFA World Cup draw date: 2022-03-31 journal: nan DOI: nan sha: bafdb60de989416bd8319354a97444ce91eed9e4 doc_id: 179404 cord_uid: v9qhudt6 The draw for the 2022 FIFA World Cup has been organised before the identity of three winners of the play-offs is revealed. Seeding has been based on the FIFA World Ranking released on 31 March 2022 but these three teams have been drawn from the weakest Pot 4. We show that the official seeding policy does not balance the difficulty levels of the groups to the extent possible: a better alternative would have been to assign the placeholders according to the highest-ranked potential winner, similar to the rule used in the UEFA Champions League qualification. Our simulations reinforce that this is the best strategy in general to create balanced groups in the FIFA World Cup. Inspired by the criticism of Guyon (2014) and Guyon (2015) , FIFA has reformed the draw of the 2018 World Cup in order to produce balanced groups (Guyon, 2018; Cea et al., 2020) : according to the classical scheme, the 32 national teams have been divided into four pots based on the FIFA World Ranking (except for favouring the host Russia by assigning it to the strongest pot), and each group has consisted of a team from each group. However, because of the COVID-19 pandemic and the 2022 Russian invasion of Ukraine, FIFA has been forced to draw the groups of the 2022 World Cup when the identity of three teams has remained unknown. The draw has taken place on 1 April 2022, while the two inter-confederation play-offs are contested in June 2022 and the qualification match(es) of Ukraine have been rescheduled to the same month (FIFA, 2022a) . This raises a problem since seeding is based on the FIFA World Ranking released on 31 March 2022. The Organising Committee for FIFA Competitions has decided to assign the three placeholders to Pot 4, that is, among the weakest teams. We show that this questionable policy unnecessarily worsens the balance in the strengths of the groups. A better outcome can be provided by assigning the unknown placeholders according to the highest-ranked remaining team in each undecided contest. The proposed rule is currently used in the qualifications for the European club competitions (Csató, 2022d) . The solution chosen by FIFA is detrimental to some national teams, including Ukraine. To generalise our finding, three policies are compared in a stylised model of the FIFA World Cup draw with respect to the expected strengths of the groups: the placeholder from an undecided contest is assigned (1) according to the highest-ranked participant; (2) according to the lowest-ranked participant; or (3) to the weakest pot. The first option turns out to be the best to produce balanced groups. Naturally, our paper has antecedents in the extant literature. The uneven distribution of the 1990 (Jones, 1990) , 2006 (Rathgeber and Rathgeber, 2007) , 2014 (Guyon, 2015) , and 2018 FIFA World Cups (Csató, 2022a) have already been demonstrated. Cea et al. (2020) , Guyon (2015) , and Laliena and López (2019) have proposed draw systems for sports tournaments in the presence of geographical or seeding restrictions to create balanced groups with roughly the same competitive level. However, all these suggestions require a fundamentally new draw procedure, which is unlikely to be adopted soon. On the other hand, our recommendation builds on a principle already used by the Union of European Football Associations (UEFA). The remainder of the work is structured as follows. Section 2 summarises the rules of the 2022 FIFA World Cup draw. The methodology of our analysis is described in Section 3, and the findings are presented in Section 4. Section 5 attempts to derive general results, while Section 6 concludes. This section describes the draw procedure that has been used in the 2022 FIFA World Cup draw on 1 April 2022 (FIFA, 2022b) . It determines the allocation of 29 qualified teams, the winners of two inter-confederation play-offs, and the placeholder of a UEFA play-off slot into eight groups of four teams each. In addition, we argue for an alternative draw procedure. The 32 teams are divided into four pots on the basis of the FIFA World Ranking released on 31 March 2022 that already takes the results of qualification games played in the March 2022 international match window into account: • Pot 1 contains the host Qatar, automatically assigned to Group A, and the seven highest-ranked teams; • Pot 2 contains the teams ranked 8th to 15th; • Pot 3 contains the teams ranked 16th to 23rd; • Pot 4 includes the teams ranked 24th to 28th plus the two placeholders from the two inter-confederation play-offs and the winner of the UEFA play-off Path A. The inter-confederation play-offs are scheduled to be played on 13-14 June 2022 in Qatar. Two matches in the UEFA play-off Path A have been postponed out of necessity due to the Russian invasion of Ukraine as one semifinal and possibly the final involves Ukraine. The draw sequence starts with Pot 1 and ends with Pot 4. Each pot is emptied before moving on to the next one. Some draw conditions apply to ensure geographic separation (FIFA, 2022b): • No group can have more than one team from any continental confederation except for UEFA (AFC, CAF, CONMEBOL, CONCACAF). • Each group should consist of at least one but no more than two European teams. Since the 2022 World Cup will be contested by 13 UEFA members, five out of the eight groups are guaranteed to include two teams from Europe. The allocation of the two inter-confederation play-offs is based on the confederation of both potential winners. Even though the official overview of the draw procedure (FIFA, 2022b) does not specify how the draw constraints are met, clearly, the standard procedure of the FIFA/UEFA (Csató, 2022a) is used. In particular, the team drawn is placed in the first available group in alphabetical order as indicated by the computer such that any deadlock situation (when the teams still to be drawn cannot be allocated into the remaining slots without violating a draw condition) is prevented. For instance, assume that Group F/G/H contains Senegal/Morocco/Tunisia (all CAF) from Pot 3, respectively, whereas Cameroon and Ghana (both CAF) are among the five remaining teams in Pot 4. If the next empty slot in alphabetical order is in Group D and the fourth team drawn from Pot 4 is neither Cameroon nor Ghana, the latter cannot be assigned to Group D because otherwise, two African countries should be allocated for the four available groups but three of them are prohibited for CAF teams, which is impossible. This procedure is explained in a video available at https://www.youtube.com/watch? v=jDkn83FwioA through the example of the 2018 FIFA World Cup. The mechanism has first been proposed in Guyon (2014) for the FIFA World Cup draw and has been adopted by FIFA in 2018 (Guyon, 2018) . It has already received serious scrutiny in the literature (Boczoń and Wilson, 2018; Csató, 2022a; Klößner and Becker, 2013) . The assignment of the three placeholders representing the winners of the play-offs to the weakest pot is a questionable decision since they can be relatively strong teams as we will see later. The same problem arises in the qualification stages of the UEFA Champions League and the UEFA Europa Conference League but it is treated in a different way: "If, for any reason, any of the participants in such rounds are not known at the time of the draw, the coefficient of the club with the higher coefficient of the two clubs involved in an undecided tie is used for the purposes of the draw" (UEFA, 2021a,b, Article 13.03). According to this policy, the placeholders of the play-offs that are still to be contested for the FIFA World Cup should be placed in a pot based on the highest-ranked potential winner instead of Pot 4. Due to the assignment of the host in Group A, the 2022 FIFA World Cup draw has 7 × (8!) 3 ≈ 3.3 × 10 17 possible outcomes without accounting for geographic restrictions. Even though these criteria significantly decrease the number of feasible solutions, it is still impossible to exactly calculate the probability of each assignment. Furthermore, the consequences of choosing a particular seeding regime can only be uncovered if the results of matches played in the play-offs and the groups are determined. To that end, computer simulations will be used as recommended in the literature on tournament design (Scarf et al., 2009) . Since two teams from each group advance to the Round of 16, a group is usually judged to be tough when three teams have high rankings, even if the fourth one is much weaker (Guyon, 2015; Laliena and López, 2019) . Hence, our measure of group strength will be the weighted average of the ratings of the four participants, where the weight of the strongest, the second strongest, and the third strongest team is two, whereas the weight of the weakest team is one. The abilities of the teams will be quantified in two ways. The first is the rating points in the FIFA World Ranking of March 2022, underlying the pot allocation. Although FIFA has adopted the Elo method of calculation after the 2018 FIFA World Cup (FIFA, 2018a,b), the current FIFA World Ranking does not take home advantage and the margin of victory into account. Both factors are considered in the World Football Elo Ratings (http://eloratings.net), which is a widely used benchmark in the literature (Cea et al., 2020; Csató, 2022b; Guyon, 2014 Guyon, , 2015 Lasek et al., 2013 Lasek et al., , 2016 . This will provide the second measure for the strengths of the teams. As the FIFA ranking is somewhat slow to react to the changing skill level of the teams (the example of Canada and Ecuador will be seen later) and is still influenced by the transformation from the old ranking method in 2018 (Ecuador has a real difficulty gaining enough points to climb substantially since it mainly plays against other South American teams), the outcomes of all matches will be simulated on the basis of the World Football Elo Ratings. A traditional choice for the distribution of the number of goals in soccer is the Poisson distribution (Chater et al., 2021; Maher, 1982; Van Eetvelde and Ley, 2019) . Then the probability that team scores goals against team is where the expected number of goals scored by team against team is ( ) if the match is played on field (home: = ℎ; away: = ; neutral: = ). Football rankings (2020) determines parameter ( ) as a quartic polynomial of the win expectancy of team against team , which is and being the Elo ratings of the two teams, respectively. The rating of the home team is increased by 100 to reflect home advantage. The exact formulas are estimated by a least squares regression based on more than 29 thousand home-away matches and almost 10 thousand games played on neutral ground between national football teams. In addition, they contain a regime change at = 0.9 since unbalanced games usually mean an excessive number of goals. Most of the games are played on neutral ground when the expected number of goals for team against team is However, there are some home-away matches (two in the UEFA play-offs and the three group matches of Qatar) to be simulated, where the expected number of goals for the home team equals with 2 = 0.984, and the expected number of goals for the away team is given by The same simulation model has been used recently to quantify the incentive incompatibility of the European Qualifiers for the 2022 FIFA World Cup (Csató, 2022b) and the unfairness of the 2018 FIFA World Cup qualification (Csató, 2022c) . The play-offs contain single-game matches, hence draws are not allowed. If the two teams score the same number of goals, the winner is chosen randomly. This effectively means that there is no goal in extra time and the penalty shootout provides equal chances for the two teams, independently of the field of the game. The ranking of the teams in the groups is determined according to the following criteria: (a) greatest number of points obtained in all group matches; (b) goal difference in all group matches; (c) greatest number of goals scored in all group matches; (d) drawing of lots. Table 1 shows the composition of the pots, as well as the two measures of strength for the teams. Three play-off paths are not yet finished at the time of the draw. The corresponding matches are listed in Table 2 . Finally, the alternative pot allocation is presented in Table 3 . 4. All group matches are played, group rankings and the set of qualified teams are obtained. All simulations are carried out 1 million times to smooth the effect of random fluctuations. In the following, our findings about the 2022 FIFA World Cup draw will be presented. In particular, Section 4.1 addresses the balance across groups by quantifying their competitive level. The consequences of the official seeding regime with respect to the probability of qualification are uncovered in Section 4.2. Two seeding rules have been outlined in Section 2 and two measures of group strength have been suggested in Section 3. In each simulation run, the groups have been ordered according to their strength, and the averages of these values over the 1 million iterations have been computed. Group A is treated separately since Qatar is guaranteed to play there. Figure 1 focuses on the expected difficulty levels of the groups. The alternative seeding regime implies a smaller variance in the average strength of Groups B-H under both measures. According to the FIFA World Ranking, the expected strength of the strongest group is reduced by our proposal, and the expected strengths of all other groups (except for Group A) are increased. Even though the average difficulty level of the strongest group does not change under the World Football Elo Ratings, the weakest groups contain better teams, thus, the groups are more balanced overall. It is substantially easier to qualify from Group A. However, this is caused by assigning the host Qatar there, a decision that is not debated in the current paper. The advantage of the recommended allocation rule is rather small but it improves balance at a minimal cost, if at all. Figure 2 reinforces this message by presenting the distribution of group strengths. Clearly, the probability of a "group of death" is diminished if it is identified by the FIFA World Ranking. Note the case of Group A again, which accounts for having two modes. However, both distributions are more "peaky" around the primary mode under the alternative seeding regime, implying a more balanced level of difficulty across the groups. Advancement to the Round of 16 is a zero-sum game. Consequently, if there are two competitive draw procedures, one of them will favour some nations compared to the other. Figure 3 presents the effect of the official seeding rule compared to our proposal which provides a more balanced outcome as can be seen in Figures 1-2 . Five countries (Mexico, the United States, Serbia, Poland, and the winner of the AFC vs CONMEBOL play-off) lose more than one percentage point in the probability of qualification. South Korea and Tunisia benefit from being assigned to Pot 3 rather than Pot 4. Uruguay is better off because the official seeding rule places two strong CONMEBOL teams (Ecuador and the winner of an inter-confederation play-off) in Pot 4 instead of only one, implying that the excepted opponent of Uruguay from Pot 4 will be weaker. The relative effects (Figure 4) are mitigated for the teams drawn from Pots 1 and 2 but can reach or even exceed 3-4% for weaker teams. There is a positive correlation among nations in the same association and pot: Mexico and the United States, the five European -1900 1900-1910 1910-1920 1920-1930 1930-1940 1940-1950 1950-1960 1960-1970 So far, only the specific case of the FIFA 2022 World Cup has been examined. Therefore, it remains uncertain whether our proposal is universally advantageous concerning group balancedness. To that end, three seeding options are compared in a stylised model of the FIFA World Cup draw with an unknown placeholder, the winner of a play-off tie: • There are 33 teams; • The strength of team (0 ≤ ≤ 32) is 33 − ; • One randomly chosen team is the host, automatically assigned to Pot 1 and placed in Group A; • Two teams contest a play-off to be played after the draw; • The teams are assigned to the pots according to their strength, except for the host and the winner of the play-off; • Eight groups are formed by randomly selecting a team from each of the four pots; • There are no draw constraints; • Group strength is measured as before. We consider three options for how to seed the winner of the play-off contested by teams and : • Seeding A: it is assigned according to the strength of the better team, which is equal to max{33 − ; 33 − }; • Seeding B: it is assigned according to the strength of the worse team, which is equal to min{33 − ; 33 − }; • Seeding C: it is assigned automatically to Pot 4. Seeding A corresponds to our proposal for the 2022 FIFA World Cup draw. Seeding B or Seeding C can be the underlying principle of the official FIFA rule. The seeding regimes are investigated in three different scenarios: • Setting 1: teams and are chosen randomly from the whole set (0 ≤ , ≤ 32) to contest the play-off, and team advances with a probability of 0.5 + ( − )/50; • Setting 2: teams and are chosen randomly from the set of 21 weakest teams (12 ≤ , ≤ 32) to contest the play-off, and team advances with a probability of 0.5 + ( − )/50; • Setting 3: teams and are chosen randomly from the set of 21 weakest teams (12 ≤ , ≤ 32) to contest the play-off, and team advances with a probability of 0.5 + ( − )/25. In Setting 1, the "natural" place of the play-off winner can be in any pot. On the other hand, the participants of the undecided play-off cannot be among the 12 strongest teams according to Settings 2 and 3. The winning probabilities are more unequal in Setting 3 compared to Setting 2. For each setting, the results will be based on 1 million simulations. Figures 5-7 show the average difficulty levels of the groups in Settings 1-3, respectively. Under Setting 1, Seeding A is the best rule to balance the groups, followed by Seeding B and Seeding C ( Figure 5 ). In particular, the expected strength of the strongest "group of death" is the lowest under Seeding A, while all other groups-including Group A-are tougher according to Seeding A than according to the other two rules. This finding is intuitive: the better contestant in the play-off is the likely winner, thus, the least mistake is committed if the placeholder is assigned according to the highest-ranked possible winner. Similar to the 2022 FIFA World Cup, Group A is an outlier due to the automatic assignment of the host. However, the situation is somewhat more complicated if the contestants of the play-off are relatively weak as in Setting 2 ( Figure 6 ). While Seeding A minimises the imbalance across Group A and the six strongest groups, the weakest of Groups B-H is expected to be closer to the other groups under Seeding B. In other words, Seeding A allows for a relatively easy group if the play-off is won by the lower-ranked contestant which is assigned according to the strength of the higher-ranked contestant. This conjecture is reinforced by Setting 3, where the participants of the play-off are more different in the probability of winning (Figure 7) . Consequently, it is less likely, ceteris paribus, that the lower-ranked contestant will advance from the play-off, and the pattern seen in Figure 5 remains valid. To summarise, Seeding A seems to be the best regime in general. Even though its dominance can be debated in Setting 2, the stakeholders probably prefer six balanced groups together with an easy one (after all, Group A is guaranteed to be easy by the organiser of the 2022 FIFA World Cup) rather than seven groups such that any six of them are less balanced than the six strongest groups under Seeding A. The current paper has examined the draw system of the 2022 FIFA World Cup. The official seeding rule has been demonstrated to violate an important principle by failing to balance the competitive levels across the groups. Allocating the winners of the unfinished play-offs according to the highest-ranked candidate does provide a fairer outcome. The questionable decision of FIFA has harmed some countries, including Ukraine. Our proposal of using the rating of the higher-ranked team in an undecided tie for seeding purposes seems to be a fairer policy in general. Although the methodology used to simulate the outcome of the matches played in the play-offs and the FIFA World Cup is relatively simple, we have mainly focused on the difference between the official and the alternative seeding rules. Hence, the direction of the effects (the variation in the strengths of the groups and the set of countries that benefit/suffer from the official pot allocation) are likely to remain unchanged under a wide set of prediction models. Our study will probably inspire more researchers to analyse sports rules. Hopefully, FIFA and other tournament organisers will begin extensive consultation with the academic community before similar decisions. Goals, constraints, and public assignment: A field study of the UEFA Champions League Fixing match-fixing: Optimal schedules to promote competitiveness On the fairness of the restricted group draw problem in the Quantifying incentive (in)compatibility: A case study from sports Quantifying the unfairness of the 2018 FIFA World Cup qualification UEFA against the champions? An evaluation of the recent reform of the Champions League qualification 2026 FIFA World Cup TM : FIFA Council designates bids for final voting by the FIFA Congress Revision of the FIFA / Coca-Cola World Ranking Decisions taken concerning FIFA World Cup Qatar 2022 TM qualifiers Draw procedures. FIFA World Cup Qatar Simulation of scheduled football matches. 28 December Rethinking the FIFA World Cup TM final draw Rethinking the FIFA World Cup TM final draw Pourquoi la Coupe du monde est pluséquitable cette année. The Conversation The World Cup draw's flaws. The Mathematical Gazette Odd odds: The UEFA Champions League Round of 16 draw Fair draws for group rounds in sport tournaments The predictive power of ranking systems in association football How to improve a team's position in the FIFA ranking? A simulation study Modelling association football scores Why Germany was supposed to be drawn in the group of death and why it escaped A numerical study of designs for sporting contests Regulations of the UEFA Champions League Regulations of the UEFA Europa Conference League Ranking methods in soccer This paper could not have been written without my father (also called László Csató), who has primarily coded the simulations in Python. We are grateful to Julien Guyon for inspiration and useful advice. Hans van Eetvelde, Mark Gagolewski, and Tamás Halm have provided important suggestions. We are indebted to the Wikipedia community for summarising important details of the sports competition discussed in the paper. The research was supported by the MTA Premium Postdoctoral Research Program grant PPD2019-9/2019.