key: cord-0511105-zg0a0ym4 authors: Csat'o, L'aszl'o title: On the role of tournament design in sporting success: A study of the North, Central American and Caribbean qualification for the 2022 FIFA World Cup date: 2021-10-12 journal: nan DOI: nan sha: 36f045da68e92345d3c5ac8cbe19e71d72259de9 doc_id: 511105 cord_uid: zg0a0ym4 Playing in the FIFA World Cup finals is an ambition shared by several nations. Since, besides luck and skill, the probability of qualification depends on the design of the qualifiers, the study of these competitions forms an integral part of sports analytics. The Confederation of North, Central America and Caribbean Association Football (CONCACAF) announced a novel qualifying format for the 2022 FIFA World Cup in July 2019. However, the COVID-19 pandemic forced the organisers to return to a more traditional structure. The present chapter analyses how this reform impacted the chances of the national teams to qualify. It is found that the probability of participating in the FIFA World Cup finals can change by more than 5 percentage points under the assumption of fixed strengths for the teams. The idea behind the original design, to divide the contestants into two distinct sets, is worth considering due to the increased competitiveness of the matches played by the strongest and the weakest teams. We recommend mitigating the sharp nonlinearity caused by the seeding policy via a probabilistic rule to the analogy of the NBA draft lottery system. Association football, commonly known as football or soccer (henceforth football), is probably the most popular sport around the world: the games of the 2018 FIFA World Cup were watched by more than half of the global population (FIFA, 2018a) . Therefore, analysing the determinants of success in football, for example, the factors on which the participation of the national teams in the FIFA World Cup finals depend, is an important topic for both academicians and football fans. One might think that the chance of qualifying is governed by the talents of the players, the skills of the coach, or the implementation of innovative tactics-in other words, elements that are independent of the decisions made by the organiser. But this is not the case. Besides the performances of the competitors, the structural dimensions of the contest have a non-negligible effect on its expected outcome, too. Hwang (1982) and Horen and Riezman (1985) demonstrate the role of draws (pairings) in single-elimination tournaments. According to Krumer and Lechner (2017) , the winning probabilities in round-robin tournaments with sequential games between three and four players are influenced by the schedule. Lasek and Gagolewski (2018) evaluate the formats used in the majority of European top-tier football leagues with respect to their ability to produce accurate team rankings. Csató (2021c) discusses many problems of tournament design from an axiomatic perspective. The extensive study of tournament structures is also necessary because the rules of several high profile sporting events are regularly modified. Corona et al. (2019) and Dagaev and Rudyak (2019) examine the effects of a seeding system reform in the UEFA Champions League. Csató (2021b) compares the four tournament formats with 24 teams that have been applied in the recent World Men's Handball Championships. Although a great number of papers have investigated the design of the FIFA World Cup finals, especially its draw and seeding policy (Cea et al., 2020; Guyon, 2015 Guyon, , 2018 Jones, 1990; Laliena and López, 2019; Rathgeber and Rathgeber, 2007; Scarf and Yusof, 2011) , few research have been conducted on the World Cup continental qualifiers. Stone and Rod (2016) provide a non-econometric analysis of the bias in the allocation of qualifying spots among the continents. Pollard and Armatas (2017) assess home advantage in the group stages of qualifi-2 cation for the 2006 , 2010 , and 2014 World Cups. Durán et al. (2017 construct schedules for the South American Qualifiers to overcome the main drawbacks of the previous approach. Their proposal was unanimously approved by the participating nations and is currently being used. Csató (2021a) studies the fairness of the 2018 FIFA World Cup qualification process. The present chapter deals with the North, Central American and Caribbean section of the 2022 FIFA World Cup qualification. This qualifying system has been chosen because its organiser, the Confederation of North, Central America and Caribbean Association Football (CONCACAF) has planned to fundamentally restructure the tournament format, however, the COVID-19 pandemic has forced the confederation to update the rules. Therefore, we can quantify the effects of an external shock on the contestants under the assumption that their strengths do not change at all. Our main contribution resides in showing how tournament designs can be compared and evaluated via Monte Carlo simulations in a basic statistical model based on Elo ratings. This approach is worth adopting by any sports governing body before the modification of a tournament format in order to derive a first estimation on the effects of the suggested changes that can be useful information for the decisionmakers. The results have also some policy implications. Small differences in the FIFA World Ranking translate into high differences in the probability of qualification. Since this can be judged unfair, we propose to consider a probabilistic mechanism for the seeding to replace the current deterministic rule, similarly to the NBA draft lottery system. The remainder of the chapter is organised as follows. Section 2 presents the two tournament formats of the CONCACAF Qualifiers for the 2022 FIFA World Cup. The methodology is detailed in Section 3, and the findings from our simulations are summarised in Section 4. Finally, Section 5 concludes. to make the leading CONCACAF teams more competitive on the global stage and to give emerging footballing nations the chance to pursue their dreams (CONCACAF, 2019) . It consists of two parts organised simultaneously: • Hexagonal group: The six highest ranked member associations play a home-and-away round-robin tournament. The top three teams qualify directly for the 2022 FIFA World Cup. • Qualifier for lower-seeded teams: The member associations ranked 7-35 (29 teams) are divided into eight groups, three groups of three teams and five groups of four teams, to play home-away round-robin matches. The group winners advance to the knockout phase, composed of the two-legged home-andaway quarterfinals, semifinals, and final. 1 The winner of the qualifier for lower-seeded teams faces the fourthranked team of the Hexagonal group in a home-and-away play-off to determine the CONCACAF representative in the inter-confederation play-offs for the 2022 FIFA World Cup. The ranking of the countries was intended to be based on the FIFA World Ranking of June 2020. This design is called the divided format in what follows. However, CONCACAF confirmed on 25 June 2020 that the qualifying process will change due to the disruption caused by the COVID-19 pandemic (TSN, 2020). Finally, a more traditional tournament design was chosen and published on 27 July 2020 (CONCACAF, 2020) . This is composed of three phases as follows: • First round: The member associations ranked 6-35 (30 teams) are drawn into six groups of five teams each, where they play single round-robin matches (two home and two away). The group winners progress to the next stage. • Second round: The six group winners from the first round play in three home-and-away single-elimination matchups. The winners advance to the next stage. • Final round: The three second round winners join the five highest ranked countries to play home-and-away round-robin matches. The top three teams qualify directly for the 2022 FIFA World Cup, while the fourth-placed team represents CON-CACAF in the inter-confederation play-offs for the 2022 FIFA World Cup. The ranking of the countries is based on the FIFA World Ranking of July 2020. The draw procedure favours higher-ranked teams (FIFA, 2020). For the draw of the first round, the 30 contestants are assigned to five pots of six teams each such that Pot k consists of the teams ranked between 6kth and 6k + 5th. The six teams from Pot 1 ranked 6-11 are pre-seeded into groups A-F: the highest ranked team (6th) occupies position A1, the next team (7th) occupies position B1, and so on, until the lowest-ranked team from Pot 1 (11th) occupies position F1. The teams from the remaining four pots are allocated to the groups sequentially according to a random draw. The position of any team is determined by the pot from which it is drawn, for example, C4 is the participant of Group C that is drawn from Pot 4. The schedule depends on these draw positions. In each group, the following 10 matches take place: #1 vs. #2, #1 vs. #3, #4 vs. #1, #5 vs. #1, #2 vs. #3, #2 vs. #4, #5 vs. #2, #3 vs. #4, #3 vs. #5, #4 vs. #5, where the first team plays at home. Consequently, team B2 (the only team in Group B that is drawn from Pot 2) plays against B3 and B4 at home, while it plays against B1 and B5 away. Naturally, any team plays twice at home and twice away. In the second round, the pairings are pre-determined, too: the winner of Group A vs. the winner of Group F, the winner of Group B vs. the winner of Group E, the winner of Group C vs. the winner of Group D. There are no specific draw constraints. This design is called the undivided format in what follows. Therefore, in the undivided tournament format, the member associations ranked 6-11 enjoy a priority in the first round as they play against the strongest opponents (the teams drawn from Pots 2 and 3) at home. Furthermore, due to the pre-seeding of the teams from Pot 1, the team ranked 6/7/8th cannot face a team ranked higher than 11/10/9th, respectively, in the second round. Since the draw procedure of the qualifier for the lower-seeded teams in the divided format was never announced, a similar policy is assumed there in order to prefer higher-ranked countries: • Pot 1 contains the eight teams ranked 7-14, Pot 2 contains the eight teams ranked 15-22, Pot 3 contains the eight teams ranked 23-30, and Pot 4 contains the remaining five teams ranked 31-35. • The eight teams from Pot 1 are pre-seeded into groups A-H, the highest ranked team (7th) occupies position A1, the next team (8th) occupies position B1, and so on, until the lowest-ranked team from Pot 1 (14th) occupies position H1. • The eight teams drawn from Pots 2 and 3, respectively, are allocated to groups A-H in a random sequence. • The five teams from Pot 4 are assigned to groups D-H sequentially as provided by a random draw. • The pairings in the quarterfinals are the winner of Group A vs. the winner of Group H, the winner of Group B vs. the winner of Group G, the winner of Group C vs. the winner of Group F, and the winner of Group D vs. the winner of Group E. • In the semifinals, the winner of Group A or H plays against the winner of Group D or E, and the winner of Group B or G plays against the winner of Group C or F. Hence, although higher-ranked teams are not favoured by the schedule because the groups are home-and-away round-robin tournaments, the member associations ranked 7-9 enjoy a priority in the group stage due to competing against only two other teams instead of three (albeit the additional teams are the weakest participants). In addition, the team ranked 7/8/9/10th cannot face a team ranked higher than 14/13/12/11th, respectively, in the quarterfinals due to the preseeding of the teams from Pot 1. Finally, the knockout bracket implies that the team ranked 7th cannot play against a team ranked higher than the 10th in the semifinals. For the comparison of different tournament designs, it is almost always necessary to use Monte Carlo simulations (Scarf et al., 2009) . Even though some limited results can be derived from historical data, this is not possible in the case of formats that were never applied in practice. Every simulation model is based on a prediction technique for individual games. Modelling the outcome of a football match has a long history since Maher (1982) first suggested Poisson models to that end. The statistical literature often attempts to improve forecasting accuracy through sophisticated time-varying (Baker and McHale, 2018) , Bayesian (Corona et al., 2019) , or maximum likelihood approaches. However, if the main aim is to understand better the properties of a tournament format and to inform decision-makers about the effects of a planned reform, it is preferred to keep the prediction model as simple as possible. An elegant method is to derive the winning probabilities exclusively from the Elo ratings of the teams (Van Eetvelde and Ley, 2019). This procedure was first suggested by the Hungarianborn American physics professorÁrpádÉlő (Elo, 1978) to rank chess players. 2 Consider a match played by teams i and j, having an Elo rating of E i and E j , respectively. The winning probability of team i is where s is a scaling factor. Note that W ij + W ji = 1, thus draws are prohibited. Although drawn matches are relatively common in football, it is unlikely that allowing for this possibility would substantially modify our findings. On the other hand, accounting for draws requires at least one more arbitrary parameter. The key question here is the reliability of the Elo ratings. The official FIFA World Ranking uses the Elo system since August 2018 with the parameter s = 600 (FIFA, 2018b). As the contestants in the undivided format of the CONCACAF Qualifiers are ranked on the basis of the FIFA World Ranking of July 2020, this will be our first set of inputs. 3 There is also a much older alternative ranking project called World Football Elo Ratings (World Football Elo Ratings, 2021) . Its scaling factor is s = 400 and the rating of the home team is increased by a fixed amount of 100. In addition, its adjustment depends on the goal difference in the game, while the FIFA version accounts neither for the margin of victory nor for home advantage. The strengths of the national teams are reported in Table 1 according to both metrics. The scaling factor of the World Football Elo Ratings is lower but its variance is higher, which has a non-negligible effect on the chances of the teams. For instance, Mexico defeats Puerto Rico with a probability of 99.16% on the basis of the World Football Elo Ratings even if it plays away, which is reduced to 96.57% under the FIFA World Ranking. The discrepancy can partially be assigned to the different updating principles. However, a stronger factor can be the recent introduction of the Elo system in the FIFA World Ranking: the number of matches played since the summer of 2018 is probably insufficient for the necessary adjustment of the ratings. In both tournament formats, there are two types of matches: group matches played either home or away, and single-elimination clashes. Two-legged knockout matches are worth considering as one "long" game because the teams focus mostly on advancing to the next round. Therefore, we adopt the methodology of the Football Club Elo Ratings project (Football Club Elo Ratings, 2021) , that is, these matchups are assumed to be played at a neutral field and the difference of the Elo ratings is multiplied by the square root of 2. We use formula (1) as follows. For each group match, W ij is calculated according to the rating method chosen. For a single-elimination matchup, W ij is computed analogously with the above modification. A random number ε is generated between 0 and 1. If ε < W ij , the winner is team i, otherwise, the winner is team j. The ranking in groups depends on the number of wins, and all ties in the number of wins are broken randomly. In each run, all matches of both the divided and the undivided formats are simulated, as well as the teams that qualify directly and the nation going to the inter-confederation play-offs are determined. The set of matches played is recorded, too. As the CONCACAF representative in the inter-confederation play-offs for the 2022 FIFA World Cup should win a two-legged clash against a team from another continent (see Csató (2021a) for the details), that place is counted as half when the probability of qualification is computed. Consequently, 3.5 qualifying slots are allocated in every iteration. The whole process is repeated 10 million times to get reasonable expected values. We think that this relatively simple approach to model the outcomes of the games has several advantages: • The FIFA World Ranking is extensively used for seeding in football tournaments; • The World Football Elo Ratings proved to have good predictive power (Lasek et al., 2013) and are a competitive indicator of football performance (Gásquez and Royuela, 2016) ; • Debates around methodological details can be avoided to a certain degree; • It is easy to understand and the calculations can be replicated by any stakeholder possessing some programming skills. The scientific literature on tournament structures often applies similar techniques (Appleton, 1995; Csató, 2020 Csató, , 2021b Marchand, 2002; McGarry and Schutz, 1997) . As Appleton (1995, p. 534) argues: "since our intention is to compare tournament designs, and not to estimate the chance of the player winning any particular tournament, we may within reason take whatever model determining winners that we please". However, it shall be emphasised that the following numerical results are primarily for comparative purposes. Table 1 for abbreviations of country names) Figure 1 outlines how the undivided tournament format of the CONCACAF Qualifiers for the 2022 FIFA World Cup has modified the probability of qualification for each national team compared to the divided format. According to the FIFA World Ranking, the greatest winners are Honduras-which should only enter the final round-and the countries ranked 7th or closely below (Canada, Curaçao, Panama, Haiti) because they could not qualify directly in the divided format. On the other hand, there are three losers (Mexico, United States, Salvador): these teams should play against only five other teams in the divided format but they face seven opponents in the undivided structure. Furthermore, the 6th ranked Salvador already enters the first round, hence even its participation in the last group stage is not guaranteed. In financial terms, one percentage point in the probability of qualification translates into at least 95 thousand USD in expected prize money as all participants of the 2018 FIFA World Cup received at least 9.5 million USD (FIFA, 2017). The consequences of the reform are somewhat different if the abilities are measured by the World Football Elo Ratings, mainly due to the different distribution of strengths and the increased variance of winning probabilities. For example, Mexico has such a high chance to qualify that even the presence of more contestants in its group cannot substantially worsen its outlook. Therefore, the chances for the next strongest teams of the United States, Costa Rica, and Honduras will decrease by a higher amount. Jamaica and Salvador have a higher probability of advancing to the play-offs in the undivided format, which balances the lower chance of direct qualification. For Canada, the 7th team in the FIFA World Ranking of July 2020, the divided format is especially disadvantageous since it closes the way of direct qualification. Figure 2 focuses on the connection between strength and the probability of qualification for both tournament formats. The relationship is monotonic, except for Jamaica and Canada using the World Football Elo Ratings: the latter team has a more difficult path to qualify as the seeding is based on the FIFA World Ranking. While this is not an inherent failure of the tournament designs, the distortion is clearly lower under the undivided format where the teams are not divided preliminarily into two distinct sets. The impact of the tournament structure cannot be neglected. For instance, the reform almost halves the probability of qualification for Salvador (from 20.2% to 11.78%) but it is roughly doubled for Canada Another important attribute of any tournament is the competi-tiveness of the matches played, quantified in Figure 3 through the average strength of the opponents. Note that in a round-robin group, this value is the highest for the weakest team. The divided tournament format essentially consists of two parallel contests, thus the six best teams of the Hexagonal group play against much stronger competitors than the lower-seeded teams, among which our metric remains approximately constant. Therefore, the weakest teams can expect more success in the divided format but the matches of the middle teams are more balanced under the undivided design. Finally, the leading CONCACAF nations should contest more competitive games on average in the divided format, too, since the Hexagonal contains fewer teams than the group in the last round of the undivided structure. Theoretical models are also important to examine because the actual distribution of teams' strengths may hide certain features of tournament designs. To that end, we set the Elo rating of the ith ranked team at 1,300 + (18 − i)∆-that is, the 18th country at the middle of the ranking has a fixed strength of 1,300-with a parameter ∆ ≥ 0, and derive the winning probabilities from formula (1) such that s = 400 and home advantage is 100, analogously to the World Football Elo Ratings. The chances of qualification are presented in Figure 4 . ∆ = 0 corresponds to the fully competitive scenario, while ∆ = 20 and, especially, ∆ = 40 represent unbalanced contests. The findings convey two basic messages. First, the sharp distinction of the teams in the divided format does not lead to misaligned incentives, that is, no team is interested in being ranked 7th to play among the lower-seeded teams instead of the Hexagonal group. Therefore, the divided tournament design is strategy-proof with respect to the initial ranking of the teams. As illustrated by the UEFA Euro 2020 qualifying tournament (Csató, 2021a, Chapter 6.4) , this is far from guaranteed since the 7th ranked team can obtain the CONCACAF slot in the interconfederation play-offs for the 2022 FIFA World Cup after playing only one two-legged clash with a higher-ranked team. Second, the probability of qualification exhibits a sharp nonlinearity, see the breaking point between the 6th and the 7th teams in the old, as well as between the 5th and 6th teams in the undivided design. While this policy might intensify the chase for a particular position in the FIFA World Ranking, it could be unfair when small differences have such powerful implications. Thus it is worth considering a lot- tery system: the seeding of the teams can be determined by the FIFA World Ranking according to a probabilistic mechanism instead of the current rule. For instance, the fifth team that enters only the final group stage in the undivided tournament format should not necessarily be the 5th ranked one but the 5th ranked with a probability of 60%, the 6th with a probability of 30%, or the 7th with a probability of 10%. A similar procedure is used in the NBA draft lottery. Such a policy would also be able to mitigate the effects of changing from the divided to the undivided format as the comparison of Figures 1 and 4 highlights. The current chapter has aimed to assess how a reform-induced by the calendar disruption due to the COVID-19 global pandemic-in the format of the CONCACAF Qualifiers for the 2022 FIFA World Cup impacted the chances of the national teams to qualify. It has turned out that the tournament design on its own can have quite a substantial role in the probability of playing in the FIFA World Cup finals, even amounting to 5-6 percentage points under the assumption of fixed Elo ratings for the teams. Since the initial division of the contestants into two sets in the original structure of the qualifiers has increased the competitiveness of the matches played by the strongest and the weakest teams without creating misaligned incentives for being seeded lower, this solution offers a reasonable alternative to the traditional hybrid format consisting of subsequent group and knockout stages. Finally, the sharp non-linearity caused by the deterministic seeding policy is recommended to be reduced via a probabilistic rule to the analogy of the NBA draft lottery. These proposals can contribute to making qualifying tournaments more exciting and fair. Hopefully, our chapter will also call the attention of sports governing bodies worldwide that even basic statistical models, based on widely used metrics such as the FIFA World Ranking, might give useful insights into the problem of designing a tournament. May the best man win Time-varying ratings for international football teams An analytics approach to the FIFA ranking procedure and the World Cup final draw Concacaf Announces Format for the 2022 FIFA World Cup Confederation Qualifiers. 10 July New Concacaf Qualifiers announced for regional qualification to FIFA World Cup Qatar 2022 Bayesian forecasting of UEFA Champions League under alternative seeding regimes Optimal tournament design: Lessons from the men's handball Champions League Quantifying the unfairness of the 2018 FIFA World Cup qualification A simulation comparison of tournament designs for the World Men's Handball Championships. International Transactions in Operational Research Tournament Design: How Operations Research Can Improve Sports Rules. Palgrave Pivots in Sports Economics Seeding the UEFA Champions League participants: Evaluation of the reform Scheduling the South American Qualifiers to the 2018 FIFA World Cup by integer programming The Rating of Chessplayers, Past and Present FIFA Council confirms contributions for FIFA World Cup participants More than half the world watched record-breaking Revision of the FIFA / Coca-Cola World Ranking FIFA World Cup 2022 TM Preliminary Draw (Concacaf) The system or how this works The determinants of international football success: A panel data analysis of the Elo rating Rethinking the FIFA World Cup TM final draw What a fairer 24 team UEFA Euro could look like Comparing draws for single elimination tournaments New concepts in seeding knockout tournaments The World Cup draw's flaws. The Mathematical Gazette First in first win: Evidence on schedule effects in round-robin tournaments in mega-events Fair draws for group rounds in sport tournaments The efficacy of league formats in ranking teams The predictive power of ranking systems in association football Ranking soccer teams on the basis of their current strength: A comparison of maximum likelihood approaches Modelling association football scores On the comparison between standard and random knockout tournaments Efficacy of traditional sport tournament structures Factors affecting home advantage in football World Cup qualification Why Germany was supposed to be drawn in the group of death and why it escaped A numerical study of designs for sporting contests A numerical study of tournament structure and seeding policy for the soccer World Cup Finals Unfair play in World Cup qualification? An analysis of the 1998-2010 FIFA World Cup performances and the bias in the allocation of tournament berths CONCACAF confirms 2022 World Cup qualifying will change Ranking methods in soccer We are grateful to Yves Dominicy and Christophe Ley for their beneficial remarks. This project could not have been implemented without my father (also 17 called László Csató), who has coded the simulations in Python. We are indebted to the Wikipedia community for collecting and structuring valuable information on the sports tournaments discussed. The research was supported by the MTA Premium Postdoctoral Research Program grant PPD2019-9/2019.