key: cord-0578001-1oc3mzj4
authors: Kaiser, Matthias; Chen, Andrew Tzer-Yeu; Gluckman, Peter
title: Should policy makers trust composite indices? A commentary on the pitfalls of inappropriate indices for policy formation
date: 2020-08-31
journal: nan
DOI: nan
sha: 752ca7d26cae79070de67bad3fcd561f0acf76ff
doc_id: 578001
cord_uid: 1oc3mzj4

This paper critically discusses the use and merits of global indices, in particular, the Global Health Security Index or GHSI (Cameron et 2019). The index ranked 195 countries according to their expected preparedness in case of a pandemic or other biological threat. The Covid-19 pandemic provides the background to compare each country's predicted performance from the GHSI with the actual performance. In general, there is an inverted relation between predicted versus actual performance, i.e. the predicted top performers are among those that are the worst hit. Obviously, this reflects poorly on the potential policy uses of the index. This paper analyses the reasons for the poor match between prediction and reality in the index, and mentions six general observations applying to global indices in this respect. The level of abstraction in these global indices builds uncertainties upon uncertainties which potentially removes them from the policy needs on the ground. From this, the question is raised if the policy community might have better tools for decision making. On the basis of data from the INGSA Policy-making Tracker, some simple heuristics are suggested, which may be more useful than a global index.

Why would anyone want to construct a global composite index of anything? The standard answer is that it would offer a useful tool for policy design and decision making. In theory, the score is easier to understand than a complex concept such as wellbeing or sustainability.

The next question is then: how can users know if the index is good and useful? Tentatively, we might suggest that the index has value if using it leads to better policies and decisions than would have been the case without it. However, most global composite indicators are never really put to the test, because performance is normally not directly measurable. We surmise that the case of the Global Health Security Index (GHSI) [1] may be an exception.

This index was published in October 2019 after two and a half years of research, and contained a ranking of 195 countries with their associated scores indicating their preparedness for global epidemics and pandemics. The GHSI aimed to be a key resource in the "face of increasing risks of high-consequence and globally catastrophic biological events in light of major gaps in international financing for preparedness" [1] . The developers "believe that, over time, the GHS Index will spur measurable changes in national health security" and sought to "illuminate preparedness and capacity gaps to increase both political will and financing to fill them at the national and international levels." [1] It utilized 140 questions organized in 6 categories, 34 indicators and 85 sub-indicators, all constructed from open-source information. Out of a total possible score of 100, the average for these countries was 40.2, ranging between 83.5 to 16.2. Fewer than 7% of the countries are ranked as being able to effectively prevent the emergence or release of pathogens.

Less than half a year later, the novel SARS-CoV-2 coronavirus led to the Covid-19

pandemic. This gives us now the possibility to compare the previous assessment from the index with the actual performance of countries' health systems. Firstly, we can see how the GHSI ranked countries into three levels of preparedness. The United States and the United Kingdom top the index at rank 1 and 2 (scoring 83.5 and 77.9), with Sweden (72.2), South Korea (70.2) and France (68.1) with high rankings at 7, 9, and 11 respectively. Then there are countries like Germany (66.0), Spain (65.9), Norway (64.9), Italy (56.2), New Zealand (54.0) and others which are placed in the middle category of preparedness, apparently not so well prepared. Brazil (59.7, rank 22) is ranked slightly better than Singapore (58.7, rank 24). Mongolia is at least above the average with a score of 49.5, while Jamaica (29.0, rank 147) and Fiji (25.7, rank 168) are among those ranked as very poorly prepared.

For those who have been watching the global spread of Covid-19, the incongruities between the GHSI rankings and the case numbers in each country will be obvious. June 2020 data from the Worldometer website [https://www.worldometers.info/coronavirus/] shows the opposite of what we might expect from the GHSI rankings: The United States, United Kingdom, Sweden, and Brazil are the worst hit countries, while other countries have been surpassing expectations derived from the index: Germany, Norway, Singapore, New Zealand, and Vietnam. This is certainly the case for Jamaica and Fiji, which have virtually eliminated Covid-19, but were ranked among the least prepared. Two of the best performing economies, Taiwan and Hong Kong, are not even included in the global index. We do note that there are some limitations with the use of data aggregators like Worldometer and potential issues around the reliability of testing in some countries, but we feel that these data are suitable enough for demonstrating the magnitude of the problem. Table 1 Quantifying the expected and actual performance data for these 29 countries invites us to look more deeply into why the differences were so large. It is not always obvious what should count as performance data -one might pick the number of infections per million inhabitants, or as we have, the number of deaths per million people. We think it is generally desirable to avoid a high death rate. What we have not done is to break down the total GHSI score into its 6 subcategories, which would somewhat complicate the picture. These subcategories in the GHSI report are:

1. Prevention of the emergence or release of pathogens; 2. Early detection & reporting of epidemics of potential international concern; 3. Rapid response to and mitigation of the spread of an epidemic; 4. Sufficient & robust health system to treat the sick & protect health workers; 5. Commitments to improving national capacity, financing and adherence to norms; 6. Overall risk environment and country vulnerability to biological threats.

Any single country will score differently on these subcategories, such that the GHSI claimed to provide more detailed information on where to act in order to improve general preparedness. Yet, the total score is what is used for international comparison and ranking, and what conveys the most weight in political discussion. In late February, US President Donald Trump cited the GHSI to argue that the United States was well prepared for Covid-19, saying "the United States, we're rated No. 1" [3] . And here the discrepancy arises most clearly with actual performance.

There have already been several assessments of how the GHSI fared in the light of Covid-19. All noted significant shortcomings, and Aitken et al [4] , based on data from Worldometer on 11 April 2020, noted the reversal of relationships. Razavi et al [5] questions the wisdom of the ranking system: "ranking countries based on weighted scores across indicators that are scored variably and are not directly comparable with one another is problematic". They based this assertion on their criticism that the scoring system is not consistent (some indicators score either 0 or 100, while others use the whole range), the use of weightings is arbitrary, and the inclusion of some indicators like urbanization, are of questionable validity.

Chang and McAleer [6] extended the analysis of GHSI by adding other approaches to quantify mean values (adding to the arithmetic mean used in GHSI, the geometric and harmonic mean values) and saw positive potential in the GHSI, but stressed the significant differentiation in the 6 indicators: "Rapid Response and Detection and Reporting have the largest impacts" [6] . They also commented on the implicit political bias: "As part of China, Hong Kong was not included in the GHS Index as a country, while Taiwan was not included undoubtedly for political reasons" [6] .

With the partial exception of the Aitken et al critique, most of the criticism focusses on the technical aspects of constructing the composite index, in particular when it comes to combining sub-categories to create a total score assigned to the individual countries. It is noteworthy, though, that they seem to find pragmatic policy value in the sub-categories, thereby ignoring how these also were constructed from various indicators and subindicators.

Given the extreme discrepancy between expected and actual performance for most countries, one must ask if this failure is due to a deeper inherent weakness in the underlying concept of the index, or other contextual factors. For instance, could underperformance be due to political decision makers not utilizing their countries' capabilities or feeling overconfident? Could performance exceeding expectations be due to political decision makers compensating for the lack of preventative capacity through more stringent interventions, perhaps supported by geographical luck? We would argue that it is wrong to solely put the blame or praise on the side of politics, when in all of these countries the decision-making was presented as evidence-based, and the GHSI purports to capture the whole range of relevant, publicly-available facts.

It is important to note that we have little evidence that the index formed a key part of policy making in governments around the world, but that it is clear the index was constructed with this intention in mind. Given that the Global Health Security Index was obviously meticulously prepared, based on a wealth of data by a large group of international experts, we might even generalize the question to now ask whether the production of any such global composite indicator -whatever its subject matter -makes any sense at all as a strategic decision-making tool.

We make the following observations in this regard:

(1) Indices like the GHSI comprise of several layers of specificity, looking for measurable (quantifiable) features that are considered essential for higher-level properties. By implication, it emerges that higher-level properties are not directly measurable, and that is why one seeks to circumvent the problem by using subordinated indicators which indirectly contribute to the higher-level property. Typically, these higher-level properties are not directly measurable due to their complexity, implying the possibility of diverse emergent and unpredictable phenomena.

(2) The upward process from the concrete to the more abstract implies building uncertainties upon uncertainties, without the means to precisely quantify them. In global studies, large uncertainties are typically already present through differences in how base local data are registered and counted. Communicating a single index score or rank for each country masks the inherent uncertainty and volatility in the measurements.

(3) Groups of properties that fall under a common concept are presumed to be uniformly linear and additive when contributing towards a common concept. This excludes local variation in response to the threat / property indexed in the study. One such variation might be cultural differences in risk communication. It also ignores interdependencies and mutual strengthening or weakening of constitutive properties.

(4) Constructing a global composite index as a strategic tool in decision-making presupposes the existence of a normative benchmark for ideal states. Any such benchmark will indirectly introduce a socio-political and cultural bias that does not do justice to the diversity of viewpoints among both experts and non-experts. Countries with high compliance to imposed rules and regulations may have other needs in terms of preparedness than countries with low compliance.

(5) While solid and comprehensive reporting of the way a global composite indicator was constructed may shield one from some academic criticism, the fact remains that users of the index, the policy makers, will almost certainly focus on the overall performance figures as reflected in the indicator. In the current example, the authors of the GHSI stated that "national health security is fundamentally weak around the world" and that "no country is fully prepared for epidemics or pandemics, and every country has important gaps to address." [3] But that did not stop an MP in the South African National Assembly from claiming "the Global Health Security Index for 2019 Report revealed that South Africa is ranked 34 out of 195 countries... this gives confidence that the South African government through the national Department of Health is doing all within its power to strengthen its health systems to safeguard the public from the outbreak of any other form of infections." [7] The key utility of a global index is ultimately being able to rely on the overall performance ranking. Thus, there is bias already in the framing of the issue, and alternative framings that would emerge in a transdisciplinary approach are seldom considered.

It is by no means surprising that scientific systematization will always include uncertainties and will never be completely objective. Facts and values are intertwined in science for policy. The framework of post-normal science [8] has stressed this for a long time. It has also been pointed out in areas outside of health, that composite indices may be misleading and may hide important information in some of its elements. Giampietro and Saltelli [9] have raised this issue in regard the global ecological footprint (EF), and this has spurred a number of reactions [10, 11, 12] . We also note other composite indices that rely on poor proxies, such as the OECD Better Life Index [13] or the Human Development Index [14] , and indices that rely on subjective perceptions, such as university rankings [15] . The danger is that policy decisions made based on flawed indices and rankings are likely to be equally flawed.

Let us put the question the other way round: could one make sensible and scientifically informed policies without these global indicators or index? With the experience of Covid-19 fresh in our minds, we would venture that good pandemic policies (leaving out the other issues for the time being) could be based on sensible data presentation and some simple heuristics rather than over-stated modelling with its inherent limitations. One key to effective control of the pandemic was acting preventatively at an early stage, and implementing counter-measures like widespread testing, lockdown and closing of the borders [16] . Taiwan is one of the best examples in this respect: noting the rapid rise of infections in neighboring China in late December 2019, it implemented wide-spread testing among incoming people, and set in motion a National Health Command Center. It soon closed its borders, quarantined all cases, and propagated face masks rapidly. Early detection and reaction were the key to controlling the pandemic in many countries, and they showed success. Laissez-faire attitudes like in the UK, Sweden, or the USA proved fatal. A UN report has this key message: "Act decisively and early to prevent the further spread or quickly suppress the transmission of COVID-19 and save lives" [17] . Other writers have already noted that simplicity may be a better guide than getting lost in the complexities: "An imperative to prioritize simplicity over complexity is at the core of social health" [18] .

[https://www.ingsa.org/covid/policymaking-tracker-landing/], we have been able to analyze the interventions taken by over 120 countries and when they took place. From this data, we have seen two particular patterns so far. In East Asian countries such as Japan and South Korea, governments took swift action to increase the supply of PPE and face masks, and began public education campaigns at a very early stage -at least 14 days before the third death, avoiding the need for harsh restrictions. In some other developed countries, lockdown measures such as limiting gathering sizes, closing schools, limiting non-essential movement, and closing borders, were implemented well before the threat of COVID-19 spiraled out of control. The countries that have fared the worst in terms of deaths per million people waited longer before implementing similar policies, as shown in Table 2 . A similar analytical approach was taken by journalists at POLITICO when comparing lockdown measures across Europe [19] . It should be noted that a number of countries with fragmented state-level responses were not included in the We hypothesize that there are some decision steps that may serve as useful heuristics for performing well in a pandemic: 1) Recognize the threat to your country and the need for a response early;

2) Agree on a broad societal basis what response strategy is most acceptable for your country, such as elimination of the virus within your borders, "flattening the curve", or keeping the occurrence of infections below a predefined damage threshold;

3) Fill the chosen response strategy with a combination of practical measures appropriate to the epidemiological, economical, and socio-cultural circumstances; 4) Monitor, adjust, or change your chosen strategy according to the predefined goals and current data.

Through further analysis of the INGSA Policy-Making Tracker data, we aim to understand the role of timing of interventions, the levels of responses, and understand the sources of evidence and justification behind these approaches. This will allow us to categorize the different types of strategies taken by governments around the world, and identify different styles of leadership and ideological underpinnings. It is noteworthy that the most obvious control mechanism successfully used, border closure, was not seen by the WHO as a key response, yet border closure was key to elimination in island states where early closure was perhaps easier. However, there is probably no global silver bullet to avoid or contain an emerging pandemic. Realizing the diversity of values along with epidemiological, economic and cultural considerations into a robust strategy is also a bridge to societal compliance.

We conclude with the hypothesis that in order to be prepared for pandemics or manage other crises and global challenges, we may not need more sophistication in the construction of global composite indicators; these indices may not be as useful as they purport to be. We we may actually do without them to a large extent, and could learn from the past instead, recognizing the power of simple heuristics that make sense for the context and point us in the right direction. 

Nuclear Threat Initiative & Johns Hopkins Center for Health Security

Median Household Income about $10,000. Gallup

Here's the Johns Hopkins study President Trump referenced in his coronavirus news conference

Rethinking pandemic preparation: Global Health Security Index (GHSI) is predictive of COVID-19 burden, but in the opposite direction

The Global Health Security Index: what value does it add?

Alternative global health security indexes for risk analysis of COVID-19

Unrevised Hansard for Proceedings of the National Assembly

Science for the Post Normal age

Footprints to nowhere

Footprint facts and fallacies: A response to Giampietro and Saltelli

Questioning the ecological footprint

On the policy relevance of ecological footprints

OECD Better Life Index

The human development index: a critical review

University Rankings: Theoretical Basis, Methodology and Impacts on Global Higher Education

Weathering COVID-19 storm: Successful control measures of five Asian countries

United Nations, Shared Responsibility, Global Solidarity: Responding to the Socioeconomic Impacts of COVID-19

Simplicity, Clarity and Minimalism: Social health during COVID-19

Europe's coronavirus lockdown measures compared

Acknowledgements: Matthias Kaiser gratefully acknowledges funding from the Norman Barry Foundation which enabled his stay at Koi Tū -The Centre for Informed Futures at the University of Auckland, and enabled his contribution to this paper. All authors are grateful for helpful discussion with Tatjana Buklijas, Kristiann Allen, and Anne Bardsley.