key: cord-0070611-n9kigvnl
authors: Ilseven, Ekin; Puranam, Phanish
title: Measuring organizational resilience as a performance outcome
date: 2021-11-30
journal: J Org Design
DOI: 10.1007/s41469-021-00107-1
sha: d2a157e768562a3beb38e64f793bed95990b2de0
doc_id: 70611
cord_uid: n9kigvnl

Interest in investigating organizational resilience has surged due to the increasing number of unexpected shocks and disruptions in the global economy. It is more important than ever to have well defined ways of measuring organizational resilience as a precursor to understanding its antecedents. In this article, we discuss the assumptions (regarding choices of counterfactuals and time intervals) needed to operationalize organizational resilience as a performance outcome and identify the minimal set of variables that can be used to estimate the resilience of an organization. We highlight the importance of the choice of time window (rule-based vs. variable) and counterfactuals (absolute vs. relative) to measure resilience.

Organizational resilience remains an important topic in both the research and practice of organization design, and it is studied through a diverse set of perspectives and methods (see Table 1 ). In this note, we propose an approach to measuring organizational resilience that is scalable and generalizable across contexts. Such an approach may prove useful to test the qualitative conclusions drawn from a large variety of inductive studies about resilient organizations that have accumulated in the literature ( Van der Vegt et al. 2015) .

The scope of our analysis is limited by three choices we make. First, our focus is on organizational (rather than individual) resilience, though we take a broad perspective on what an organization is (Puranam 2018). The measure of organizational resilience we propose here can be applied to organizations that are smaller (e.g., divisions, departments and teams) or larger (e.g., alliances, eco-systems, metaorganizations) than a single firm.

Second, our focus is on measuring organizational resilience as an observable outcome, and we are agnostic to the antecedents that produce it. The literature on the mechanisms that underlie organizational resilience is vast, beginning at least with Thompson's (1967) distinction between buffering vs. adaptation, and recent reviews provide a detailed account of how those seminal ideas have been developed (e.g., Williams et al., 2017; Mithani, 2020) . Accordingly, we do not offer any deep elaboration on the antecedents of these outcomes and encourage further research to do so.

Third, we propose to measure resilience in terms of changes to organizational performance after unexpected adversity (Weick and Sutcliffe 2001; Lengnick-Hall and Beck 2009 ). This means that our approach is ideally suited to measure organizational resilience when time series data on organizational performance are available.

Typically, an organization's performance after an unexpected adversity (such as the entry of a competitor, Argyres et al. 2015 ; an unexpected terror attack, Kendra and Wachtendorf 2003 ; an extreme weather event, Dutta 2017; an epidemic, Rao and Greve 2018;  to consider a few) will exhibit a drop in performance at the onset of adversity, as well as a possible recovery. Obviously, the incidence and the magnitude of the adversity experienced by an organization should not be measured by the same performance metric being used to assess its resilience, to avoid circularity. Intuitively, there are at least four separately measurable components of resilience as an outcome: (a) The magnitude and (b) the rate of the drop in performance, and (c) the magnitude and (d) the rate of recovery in performance after the adverse event. Several labels have been used to describe these measurable components derived from performance trajectories which have been discussed by Ayyub (2014) in detail (see also Tang 2019). For instance, the rate of drop in performance ranges from graceful to brittle as it increases. Robustness is described as the residual performance, associated with the magnitude of the drop of performance. Lack of full recovery in magnitude indicates scarring. In certain cases, the magnitude of performance recovery in the postshock period may lead to higher performance than in the pre-shock period and is labelled antifragility (e.g., Taleb 2012; Kupers and Mullie 2014; Martin 2020) .

While these notions are intuitive, when used in isolation, they make implicit assumptions about the window over which observations take place, and the counterfactual being used to evaluate performance. To make such assumptions explicit, we build on previous work (such as Tierney and Bruneau 2007; Cimellaro et al. 2016; Zhang et al. 2019; Tang 2019) and define a measure of organizational resilience as the negatively signed cumulative performance difference between a hypothetical performance trajectory that would have been realized in a world without the shock 0 and the realized performance trajectory (see the area between dashed and solid curves in Fig. 1 ). In formal terms, this corresponds to the integral:

where t pre and t post correspond to moments in time before the shock happens and after the shock happens, π 0 corresponds to the counterfactual performance, π to the actual performance, and denotes a set of parameters associated with organizational mechanisms determining resilience. The negative sign ensures that the more the realized performance matches or exceeds the counterfactual, the greater the organizational resilience and vice versa.

The four intuitive measures (drop, rate of drop, recovery and rate of recovery) can be shown to be approximations of R and positively correlated with it (please see Appendix). However, there may be negative correlations among them-implying that researchers who use different subsets of these measures may reach opposite conclusions about the same firms. As an instance, drop of performance and time to recovery are two variables measured in several works (Table 1 ). However, time to recovery is the product of a subset of variables we have derived, given that Time to Rec. = Drop × (1/Rate of Drop + 1/Rate of Rec.). There are two important consequences of using these two variables:

Our literature review revealed 19 empirical works, out of which 11 has operationalized resilience either as a dependent (outcome) or independent variable (trait). The table illustrates the diversity in operationalizations and empirical methods First, it omits the possibility that organizations may recover beyond the previous performance, leading to an incomplete description of resilience by omitting the fourth component of the R measure (the level of recovery). Second, when not adjusted for the drop, it becomes a measure of the sum of correlations between the drop of performance and the inverse rate of drop and the inverse rate of recovery; this runs into the risk of drawing inconsistent conclusions from measurements. We argue that, if all four variables are used together, they constitute a minimal set of variables that can capture an unbiased (even if approximate) measure of the resilience of an organization. Therefore, a key injunction from our analysis is that researchers should ideally aim to use all four measures, or at least make explicit their assumptions about the measures they are not capturing, in order make the resilience measurement more transparent. In its general form, this measure directs our attention to two explicit choices, namely, the time interval (t pre , t post ), and the counterfactual trajectory, π 0 (t). Below we discuss the implications of these choices for observed organizational resilience. We summarize the following discussion in Table 2 , highlighting the assumptions, advantages, limitations, and conditions according to which one assumption may be more suitable than the other.

Choice of counterfactual π 0 (t)

Measuring the absolute resilience of an organization requires determining a counterfactual which describes the performance of an organization as if the shock did not take place. This approach is for example visualized by Ayyub (2014) ; the labels arise from the comparison of a realized performance curve with an established counterfactual one. In the organizational context, when the exposure to the adverse event cannot be randomized, the determination of absolute resilience relies on forecasting methods. For example, in the context of regional economic organization, Sensier et al. (2016) use macro business cycles to forecast expected performances and determine the drop of performance at each cycle to measure resilience. In Fig. 1 , we see an illustration of this: Martin Marietta Materials Inc. (MMM) is an S&P500 company in the construction materials industry per GICS categorization, coded as 151020. As the quarterly revenue curve (solid line) shows, during the 2008 Financial Crisis, the company suffered from a 50% drop in revenues in the course of almost 2 years. To assess the absolute resilience, we used Holt-Winters exponential smoothing method to forecast what would have happened if the shock had not occurred (dashed line beginning in 2008-Q1). In this case, we observe that the actual curve surpasses the forecasted one in 2014-Q1, indicating that MMM experienced an antifragile outcome after this shock. However, such forecasting methods rely on time series data up to the moment of the shock and strictly assume that a model derived from past data predicts future outcomes well enough; forecasts for prolonged times may not be able to fulfill this.

An alternative approach measures the relative resilience of an organization, where the counterfactual is the observed performance of a group of organizations that have faced the same adversity. For instance, one can compare the performance of a firm to the industry average, in an industry that has been affected by an adversity befalling all its firms at the same point in time. The relative resilience of firms to the COVID19 pandemic or the government regulatory restrictions that arose to cope with it can easily be measured in this way. Measurement of relative resilience is considerably easier as the counterfactual curve is derived from whereas its revenue recover to pre-shock levels in Q3 2013. For high seasonality and growth companies, forecasting will offer more accurate measures. Accordingly, in the post-shock period, the company improves on its pre-shock trend and experiences a larger growth (antifragile outcome) observable performance. The counterfactual group can be also constructed using synthetic controls (Abadie et al. 2010; Tirunillai and Tellis 2017; Conti and Valentini 2018) ahead of a shock to establish more accurate counterfactuals after the shock period. In this measurement, it is assumed that the only heterogeneity in shock response arises from the underlying resilience related capabilities (the α parameters in Eq. 1), not due to the differing levels of shocks across organizations. Otherwise, organizations that are hit by smaller shocks may appear to be more resilient, although they just faced less adversity.

In Fig. 2a (top) , we illustrate the measurement of relative resilience for MMM and its peers, Vulcan Materials Co. (VM) and James Hardie Industries PLC (JHI). A closer look at the industry of MMM shows that it is rather concentrated; the top 3 companies with more than 10% market share cover almost 80% of the market in the time period of interest. Since the impact of the shock will differ across different market shares, we focus on these top three companies to establish the counterfactual through averaging. Adjusting the performance for pre-shock performance (2007-Q4), 1 we see that during the shock period the market leader VM The condition Δ t onset = 0 in Eq. (A2) in Appendix is not fulfilled for the counterfactual based on peer average as Δ t onset = t onset − (t onset ) ≠ 0 , being the peer average performance. We propose to remove this initial non-zero difference from the integrand and capture only the non-linear response, which is also implemented in our examples. has taken advantage of its market position and continued to perform better compared to its peers. However, its relative resilience declines drastically over the longer term. Meanwhile, JHI, MMM's next closest rival falls below the population response and only to recover strongly in the postshock period. Overall, MMM's relative resilience remains unperturbed. In Fig. 2b , we further illustrate how the choice of peers matters. The resilience of all three companies is considerably underestimated when an industry-wide average is used instead of the peer based one. To minimize such errors in choosing the appropriate peer group, their bottomup determination through synthetic control approaches may prove to be much more appropriate for these purposes. For comparison purposes with the absolute resilience measure, we also illustrate the results following from Eq. (1) for the three companies in Fig. 2a (bottom) . We observe that the (1). All companies with more than 5% market share (top three companies registered under GICS Industry code 151020 cover more than 75% of the mining and quarrying industry. 3rd one is dropped as its downturn (and eventual bankruptcy) during the financial crisis is because of idiosyncratic reasons. Martin Marietta Materials Inc. is resilient during the shock, overall maintaining its performance during and beyond the shock. The market leader Vulcan Materials Co. shows a resilient response in the short term, only to fall behind peer response in the long run. Finally, James Hardie Industries PLC shows a highly resilience response in the long run, although in the immediate short term it performs worst in the peer group. (Bottom) As comparison, we show the absolute resilience measure of each company. MMM appears to be the most resilient compared with its own forecasted counterfactual, whereas rest of its peers continue to underperform. This illustrates how the counterfactual choice can impact analysis. b Measurement of relative resilience using Eq. (1) using all companies categorized under the same 6-digit GICS code. Resilience is drastically lower for the top companies in the long run, whereas increased in the short-run. More specifically, we see that MMM and James Hardie Industries PLC show almost non-negative resilience measure. However, MMM remains scarred in the long run which is not the case in a (top) choice of counterfactual can impact the analysis, MMM turns out to be the most resilient almost throughout the whole observation period. Meanwhile, VM's early resilience remains both for relative and absolute measures.

We see two potential ways of determining the interval of measurement. Many researchers apply a rule-based choice of time window, such as the time to full recovery to determine the time window. However, its implementation poses several problems that are typical of time series. For example, fastgrowing organizations tend to grow fast in the post-shock period as well, which shortens the time to full recovery. Here fast-growth is a firm-specific parameter that is not part of α. Another issue arises from seasonality effects as the seasonal fluctuations make it difficult to pinpoint the start and the end of the shock period. When we revisit Fig. 1 , we see that MMM revenue stream demonstrates a positive trend and strong seasonality, maxima frequently coinciding with summers and minima with winters. First, the onset and the end of the shock will depend on which quarter is being considered, Q2s showing quicker recovery than Q4s. Meanwhile, the pre-shock trend and the post-shock trend do not seem to differ significantly; determining the shock duration through the moment of full recovery (without considering the trend) may lead to an overestimation of resilience by 1 year. This has been also noted by Tang (2019). Finally, rules that assume the shock period being over at the full recovery directly limit the assessment of the post-shock magnitude of performance recovery as they may be realized long after the recovery period. Using the time to full recovery by definition misses out on the anti-fragile outcome of MMM which is observed past 2014. As such, the ease of the implementation of "time to full recovery" windows comes with crude and potentially misleading approximations to an organization's resilience.

An alternative to this rule-based choice is the consideration of time windows that stretch both into the past or the future of the shock period. This ensures a better estimation of the magnitude of recovery in the post-shock period. Moreover, one could consider short-term, mid-term, and long-term resilience outcomes that could offer a more nuanced understanding of an organization's resilience. In Fig. 2a (top) , using Eq. (1), we see that the market leader's resilient response in the short-to mid-term did not last forever. JHI's post-shock performance, meanwhile, indicates anti-fragility, with strong post-shock revenue growth. On the other hand, the choice of these time windows would have to pay attention to avoiding the occurrence of other shocks in the past and the future.

The resilience outcome should be assessed on a single shock basis to avoid construct validity problems. The time window should also be sufficiently localized to capture the resilience response. Nokia's decline in performance due to competition from Blackberry, Microsoft, and Apple in 2004 lasted a long time (Doz and Wilson 2017) . After almost a decade, we observe that Nokia is once again a rising player in the semiconductor and infrastructure markets. Nevertheless, these two performance outcomes are not associated with a response to the same unexpected shock. This tension between the length of the observation window and the need to exclude other adverse events is very similar to that arising in using cumulative abnormal returns on share prices when conducting event studies (e.g., Christie 1983) . As with event studies, it may be useful to report resilience measures for different time windows to assess the robustness of conclusions.

While the time window to study organizational resilience to any particular shock should not accidentally include other shocks, somewhat paradoxically it is useful to study multiple shocks to estimate the organizational parameters captured by in (1) in an unbiased manner-to say something confidently about what the mechanisms and antecedents to an organization's resilience are. This is because an empirical study that considers a single shock will ultimately be equivalent to conducting a cross-sectional study-we cannot be sure if the observed resilience is due to observed organizational features or unobserved heterogeneity. In contrast, observing resilience to multiple shocks is equivalent to estimating a fixed effect in panel data, which allows for control of all stable unobserved heterogeneity at the panel level. Table 1 shows that Dutta (2017) and Rao and Greve (2018) are rare instances of works comparing the resilience of organizations across multiple shocks. It is important to note that in such a case the interpretation of an organization's resilience will differ depending on the nature of the multiple shocks: The literature differentiates between general and specific resilience (Nykvist and Von Heland 2014) . When recurring shocks are of the same nature, we measure specific resilience. On the other hand, currently, many scholars question whether resilience to the 2008 Financial Crisis is predictive of resilience to the 2020 Coronavirus healthcare crisis. Such studies may inform us regarding the general resilience of an organization. Finally, we also highlight the fact that studies identifying the antecedents of organizational resilience ( ) through such a longitudinal method have to also assume stability in the antecedents over the time frame considered to make consistent inferences.

So far we have set forth a rather technical discussion of how to measure organizational resilience, which we summarized in Table 2 . In addition to this, we would like to raise several caveats regarding its application in the context of larger research design and address fundamental blocks of organizational resilience research, namely, the organization, the shock, and the performance. The organization-shock pair constitutes the main unit of analysis as the performance outcome can only be generated by such a pair. Accordingly, their properties require close scrutiny in the larger research design context.

(1) Consistent unit of analysis The initial response to the 2020 Coronavirus pandemic included some companies-such as General Motors, L'Oreal, and Dysontaking on crisis (pandemic) specific business activities thanks to their economies of scope. More drastically, some companies engaged in divestitures (e.g., GE divesting GECAS and Dell divesting VMWare) and some others filed for bankruptcy (e.g., Hertz). Beyond these observations, Lin et al. (2006) find that 25 out of 80 organizations observed in crisis have changed their organizational design. These responses to crises direct our attention to the consistency of the unit of analysis. Researchers will need to specify the identity of the organizational entity whose resilience they're interested in; the identity of this entity has to be stable and its performance should remain measurable and consistent before and after the shock. (2) Appropriate performance metric Markman and Venzin (2014) document that six major banks performed the best along seven different performance metrics during the 2008 Financial Crisis. This observation demonstrates that an organization can be assessed along many performance metrics and the choice will determine the interpretation accordingly. Once again, the research context will be the ultimate guide in choosing the performance metric of interest (e.g., see Modica and Reggiani's (2015) review documenting various metrics used in regional economics). In the organizational context, we would like to point out that organizations may differ in their purposes and their performance metric of concern may vary accordingly. Performance feedback and aspiration level theory (Greve 1998) indicates that organizations will respond to downturn in some performance metrics more than to some others. Considering such context specific central performance metrics may be useful leads to follow in choosing the appropriate one.

(3) Shock duration can range from short-term perturbations to long-lasting environmental shifts. We argue that for short-term perturbations, where the organizational environment mostly returns to its original state, forecasting methods facilitating the measurement of absolute resilience will be suitable. On the other hand, for long-lasting systemic shifts measurement of relative resilience may be more preferable as other methods of building counterfactuals may become questionable in their long-term accuracy. The nature of the organization-shock pair will determine the applicability of these methods. For instance, during the 2008 Financial Crisis, many companies were hit by the lack of financial resources and lower demand due to the recession. On the other hand, banks faced a series of regulation changes corresponding to long term environmental shifts. Such shock related factors play an important role in choosing the appropriate method as illustrated in the last column of Table 2 .

We encourage future empirical research to elaborate carefully on these research design elements as well as discuss other properties of these elements that may be impactful in measuring organizational resilience, eventually contributing to and extending the earlier discussions by Carpenter et al. (2001) and Powley et al. (2020) .

Several conceptualizations of resilience in terms of performance outcomes following an unexpected adversity are available in the literature (e.g., robustness, anti-fragility, brittleness) but as we have shown, they involve implicit assumptions about time windows and counterfactuals; counterfactuals can be built through forecasting leading to absolute measures or peer-based estimations leading to relative measures. Meanwhile, time windows can be determined in rule-based methods or can be varied for richer interpretations. These choices have a significant effect on what can be measured and how they should be interpreted. These will, however, depend on their suitability for the research question. A field such as strategic management may value competitive dynamics more and accordingly emphasize relative resilience, whereas organization design may rather be interested in the relationship between certain organizational design choices and absolute resilience. Furthermore, we pointed out three important caveats regarding the match between the measurement and the research design: The consistency of the unit of analysis, the relevance of the performance metric, and consideration of the shock duration (among other properties) are important research design related factors to consider in measurement and have to be discussed at least transparently. The minimal set of variables derived from Eq. (1) may open up a fruitful research direction aimed at capturing the configurational nature of organizational resilience, more specifically the correlation structure among all four variables of the minimal set. We are optimistic about future theoretical developments in this direction, which may complement research on the mechanisms that produce resilience as an outcome.

Funding Not applicable.

Availability of data and materials Not applicable.

Competing interests No competing interests to declare.

Synthetic control methods for comparative case studies: estimating the effect of California's tobacco control program

Dominant designs, innovation shocks, and the follower's dilemma

Systems resilience for multihazard environments: definition, metrics, and valuation for decision making

IQ and ego-resiliency: conceptual and empirical connections and separateness

CEO narcissism, risk-taking, and resilience: an empirical analysis in US commercial banks

From metaphor to measurement: resilience of what to what

On information arrival and hypothesis testing in event studies

PEO-PLES: a framework for evaluating resilience

Super partes? Assessing the effect of judicial independence on entry

Caught in the crossfire: dimensions of vulnerability and foreign multinationals' exit from war-afflicted countries

Bouncing back: building resilience through social and environmental practices in the context of the 2008 global financial crisis

Creating in the crucibles of nature's fury: associational diversity and local social entrepreneurship after natural disasters in California

Performance, aspirations, and risky organizational change

Elements of resilience after the world trade center disaster: reconstituting New York City's Emergency Operations Centre

Turbulence: a corporate perspective on collaborating for resilience

Resilience capacity and strategic agility: prerequisites for thriving in a dynamic environment. UTSA, College of Business

Organizational design and restructuring in response to crises: Lessons from computational modeling and real-world cases

Resilience: lessons from banks that have braved the economic crisis-and from those that have not

When more is not better: overcoming America's obsession with economic efficiency

Spatial economic resilience: overview and perspectives

Social-ecological memory as a source of general and specified resilience

The long-term benefits of organizational resilience through sustainable business practices

CEO positive psychological traits, transformational leadership, and firm performance in high-technology start-up and established firms

2020) Research handbook on organizational resilience

Disasters and community resilience: Spanish flu and the formation of retail cooperatives in Norway

CEO greed, corporate social responsibility, and organizational resilience to systemic shocks

Measuring regional economic resilience across Europe: operationalizing a complex concept

Resources for change: the relationships of organizational inducements and psychological resilience to employees' attitudes and behaviors toward organizational change

Shady characters: the implications of illicit organizational roles for resilient team performance

Quantitative assessment of resilience in complex systems (Doctoral dissertation

Conceptualizing and measuring resilience: A key to disaster loss reduction

Does offline TV advertising affect online chatter? Quasi-experimental analysis using synthetic control

Managing risk and resilience

Development and psychometric

Organizational response to adversity: fusing crisis management and resilience research streams

Positive organizational behavior in the workplace: the impact of hope, optimism, and resilience

Scale-free resilience of real traffic jams

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

For generalizability, we first denote with Δ the performance difference between the counterfactual and the actual performance, i.e., Δ = ( 0 (t) − (t; )) the integrand of Eq. 1, which can be applied both for estimation of actual and relative organizational resilience. Next, we separate the components of the integral in Eq. 1 and re-write as follows:The first integral term captures the performance difference between the time when measurement begins and the shock onsets (t pre , t onset ), the second term from shock onset until performance reaches its minimum (t onset , t min ), the third term from when minimum performance is reached until the performance fully recovers (t min , t rec ), the fourth term from full recovery to attaining a steady antifragile state (t rec , t afrag ), and the last one from equilibration to when the measurement of resilience ends (t afrag , t post ). Assuming a steady state before the onset, Δ is zero until shock onset and Δ t onset = 0.To a first order approximation, we can use trapezoidal rule (a method of quadrature) to estimate the second term:where d denotes the average deterioration rate following the shock. The second step follows from the fundamental theorem of calculus:where we used Δ t onset = 0 and the second ratio term corresponds to the average deterioration rate. Given that Δ t onset = Δ t rec = 0 , we apply the same approximation to the second and third terms as well to obtain (A1)where r denotes recovery rate, and for simplification, we assumed that the average recovery rate after reaching minimum remains the same until the antifragile state is reached even after total recovery is realized. The last integral corresponds to a measurement far into the future after the shock and should be minimized if possible by choosing an appropriate t post . In this approximation, we see all four intuitive measures offer an approximation to the full integral we propose: Rate of performance drop ( d ), magnitude of performance drop ( Δ (t min ) ), rate of performance recovery ( r ) and the level to which the performance recovers ( Δ (t afrag )).It is important to note that this linear approximation does not capture descriptions that incorporate non-linearity, such as those offered by Ayyub (2014) . For example, the author describes a sudden drop in performance as brittle and a slower drop as graceful. Similarly, he visualizes different recovery trajectories, some accelerating and some other decelerating.However, the trapezoidal rule does offer further insight into the error that can appear due to the first order approximation. For example, the absolute value of the error for the first integral corresponds to where Δ̈ t � denotes the second derivative of the performance trajectory. Trajectories such as brittleness or gracefulness differ along the second derivative and higher order approximations will be necessary to reduce such errors, essentially requiring a larger set of variables than the four we have proposed.

We conducted our literature search by searching for the word "resilience", "resiliency", or "resilient" on the online platforms of 11 leading management journals, namely,