key: cord-0305064-h30ovg12 authors: Pal, Ranjan; Huang, Ziyuan; Yin, Xinlong; Lototsky, Sergey; De, Swades; Tarkoma, Sasu; Liu, Mingyan; Crowcroft, Jon; Sastry, Nishanth title: Aggregate Cyber-Risk Management in the IoT Age: Cautionary Statistics for (Re)Insurers and Likes date: 2021-05-04 journal: nan DOI: 10.1109/jiot.2020.3039254 sha: f44fcb7a70497ec7eb864eef4a34a6dabb223e69 doc_id: 305064 cord_uid: h30ovg12 In this paper, we provide (i) a rigorous general theory to elicit conditions on (tail-dependent) heavy-tailed cyber-risk distributions under which a risk management firm might find it (non)sustainable to provide aggregate cyber-risk coverage services for smart societies, and (ii)a real-data driven numerical study to validate claims made in theory assuming boundedly rational cyber-risk managers, alongside providing ideas to boost markets that aggregate dependent cyber-risks with heavy-tails.To the best of our knowledge, this is the only complete general theory till date on the feasibility of aggregate cyber-risk management. IoT-driven smart cities are examples of service networked ecosystems that are popularly on the rise around the globe, with major cities like Singapore, Dubai, Barcelona, and Amsterdam being working examples. The proper functioning of such cities is hugely based on the success of supply chain relationships from diverse sectors such as automobiles, electronics, energy, finance, aerospace, etc. In the IoT age, these relationships are often realized via large scale systemic network linkages (see Figure 1 .1. in [1] ) that operate via the interplay of IoT hardware (e.g., sensors, actuators, cameras), application software (e.g., Oracle for DBMS support, cloud service software), and IoT firmware. Currently, robust IoT security is a challenge [2] with a significant fraction of users controlling IoT systems being naive about effective cyber-security practices (e.g., the use of non-default device passwords, periodic patch updates). Consequently a cyber-attack exploiting a software vulnerability can have a catastrophic cascading service disruption effect that could amount to losses in billions of dollars across various service sectors. Recent examples of such cyber-attacks include the Mirai DDoS (2016), NotPetya ransomware (2017), and WannaCry ransomware (2017) attacks, which wrecked havoc among firms in various industries across the globe, resulting in huge financial losses due to service interruption (see [1] for more examples). As a result of such large losses, a certain section of society overall could be negatively impacted and experience psychological depression and affected lifestyles. As instruments to cover cyber-losses in society, markets for commercial third-party services (e.g., cyber-insurance) are steadily but sluggishly gaining traction with the rapid increase of societal IoT deployment, and provides a channel for members (individuals and organizations) to transfer residual cyber-risk post cyber-attack events. The primary benefits of commercial cyber-loss management services have been recently cited in detail by the authors in Biener et.al. [3] , and include (i) indemnification of loss events, (ii) helping corporations estimate cost of cyber-risk, and (iii) improve cyber-security [4] [5] [6] [7] . The steady rise in market requirement for such services primarily arises from a combination of (a) the naivety of user security practices, (b) the non fool-proof nature of technical security solutions to remove cyber-risk [8] , (c) higher board level concerns in organizations post notable cyber-breach incidents (e.g., Sony, Target, WannaCry) and their negative effect on stock prices [9] [10] , and (d) the growing perception of cyber-risk in the digital society [11] . Despite the promised potential for commercial cyber-risk management services, the markets have been too sluggish for our liking. The yearly estimates of cyber-loss approximately amount to USD 600 billion globally (1% of US GDP) [1] , whereas the cumulative global public and private sector spendings on cyber-security amount only to USD 174 billion [12] . In addition, the total yearly market for cyberinsurance services -the most popular form of commercial third party commercial cyber-risk management offerings, approximates to a paltry USD 6 billion globally [12] , compared to the amount of net cyber-loss. The primary reasons for such a low (but increasing) market penetration are (a) misunderstanding and lack of coverage awareness by the demand side (users and organizations) [12] , (b) unavailability It is obvious that the ushering pervasive IoT age with 100s of IoT devices per home/organization will bring forth the need for businesses and homes to increasingly buy coverage CRM solutions like cyber-insurance. This is simply because the cyber-attack space will be broad enough in the digital terrain for humans to always prevent being security-hacked by smart adversaries. As a result, any coverage CRM solution provider will face aggregate cyber-risks from its clients. The idea of spreading aggregate cyber-risk among multiple risk managers (e.g., cyber (re)insurers) is gaining traction [1] [18] [19] for IoT-driven smart society settings whereby insurers covering aggregate cyber-risk of organizations in a given sector (e.g., manufacturing) wish to spread that risk among insurers of firms that are higher up in the supply chain (e.g., energy companies). However (a) there is no formal analysis on the effectiveness of this idea for general individual cyber-risk distributions, and (b) there may be significant differences in the cyber and noncyber re-insurance settings -benefits of non-systemic outcomes in the latter (as qualitatively stated in [18] ) may not apply to the former (see Section IV for more details). Consequently, without a formal analysis, aggregate cyber-risk managers may not have the confidence to scale their service markets [20] . Our main goal in this paper is to devise a foundational methodology that analyzes the effect of individual heavy-tailed and tail-dependent cyber-risks on the effectiveness of aggregate cyber-risk management markets. We make the following research contributions in this paper. 1. We prove that spreading catastrophic heavy-tailed cyber-risks that are identical and independently distributed (i.i.d.), i.e., not tail-dependent, is not an effective practice for aggregate cyber-risk managers. However, spreading i.i.d. heavy-tailed cyber-risks that are not catastrophic is an effective practice for aggregate cyber-risk managers. While this latter point has long been believed and empirically validated in the cyberinsurance research literature, the former point is a surprising new facet that we unravel in this paper via theory (see Section II). 2. We prove that spreading catastrophic and curtailed heavy-tailed cyber-risks that are (non) identical and independently distributed (i.i.d.), i.e., not taildependent, is not an effective practice for aggregate cyber-risk managers. (see Section III). 3. We show that spreading catastrophic and taildependent heavy-tailed cyber-risks is not an effective practice for aggregate cyber-risk managers. Though this result has been empirically established in the past for some heavy-tailed distributions (and also somewhat intuitive from the results of Section II), there exists no formal proof for general heavy-tailed cyber-risk distributions, leave alone catastrophic heavy-tailed distributions (see Section IV). 4. We experimentally validate our theory using a realworld Privacy Rights Clearinghouse cyber-breach data set (see Section V). Our proposed research, based on ideas in [21] presents a foundational methodology to analyze the effectiveness of spreading catastrophic heavy-tailed and tail-dependent cyber-risks. To the best of our knowledge, this is the only complete general theory till date on the feasibility of aggregate cyber-risk management, and is invariant of specific threat models that eventually induce cyber-risk distributions. Though the empirical occurrence of catastrophic cyber-risks is uncommon, it is a matter of time we start encountering them relatively more frequently in the IoT age (see Chapters 1.2, 1.3 in [1] ). A basic primer of important statistical and econometric concepts used in the paper is provided in online Appendix A, and a table of important notations in the paper is presented in Table I . Our research contributions stated thus far are primarily targeted towards the advancement in the economics and econometrics of cyber-risk management in the IoT age through the solution of open research issues -the main focus of our research. However, each of these contributions have a direct impact on IoT security improvement, and its consequent positive impact on society. To start with, according to data sources, the global number of connected devices has already reached 22 billion at the end of 2018 -more than half of which belong to enterprise IoT [22] , and will grow to 29 billion by 2022 [23] . Moreover, worldwide spending on IoT is projected to reach a significant 1.2 trillion USD by 2022 with the number of Internet-connected devices being projected to reach a whopping 125 billion by 2030 [24] . A thing common to nearly all IoT devices is the poor cyber-hygiene associated with their use (e.g., default passwords) -a primary reason being the scale of such devices in operation and the disproportionate human effort (that is likely to continue) needed to strengthen basic security in such devices [1] . This increasingly becoming common knowledge would push organizations and individual households to consider investing in third-party cyberrisk management (CRM) solutions as a necessary risk management step in the upcoming pervasive IoT age. Contribution #1 states that cyber-risk "buyers" (i.e., the CRM firms) need to develop regulated pricing policies for their CRM solutions. These solutions will enable endusers to voluntarily (incentive compatibly) "look after" to a considerable degree, the security hygiene (and hence cyberrisk exposure) of IoT devices under their control. Consequently, such steps will prevent each end-user (individual household or organization) to be a source of a cyber-risk distribution that is heavy-tailed, i.e., catastrophic. This will allow CRM solution markets to scale and flourish, and improve cyber-security in society. Contribution #2 reflects the same things for the CRM solution buyers as that from Contribution #1, but additionally warns the 'riskbuyer' side to put increasing focus on pricing policies that prevent IoT-controlled sources (organizations or individual households) to be a root of catastrophic cyber-risk distributions. The increased focus needed due to the fact that statistical curtailement of such cyber-risks (unlike that in Contribution #1) will also not allow CRM markets to scale and flourish -thereby having a negative effect on society as a whole. Contribution #3 reflects similar learnings for both the CRM solution provider and the buyer sides, as that from Contributions#'s 1 and 2. Contribution #4 clearly states that when CRM solution providers suffer from practical and subjective behavioral biases in appropriately assessing cyber-risk extent [1] , it should not aggregate cyber-risk of catastrophic nature -thereby implying, similar to that in Contribution #'s 1-3, that solution pricing policies should be designed in a way so as to incentivize CRM solution buyers to invest enough efforts in cyber-security so as not to be a source for catastrophic cyber-risks. Finally, while appropriate CRM pricing policies might 'nudge' the demand side to improve their cyber-hygiene, all the contributions together indicate the important role of regulators (e.g., the government) to regulate the enforcement of improved security strength in factory settings of IoT devices during/post manufacturing. This will mitigate (a) the negative effect of human "laziness" towards improving cyber-hygiene, and (b) the chances of society dealing with catastrophic risks. One of the key features of risk management (CRM) (e.g., via insurance) in general as a business model is its ability to pool different types of risks, thereby reducing an underwriter's overall risk exposure. This is particularly true for a reinsurer (not necessarily a cyber re-insurer) , who is in a position to significantly diversify its risks, by selling reinsurance contracts to very different front-line insurers who specialize in different sectors (e.g., retail, pharmaceutical, manufacturing, etc.), primarily independent of one another. This means that a reinsurer typically takes on or aggregates a fraction of many different risks that are most likely to be independent of one another. However, this independence property may not hold true of some cyber-risks. In Section II & III, we make a simplistic assumption that cyber-risks aggregated by a aggregate cyber-risk manager are independent, and leave the analysis of tail-dependent cyberrisks for Section IV. Specifically, in this paper we will often consider the average of n (dependent or independent) cyber-risks X 1 , · · · , X n arising from different IoT-driven organizations in a smart society, given by Z w = 1 n n i=1 X i , or more generally, the weighted average given a fraction of each cyber-risk w = [w 1 , · · · , w n ]: In what follows, in this section we will first examine, for increasing cyber-risk spread (variance), the distribution resulting from aggregating catastrophic cyber-risks, whose first and second moments are undefined. We will then generalize this result and examine the standard VaR risk measure (see online Appendix A for a definition and a valid rationale for using the VaR metric) as a result of aggregating n cyberrisks (catastrophic or otherwise). Symbol Description V aR q (X) Value-at-Risk (VaR) of X at level q S α (σ, β, µ) stable and heavy-tailed distribution characterized by the index of stability α, scale parameter σ, symmetry index β, and location parameter µ CS(r) class of symmetric distributions that are convolutions of S α (σ, 0, 0) distributions with r ≤ α < 2 and σ > 0 CS(r) class of symmetric distributions that are convolutions of S α (σ, 0, 0) distributions with 0 ≤ α < r and σ > 0 CSLC class of symmetric distributions that are convolutions of symmetric distributions that are either log-concave or stable with exponent α > 1 Z w aggregated risk with weights w and risk portfolio X 1 , · · · , X n , such that Z w = n i=1 w i X i a length of support of a probability distribution Table 1 : To give some intuition, we begin with a simple comparison of risk spread (standard deviation) between aggregating light-tailed distributions and heavy-tailed distribution. Consider the Normal distribution as a representative of the former and the Levy [25] and the Cauchy distributions as representatives of the latter that are statistically stable [26] ; the latter exhibit power-law decay with cdf given by F (−x) ≈ x −α , x, α > 0. For n IID normal X 1 , · · · , X n ∼ N (µ, σ 2 ), their average 1 n n i=1 X i is also normally distributed with N (µ, 1 n σ 2 ). The implication here is that the aggregate risk has a spread (the standard deviation) that grows as 1 n of σ for a given µ, suggesting a decrease in average risk as one spreads over an increasing number of individual risks. Thus in this case higher diversification -the spreading over larger pool of risks -is desirable. Now consider the Levy distribution denoted by L(µ, σ), with location parameter µ, scale σ, pdf and cdf is respectively given by A simple algebraic manipulation will suggest that for IID X 1 , · · · , X n ∼ L(µ, σ), we have 1 n n i=1 X i ∼ L(µ, nσ). In other words, contrary to the normal case, the risk spread as a result of aggregating Levy distributions increases linearly in the number of individual risks for a given µ. This suggests that risk aggregation in this case is undesirable. As another example, consider the Cauchy distribution denoted by G(µ, σ), with location parameter µ and scale σ, pdf given by , and the corresponding cdf given by Again, standard results suggest that for IID X 1 , · · · , X n ∼ G(µ, σ), we have 1 n n i=1 X i ∼ G(µ, σ), meaning that the spread of the aggregate risk is unchanged from the individual risk spread. So in this case risk aggregation does not bring risk reduction benefit; it is neither desirable nor undesirable. The above suggests that the notion of spreading risks is sound when the underlying individual risks are light-tailed, but casts doubts on the wisdom of doing so when the underlying risks are heavy-tailed. In the remainder of this section we formally establish this result using the VaR risk measure. We first consider aggregating IID risks X i from the family CS (1), which are class of distributions that are convolutions of symmetric and stable distributions with characteristic exponent α < 1 -those exhibiting an infinite mean and variance, and representing catastrophic cyber-risks (see online Appendix A for details). We have the following result regarding VaR performance post cyber-risk aggregation, the proof of which is in online Appendix B . n, q ∈ (0, 1), and n-vector of weights w, v ∈ R n + . Then and v is not a permutation of w; in other words, the function V aR q (Z w ) is strictly Schur-concave in w ∈ R n + . In particular, V aR q (Zw) < V aR q (Z w ) < V aR q (Z w ), ∀w ∈ I n such that w = w and w is not a permutation ofw. Theorem Implications -On a practical note, the theorem simply implies that when an aggregate cyber-risk covering agency is faced with covering independent and identical catastrophic cyber-risk distributions, the variance of the combined distribution increases with the number of piled up cyber-risks -simply a dampening signal for-profit cyber-risk managers to contribute to a sustainable aggregate loss coverage market. Now consider the special borderline case α = 1 (borderline catastrophic), which corresponds to IID X 1 , · · · , X n with a symmetric Cauchy distribution S 1 (σ, 0, 0). In this case, we have for all w = (w 1 , ....., w n ) ∈ I n , is independent of w and is the same for all portfolios of risk X i with weights w ∈ I n . In other words, in such a case variations in a portfolio has no effect on riskiness of its aggregate return. Thus, the symmetric Cauchy distribution with characteristic exponent α = 1 is the boundary between extremely heavy-tailed distributions (for which aggregate coverage is statistically not incentive compatible) with infinite first moments, and moderately heavy tailed distributions with finite first moments (aggregate coverage might be sustainable). Similarly, for general weights is both Schur-convex and Schur-concave in w ∈ R n + for IID X i ∼ S 1 (σ, 0, 0). We now consider aggregating IID risks X i from the family CSLC, which are class of distributions that are convolutions of symmetric distributions that are either logconcave or stable with exponent α > 1 -those exhibiting finite mean and variance, and representing non-catastrophic heavy-tailed cyber-risks (see online Appendix A for details). We have the next result regarding VaR performance post cyber-risk aggregation, the proof of which is in online Appendix B . ∀w ∈ I n such that w = w and w is not a permutation ofw. Theorem Implications -On a practical note, the theorem simply implies that when an aggregate cyber-risk covering agency is faced with covering independent and identical non-catastrophic cyber-risk distributions, the variance of the combined distribution does not increase with the number of piled up cyber-risks -simply an encouraging signal for-profit cyber-risk managers to contribute to a sustainable aggregate loss coverage market. While this latter point has long been believed and empirically validated in the cyberinsurance research literature, the result from Theorem 2.1 is a surprising new facet that we unravel in this paper via theory. In this section we analyze what happens when aggregating multiple heavy-tailed risks each of which has been curtailed, to fit the realistic scenario where cyber-risk managers have upper bounds on coverage. We also study the role of how the length of the distributional support needed for the analogue to hold depends on the number of cyber-risks in a manager's portfolio and the degree of heavy-tailedness of unbounded cyber-risk distributions. We have the following result, an analogue of Theorem 2.1 for curtailed catastrophic cyber-risks in this regard, the proof of which is in online Appendix B . Theorem 3.1 Let n ≥ 2 and let w ∈ I n be a weight vector with w [1] = 1. Let X i , i = 1, · · · , n be IID r.v.'s ∼ CS(r) for some r ∈ (0, 1) and their respective a-truncated version given by Y i defined above. Denote G(w, z) = P (w [1] the following inequality holds: Note that G(w, z) reflects that V aR q [X w ] > V aR q [w [1] The implications of this theorem are multifarious and are presented in multiple blocks. Implication 1 -The practical implications of the theorem are analogous to Theorem 2.1 in the case of bounded cyber-risks. More specifically, cyber-risk aggregation coverage continues to be disadvantageous in general for catastrophic truncated heavy-tailed distributions. For n ≥ 2 and any cyber-risk valuation z > 0, there exists n cyberrisks with finite support with the property that the variance return of the aggregate cyber-risk portfolio is riskier than that of the portfolio consisting of a single cyber-risk. From a mathematical viewpoint, Theorems 2.1 and 3.1 indicate that VaR is not sub-additive and, thus, its coherency (see online Appendix A for details) is always violated in the class of extremely heavy-tailed cyber-risks with infinite first moments. More specifically, Theorem 3.1 implies that VaR may also be non-coherent in the world of cyber-risks with bounded distributional support. We just proposed conditions under which it is statistically incentive compatible for a (re)-insurer to spread catastrophic cyber-risks having heavy tails. One could also further study conditions under which it will not be optimal to spread risks -in the interest of space, this analysis is provided in online Appendix C and also in [27] . Implication 2 -We note that in the special case of a cyber-risk portfolio with equal weights,w n = 1 n , 1 n , ...., 1 n , we have This means that the length of the distributional support reflecting statistical incentive non-compatibility to aggregate cyber-risk coverage in Theorem 3.1 can be taken to be same for all the portfolios with equal weightsw n . This holds, obviously, for the whole class of the portfolios w such that w [1] < 1 2 . Furthermore, a similar result holds as well for the class of portfolios w such that w [1] < 1 − , (and, thus, w i < 1 for all i), where 0 < < 1 2 . As follows from the proof of Theorem 3.1, for all such portfolios w, the theorem holds for a > E[|X1| r ](n−1) z) . This follows since any vector w with w [1] < 1 − is majorized (see basics of majorization in the online Appendix A ) by the vector (1 − , , 0, .., 0) . Implication 3 -From the proof of Theorem 3.1, it follows that, in the special case of portfolios with equal weights w n = 1 n , 1 n , . . . , 1 n where n > 2, the length of the interval of truncation a can be reduced to a smaller value. In such a case, the theorem holds under the restriction a > E|X1| r (n−1) Note that, by Theorem 2.1, F n (z) > H(z) = G (w n , z) for n ≥ 3. This suggests that if the support is large compared to the number of cyber-risks to be aggregated, it might be infeasible for an aggregate risk manager to cover the risks. This demonstrates the "unpleasant" properties of VaR as a cyber-risk measure under heavy-tailedness does not arise from the relatively high likelihood of getting very large losses but rather from the fact that there are too few cyber-risks available for the profitable aggregate cyber-risk coverage to work. Implication 4 -Theorem 3.1 also shows that, for a specific loss probability q, there exists a sufficiently large a such that the value at risk V aR q [Y w (a)] of the return Y w (a) at level q is greater than the value at risk V aR q [Y 1 (a)] of the return Y 1 (a) at the same level: This highlights the dampening factor to the sustainability of covering aggregate heavy-tailed cyber-risks. One should emphasize that the last inequality between the returns Y w (a) and Y 1 (a) holds for the particular fixed loss probability q and, in the comparisons of the values at risks V aR q [Y w (a)] and V aR q [Y 1 (a)] , the length of the interval needed for the reversals of the stylized facts on the portfolio variation depends on q (similar to the fact that in Theorem 3.1, the length of the distributional support a depends on the value of the disaster level z -denoting the degree of heavy-tailedness). This is the crucial qualitative difference of the results in Theorem 3.1 for bounded/curtailed cyberrisk distributions and their implications for the value at risk, from those given by Theorem 2.1 and Theorem 3.1 for unbounded risks, where the inequalities hold for all z > 0 and all q ∈ (0, 1). Implication 5 (Case of non-identical distributions) -The analogues of Theorem 2.1 hold for i.i.d. risks X 1 , . . . , X n that have skewed extremely thick-tailed stable distributions with infinite first moments: X i ∼ S 0<α<1 (σ, β, 0), α ∈ (0, 1), σ > 0, β ∈ [−1, 1], i = 1, . . . , n. As follows from the proof of Theorem 3.1 (see online Appendix B ), this implies that complete analogues of the results in the present section for bounded versions of symmetric risks from the classes CS(r) continue to hold for truncated extremely heavy-tailed stable distributions S α (σ, β, 0) with α ∈ (0, 1), σ > 0, and an arbitrary skewness parameter β ∈ [−1, 1]. In particular, Theorem 3.1 continues to hold for arbitrary skewed risks X i ∼ S α (σ, β, 0), Results Overview and Impact on IoT Societies -As a summary of the theory results in this section and the previous one, Figure 1 provides a graphical illustration of the impact of the type and number of cyber-risks on a risk manager's valuation (statistical utility, i.e., decreased VaR) of covering aggregate cyber-risk. The interesting observation is that for cases B and C illustrating curtailed cyber-risks, there is a drop in the utility, i.e., increased VaR, as a function of the number of cyber-risks, in covering aggregate risk, followed by an indefinite increase in utility henceforth. The initial drop is due to the tradeoffs from the higher costs of aggregate and increased variance-induced coverage due to a certain threshold 'n' catastrophic cyber-risks versus the benefit received from coverage premiums. Clearly, beyond 'n' cyber-risks the statistical benefits of aggregate cyber-risk coverage outweighs the negatives of increased risk spread. The outcome of cases A and D are intuitively obvious. In [18] , the author rationalizes why aggregate loss coverage services like re-insurance might be sustainable, and not encounter a systemic catastrophe problem. For the general reinsurance setting, he mentions (i) a portfolio of independent risks and geographical diversification, (ii) partial cessation of risk with proper risk screening, and (iii) lack of liability loops, to be the major factors in favor of re-insurance services being sustainable. However, there are major differences between general re-insurance and cyber re-insurance services, that allows us to closely look at cyber re-insurance service sustainability under universal risk types. Clearly (i) and (ii) are impractical when major cyber-catastrophes occur and impact IoT societies (e.g., ones caused by the Wan-naCry and Mirai attacks) (Curve D). In the most optimistic scenarios, Figure 1 illustrates what the size and nature of the coverage portfolio should look like for a cyber re-insurer assuming limited coverage liability for i.i.d. heavy-tailed cyber-risks (Curves A-C). However, the challenge still remains to deal with non i.i.d. heavy-tailed cyber-risks such as those posed by WannaCry and Mirai. Cyber-risks are not only heavy tailed in nature, but are likely to be correlated, i.e., tail-dependent. This is true especially in scenarios of major systemic impact causing cyber-attacks. The likelihood of systemic loss impacts are fairly high in a service-networked smart society [28] [1] driven by IoT technologies. In this section we study the effect on VaR on aggregating such cyber-risk types. Statistical correlations and dependencies between distributions are often captured systematically using copulas [29] [30] (see online Appendix A for a preliminary introduction), that are multivariate functions of marginal distributions outputting dependence values. In our case, the marginal distributions are cyber-risk random variables having a heavy-tail characterized via a power-law distribution family. To illustrate dependencies between such marginal distributions, we start with the bivariate (generalization to follow) Eyraud-Farlie-Gumbel-Morgenstern (EFGM) copulaa power type copula (see online Appendix A for more details) whose marginal distributions obey the power law to reflect heavy-tailed cyber-risk distributions (both catastrophic and otherwise). Let (X 1 , X 2 ) be random variables with the EFGM copula and power-law marginals. Then, for any x ≥ 1 and for j = 1, 2, we have )] Let (ξ 1 (α), ξ 2 (α)) be independent random variables from power-law distributions with tail index α, often called independent copies of (X 1 , X 2 ) . Our key insight is that in the tail, the behavior of products and powers of power-law densities and distributions of X j 's is identical to the behavior of their independent copies. This makes it possible to provide asymptotic (with respect to the loss comparisons between the VaR of the aggregated loss and that of a single risk. More specifically, the crucial component of P X1+X2 2 > x under the EFGM copula can be written as follows where the behavior of the individual summands for large z is driven by the lowest tail index of ξ j in the spreading portfolio. We formalize this result in the following theorem (see online Appendix B for a proof), which generalizes to n dependent heavy-tailed random variables X 1 , X 2 , . . . , X n with multivariate EFGM copula and power-law marginals. Theorem 4.1 For an asymptotically large z > 0, and any n, α > 0 Theorem Implications -The result suggests that suboptimality of cyber-risk aggregation in the VaR framework for extremely heavy tailed losses carries over from independence to the dependence-capturing EFGM copula. That is, cyber-risk aggregation increases VaR of dependent extremely heavy tailed risks within this copula family. It is also easy to see that for dependent losses with the EFGM copula and sufficiently small loss probability q, we have Important generalizations of Theorem 4.1 arise if we consider the wider class of power-type copulas. Most popular members of this class such as the polynomial copula of Drouet Mari and Kotz [31] and the copula with cubic section of Nelsen et al. [32] can be written in the following general form C (u 1 , . . . , u n ) = i1,...,in=0,1,... γ i1,i2,...,in · u i1 1 · u i2 2 · . . . · u in n (5) for a multiple index i = (i 1 , i 2 , . . . , i n ) and a set of corresponding parameters γ i with appropriate restrictions that make C (u 1 , . . . , u n ) a copula. For example, Drouet Mari and Kotz [31] [21] show how to obtain a polynomial copula from function f = u k v q . The key feature of such copulas is that they and their densities can be expressed as powers of u j 's. This allows to apply similar arguments as for EFGM. To this end, we have the following theorem, the proof of which is in online Appendix B . Theorem 4.2 For dependent losses with a power-type copula in (5) and for an asymptotically large z > 0, and any n, α > 0, the conclusions of Theorem 4.1 hold. Theorem Implication -The implications are the same as that of Theorem 4.1. In this section, we put our theory to a rigorous test using real-world cyber-loss data. We want to study whether aggregating individual cyber-risks from different IoT-driven organizational sources (assumed to show characteristics of real-world cyber-loss) in a smart society increase or decrease a risk manager's VaR/Expected Utility (EU) -the scalar metric for measuring the extent of aggregate cyber-risk. In a nutshell, we first show using real world data that individual cyber-losses can indeed exhibit a heavy-tailed statistical nature. We then investigate the VaR/EU trends with increasing number of heavy-tailed cyber-risks to be aggregated. We consider 9015 cyber losses extracted from the publicly available Privacy Rights Clearinghouse database, published in 2017. We first perform several goodness-of-fit tests for several widely used distributions to characterize the true nature of the cyber-loss distribution. Namely, we use the normal, log-normal, and general Pareto distributions for the purpose of comparison, as in [15] . Based on the goodness-offit-statistics (using Log-Likelihood, AIC, BIC, Kolmogorov-Smirnoff, and Anderson-Darling tests), we find that the generalized Pareto distribution fits the data best -thus, . The estimated Pareto Index (the exponent in a power law distribution) characterizing a heavy-tailed distribution for the generalized Pareto distribution is 0.1862, using analysis adopted from [33] . If a cyber-risk manager (e.g., an insurer) takes on a random risk X, a function of n -the number of cyber-risks it accepts to aggregate, the effective outcome (before opting for cyber re-insurance services) for the insurer once X is realized is: where k is the limit of the amount of cyber-risk it can accept -true of practice. In the special case when there is no limited liability, i.e., when k = ∞, we have V (X) = X for all X. If k < ∞, u is defined only on [0, k], and without loss of generality u(k) = 0. Here, we assume the utility function of a perfectly rational and risk-averse cyber-insurer to be generally of the following form: which is the power utility function, and for x being a risk variable, is a Von-Neumann Morgenstern (VNM) utility function. β is degree of risk-aversion of the cyber-insurer. We perform 100,000 Monte Carlo simulations to obtain our results. We observe from Figure 2 that V aR 0.995 (X) monotonically decreases for normal and log-normal individual cyber-risk distributions (fitted using our data set) -though the VaR for log-normal risks decreases, at a slower rate. On the other hand, V aR 0.995 (X) (denoted as VaR from now on throughout the section) increases (not monotonically) for Pareto individual cyber-risk distributions, as is expected from theory (for symmetric stable cyber-risk distributions). However, the non-monotonicity indicates (also in accordance to our theory) that for heavy-tailed cyber-risks simulated in practice, there exists a certain number of sampled risks, aggregating which does not increase VaR. To focus on our empirical data set, we use statistical bootstrapping to simulate the VaR for varying number of aggregated cyber-risks. In this regard, we draw directly from our original sample instead of the different distributions assumed above. The sample is drawn with replacement (thus, i.i.d.) and is of equal size as the original data set (m=9015 observations). Due to the symmetric stable nature of the cyber-risk distribution induced by the empirical dataset, the Conditional-Valueof-Risk (CVaR) measure provides similar performance (see Figure 3a .) as the VaR measure -as the VaR measure is a coherent risk measure for symmetric and stable risk distributions [34] . Moreover, we calculate the confidence interval by repeating the bootstrapping itself. Figure 3b . shows the bootstrapped VaR and its confidence interval. We observe that the bootstrapped VaR (induced by the empirical loss distribution) always lies above the log-normal VaR and the aggregation benefit is much less prevalent than assumed. As a consequence, in accordance with theory, not to aggregate heavy-tailed risks at all would be optimal from a cyber-risk management perspective. We now focus on an expected utility (EU) setting induced on limited liability where applicable, to assess cyber-risk aggregation performance. Figures 4a and 4b show the EUtheoretic performance based on a power utility function u(x) for aggregating i.i.d. cyber-risks. As expected, for log-concave cyber-risk distributions, i.e.., for normally distributed and log-normal i.i.d. cyber-risks (Figure 4a ), we do not observe a change in the derivative of the expected utility with increase in the number of cyber-risks aggregated. However, this is not true for a heavy tailed distribution such as the one induced by the empirical dataset and fitted to a Pareto distribution (see Figure 4b. ). More specifically, the rate of decrease (corroborated via theory) in expected utility with heavy-tailed cyber-risk distributions fluctuates (instead of exhibiting monotonic behavior, as is usual in sampling scenarios) -however, on average is borderline negative with a high standard deviation. We also study the role of pool of homogeneous cyber-risk managers (CRMs) that share 1 aggregate cyber-risk (e.g., like in a cyber re-insurance business), on the EU of a single manager in that pool. We consider two instances of individual cyber-risks -one with a synthetic Pareto index α that is 1 (characterizing heavy-tail nature of cyber-risk), and one lying below 1 (characterizing extremely heavy-tailed cyber-risks), that is characterized by our real-world data set. Figure 5 shows that for risk with a Pareto Index of 1 and limited liability of k = 60, the expected utility of a single manager for different aggregation and cyber-risk pooling sizes (#CRMs), is U-shaped. The U-shape denotes that the benefit from aggregation first decreases before it eventually increases again (similar trend to that in Figure 1 that illustrates our theory). Using a Pareto index of 0.1862 (as estimated from the data, and indicating an extreme heavytailed distribution) changes, ceteris paribus, the result completely, as shown in Figure 6a . and 6b. Since the expected utility decreases monotonically (at nearly a constant rate of decrease) not providing any (pooled) coverage management such as insurance would be optimal and the aggregate coverage market would fail completely. Our numerical analysis shows that the U-shape can only be observed if the Pareto tail index is in the range of (0.8, 1.12) [modelbased] . While the situation in Figure 5 leaves room for traditional cyber (re)insurance promoting regulatory intervention that enables curtailed heavy-tailed cyber-risk distributions sourced from organizations to have a tail-index in a feasible range, the one in Figure 6 suggests otherwise for a risk-aggregating cyber re-insurance business that does not curtail very heavy-tailed cyber-risks. More precisely, cyber-risk pooling is not business-beneficial for cyber-risk managers (CRMs), if individual cyber-risks are heavy-tailed cyber-risks (unless these risks are curtailed), and the subsequent coverage market fails. An Important Note on the Results -The data used in this paper is not generated from an IoT system. The data captures heavy-tailed properties of cyber-loss distributions that we take advantage of to show whether it is feasible to aggregate individual cyber-risks having such properties. There is no real world data available pertaining to IoT systems as of yet, to the best of our knowledge, that reflects heavy-tailed cyber-loss distributions -though in principle it is fair to assume that some IoT-related cyber-loss data sets would exhibit heavy-tailed characteristics. In this section, we solely focus on research related to cyberrisk aggregation. We partition this section in two parts: (i) the heavy-tailed and tail-dependent nature of cyber-risk, and (ii) feasibility insights regarding the profitable coverage of aggregate heavy-tailed cyber-risk. The readers are referred to [13] [35] for references to research on pricing cyber-risk. Nature of Cyber-Risk [43] . Shortcomings -Existing research in cyber-security has been successful in elucidating the heavy-tailed and taildependent nature of cyber-risk; however, is yet to propose formally proven directions to allow a profit-minded cyberrisk manager to judge whether a collection of such risks is suitable to aggregate, under various degrees of heavytailedness. This decision making problem will increasingly arise in the IoT age where major cyber-risks affecting smart societies will give rise to a systemic effects that cyber-risk managers have to deal with. It is a common perception from empirical studies and insurance literature that i.i.d. cyber-risks, even though heavy-tailed, are suitable for aggregation. In this paper, we showed quite the contrast for i.i.d. catastrophic heavy-tailed risks. In a recent work, a group of researchers [28] have studied the problem of whether (a) the underlying network of service organizations in society relying on IT/IoT technologies, and (b) the statistical nature of cyber-risk distributions, positively or negatively affect aggregate cyber-risk managers in expanding their business. The authors surprisingly show that both, the underlying network, as well as i.i.d. and non i.i.d. non-heavy tailed cyber-risk distributions does not have a major role to play (does not imply independence) in encouraging or discouraging aggregate cyber-risk managers to expand or contract their coverage business. Shortcomings -The cited work, though tackling the prob- lem of judging the role of the network and the nature of cyber-risk distributions on the future of cyber-risk aggregation business, does not model catastrophic and taildependent heavy-tailed cyber-risks that may be a possibility in modern IoT-driven societies. However, as a major positive, their result in the work does provide confidence to aggregate cyber-risk managers to boost their cyber-loss coverage business for non-heavy tailed cyber-risks in a networked interdependent setting -something the digital society is in need of. In this section, we first provide a brief review of the current state of insurance-driven CRM (an indicator of the degree of cyber-risk control) in small and medium IT-driven businesses that represent the majority of IT businesses in operation, and gauge the likelihood of cyber-risk distributions that may be sourced at these businesses. More importantly SMBs are highly service networked among themselves, and this network can pose significant cyber-risk aggregation challenges for CRM solution providers [28] . Our review is based on recent Advisen and CyberScout reports - indsutry leaders in CRM and cyber-security solutions. Finally, we summarize the paper. Small and medium-sized businesses are an important driver of the economy and should be empowered with progressive insurance policies that include cyber risk protection services, incident response and insurance coverages to provide the financial support needed to keep the doors open after an attack. As of 2020, insurers and cybersecurity services firms are innovating around the clock to create risk mitigation policies and procedures that can provide peace of mind to SMB leaders. However, despite a rise in cyberattacks against small and mid-size businesses, about 69% of SMB respondents to a recent survey by CyberScout said they did not carry cyber insurance coverage and worryingly many don't even have the appropriate security safeguards in place -clearly indicating a lack of seriousness by SMBs to improve their cyber-hygiene. Moreover, in the age of COVID, business owners are under a lot of pressure from the economic disruptions caused by the pandemic, and finding it even more challenging now to find the time to prioritize cyber-security. CyberScout found that 16% of the respondents had experienced a ransomware event and 40% said they would not know who to contact if they did fall victim to ransomware. SMBs also may not be aware enough of the ransomware risk -data breach ranks as the highest concern for 30% of respondents, but ransomware is tops for only 10%. And only 22% have a backup plan in place. Over half (51%) of survey respondents had no formal cyber-security training program, but 76% said they felt confident about their company's security infrastructure. However, the results revealed some possible gaps. A quarter of respondents said they send out "best practices" emails to employees, 22% reported performing "live fire" trainings and 20 percent also performed vulnerability testing. Annual trainings were the only measure taken by 18% of the respondents. Due to the pandemic, just over half (53%) reported having employees work remotely, but only 34% required the use of a VPN connection and only 17% took any steps to create or remind employees of remote work security protocols. In fact, 14% said they had no specific cyber measures for remote working. Clearly, even in 2020, the state of cybersecurity strength in SMBs is far from desired, and there is a significant likelihood of each being a source of heavy-tailed, i.e., catastrophic, cyber-risks in the event of major cyberattacks. In this paper, we provided a rigorous general theory to elicit conditions on (tail-dependent) heavy-tailed cyber-risk distributions under which a risk management firm will find it (un)profitable to provide aggregate cyber-risk coverage for IoT-driven smart societies. As our primary novel contributions, we proved that (a) spreading catastrophic heavytailed cyber-risks that are identical and independently distributed (i.i.d.), i.e., not tail-dependent, is not an effective practice for aggregate cyber-risk managers, whereas spreading non-catastrophic i.i.d. heavy-tailed cyber-risks is, and (b) spreading catastrophic and tail-dependent heavy-tailed cyber-risks is not an effective practice for aggregate cyberrisk managers. A summary of cyber-risk management effectiveness results for various i.i.d./non-i.i.d. distributions is shown in Figure 7 . We conducted a real-data driven numerical study to validate claims made in theory. Solving Cyber Risk: Protecting Your Company and Society IoT security issues The Geneva Papers on Risk and Insurance-Issues and Practice Analyzing self-defense investments in internet security under cyber-insurance coverage Will cyber-insurance improve network security? a market analysis Improving cyber-security via profitable insurance markets On robust estimates of correlated risk in cyberinsured it firms: A first look at optimal ai-based estimates under "small" data Information security: where computer science, economics and psychology meet Reducing informational disadvantages to improve cyber risk management The effect of data breaches on shareholder wealth Growth in the perception of cyber risk: evidence from us p&c insurers Integrated framework for information security investment and cyber insurance Content analysis of cyber insurance policies: how do carriers price cyber risk? The cyber insurance market in sweden Extreme cyber risks and the nondiversification trap Modeling and predicting cyber hacking breaches Heavy-tailed distribution of cyber-risks Why (re) insurance is not systemic Scor paper Systemic cyber risk and aggregate impacts Heavytailed distributions and robustness in economics and finance Global connected and IoT device forecast update Internet of Things forecast The internet of things: a movement, not a market Statistical distributions One-dimensional stable distributions Sustainable catastrophic cyber-risk management in iot societies When are cyber blackouts in modern service networks likely? a network oblivious theory on cyber (re) insurance feasibility Correlations and copulas for decision and risk analysis Modeling multivariate distributions using copulas: applications in marketing Correlation and dependence Bivariate copulas with cubic sections Infinite mean models and the lda for operational risk Quantitative risk management: Concepts Cybersecurity insurance: Modeling and pricing Hype and heavy tails: A closer look at data breaches The extreme risk of personal data breaches and the erosion of privacy Modelling Extremal Events for Insurance and Finance Copula-based actuarial model for pricing cyber-insurance policies Cyber-risk decision models: To insure it or not? Models and measures for correlation in cyber-insurance A vine copula model for predicting the effectiveness of cyber defense earlywarning Modeling multivariate cybersecurity risks This work has been supported by the NSF under grants CNS-1616575, CNS-1939006, and ARO W911NF1810208.