key: cord-0045844-bvgzy7yq authors: Śliwka, Piotr; Socha, Leslaw title: A Comparison of Generalized Stochastic Milevsky-Promislov Mortality Models with Continuous Non-Gaussian Filters date: 2020-05-23 journal: Computational Science - ICCS 2020 DOI: 10.1007/978-3-030-50423-6_26 sha: 87a0ccdc0a6a3d82aa0aa89abfe441d3415237f6 doc_id: 45844 cord_uid: bvgzy7yq The ability to precisely model mortality rates [Formula: see text] plays an important role from the economic point of view in healthcare. The aim of this article is to propose a comparison of the estimation of the mortality rates based on a class of stochastic Milevsky-Promislov mortality models. We assume that excitations are modeled by second, fourth and sixth order polynomials of outputs from a linear non-Gaussian filter. To estimate the model parameters we use the first and second moments of [Formula: see text]. The theoretical values obtained in both cases were compared with theoretical [Formula: see text] based on a classical Lee-Carter model. The obtained results confirm the usefulness of the switched model based on the continuous non-Gaussian processes used for modeling [Formula: see text]. The determination of the mortality models is one of the basic problems not only in the field of life insurance but recently particularly in the economics of healthcare [1, [3] [4] [5] . Currently, the most frequently used model is 11, 12, 16, 17] not only the change in mortality associated with age x and calendar year t but also takes into account the influence of belonging to a particular generation (cohort effect) and takes the form: ln(μ x,t ) = α x + β x k t + ε x,t . The assumption that the estimated a x and b x are fixed at time t causes a wave of criticism, especially from the point of view of forecasting. Therefore, there is a need to look for other methods predicting mortality rates that take into account the variability of parameters over time. One of these propositions may be the approach recently proposed in Rossa and Socha [18] , Rossa et al. [19] , Sliwka and Socha [20] , Sliwka [21, 22] , and based on the Milevsky-Promislov family of models [15] with extensions [2, 7, 13] . Methods of modeling μ x,t taking into account the causes of death have been characterized in article [23] . Works on switchings model, which consist of several subsystems with the same structure and different parameters and which can switch over time according to an unknown switching rule, have been taken, among others in Sliwka and Socha [20] . In the mentioned paper it was shown that modeling of empirical mortality coefficients μ x,t using the non-Gaussian linear scalar filters model second order with switchings (nGLSFo2s) allows a more precise estimate of μ x,t than using the Gaussian linear scalar filters model with switchings (GLSFs) and Lee-Carter with switchings (LCs) model for some fixed ages x. In this paper we first propose three extended Milevsky and Promislov models with continuous non-Gaussian filters. We assume that excitations are modeled not only by the second but also by the fourth (nGLSFo4) and the sixth order polynomials (nGLSFo6) of outputs from a linear nGLSF. To estimate the model parameters we use the first and second moments of mortality rates. We show that in considered models some of the parameters can be estimated. Next, we use these models to create hybrid models, where submodels have the same structure and possible different parameters. To estimate the model parameters we use the first and second moments of mortality rates. According to our knowledge, the mortality models proposed above, their hybrid versions and methods for estimating their parameters and switching are new in the field of life insurance. The paper is organized as follows. In Sect. 2 basic notations and definitions of stochastic hybrid systems are entered. Three new basic models represented by even-order polynomials of outputs from linear Gaussian filter are introduced and the non-stationary solutions of corresponding moment equations are presented in Sect. 3. The derivation of these non-stationary solutions are derived in Appendix. In Sect. 4 the procedure of the parameters estimation and determination of switching points is presented. Based on the adapted numerical algorithm of a nonlinear minimization problem, parameter estimation is performed. In Sect. 5 we have compared empirical mortality rates with theoretical ones obtained from proposed models as well as from standard LC model in two versions with switchings and without switchings. The last Section summarizes the obtained results. Throughout this paper we use the following notation. Let | · | and < · > be the Euclidean norm and the inner product in R n , respectively. We mark , P) be a complete probability space with a filtration {F t } t≥0 satisfying usual conditions. Let σ(t) : R + → S be the switching rule, where S = {1, . . . , N} is the set of states. We denote switching times as τ 1 , τ 2 , . . . and assume that there is a finite number of switches on every finite time interval. Let W k (t) be the independent Brownian motions. We assume that processes W k (t) and σ(t) are both {F t } t≥0 adapted. By the stochastic hybrid system we call the vector Itô stochastic differential equations with a switching rule described by (1) where x ∈ R n is the state vector, (σ 0 , x 0 ) is an initial condition, t ∈ T and M is a number of Brownian motions. f (x(t), t, σ(t)) and g(x(t), t, σ(t)) are defined by sets of f (x(t), t, l) and g(x(t), t, l),respectively i.e. f (x(t), t, σ(t)) = f (x(t), t, l), g(x(t), t, σ(t)) = g(x(t), t, l) forσ(t) = l. Functions f : R n ×T ×S → R n and g : R n ×T ×S → R n are locally Lipschitz and such that ∀l ∈ S, t ∈ T, f(0, t, l) = g(0, t, l) = 0, k = 1, . . . , M. These conditions together with these enforced on the switching rule σ ensure that there exists a unique solution to the hybrid system (1) . Hence it follows that Eq. (1) can be treated as a family (set) of subsystems defined by where x(t, l) ∈ R n is the state vector of l-subsystem. We assume additionally that the trajectories of the hybrid system are continuous. It means, when the stochastic system is switched from l 1 subsystem to l 2 subsystem in the moment τ j , then We consider a family of mortality models with a continuous nGLSF described by where μ x (t, l) is a stochastic process representing a mortality rate for a person aged x (x ∈ X = 0, 1, . . . , ω) at time t; α l x , β l x1 , q l xi , i = 1, ..., m, μ l x0 , γ l x1 are constant parameters, l ∈ S; W (t) is a standard Wiener process. We will show that the proposed model (4), (5) can be transformed to the formula (2) for all l ∈ S. Introducing new variables y 1 (t, l) = y(t, l), y i (t, l) = y i (t, l), i = 1, ...m, l ∈ S and applying Ito formula we obtain . . . Taking natural logarithm of both sides of Eq. (4) and applying Ito formula for all l ∈ S we find Now we consider in details three cases of model (4) and (6)-(7), namely for m = 2, 4 and 6. Equations (4) and (6)-(7) for m = 6 take the form Introducing a new vector state Equations (10)-(16) one can rewrite in a vector form where The elements of the matrices A 6 x (l), C 6 x (l) and vectors b 6 . We note that similarly to Eq. (2) Using linear vector stochastic differential Eq. (19) and Ito formula we derive differential equations for the first order moments E[z xi (l)] and second order moments E[z xi (l)z xj (l)], i, j = 1, ..., 7. Next, we find the nonstationary solutions of the first moment of the processes z xi (t, l) for nGLSF of all order models, i.e. (nGLSFo1), (nGLSFo2), ... (nGLSFo6) models In the case of second moment of the processes z xi (t, l) we find first the nonstationary solutions for nGLSF even order models. In the case of sixth order model it has the form where l ∈ S, q l x2 = q l x4 = q l x6 = 1, q l x2 = q l x4 = q l x6 = 1, and α l 0x , c l 0x are constants of integration (see Sect. A). To obtain the moment equations for nGLSF second and fourth order models and the corresponding stationary and nonstationary solutions we assume that: -in the case of second order model the parameters q l x2 = 1, and q l x4 = q l x6 = 0, -in the case of fourth order model the parameters q l x2 = q l x4 = 1, and q l x6 = 0. The corresponding nonstationary solution for the second moment of the process z x1 (t, l) takes the form: for nGLSF second order model, where c l 0x is an integration constant, and for nGLSF fourth order model, where c l 0x is an integration constant and p = It can be proved that in the case odd order models the nonstationary solutions have similar forms, i.e. in the case of the first order (nGLSFo1) model and in the case of other odd order models, i.e. (nGLSFo3), (nGLSFo5), (nGLSFo7) models the nonstationary solutions are the same as the nonstationary solutions for nGLSF even order models, i.e. (nGLSFo2), (nGLSFo4), (nGLSFo6) models, respectively. Simultaneous estimation of parameters: α l 0 , α l x , β l x , γ l x , c l 0x , q l x (where l ∈ S) nGLSF models of 2, 4 or 6 order given by formulas (22)-(26) using traditional methods does not provide unambiguous results (this problem has already been considered in [20] part 4.1.1, in particular by considering the analytical formula for estimating parameters of GLSFo2 model). Therefore, in this case, a two-step procedure was used to estimate the parameters. In the first step, the α 0x and α x of the first moment E[z x1 (t)] of the process z x1 (t) were estimated. In the second step, c 0x and p x of the second moment E[z 2 x1 (t)] were estimated based on the already known α l 0x , α l x ,where p x was defined as follows: . The applied procedure allows to obtain unambiguous estimates of all parameters assuming that q xi = 1, ∀ i=1,...,6 . One of the fundamental problems in the field of switching models is to find the set of switching points. This problem is closely related to the problem of segmentation of a time series discussed in many papers (see for instance [10, 14] ). In our considerations we propose a procedure which is a combination of a statistical test (based on [6] ) and so called Top-Down algorithm. It has the following form. First we introduce some notations. We assume that an extracted time series (Input) consists of n empirical values y emp1 , y emp2 , . . . , y empn defined in time points t 1 , t 2 , ..., t n , respectively. By we denote an interval that begins at t 1 and ends in t 2 . We define three sets P -the set of non-verified intervals, R -the set of intervals without switching points, T -the set of switching points. Then the initial conditions have the form Step 1 We calculate the values of function L( * ) given by formula (28) for all points from an interval and assuming the random component t ∼ N (μ, σ 2 ). IfL(τ 1 ) = maxL( * ) is found at the beginning or at the end of the considered interval, then there is not a switching point in this interval. Then we receive P = φ, ∈ R, T = φ, IfL(τ 1 ) = maxL( * ) is found inside the interval for τ 1 = t k , then Step 2 Choose an interval from the set P and check if its length is greater than 2. Step 3 If "no", then transfer this interval from the set P to the set R and go back to Step 2, if "yes", go back to Step 1. Step 4 The procedure is ended when P = φ, R consists only with subintervals without switching points, T consists of all switching points that can be sorted from the smallest to the greatest one. In Subsect. 4.1 we have established the switching points set, which allow to define submodels. From (21) and further considerations we find that unknown parameters in family of (19) are , and parameters q x2 , q x4 , q x6 are equal 0 or 1. Based on the numerical algorithm of nonlinear minimization with additional conditions of α l 0x (∀x α l 0x < 0) parameters (29) given in the formula (23)-(26) were assessed. The algorithm works by generating a population of random starting points and next uses a local optimization method from each of the starting points to converge to a local minimum. As the solution, the best local minimum was chosen. For a fixed sex, fixed age x, and knowing the switching points (designated in accordance with the procedure described above) two sets of time series of μ x,t values were created. In the first case, the estimation of μ x,t was based on empirical data from 1958-2010 (using the next 6 years for ex-post error evaluation). Similar estimation based on the years 1958-2016 was done in the second case. In both cases the choice of the theoretical value μ x,t at a fixed moment t from the theoretical values of the models (nGLSFo2), (nGLSFo4) and (nGLSFo6) was based on minimization of the absolute error (AE), i.e. min i=2, 4, 6 | μ x,t nGLSF oi − μ x,t |. In addition, point forecasts for the period 2017-2025 have been determined. The parameters for the Lee-Carter model with switchings were estimated based on the formulas given in the literature [11] and using the same set of switches as in the case of the nGLSF model. We note that the hybrid model (19) is continuous. However, the moment equations of the first and second-order defined by (22)-(27) are not continuous in switching points because the empirical data of mortality rates we have used were discrete, and these moments are determined separately for every submodel. Selected results for a 45-year old and a 60-year old woman and man presented in Figs. 1 and 2 (source of empirical data: [8]). In Figs. 1 and 2 , blue circular points indicate empirical data, red, black and green solid lines indicate the theoretical values of the models: Lee-Carter (LCs), nGLSF order 2 (nGso2) and nGLSF mixed order 2,4, and 6 (nGs) with switchings respectively, while the solid purple line indicates the forecast of the nG (nGsf) for the next five years. To verify the goodness of fit of the proposed nGs models with switchings to the empirical mortality rates and compared with Lee-Carter model the mean squared errors (MSE) between empirical mortality μ x,t and theoretical values μ x,t in the years 1958-2010 ('10) and 1958-2016 ('16) as well as the 95% confidence interval for MSE has been calculated. Selected results (45 and 60-year old female and male) are presented in Table 1 (where: CI L -lower -, CI U -upper confidence interval, {W, M } X,M SE -MSE value for {female, male} aged X). The results in column 5th illustrate the model (nGso2) considered in [20] . MSE values calculated on the basis of empirical and theoretical data from 1958-2016 and included in Table 1 and Figs. 1 and 2 provide the following conclusions: -the theoretical values of the mortality rate μ nGs x,t based on the non-Gaussian linear scalar filters with switching provide closer estimates to empirical values than μ x,t LCs based on LC model and μ x,t nGso2 with switching for both a 45year-old and a 60-year-old woman and man, -the range confidence interval is the smallest for the nGs model compared to all other models given in Table 1 , which means greater precision of the proposed nGs for forecasting than the other models presented here, -the empirical mortality rates for women are more accurately fitted using the proposed nGs model than for men (lower MSE value), -based on graphical results ( Fig. 1-Fig. 2) , it can be seen that the proposed method of modeling μ x,t using nGs more precisely adapts to empirical data, especially for data with a large variance than the LC model (e.g. see empirical data from 1980-1990 for a 60-year-old man on Fig. 2, right side) . Moreover, taking into account all results for people aged x = 0, . . . , 100 years (also partly included in Table 1 ) it can be seen that the proposed nGs model fits more accurately to the empirical data for younger than older (lower MSE for 45 years old than for 60 years old man and woman). In this paper, three extended Milevsky and Promislov models with excitations modeled by the second, the fourth and the sixth order polynomials of outputs from a linear non-Gaussian filter are proposed and adopted to Polish mortality data. To obtain hybrid models the procedures of parameters estimation and the determination of switching points were proposed. Based on the theoretical values obtained from these three models, one series of theoretical values based on the AE criterion was constructed and compared with the theoretical mortality rates based on classical the Lee-Carter model. In addition, a point forecast was computed. The obtained results confirm the usefulness of the switched model based on the continuous non-Gaussian process for modeling mortality rates. A natural extension of the research contained in this article is the Markov chain application (homogeneous or heterogeneous), which will be used to describe the space of states built on extended Milevsky and Promislov models with excitations modeled by the second, the fourth and the sixth order polynomials. The issues discussed above will be examined in the next article. The derivation of stationary and nonstationary solutions of moment equations in nGLSF six order model Using linear vector stochastic differential equation (19) and Ito formula we derive differential equations for the first order moments E[z xi (l)] and second order moments E[z xi (l)z xj (l)], i, j = 1, ..., 7. Next we find the stationary solutions for the first order moments E[z xi (l)], i = 2, 3, ..., 7 and for second order moments E[z xi (l)z xj (l)], i, j = 1, ..., 7, (i, j) = (1, 1) equating to zero the corresponding time derivatives, i.e. Then we obtain (33) Hence, from conditions (32)-(33) and equality we find the nonstationary solution for the first moment of the process z x1 (t, l) where α l 0 is an integration constant. Next, taking into account conditions (30)-(31), (32)-(33) and (35) we obtain Hence, from Eq. (49) and equality (35) we find the nonstationary solution for the second moment of the process z x1 (t, l) where c l 0x is an integration constant. Mortality modelling and forecasting: a review of methods Stochastic Hybrid Systems: Analysis and Design Modelling and management of mortality risk: a review A quantitative comparison of stochastic mortality models using data from England and Wales and the United States Trend analysis of mortality rates and causes of death in children under 5 years old in Beijing, China from 1992 to 2015 and forecast of mortality into the future: an entire population-based epidemiological study Tests of equality between sets of coefficients in two linear regressions A stochastic model for mortality rate on Italian Data Trend forecasting of main groups of causes-of-death in Iran using the Lee-Carter model Segmenting time series: a survey and novel approach Modeling and forecasting the time series of U.S. mortality Evaluating the performance of the Lee-Carter method for forecasting mortality Switching in Systems and Control Algorithmic methods for segmentation of time series: an overview Mortality derivatives and the option to annuitise Lee-Carter mortality forecasting with age-specific enhancement A cohort-based extension to the Lee-Carter model for mortality reduction factor Proposition of a hybrid stochastic Lee-Carter mortality model Hybrid Dynamic and Fuzzy Models of Mortality A proposition of generalized stochastic Milevsky-Promislov mortality models Proposed methods for modeling the mortgage and reverse mortgage installment Application of the Markov chains in the prediction of the mortality rates in the generalized stochastic Milevsky-Promislov model Application of the model with a non-Gaussian linear scalar filters to determine life expectancy, taking into account the cause of death