key: cord-0506889-fa4vx88p authors: Colombi, R.; Giordano, S.; Kateri, M. title: Hidden Markov Models for Longitudinal Rating Data with Dynamic Response Styles date: 2021-11-26 journal: nan DOI: nan sha: 298a0eddfd59bfb744e45bd53d38fcb05205ae88 doc_id: 506889 cord_uid: fa4vx88p This work deals with the analysis of longitudinal ordinal responses. The novelty of the proposed approach is in modeling simultaneously the temporal dynamics of a latent trait of interest, measured via the observed ordinal responses, and the answering behaviors influenced by response styles, through hidden Markov models (HMMs) with two latent components. This approach enables the modeling of (i) the substantive latent trait, controlling for response styles; (ii) the change over time of latent trait and answering behavior, allowing also dependence on individual characteristics. For the proposed HMMs, estimation procedures, methods for standard errors calculation, measures of goodness of fit and classification, and full-conditional residuals are discussed. The proposed model is fitted to ordinal longitudinal data from the Survey on Household Income and Wealth (Bank of Italy) to give insights on the evolution of the Italian households financial capability. Psychometric literature widely debated the different behavior patterns of respondents to rating surveys, which may introduce distortions or inaccuracies in their responses. Questions on attitudes, opinions, perceptions are usually Likert-type or rating-scale items, and the observed responses may not reflect the respondents' true preferences but their tendency to use only a small number of the available rating scale options, governed by an underlying behavioral mechanism, known as Response Style (RS) (e.g., Van Vaerenbergh and Thomas, 2013 , for an overview). The activation of a response style mechanism influences systematically the way interviewees use response scales, introducing bias in the responses and scale usage heterogeneity, which may impact the data quality and the validity of the results (e.g., Baumgartner and Steenkamp, 2001; Roberts, 2016) . What is new in our approach is the interest on the longitudinal perspective where respondents are asked, at several time occasions, to give a subjective assessment about rating-scale items and their responses are indicators of a latent trait of interest (e.g., health status, environmental risk, customer satisfaction). Moreover, responses can be driven or not by RS, the RS attitude can vary dynamically and the change over time of responses and answering behaviors can depend also on individual characteristics. More precisely, in the context of longitudinal ordered categorical data analysis, the methodological contribution of the paper is a hidden Markov model (HMM) with a bivariate latent Markov chain that jointly models an unobservable trait of interest and an unobservable binary indicator of the respondent's form of answering (response style driven or not) over time. The use of HMMs in the context of categorical longitudinal data is not new (see Bartolucci et al., 2012 , for a comprehensive review), but to date there does not exist any HMM-based procedure useful for modeling the evolution of an underlying response behavior over time. A further contribution of the proposed approach lies on providing a parsimonious parametrization of the probability functions of the observed responses dictated by a RS. Several RSs have been identified and studied (e.g., Baumgartner and Steenkamp, 2001; Van Vaerenbergh and Thomas, 2013 ) and here a model is introduced that enables capturing easily the most commonly encountered RS. In fact, in our approach, the observation probability functions, conditionally on the presence of a RS, depend on two parameters only, but offer a great flexibility in the types of RSs that can be modelled such as tendency to select at random categories (careless RS, CRS), tendency to prefer positive response categories/answer with agreement (acquiescent RS, ARS), or negative response categories (disacquiescent RS, DRS), middle/neutral categories (middle RS, MRS), or extreme categories (extreme RS, ERS). Other approaches to simultaneously tackle multiple RSs, for cross-sectional data, rely on more complex models such as multi-trait models (e.g., Wetzel and Carstensen, 2015; Falk and Cai, 2016) or item response models (e.g., Böckenholt, 2012; Henninger and Thorsten, 2020; Zhang and Wang, 2020) or latent class factor models (Kieruj and Moors, 2013) . Furthermore, novel is also in the use of stereotype logit models (Anderson, 1984) to investigate how covariates affect the initial and transition probabilities of the latent Markov chain. To our knowledge, such parsimonious models of sound interpretation have not been previously used in the HMM framework. In summary, our approach enables: The proposed methodology is of interest in all longitudinal surveys that model attitudes, opinions, perceptions or beliefs, that are indicators of non directly measurable and observable variables. For example, in healthcare studies, patients are asked, at several occasions, to give a subjective assessment of their health status or disability in daily living; in marketing research, customers are required to evaluate their satisfac-tion for services/products; in socio-economic contexts, citizens are invited to answer to what extent they agree or disagree with sensitive topics (immigration, criminality, gender gap); in environmental studies, interviewees are asked to reveal their perception of the impact of climate changes and environmental risk. In all these cases, the presence of RS effects cannot be ignored and substantive latent traits need to be measured taking into account effects due to RSs. To show the practical usefulness of our proposal, we investigate the evolution over time of the household financial capability (a broader term encompassing behaviour, knowledge, skills and attitudes of people with regard to managing their financial resources, e.g. Zottel et al., 2013) as a latent psychological and behavioral trait that influences the household's decision-making to face financial issues. The latent financial capability is here measured in terms of two observed indicators: the self-perceived ability to make ends meet and the self-report of perceived risk related to financial investments. These indicators have great impact on the score measuring the financial capability, as defined according to the Organization for Economic Cooperation and Development methodology (survey OECD, 2020), applied by 36 countries and in Italy implemented by the Bank of Italy (D'Alessio et al., 2020) . The structure of the work is as follows. In Section 2, the data of our motivating problem from a survey on financial conditions of Italian households are introduced, and the issues to be tackled described. In Section 3, the modeling of different response style effects in the longitudinal perspective through HMMs is proposed and the advantages of our approach are highlighted. In Section 4, latent and observation components of the proposed HMM are described in detail. Alternative HMMs are examined in Section 5, most of them being special cases of the here presented model. Section 6 is devoted to methodological contributions on: maximum likelihood estimators of the parameters, measures of goodness of fit and classification, and fullconditional residuals. In Section 7, the proposed model is fitted on the real data of Section 2, implementing the developed estimation procedure and providing answers to the questions raised in Section 2. Concluding remarks are given in Section 8. Technical details on the methods to calculate the standard errors are postponed in Appendix. Our work meets the growing interest in the households financial capability. The governments are now playing an active role in meeting the financial capability challenge. Initiative taking forward to increase capability are provided throughout starting from National Strategy for Financial Capability in UK (HM-Treasury, 2007) , to EU Commission (Valant, 2015) , National Financial Capability Study in USA (Lin et al., 2019) , OECD (OECD, 2020), among others. European Commission recognised that individual financial and economic behavior is relevant to EU policy making process, and since 2009 incorporates behavioral insights into the design, implementation and monitoring of EU policies (Van Bavel et al., 2013) . Psychological and behavioural aspects affecting people economic and financial decisions (studied as behavioral economics) are also inserted into practices to strengthen financial consumer protection (as agreed in the action plan endorsed by G20/OECD, Lefevre and Chapman, 2017) . In this direction, we propose here to model the dynamics of the households' perception of their financial conditions, accounting for the way households disclose their perceptions, through HMM. The data are from the waves of the Survey on Household Income and Wealth (SHIW). It is conducted by the Bank of Italy every two years since the 1960s to collect information about the income, wealth and saving of Italian households. Over the years, the survey has grown in scope and now it includes also aspects of households' economic and financial behaviour, furthermore since 2004 it contains information on attitude towards financial risk. The data 1 used refer to 1109 Italian households involved in all the waves from 2006 to 2016. We considered the items: -R 1 reveals the perception of the household's financial ability to make ends meet based on the answers of the head of the households to the question: Is your household's income sufficient to see you through to the end of the month.... very easily, easily, fairly easily, fairly difficultly, with some difficulty, very difficulty; -R 2 indicates the risk perception in managing financial investments measured through the response to the question: in managing your financial investments, would you say you have a preference for investments that offer : low returns, with no risk of losing the invested capital (risk averse); a fair return, with a good degree of protection for the invested capital (risk tolerant), good-high returns, but with a fair-high risk of losing part of the capital (risk lover). We focus on these two indicators, among others, since they strongly orient policy maker choices. In particular, insights into ability helps to: developing effective programmes to educate people to manage their resources, reducing welfare dependency, and identifying vulnerable groups of the population for which targeted interventions can be designed. The OECD, in the recent survey (OECD, 2020), recognises that large groups of citizens are lacking the necessary financial behavior and financial resilience to deal effectively with everyday financial management. This is particularly concerning at the time of the unfolding crisis as a result of the COVID-19 pandemic, which is likely to put considerable economic and financial pressures on individuals and test their ability to preserve their financial well-being. Moreover, to understand the financial capability is important to comprehend how households think and feel about the risk s they face, Slovic (2010) . The risk perception is an important determinant of protective behavior, as in general, the success of public intervention programs is largely dependent on individual risk perception. Comprehension of the perceived risk may offer useful prompts for the design of effective investor education programmes and orient towards vulnerable individuals preventive initiatives against bad financial decisions (e.g., Pidgeon, 1998; Gentile et al., 2015; Nguyen et al., 2019) . Jappelli et al. (2014) The frequencies of all the 18 pairs of categories of the two items, R 1 and R 2 , over the six years are represented in Figure 1 . The perceptions evidently change over time. The most commonly chosen responses are a fairly difficult ability to make ends meet and a risk loving behavior towards financial proposals. The choice falls frequently also on the pairs: very easily-averse, easily-averse, fairly easily-averse. The number of households who selected these four most common responses is represented in Figure 2 , over the years, for four groups of households, identified as those with the most representative profiles (most frequent configurations of the covariates among the 94 truly observed ones in the data at hand). These plots exemplify just a portion of the data on the three dimensions: responses × covariates × time. In our context, responses R 1 and R 2 can be considered as the manifest expressions of the latent household's financial capability. Moreover, we believe that an unobservable answering behavior drives respondents that can reveal their perceptions in two ways: with awareness, when their answers reflect the respondents' true opinion, or according to a response style, when in doubt or reluctant to disclose their opinion they prefer extreme or middle categories of the rating scale, or according to their inclination they focus on the positive or negative side of the rating scale. Our approach gives us various opportunities: (i) to describe simultaneously the dynamic behavior of respondents in the way of answering and in disclosing their per- (iii) to discriminate groups of respondents, with certain profiles, in the latent classes that identify various degrees of the latent financial capability, taking into account that they can answer with awareness or may prefer a response style. Section 7 will shed light on these aspects of interest. RS mechanisms lead to biased measurement of the traits of interest that may influence seriously the results of a survey and thus be responsible for non-optimal decisions. Underlying RSs affect all levels of the analysis of survey data, from being responsible for violations of the adopted model assumptions up to biased estimation of parameters and measures of interest, like correlations in cross-sectional survey data, as shown, among others, by Piccolo and Simone (2019) for CRS, Dolnicar and Grün (2009) (Bachman and O'Malley, 1984; Paulhus, 1991) or it is not necessarily consistent over time, depending on the measurement situation (Weijters, 2006; Aichholzer, 2013) . In this regard, the RS is described through timeinvariant and time-specific latent factors in (Weijters et al., 2010) . A recent proposal in the direction of dynamic response styles is by Soland and Kuhfeld (2020) , who concluded that the stability over time of within-subject RS factors is not always justified, by comparing multidimensional nominal response models (Bolt and Johnson, 2009 (Billiet and Davidov, 2008) . Consider r ordinal responses observed on n units (subjects/items) at T time occasions. In particular, let Y jit , Y jit ∈ C j = {1, . . . , c j }, denote the j-th ordinal response variable, The latent variables L it and U it are independent across units and, for every unit, the process {L it , U it } t∈T is assumed to evolve in time according to a first order bivariate Markov chain with states (u, l), u ∈ S U , l ∈ S L . For the sequel, let always i ∈ I and consider states u,ū ∈ S U and l,l ∈ S L . The latent component of the model is specified through its initial and transition prob- , and the transition probabilities are π it (u, l|ū,l) = denote the marginal transition probabilities for the latent variables L it and are the transition probabilities of the latent RS indicators U it , conditioned on the transition (l, l) of the latent construct, called for short as conditional RS transition probabilities. The introduced probabilities are required to satisfy the following conditions: A1. Granger non causality assumption: π L it (l|ū,l) = π L it (l|l), t = 2, . . . , T. It states that L it ⊥ U it−1 |L it−1 , i.e. the latent construct, given its past, does not depend on the past of the RS indicator. A2. Conditional independence of the current latent RS indicator from the past of the This restriction on the probabilities (2) means that: the current way of answering, depends on its past and on the contemporaneous latent construct but not on the past of the latent construct. A3. Independence of the latent processes at the initial time: Assumptions A1 and A2 simplify the transition probabilities of the bivariate Markov to reduce the number of parameters, but can be relaxed. In the sequel, x Under the assumptions A1−A3, the initial and transition probabilities of the latent RS indicator and of the latent construct are specified by the following logit models: -A linear baseline logit model for the initial probabilities of the latent construct: -A logit model for the initial probabilities of the RS indicator: This model has (1 + p (U ) 1 ) parameters. -A set of |S L | = k linear baseline logit models for the marginal transition probabilities of the latent construct, each having as reference category the statel of the previous occasion, i.e. forl ∈ S L : The total number of parameters for these models equals k(k − 1)(1 + p -A logit model for the conditional RS transition probabilities for each possible RS stateū of the previous occasion and for each current state l of the latent construct: The 2k models have in total 2k(1 + p 2 ) parameters. The number of parameters of models (3) and (5) for the latent construct is increasing in the number of states k, which is a draw back of these models. More parsimonious models can be considered, alternative to (3) and (5), that provide sound interpretation options. A convenient class of models for such purposes is that of stereotype logit models (Anderson, 1984; Agresti, 2010) , which we shall employ for modeling the initial and marginal transition probabilities of L it . The stereotype logit model for the initial probabilities, that can replace (3), is: where µ l , l = 2, . . . , k, are scores to be estimated. For identifiability purposes, since the model is invariant under scale transformations of the scores, we set µ 2 = 1. This 1 − 1) parameters less than model (3). Model (7) imposes a special structure on the way the covariates affect the odds of any two categories of L it . In particular, for any l 1 , l 2 ∈ S L , we have: i.e. the effect of the covariates on the log-odds is proportional to the difference between the µ-scores corresponding to the categories l 1 and l 2 . The stereotype logit models for the transition probabilities, that can replace (5), are defined analogously as: for t = 2, . . . , T . Forl = 1, ν 1l = 1, forl = 1, ν 2l = 1 while the rest of the ν-scores are parameters to be estimated. These models require k 2 − 1) parameters less than model (5). For any l 1 = l 2 , we have: i.e. the effect of the covariates on the log-odds is proportional to the difference between the µ-scores corresponding to the categories l 1 and l 2 . If the scores ν ll in (9) are equal to 1, we obtain a more parsimonious model, according to which the log odds (10) do not depend on covariates when l 1 = l 2 =l. According to this model there is a covariate effect on the odds of a transition to a different state but this effect is the same for all the states different from the current one. Under this restriction, (9) simplifies to parallel baseline logit models for the A different simplification follows by assuming that the scores µ l , µ 1 = 0, µ 2 = 1, ν ll , νll = 0, ν 1l = 1,l = 1, ν 2l = 1,l = 1, are linear functions of l, l ∈ S L . In this case, (7) and (9) are equivalent to parallel adjacent categories logit models for the initial and transition probabilities having (k − 1) + p (L) 2 ) parameters, respectively. Nevertheless, while the previous stereotype models are invariant with respect to permutations of the k latent states, the parallel adjacent categories logit model is not and should be considered only in case the ordering of the latent classes is known a priori. Simplifying restrictions can also be introduced for the conditional RS transition probabilities if, coherently with the idea that covariate effects are captured by the latent constructs L it , the conditional RS transition probabilities are assumed time and subject invariant, that is: independence assumptions specify the observation model: B1. Subject independence. The vectors Y i , i ∈ I are independent random vectors. B2. Hidden Markov assumption. For every unit i and occasion t, the responses Y jit , j ∈ R, are independent from their own past and depend only on (L it , U it ). B3. Contemporaneous independence. For every unit i, at any occasion t, the responses Y jit , j ∈ R, are independent given the current state of the latent process B4. Subject and time invariance. The marginal probability functions of Y jit , conditioned on the RS or AWR latent states (u, l) are both time and subject invariant. That is, for t ∈ T and i ∈ I, it holds: Under the previous assumptions, the observation probability functions are parameterized by the following logit models (without covariates), involving k r j=1 (c j − 1) + 2rk parameters: -Given the RS regime, every probability function f j|1 (y j |l), j ∈ R, l ∈ S L , is specified by the linear local logit model: -Given the AWR regime of the RS indicator, every probability function f j|2 (y j |l), j ∈ R, l ∈ S L , is parameterized by c j − 1 adjacent categories logits: The φ 0lj , φ 1lj in (12) are parameters to estimate and the scores s j (y j ) are known constant defined as: s j (y j ) = 1 for y j < c j /2, s j (y j ) = 0 for y j = c j /2, s j (y j ) = −1 for y j > c j /2, y j = 1, 2, . . . , c j − 1. These scores have been proposed by Tutz and Berger (2016) to extend the adjacent categories logit model to account for RS effects. Parameter φ 0lj governs the skewness of the probability function f j|1 (y j |l), so that it is symmetric with φ 0lj = 0, left and right skewed with φ 0lj > 0 and φ 0lj < 0, respectively. Increasing positive values of φ 1lj rise (decrease) the logits (12) Figure 3 : Response probability functions of respondents with ARS (φ 0lj = 1, φ 1lj = 1, or φ 0lj = 2, φ 1lj = 1), DRS (φ 0lj = −2, φ 1lj = 1, or φ 0lj = −1, φ 1lj = 1), MRS (φ 0lj = 0, φ 1lj = 1 or φ 0lj = −0.5, φ 1lj = 1 or φ 0lj = 0.5, φ 1lj = 1), ERS (φ 0lj = 0, φ 1lj = −1) and CRS (φ 0lj = 0, φ 1lj = 0) patterns, for response categories ranging from 1 = strongly disagree to 6 = strongly agree The suitability of model (12) for describing ARS, DRS, MRS, ERS and CRS behaviors is justified by the fact that the RS probability function defined by (12) can be unimodal only at the middle or extreme points categories of the response scale. In detail, for φ 1lj > 0, the probability function has a mode at the smallest category For φ 1lj > 0 and −φ 1lj < φ 0lj < φ 1lj , the mode is at the middle (MRS) category y j = (c j + 1)/2 when c j is odd, while for even c j , the mode is at the middle category , the previous modal categories and all the categories to the left (to the right) are equiprobable modes. and the mode corresponds to the smallest (highest) category when φ 0lj < 0 (φ 0lj > 0). If φ 0lj = 0, then the extreme categories are equiprobable modes. Finally, it is worth noting that φ 1lj = φ 0lj = 0 gives the uniform distribution, commonly used to model CRS. Examples of the different shapes of the RS probability functions are illustrated in By modifying assumption A2, interesting models, proposed in the literature in different frameworks, can be obtained as special cases of RS-HMM. These models deserve to be considered because they help us to understand the assumptions on the latent RS component of our approach. Under the assumption π U |L it (u|l,ū,l) = π U it (u|ū), i ∈ I, t ∈ T , more restrictive than A2, the Markov chains L it , t ∈ T , and U it , t ∈ T are independent for every i ∈ I (i.e. parallel Markov chains) and the RS-HMM model becomes a factorial HMM (Ghahramani and Jordan, 1997) for l ∈ S L , y j ∈ C j , j ∈ R where f j|1 and f it (u|ū,l) = π U it (u|ū), t = 2, . . . , T, which is analogous to the Granger non causality assumption A1. This model, according to which each latent variable does not Granger cause the other one, is a special case of the graphical multiple HMMs introduced by Colombi and Giordano (2015) . The drawback of this model is that, under the two non Granger causality conditions, the transition probabilities π it (u, l|ū,l) do not have a closed expression and must be computed numerically as a function of the probabilities π U it (u|ū), π L it (l|l) and a set of k − 1 odds ratios defined on the bivariate transition probabilities. See Colombi and Giordano (2015) for more details on these Granger non causality conditions and on a marginal parametrization that can be used in this context. Let θ denote the vector of all the parameters of the latent and observation models. For example, in the simple case of a memoryless model with k = 2, no covariates and one response with four categories, it is: θ = (α 01 ,ᾱ 0 , β 21 , β 12 ,β 01 ,β 02 , ϕ 11 , ϕ 21 , ϕ 31 , ϕ 12 , ϕ 22 , ϕ 32 , φ 01 , φ 11 , φ 02 , φ 12 ). Hereafter, procedures to provide maximum likelihood estimates (MLE) of these parameters and standard errors are illustrated. The latent binary variable d it (u, l) is equal to 1 when the i-th unit (subject) is at time t in state (u, l) and the latent binary variable d it (u, l;ū,l) is 1 if at time t, t > 1, the i-th subject is in state (u, l) while at occasion t − 1 was in (ū,l), l,l ∈ S L , u,ū ∈ S U . Moreover, the observable binary variable d jit (y j ) is equal to 1 if at time t the category y j of Y jit , j ∈ R, is observed on the i-th individual, i ∈ I. If the above binary latent variables were observable, the parameters could be estimated by maximizing the following complete log-likelihood (i.e. the joint log-likelihood of the observations and the latent variables): it (u, l;ū,l)) log π L it (l|l) + where f j|1 and f j|2 are provided in (12) and (13). As the latent variables are not observable and it is not easy to maximize the marginal log-likelihood, obtained by summing the joint log-likelihood over all the possible realizations of the latent indicators, it is common, in the context of HMMs, to use the EM algorithm, to compute the maximum likelihood estimates. Details on the EM algorithm in the context of HMMs are presented in many papers and books. See Bartolucci and Farcomeni (2015) for a presentation specific to the context of longitudinal data. Every iteration of the EM algorithm is composed by two steps: the Expectation (E) step and the Maximization (M) step. With respect to our model, in the E step the following expected values are computed: it (u, l)), δ it (u, l;ū,l;θ) = E obs (d it (u, l;ū,l))), where E obs () is the expected value taken conditionally on the observed values of the responses Y jit and on the covariates and given the current valueθ of the parameters. The previous expected values are computed by the Baum-Welch forward-backward algorithm (Zucchini and MacDonald, 2009, Ch. 4 ). In the M step, the following conditional expectation of the complete log-likelihood function is maximized in order to obtain an updatedθ: Note that Q(θ|θ) is obtained from the complete log-likelihood by replacing d it (u, l) and d ( 2) it (u, l;ū,l) with their expected values (17). The six addends of (18), corresponding to the models specified by (3) or (7), (4), (5) or (9), (6) and (13-12) of Sections 4.1 and 4.2, depend on disjoint subsets of the vector θ and can be maximized separately. The maximization of the sixth addend is simple as there is a closed form for the maxima. Moreover, the first addend is equivalent to the ML estimation of the logit model (3) or its stereotype variant (7), and the third and fifth terms simplify to the estimation of k(r + 1) separate logit models described by (5) or (9) and (12). A similar remark applies to the second and fourth addends and the logit models defined by (4) and (6), respectively. The terms within curled brackets correspond to the log-likelihoods of the logit models that can be maximized separately. In the first two addends, the curled brackets are omitted as only one logit model is involved. The expected values within squared brackets play the role of observed frequencies. If the model is correctly specified, the estimates of the standard errors can be based either on the matrix of second derivatives of the log-likelihood function (observed information matrix, in short OIM), see Bartolucci and Farcomeni (2015) , or on the outer products of the individual contributions to the score functions (outer product information matrix, OPIM, or BHHH estimate, Berndt et al., 1974) . When the model is misspecified, the information matrix equivalence does not hold and the standard errors have to be calculated using the so called Sandwich matrix (White, 1982) , say SDW. Alternatively, standard errors can be computed using the boostrap (BOOT) technique. Technical details are given in Appendix. All the R functions, for the estimates and standard errors (with the four mentioned methods) are available from the authors. The goodness of fit testing and model selection in latent Markov models for longitudinal data is not straightforward, since standard asymptotic results for test statistics may not hold. The use of Akaike's information criterion (AIC) or Bayesian information criterion (BIC) is a broadly used and accepted procedure. In particular, for HMM, the use of BIC dominates, even though its theoretical properties are not clear (e.g., Bartolucci et al., 2009; Zucchini and MacDonald, 2009) . for assessing the fit of the model against the independence model characterized by k = 1 and no RS effects, with r j=1 (c j − 1) parameters and log-likelihood functionˆ 0 . It holds R 2 ∈ [0, 1], with higher values indicating a better fit. Indices can be introduced for measuring the quality of classification and the distinguishability of the latent classes as well; Bartolucci et al. (2009) proposed an index based on the posterior probabilities of the latent classes, which in our set-up is: with δ * it being, for unit i at time t, the maximum with respect to (u, l) of the posterior latent class probabilities δ it (u, l;θ), introduced in (17). Measure S k lies between 0 and 1, where 1 represents certainty in classification and a perfect separation among latent classes, while values close to 0 indicate that most of δ * it are close to 1/2k, that is like choosing the classes at random. This index is very suitable for our context where the observed responses are manifest realizations of the latent variables, therefore a good quality in terms of separation of the 2k latent states is crucial. In line with the literature which ignores the answering behavior, we can measure the quality of the separation of the latent construct states marginally with respect to U , so that (20) reduces to: it (u, l;θ). Moreover, in our context, the distinguishability among the k states of the latent construct can be interestingly measured at the AWR and RS regimes separately. The S k index is specified for this aim as follows: . It can also be of interest to measure the ability to discriminate between AWR and RS behaviors, regardless of the latent construct. For this aim, the measure (20) is modified as: it (u, l;θ). Finally, the concern can be directed to measure how well separated are the two responding regimes, in every class of the latent construct. An insight in this sense is given by: , l ∈ S L . After the selection of a reasonable model according to indices of goodness of classification, and indices for judging the overall fit of the model, a residual analysis which detects features of the data not captured by the model has to be carried out. We assess the adequacy of the selected model by analysing full-conditional residuals, introduced in the context of HMMs by Buckby et al. (2020) , as exvisive residuals. Full-conditional residuals are an alternative to the forecast or predictive residuals (Buckby et al., 2020) . The difference is that in full-conditional residuals, the expected values of observed counts at time t are taken given all the other observations while in forecast residuals they are taken given the observations before time t. Full-conditional residuals are more useful in evaluating goodness of fit while forecast residuals are more helpful to assess the predictive accuracy of the model. In the application that follows, we use Pearson full-conditional residuals, whose technical details are given below. To simplify the notation, let it ) be the set of covariates for individual i at time t, i ∈ I, t ∈ T . Let D t = {x 1 , x 2 , . . . , x dt } be the set of different configurations of covariates observed at time t ∈ T and D = ∪ t D t . Moreover C is the set of the c = j c j different configurations of the responses. For every vector y it , y it ∈ C, of the r responses of unit i at time t, we define the rest of y it as Y − it = {y i1 , y i2 . . . , y it−1 , y it+1 , . . . , y iT }. For y ∈ C, i ∈ I, t ∈ T , the indicator d it (y) = 1 if y it = y, d it (y) = 0 otherwise, is defined and summing over units the We introduce a residual for every x ∈ D t , t ∈ T , and y ∈ C by comparing the previous counts with their expected values defined below. Let f it (y|D, Y − it ), y ∈ C , i ∈ I, t ∈ T be the joint probability density function (pdf) of the responses given the covariates and the rest of y it . The computation of this pdf is described by Buckby et al. (2020) , by Zucchini and MacDonald (2009) in the related context of pseudo residuals, and can be obtained as a by product of the Baum-Welch algorithm. Starting from these pdf, we define the following conditional expected values of the counts n t (y, x): for every x ∈ D t , t ∈ T , y ∈ C. Accordingly, the following full-conditional Pearson residuals are introduced: for every x ∈ D t , t ∈ T , y ∈ C. Plotting full-conditional Pearson residuals is an useful tool to investigate the lack of fit of the model and to highlight particular features of the data. Standardizing these residuals is possible in theory but the computation of the standard errors is not an easy analytical and computational task. This could be done by the methods used in Titman (2009) for HMMs in continuous time but, an in depth-study is needed to asses the feasibility in presence of many residuals. For every time occasion t, t ∈ T , and every observed covariate configuration x, the squared full-conditional Pearson residuals sum to the corresponding Pearson's chi-squared statistic χ 2 t (x) = y∈C ρ t (y, x) 2 . In this paper, the averages of full-conditional Pearson residuals over the c response configurations, i.e. c , x ∈ D t , t ∈ T , are used to summarize the comparison of the estimated cell probabilities under the assumed model with the observed proportions. In applications of multivariate responses, practical interest may lie on univariate responses Y j or bivariate responses (Y j , Y j ), with j = j , j, j ∈ R. In such cases, residuals (21) can be marginalized to: where y is a configuration of the responses of interest and R the associated set of indices. The consideration of marginalized residuals is also useful in case of sparsity in response configurations. We applied the proposed models to the panel data from the Survey on Household Income and Wealth described in Section 2 to answer the questions raised there. The household's financial capability (or condition) is the latent trait of interest measured trough the ability to make ends meet R 1 and the perceived financial risk R 2 , with covariates gender (G), job (J), children (CH), debts (D), savings (S), education (E), cf Section 2. Models based on different hypotheses on the latent transition probabilities are compared in Table 1 , each one considered for an increasing number of latent states. When the initial or transition probabilities depend on the covariates are said to be heterogeneous otherwise they are homogeneous. In the models of Table 1 , the initial probabilities π L i1 (l) are modelled through RS initial probabilities are always assumed to depend on the covariates to capture heterogeneity in the answering behavior at the beginning. The minimum BIC corresponds to the model M8 with k = 4 states defined by stereotype models (7) for the latent construct initial probabilities and parallel baseline logit models (9) with scores ν ll = 1, l =l, for the transition probabilities, and no covariate effects on the RS conditional transition probabilities, specified in (6) with l ∈ S L (Section 6.2), illustrated in Table 2 , confirm the superiority of M8 with k = 3 over the analogous model with k = 4 in terms of distinguishing the states of the latent financial capability within the two groups of AWR and RS respondents, and also the greater ability to distinguish the AWR and RS behaviors, marginally and conditionally on the latent classes l, except for l = 1 only. Therefore, the latent construct -the households financial capability -is reasonably chosen with three states meaning that households can be grouped according to whether they feel financially confident (l = 1), financially fair (l = 2), financially distressed (l = 3). The choice of model M8 implies that the transition probabilities of the Table 2 : Results of indices of quality of classification illustrated in Section 6.2 for models M8 with k = 3 and k = 4 states of the latent construct To complete the assessment of the chosen model we carry out a residual analysis. Figure 4 while only 22 residuals in total are greater than 5 (2.5 ‰). The maximum residual corresponds to the profile of a male, with a job (selfemployee or employee), no children, no debts, no savings and a low educational level. A slightly larger dispersion appears for residuals (s. Fig 4) corresponding to the choice They correspond both to female respondents with quite opposite profiles. In one of them they are not self-employee, with no children-debs-savings and a low education, while in the other, they have a job (self-employee or employee), children-debs-savings and a high educational level. Table 3 : Estimates (standard errors) of parameters φ 0lj and φ 1lj of the RS probability functions in the stata l = 1, 2, 3 and responses R j , j = 1, 2 financially confident (l = 1) financially fair (l = 2) financially distressed (l = 3) Bottom panels in Figure 6 refer to the probability functions of the observed response R 2 about financial risk perception for AWR (colored bars) and RS (grey bars) households. The RS probability functions in every latent state have mode at the smallest category risk averse since all the parameters φ 0l2 and φ 1l2 are negatively estimated (Table 3) Estimates of the parameters of the models for the initial and transition probabilities are in Table 4 . The reported standard errors are calculated using the OPIM method, even if all the methods illustrated in Section 6.1 and Appendix have been applied. They provided quite overall similar results, also close to the standard errors obtained by the bootstrap method. Table 5 shows, for the sake of simplicity, the standard errors of the estimators for the parameters of the models of the initial and transition probabilities of the latent construct, calculated with the three illustrated methods and the bootstrap technique. There is coherence in the results, except for some cases. Some numerical issues appear mostly in correspondence with high estimates of the parameters. By the sign of the estimates of the parameters of model (7), in Table 4 row 1, we deduce that at the first occasion employees, people without savings and with high education are in a worse financial status (l = 2, 3) with higher probability. This effect is strengthened for the status that describes greater financial incapability as the scoreμ 3 is greater than 1. In particular, for high educated people, the odds of being financially distressed (l = 3) instead of confident (l = 1) is quite 10 times the odds for low educated respondents. Similarly, for households with no savings (with an employee job), the odds of being in financial vulnerability (l = 3) instead of being confident in managing the disposable income is 7 times (about 6 times) the odds when households can count on savings (on a self employee job). In addition, by considering the difference between the scores in discussing the effect of the educational level, it follows that (2,E=uptosecondaryschool) , at the first occasion, the propensity of strongly struggling to make ends meet (l = 3) instead of managing their finances without much effort (l = 2), for high educated respondents is about 2.5 times that of low educated ones. Analogously, the odds ratios are 2.23 and 2.07 when groups of households with/without savings and with employee/self employee-householder, respectively, are compared. Looking at the estimated parameters (row 4) of the RS initial probabilities modelled by (4), we deduce that at the beginning of the survey, female and low educated respondents seem more inclined towards response styles when describing their financial condition. From the estimated parameters (rows 7, 10, 13, 15) of model (9) with parallel restriction (ν ll = 1) for the transition probabilities of the latent financial capability, it seems that, in two consecutive moments, women, highly educated, with no children and no savings with higher probability move from a financially confident condition (l = 1) to a worse status of financial vulnerability (l = 2, 3). When the starting status corresponds to a fair financial confidence (l = 2), self-employees with no savings to rely on and a low level of education are more likely to move towards other levels of financial capability (l = 1, 3). Individuals who suffer in one occasion financial distress but can count on personal savings, tend to improve their condition in the next time by moving with more probability towards the stata of financial stability (l = 1, 2). Moreover, it is worthwhile to note that all the intercepts are negative, therefore there is evidence of a higher propensity to rest in the same previous financial status. This is more striking for households who experience financial distress (l = 3) and with very small probabilities pass to more comfortable conditions (l = 1, 2). The estimated intercepts of model (6) for the RS transition probabilities (rows 18, 21), suggest that respondents tend to keep the same behavior in answering the two questions in two consecutive occasions, regardless the latent state which represents the current perceived financial capability. A HMM for longitudinal data of ordered categorical variables, that takes into account that responses can be determined by a RS, has been introduced. The proposed model Under assumption A2 and if the conditional RS transition probabilities are homogeneous, the hypothesis π U |L (u|l,ū) = dū(u) of time invariance of the RS indicator constrains 2k parameters on the frontier of the parametric space. A test based on the log likelihood ratio statistic can be used but in this case the asymptotic distribution of the statistic is a mixture of chi-squared distributions known as chi bar squared distribution. The test can be easily implemented as shown in Bartolucci (2006) and Colombi and Forcina (2016) who dealt with related problems. Point (ii) above can be based on the approach of graphical HMMs by Colombi and Giordano (2015) . Regarding point (iii), Assumption B4 on the observation model can be relaxed by modelling the observation probabilities as function of individual covariates as an alternative to the presence of covariate effects on the latent component. This can be the case when the main interest is on the observed responses and the latent variable serves to account for time dependence and respondent's unobserved heterogeneity not explained by RS. We discuss three approaches to the estimation of the standard errors of the maximum likelihood estimatorθ of θ. Hereafter, the upper index (m), m ∈ {L, U }, will be omitted from the vectors of covariates x (m) i and z (m) it to simplify the notation. Let y i , i ∈ I, be a realization of the T r observable variables Y jit , j ∈ R, t ∈ T , collected in the vector Y i . The joint probability function of Y i , conditioned on the vector of covariates x i , z i (z i is obtained by stacking the z it , t > 1), is denoted by q(y i |x i , z i ; θ). The log-likelihood function of the observations y i , i ∈ I, is: log q(y i |x i , z i ; θ), and the vector of the score functions is: The calculation of standard errors can be based on OIM, OPIM, SDW methods, as mentioned in Section 6.1. We here sketch briefly some technical details of the three methods, an alternative approach is based on the well known parametric bootstrap technique. The OIM can be computed using the Oakes identity (Oakes, 1999) : as shown by Bartolucci and Farcomeni (2015) . The first term inside the square brackets is easy to compute, using the outputs of the last M step, as it is block diagonal with blocks given by the Hessian matrices of the six addends of (18). The computation of the second term inside the square brackets is more demanding as it requires the derivatives with respect toθ of log δ it (u, l;θ) and log δ it (u, l;ū,l;θ). These derivatives can be obtained as an output of the Baum-Welch forward-backward algorithm as described by Bartolucci and Farcomeni (2015) . For every elementθ h ofθ, the terms of − ∂ 2 Q(θ|θ) ∂θ h ∂θ |θ=θ are obtained by the derivatives with respect to θ of the six addends of (18) if δ (1) it (u, l;θ) and δ ( 2) it (u, l;ū,l;θ) are replaced by δ If the RS-HMM is correctly specified, estimates of the standard errors can be also derived from the OPIM matrix I(θ) = i s i (θ)s i (θ) . The matrix I(θ) is estimated byÎ = I(θ) and the estimated standard errors of the maximum likelihood estimators are the square roots of the diagonal elements ofÎ −1 . The matrixÎ is easier to compute thanĴ , due to the effort needed to compute ∂ 2 Q(θ|θ) ∂θ∂θ |θ=θ . Remind that the HMM with a RS component is misspecified if there does not exist a θ such that τ (y|x, z) = q(y|x, z; θ) with probability 1 where, for every x, z, τ (y|x, z) is the true probability function generating the data. In this case,θ is a pseudo maximum likelihood estimator which is a consistent estimator of the pseudotrue value θ 0 = argmin θ E x,z E y τ (y|x, z) log τ (y|x,z) q(y|x,z;θ) , see Vuong (1989) and White (1982) . When the RS-HMM is misspecified, estimated standard errors of the pseudo maximum likelihood estimators are given by the square roots of the diago-nal elements ofĴ −1ÎĴ −1 . The estimators of the standard errors, obtained in this way, are robust in the sense that they are consistent, independently from the correct specification of the model. The matrix nĴ −1ÎĴ −1 is a consistent estimator of the SDW matrix A(θ 0 ) −1 B(θ 0 )A(θ 0 ) −1 where A(θ 0 ) = −E x,z E y ∂ 2 log q(y|x,z;θ) ∂θ∂θ and B(θ 0 ) = E x,z E y ∂ log q(y|x,z;θ) ∂θ ∂ log q(y|x,z;θ) ∂θ . The SDW matrix plays a central role in testing problems on misspecified models (Vuong, 1989) . As all models are possibly misspecified, the estimator of the standard errors based on the sandwich matrix should be always used in practice. However, computational complexity and numerical instability problems make the use of estimates based on the matrixÎ more practical in the case of the model considered here. Analysis of Ordinal Categorical Data Intra-individual variation of extreme response style in mixedmode panel studies Household financial vulnerability: an empirical analysis Regression and ordered categorical variables Yea-saying, nay-saying, and going to extremes: black-white differences in response styles Likelihood inference for a class of latent Markov models under linear hypotheses on the transition probabilities Information matrix for hidden Markov models with covariates Latent Markov Models for Longitudinal Data Latent Markov model for longitudinal binary data: an application to the performance evaluation of nursing homes Response styles in marketing research: a cross-national investigation Marginal Models: For Dependent, Clustered, and Longitudinal Categorical Data Estimation and inference in nonlinear structural models The index of household financial condition, combining subjective and objective indicators: an appraisal of Italian households Testing the stability of an acquiescence style factor behind two interrelated substantive variables in a panel design Modeling multiple response processes in judgment and choice Response style analysis with threshold and multi-process IRT models: a review and tutorial Applications of a MIRT model to self-report measures: Addressing score bias and dif due to individual differences in response style Model checking for hidden Markov models Testing order restrictions in contingency tables Multiple hidden Markov models for categorical time series Hierarchical marginal models with latent uncertainty A rating scale mixture model to account for the tendency to middle and extreme categories Financial literacy in Italy: results from the 2020 Bank of Italy Survey. Occasional Papers Massive earthquakes, risk aversion, and entrepreneurship Hidden Markov models for zero-inflated Poisson counts with an application to substance use Response style contamination of student evaluation data A flexible full-information approach to the modeling of response styles Financial disclosure, risk perception and investment choices Factorial hidden Markov models Response style corrected market segmentation for ordinal data Longitudinal Data Analysis Different approaches to modeling response styles in divide-by-total item response theory models (part 1): A model integration Financial Capability: the Government's Long-Term Approach. HM Stationery Office Mixture random-effect IRT models for controlling extreme response style on rating scales Households' saving and debt in Italy Response style behavior: question format dependent or personal style? Behavioural economics and financial consumer protection. OECD Working Papers on Finance, Insurance and Private Pensions 42 The state of US financial capability: the 2018 National Financial Capability Study. FINRA Investor Education Foundation Mixed hidden Markov models for longitudinal data: an overview Models for Discrete Longitudinal Data The joint influence of financial risk perception and risk tolerance on individual investment decision-making Direct calculation of the information matrix via the em algorithm OECD/INFE international survey of adult financial literacy competencies Measurement and control of response bias The class of CUB models: statistical foundations, inferential issues and empirical evidence Risk assessment, risk values and the social science programme: why we do need risk perception research Chapter Response styles in surveys: understanding their causes and mitigating their impact on data quality Are risk preferences stable The Feeling of Risk: New Perspectives on Risk Perception. Earthscan Do response styles affect estimates of growth on social-emotional constructs? Evidence from four years of longitudinal survey scores Measuring financial capability and its determinants using survey data Computation of the asymptotic null distribution of goodness-of-fit tests for multi-state models Response styles in rating scales: simultaneous modeling of content-related effects and the tendency to middle or extreme categories Improving the financial literacy of European consumers Applying behavioural sciences to EU policy-making. EUR -Scientific and Technical Research Reports Response styles in survey research: a literature review of antecedents, consequences, and remedies Hidden markov models with mixtures as emission distributions Likelihood ratio tests for model selection and non-nested hypotheses Response Styles in Consumer Research The stability of individual response styles Multidimensional modeling of traits and response styles Maximum likelihood estimation of misspecified models Validity of three IRT models for measuring and controlling extreme and midpoint response styles Financial capability surveys around the world: why financial capability is important and how surveys can help Hidden Markov Models for Time Series: An Introduction Using R