Microsoft Word - 01_2010.docx Edinburgh Research Explorer Are rating agencies' assignments opaque? Evidence from international banks Citation for published version: Bellotti, T, Matousek, R & Stewart, C 2011, 'Are rating agencies' assignments opaque? Evidence from international banks', Expert Systems with Applications, vol. 38, no. 4, pp. 4206-4214. https://doi.org/10.1016/j.eswa.2010.09.085 Digital Object Identifier (DOI): 10.1016/j.eswa.2010.09.085 Link: Link to publication record in Edinburgh Research Explorer Document Version: Peer reviewed version Published In: Expert Systems with Applications Publisher Rights Statement: © Bellotti, T., Matousek, R., & Stewart, C. (2011). Are rating agencies' assignments opaque? Evidence from international banks. Expert Systems with Applications, 38(4), 4206-4214. 10.1016/j.eswa.2010.09.085 General rights Copyright for the publications made accessible via the Edinburgh Research Explorer is retained by the author(s) and / or other copyright owners and it is a condition of accessing these publications that users recognise and abide by the legal requirements associated with these rights. Take down policy The University of Edinburgh has made every reasonable effort to ensure that Edinburgh Research Explorer content complies with UK legislation. If you believe that the public display of this file breaches copyright please contact openaccess@ed.ac.uk providing details, and we will remove access to the work immediately and investigate your claim. Download date: 06. Apr. 2021 https://doi.org/10.1016/j.eswa.2010.09.085 https://doi.org/10.1016/j.eswa.2010.09.085 https://www.research.ed.ac.uk/portal/en/publications/are-rating-agencies-assignments-opaque-evidence-from-international-banks(7b0b13c5-998a-46d3-b2b6-8a29cc8b7582).html CENTRE FOR EMEA BANKING, FINANCE & ECONOMICS Are Rating Agencies’ Assignment Opaque? Evidence from International Banks Tony Bellotti, Roman Matousek and Chris Stewart No. 01/10 Working Paper Series 2 Are Rating Agencies’ Assignments Opaque? Evidence from International Banks Tony Bellotti 1 Roman Matousek 2 Chris Stewart 3 Abstract We compare the ability of ordered choice models and support vector machines to model and predict international bank ratings. Although support vector machines can identify significant determinants we argue that ordered choice models are more reliable for this. Our findings suggest that ratings reflect a bank’s financial position, the timing of rating assignment and a bank’s country of origin. Accounting for country effects substantially improves predictive performance. We find that support vector machines can produce considerably better predictions of international bank ratings than ordered choice models due to the formers ability to estimate a large number of country dummies unrestrictedly. Keywords: International banks, ratings, support vector machines, ordered choice models, country effects JEL Classification: C25, C51, C52, G21. 1 University of Edinburgh Business School, William Robertson Building, 50 George Square, Edinburgh EH8 9JY. Email: Tony.Bellotti@ed.ac.uk. 2 Corresponding author: Centre for EMEA Banking, Finance and Economics, London Metropolitan Business School, London Metropolitan University, 84 Moorgate, London, EC2M 6SQ. Tel: 020-7320-1569. E-mail: r.matousek@londonmet.ac.uk. 3 London Metropolitan Business School, London Metropolitan University, 84 Moorgate, London, EC2M 6SQ. Tel: 020-7320-1651. E-mail: c.stewart@londonmet.ac.uk. 3 1. Introduction Ratings agencies’ exclusive position may be justified on the grounds of reducing asymmetric information between investors and companies (Portes, 2008). However, the current global financial crisis has severely damaged the reputation of ratings agencies (RAs) that mispriced credit risk through their ratings assignments. A number of relatively financially sound banks, according to ratings assignments, were forced to close or be bailed out by governments. This raises the question of how RAs determine bank ratings. Ratings are ordinal measures that should not only reflect the current financial position of sovereign nations, firms, banks, etc. but also provide information about their future financial positions. There has been extensive research in predicting bond ratings using ordered choice models, non-parametric techniques and artificial intelligence methods, for example, Altman and Saunders, (1998), Kamstra et al (2001), Huang et al., (2004), Kim (2005) and Lee (2007). However, we are not aware of any research that model bank ratings, although Morgan (2002) attempts to identify the determinants of the difference in two separate RAs’ bank rating assignments using (ordered) logit regressions. Morgan’s work is motivated by the inherently opaque nature of banks in terms of those outside of banks, including the RAs, assessing the risks taken by inherently opaque banks. Within this context, we seek to shed light upon how ratings agencies determine the risks of banks. Thus, we employ financial variables, in addition to country risk (which we model using country specific dummy variables), as determinants of bank ratings using both ordered choice models and support vector machines (SVMs). The main challenge in modelling ratings is to increase the probability of correct classifications. Therefore, our comparison of SVMs with ordered choice models for predicting individual bank ratings as produced by Fitch Ratings (FR) is a significant 4 contribution to current research in this field. In doing so, we model the bank ratings assigned by FR using both ordered choice models and SVMs with the aim of shedding light upon their determination and comparing the two modelling methodologies. We consider three sets of determinants of ratings with the first set being financial variables. Secondly, we examine whether bank ratings are systematically determined by the timing of the rating. Thirdly, we incorporate country indicator variables to capture country-specific variations in ratings under the rationale that a bank’s rating is related to the country in which it is based. Accounting for country (fixed) effects within the context of modelling bank ratings, is an additional contribution of our study – we demonstrate that it substantially enhances the predictive accuracy of our models. We also assess the predictive power of our models and compare the performances of ordered choice models and SVMs. The organization of the paper is as follows. Section 2 provides a brief literature review. Section 3 describes the data and the methods applied while Section 4 discusses the principal empirical findings. Section 5 then considers our models’ predicted ratings. Section 6 concludes and provides policy recommendations and suggestions for further research. 2. Brief Literature Review The ability of RAs to assign ratings correctly has extensively been questioned (Altman and Saunders, 1998, Levich et al, 2002, Altman and Rijken, 2004, Amato and Furfine, 2004, Portes, 2008). One of the most frequent arguments about the prediction abilities of RAs is that they could provide misleading information since the analysis is backward looking rather than forward looking. In addition, the low transparency of ratings assignments contributes to the concern over the accuracy of ratings. Further, RAs do not 5 have, and cannot have, superior information to market participants about uncertainty and the degree of insolvency (illiquidity) of companies. A prediction of the financial soundness of banks, corporations and sovereign countries has been of central importance for analysts, regulators and policy makers. Research focusing on the prediction of bank failures by applying Early Warning Systems (EWS) has also been extensive – see, for examples, Mayer and Pifer (1970), Altman and Saunders (1998), Kolari et al (1996) and Kolari et al (2002). There has been widespread research in predicting bond ratings using multi-variate discriminant analysis as well as probit and logit models (Altman & Saunders, 1998). Kamstra et al (2001) have recently demonstrated that predictive accuracy can be improved by combining several forecasting methods to predict bond ratings. Kim (2005) used non- parametric techniques designed to capture the dynamic relationship between input and output variables. Huang et al (2004) and Lee (2007) show that artificial intelligence methods do not provide superior predictions of bond ratings to standard ordered choice methods. Comparing ordered logit/probit regressions with SVM is a valid way of addressing the main challenge in modelling ratings, which is to increase the probability of correct classifications. However, we are not aware of any previous studies that compare ordered choice models and SVMs in terms of modelling and predicting individual bank rank ratings, which is the aim of this paper. 3. Data and Methodology FR, as one of the largest rating companies for the banking industry around the world, releases four types of ratings; legal ratings, long term and short-term (security) ratings and individual ratings. 6 We focus on individual ratings because they assess the financial position of a bank itself. As stated by FR the rating is closely linked with financial performance (financial variables). The individual rating provided by FR is divided into five categories according to the performances of rated banks and further subdivided to give a total of nine rating categories. 4 Using data on 681 international banks’ ratings between 2000 and 2007 collected from BankScope, we estimate models of the determinants of these ratings, denoted i Y . This variable is ordinal and has up to nine ranked categories that are assigned integer values from 1 to 9, such that lower values indicate a lower rating. The sample size falls as higher-order lagged explanatory factors are added to the model and this can cause all banks in a particular category to be excluded from the sample. In our application the number of categories is 8 (ratings 1 to 8) because 4 lags are included in all of our models. The eight rating categories (with assigned values in brackets) are: E (1), D/E (2), D (3), C/D (4), C (5), B/C (6), B (7), A/B (8) – there is no data on banks with an A rating in our sample. Since all models include a fourth lag of at least one financial variable the sample size used in estimation is reduced to 359 – 360 observations. The average numerical ratings range from 5.83 in 2002 to 4.31 in 2005 suggesting a general decline in ratings over time. We assess whether ratings have declined through time in our modelling. We apply ordered choice estimation techniques and SVMs to this rating data. The SVM approach has a number of advantages over other machine learning algorithms such as 4 The standard classification of the individual rating is A, B, C, D and E. A further graduation among these five ratings is used, that is, A/B, B/C, C/D and D/E. The grade A says that the bank is in an impeccable financial position with a consistent record of above average performance. The B rating defines a bank as having a sound risk profile without any significant problems. The bank’s performance generally has been in line with, or in a better position than, that of its peers. The C rating includes banks which have an adequate risk profile but possess one troublesome aspect, giving rise to the possibility of risk developing, or which have generally failed to perform in line with their peers. The D rating includes banks which are currently under-performing in some notable manner. Their financial conditions are likely to be below average and their profitability is poor. These banks have the capability of recovering using their own resources, but this is likely to take some time. Finally, the E rating includes banks with very serious problems which either require or are likely to require external support. 7 neural networks (NN): (1) it has a solid theoretical foundation in statistical learning theory (Vapnik 1995), (2) SVM finds a single global minima whereas NN may find local rather then global minima, (3) computational complexity does not increase with the size of the input space as it does with NN, and (4) SVM includes a regularization term to control model complexity and therefore avoid over-fitting. The last point is a critical advantage SVM has over standard methods too, such as logistic regression, which enables it to implement complex non-linear models such as polynomial and Gaussian models along with handling a large number of covariates (features) such as multiple country dummy variables (codes). 3.1 Ordered Choice Estimation Techniques The ordered dependent variable model assumes the latent variable form below (see Greene 2008). The model has both cross-sectional and time-series dimensions, however, the latter is referenced to the year a rating was made rather than the calendar year. Further, there is only one time-series observation for each bank’s dependent variable, although lags are available on the explanatory factors. As such, the model is not a pooled data specification. Rather it is a cross-sectional model with time-series dynamics in the explanatory variables, thus: it K k itkk K k itkk K k itkk K k itkkit uXXXXY ++++= ∑∑∑∑ = − = − = − = − 1 4,4 1 3,3 1 2,2 1 1,1 * ββββ (1) where 4 ,3 ,2 ,1 , , =− lX litk is the l th lag on the k th explanatory variable for the i th bank in period t, it u is a stochastic error term, and * it Y is the unobserved dependent variable that is related to the observed dependent variable, it Y , (assuming eight categories) as follows: 8 * 7 * 1-j 1 * if 8 7 ,...,3 ,2 , if if 1 itit jitit itit YY jYjY YY <= =≤<= ≤= λ λλ λ (2) where 1 λ , 2 λ ,…, 7 λ are unknown parameters (limit points) to be estimated with the coefficients (the kl β s). Our interest is primarily confined to the general direction of correlation between the dependent and independent variables. Therefore, we use the sign of kl β to provide guidance on whether the estimated signs of coefficients concur with our a priori expectations. This is instead of looking at the marginal effects which indicate the direction of change of the dependent variable (for each value of the dependent variable) to a change in litk X −, . For ordered choice models these marginal effects are difficult to interpret. Greene (2008) suggests that probit and logit models yield results that are very similar in practice. 3.2 Support Vector Machines Since the SVM classifier is binary it is not immediately suitable for the bank rating problem. Multiple bank ratings can be modelled using multiple SVM classifiers in a “one-against-one” or directed acyclic graph approach (Huang et al 2004). However this requires many SVMs, which further increases model complexity and computation time. A simpler method is to use SVM regression to model ratings directly. An advantage that SVM regression has in contrast to classical regression techniques such as OLS is that the loss function is ε -insensitive meaning that differences between the predicted and true value less than ε are not treated as errors. Since we are interested in predicting integers, predictions are 9 rounded to the nearest integer and so are indifferent to the fractional part of the prediction; for example, if the target rating is 2, then predictions of 2.1 or 2.3 are equally valid. Hence we can set 5.0≤ε to represent this indifference. This property allows SVM to be a more sensitive model for bank ratings. SVM regression is expressed formally as follows. Given n observations x i , y i( ) where x i is a vector of covariates and y i is a real number outcome, then SVM regression constructs a linear model bxwy +⋅=ˆ by solving the quadratic optimization problem: min w,ξ ,ξ * 1 2 w ⋅ w + C ξ i + ξ i * i=1 n ∑       subject to y i − w ⋅ x i − b ≤ ε + ξ i y i − w ⋅ x i − b ≥ ε + ξ i * ξ i ,ξ i * ≥ 0      (3) This is essentially a least absolute value regression method with the ε -insensitive loss function:    −− ≤− = otherwiseˆ ˆ if0 )ˆ,( ε ε yy yy yyL (4) and a regularization term to minimize the magnitude (complexity) of w. The relative importance of the two optimization goals of fitting the data and reducing model complexity is controlled by the parameter C. Lower values give greater emphasis to reducing model complexity whilst larger values of C give greater emphasis to model fit. By reducing the complexity of the model we make it less likely that the model over-fits the in-sample training data and so should yield better performance on out-of-sample test data. 10 This representation is in primary form, but it can be transformed into dual form using Lagrange multipliers. In dual form non-linear models can be implemented efficiently using kernel methods. Typical kernels are polynomial and Gaussian kernels (Vapnik 1995). However, for our study we used the simple linear model since we want to extract and report coefficient estimates w to compare with the ordered choice model. We also do not use complex kernels since we want to avoid over-fitting on the in-sample data set. We apply the SVM method to bank ratings data and compare its in-sample predictive performance with that of the current standard method for modelling ratings; ordered choice models. Various combinations of ε and C are considered in the SVM application. For these experiments we used LIBSVM, a popular implementation of SVM by Chang and Lin (2001). 3.3 Covariates and Modelling We consider three sets of covariates to model bank ratings, being financial variables, the year in which the rating was made, [denoted it time ] and country effects. The financial variables that we consider are as follows. The ratio of equity to total assets [denoted it Equity ], the ratio of liquid assets to total assets [ it Liquidity ] the natural logarithm of total assets [ ( ) it Assetsln ] and the net interest margin [NI_Margin], ititit OEAOIANOA −= (where it OIA is the ratio of operating income to total assets and it OEA is the ratio of operating expenses to assets), the ratio of operating expenses to total operating income [ it OEOI ] and the return on equity [ it ROAE ]. 5 5 The following three further variables were also considered for inclusion in the model: the ratio of operating expenses to assets [ OEA ], the ratio of operating income to assets [ OIA ] and the return on assets [ ROAA ]. These were excluded from the model because they would cause a high degree of multicollinearity and their effects could be captured in other ways. That is, the effects of OEA and OIA are captured by the variable OEAOIANOA −= while ROAA is a close substitute of ROAE (which it is highly correlated with). The highest pairwise simple correlations amongst the explanatory factors involve these variables. Specifically, the 11 The first to fourth lagged values of the financial variables are considered as potential determinants of bank ratings. We do not include current values of these seven variables because they may contain information that was unknown at the time the rating was made. For example, if a bank’s rating was decided in January 2007 then the value of any explanatory factor measured over the whole of 2007 would be unknown when the rating was made. Models could not be estimated when the lag length exceeded four. Therefore, models are estimated from one up to four lags of these variables. Finally, we incorporate country indicator variables to capture country-specific variations in ratings. 89 country dummy variables (there are banks from 90 countries in total) account for country-specific effects (capturing, for example, country risk). The 89 individual country dummy variables were all entered simultaneously in the SVM application. However, for the ordered choice models these country dummy variables could not all be entered simultaneously because such a model could not be estimated. Therefore, these country dummies are combined in to a single index of indicators, following Hendry (2001). 6 To obtain an initial index, 1 I , we calculate the average rating of each country in our whole sample where m δ denotes the average rating for the mth country. A dummy variable for each country is constructed such that it is unity for that country’s observation and zero otherwise. The initial index is then constructed as: ∑ − = = 1 1 1 M m mm DI δ , where m D denotes the dummy for the m th country and M is the total number of countries. This index was checked for appropriateness by estimating a single ordered choice model (we used the probit form in out application) that included the country index plus one individual country’s dummy. If the simple correlation coefficients for the each pairing (calculated using a common sample) are the following: OEA and OIA , 0.98; ROAA and NOA , 0.89; OIA and NOA , 0.84; OEA and NOA , 0.72; OIA and ROAA , 0.71; ROAA and ROAE , 0.62; ROAA and OEA , 0.60. The simple correlation coefficients of pairs of variables retained in the model are all well below 0.5 (most are substantially lower than this), which helps to ensure that the reported regressions do not suffer from severe multicollinearity. 6 Hendry’s analysis is within the context of modelling inflation using time-series data. Hendry and Santos (2005) discuss the potential advantages of using such an index. 12 latter was significant at the 5% level the value of this dummy’s coefficient was incorporated into the country index. This was repeated for all ninety countries, that is, ninety distinct regressions that contained only two variables (the country index and a particular country’s dummy) were estimated. After all the coefficients of the individual country dummies that were significant in these ninety regressions had been incorporated into the index this step was repeated until no individual country dummies were significant at the 5% level (when included in a regression with the country index). 7 The weights used in the resulting country index (denoted it Country ) are reported in Table 1. Models were then constructed using the country specific terms and the other explanatory factors (financial variables and time term). For the ordered choice models a cross-sectional variant of the general-to-specific method was employed to produce the favoured model. When more than one model could be chosen the favoured parsimonious model was selected upon the basis of the lowest SBC. Both general and parsimonious models are reported. Regarding the SVM procedure we estimate a general model with all variables included and, based upon bootstrapped confidence intervals, a parsimonious SVM is selected. That is, those covariates that are individually significant according to the confidence intervals are selected for the parsimonious model – no joint tests of significance for sets of variables are conducted. The production and use of bootstrapped confidence intervals within the SVM approach is an innovation of this study. An advantage of the SVM approach over the use of ordered choice models is that it allows all of the individual country dummies to be included simultaneously and their coefficients to be freely estimated in an unrestricted model. For comparative purposes we report results on predictive accuracy for SVMs that use the country 7 The adjustments made to the initial country index based upon the average of each country’s rating was to first, add 0.118 to the weights in the index for Argentina, Benin, Iran, Jamaica, Kenya, Lebanon, Mohgolia, Nigeria and Tunisia, and, second, subtract 7.379766 from the weight in the index for Bangladesh. 13 index employed in the ordered choice models. We also report predictive performance measures for both SVMs and ordered choice models that exclude country effects in an attempt to determine the importance of accounting for country effects when modelling bank ratings. Finally, the predictive performance of ordered choice models that incorporate a country index based upon the weights obtained from the general SVM is also reported. 4. Empirical Results: determinants of ratings The ordered logit and probit regression results for the determinants of bank ratings with four lags of the explanatory variables are given in Table 2. For all specifications we report a general model (including all lags of the variables) and one parsimonious specification. For the ordered choice models the favoured parsimonious specification only includes individually (according to z-statistics) and jointly (according to a likelihood ratio test, denoted LR statistic) significant variables. In all cases the restrictions placed on the general model to obtain the parsimonious model cannot be rejected according to a likelihood ratio test [LR(general→*)]. Whilst these generally are exclusion restrictions we also consider combining 2−itLiquidity and 3−itLiquidity into the difference variable, 322 −−− −=∆ ititit LiquidityLiquidityLiquidity , given that they have approximately equal and opposite signs. Upon this basis the favoured model includes 2−∆ itLiquidity for both probit and logit forms. The favoured parsimonious models will yield more efficient inference relative to the general model and are, therefore, used for inference. The same models are favoured for the probit and logit forms. The favoured parsimonious models include the following statistically significant effects with an unambiguous direction of correlation. The variable time has a negative effect 14 on bank ratings: the more recently the bank’s rating was made the lower the rating will be, ceteris paribus. Equity (capital adequacy) has a positive effect on a bank’s rating: a more capitalised bank has a higher rating. The natural log of assets also has a positive effect on bank ratings: banks with a larger size of assets have a higher rating. OEOI has a negative correlation with a bank’s rating. The return on assets, ROAE , has a positive impact upon ratings. All of these effects are consistent with prior beliefs. Country has a positive coefficient indicating that country specific effects affect a bank’s rating: a bank in a less stable/developed/rich economy appears to have a lower rating. For example, Canada, Ireland, Norway and Sweden are in the group of countries with the highest country specific rating while Bangladesh has the lowest country specific rating (interestingly Andorra is ranked in the top band of the country index). This finding confirms our hypothesis that a bank’s country of origin plays an important role in assigning individual ratings, capturing constant country specific effects (rather like fixed-effects in a panel data model) that are not explained by the financial variables. Both the second and third lags of Liquidity are significant and their coefficients are of approximately equal and opposite sign in the general models. Hence, it is the second lag of the change in liquidity, 2−∆ tLiquidity , rather than its level, that appears to be important (in the parsimonious specification) and it has a plausible positive effect upon bank ratings. That is, a bank whose liquidity increased two periods ago has a higher rating. We note that this effect would not have been revealed had we not allowed for sufficient lags in the dynamic specification. We believe that allowing for such lags is a strength of our investigation relative to analyses that do not consider such dynamics. NI_Margin is not significant, thus it appears that NI_Margin does not determine bank ratings. 15 Finally, the second lag of NOA is significant. We are cautious of interpreting this as supportive of a significant effect upon rating because 2−tNOA has a theoretically implausible negative sign. This apparent and unexpected correlation may be due to a Type I error (of which there is a 5% chance given our chosen significance level). Under the heading ‘Undifferenced specifications’ in Table 3 are the SVM estimated weights with two sets of bootstrap confidence intervals at the 80% level. We do not report the 95% confidence intervals because only two non-country covariates are significant using this level of significance (and one has an unexpected sign), being ( ) 1 ln −tAssets and 3−tLiquidity , and, in addition, up to twenty country dummies are significant. 8 To obtain a number of significant covariates closer to that obtained with the ordered choice models we employed a broader 80% confidence interval to produce our reported parsimonious model. The following eight non-country variables are included in the parsimonious model (their weights’ signs are given in parentheses): time (negative), ( ) 1 ln −tAssets (positive), 1−tOEOI (negative), 2−tOEOI (negative), 1−tROAE (positive), 2−tLiquidity (positive), 3−tLiquidity (negative) and 3−tNOA (negative). All these weights’ signs are plausible except for 3−tLiquidity and 3−tNOA . However, the weight on 3−tLiquidity is approximately of equal magnitude to that on 2−tLiquidity suggesting that the difference of this variable is important, which is consistent with the findings from the ordered choice model. Nevertheless, the 95% confidence intervals are broad and disappointing and using 80% confidence intervals simply to secure more acceptable results is arbitrary. We believe that these disappointing confidence intervals may be due to multicollinearity between the lags of covariates leading to unstable SVM estimates. Therefore, we applied an alternative “difference” model with undifferenced covariates used for lag 1 and the first three lags of 8 With bootstrapped confidence intervals 13 country dummies are significant while 20 of the country indicators are significant according to the confidence intervals based upon the normal distribution. 16 differenced variables. The weights of the model, with two sets of 95% confidence intervals, are reported in Table 3 under the heading ‘Differenced specifications’. These results seem more satisfactory because there are a greater number of significant covariates in the ‘difference specification’ compared to the ‘undifferenced specification’ using the 95% confidence interval, and broadly corroborate the results produced by the ordered choice model. The reported parsimonious model, based on the 95% confidence intervals, includes the following variables: 1−tEquity (positive), ( ) 1ln −tAssets (positive), 1−tOEOI (negative), 1−tROAE (positive) and 2−∆ tLiquidity (positive) plus 11 country dummy variables. All of the weights on the included variables have theoretically plausible signs that reinforce our satisfaction with this ‘differenced specification’. Overall SVMs used with bootsrapped confidence intervals applied at standard levels can be employed to determine the significant variables in the same way as ordered choice models. Upon this basis a parsimonious specification can be chosen. To determine if SVMs select parsimonious models that are as good as those obtained using ordered choice models we need to compare the predictive accuracy of these specifications. The predictive accuracy of these models is considered in the next section. 5: Empirical Results: predictive performance The percentage of correct predictions of the general and parsimonious specifications for both ordered choice models and SVMs are reported in Table 4 and Table 5, respectively. 9 From Table 4 (row entitled With Country) we see that there are between 50.1% and 52.1% correct predictions for the ordered choice models including the country variable (predictions 9 This prediction is calculated using the same sample employed to estimate the data. It is a fit measure rather than providing an assessment on out-of-sample data. We did not reserve any data for out-of-sample evaluation in order to maximise the period that could be used for estimation and, therefore, maximise efficiency of estimation, which is especially important for ordered choice models. 17 are obtained from the models reported in Table 2). 10 The percentages of correct predictions for these models excluding the country variable are reported in the row entitled No Country of Table 4 for comparative purposes. 11 The parsimonious model is obtained by applying the general-to-specific method with all variables except for Country included in the general model. The estimated percentage correct predictions for these regressions are between 34.0% and 36.6%. The predictive accuracy is substantially greater (by at least over 13.5 percentage points) for models incorporating the Country variable compared to those that do not. The regressions including this country index also have much larger pseudo 2 R s than those that do not and the country index is highly significant in all models in which it is included. This further demonstrates the importance of modelling country effects for predicting international bank ratings. The percentages of correctly predicted bank ratings obtained from the undifferenced SVM with all country dummies included simultaneously and estimated unrestrictedly for various combinations of C and ε are reported in Table 5 in the section headed Undifferenced Model with Unrestricted Dummies. The predictive accuracy of the general SVM is between 48.5% and 62.4% (the majority of SVM predictions exceed 57%) which is substantially better than obtained from the ordered choice models. 12 If such performance can be repeated out-of-sample this would suggest the adoption of SVMs would provide greater predictive accuracy than ordered choice 10 These percentage of correct predictions are similar for probit and logit specifications, if the latter, in general, produce slightly more accurate predictions. 11 To save space we do not report these estimated models, however, these results are available from the authors on request. 12 To place this predictive performance in context we note that when there are nine (eight) rating categories the expected accuracy of predictions by chance is 11.1% (12.5%). Hence, the best performing SVM can increase the predictive accuracy by over fifty percentage points. 18 models that are currently used as standard for this purpose. 13 This is important given that prediction is the primary purpose of such models. The predictive performance of the parsimonious undifferenced SVM with unrestricted dummies is between 42.49% and 52.09%. This performance is substantially worse than the general SVM version of this model and no better (and often worse) than that of the ordered choice models (reported in the row No Country of Table 4). The implication is that model reduction within the SVM method has lead to important variables being excluded suggesting that one might be cautious when using the SVM methodology to identify the significant determinants of ratings. This contrasts with the ordered choice method where there was no systematic difference in the predictive performance of general and parsimonious models. We report the predictive accuracy of the difference SVM (the estimation results are given in Table 3) in the section headed Differenced Model with Unrestricted Dummies of Table 5. The predictive accuracy is between 47.4% and 61.8% for the general differenced SVM and in the range of 43.2% and 47.4% for its parsimonious counterpart. The maximum predictive accuracy of the differenced SVM (being 61.8%) is slightly less than that of the undifferenced SVM (62.4%). It is usually understood that multicollinearity is a problem for accurate estimation of coefficients rather than prediction and these results show that this is the case for our SVM models: there is no problem for prediction, only for reporting the weight estimates and their associated confidence intervals. Similar to the undifferenced SVM the predictive accuracy of the parsimonious differenced SVM is substantially lower (with a maximum of 47.4%) than for its general counterpart. This reinforces our earlier findings and demonstrates that for the SVM method using as many covariates as possible is best for achieving good predictive results, even if some of those covariates are not statistically significant in the conventional sense. 13 The in-sample predictive performance of the general undifferenced SVMs is at least as good as the ordered choice models with 5.0<ε and C > 0.5 and is best when 25.0=ε and C = 2 (with 62.4% correct predictions). This suggests that choice of these parameters is important in the selection of the SVM used for prediction. 19 The predictive accuracy of undifferenced SVMs that exclude the country dummy variables are reported in the section headed Undifferenced Model with No Country Dummies in Table 5. The percentages of correct predictions are between 34.8% and 39.0% (29.0% - 31.8%) for the general (parsimonious) undifferenced SVM without country dummies. This is a substantially worse predictive performance relative to when country dummies are included in SVMs. This confirms the findings obtained from the ordered choice models’ predictive performance and further emphasises the importance of accounting for country effects when modelling and predicting bank ratings. SVMs with undifferenced variables are re-estimated using the single country index (employed in the ordered choice models) to capture country effects instead of the individual country dummies and their prediction accuracy is reported in the section headed Undifferenced Model with Single Country Index of Table 5. The predictive accuracy is substantially lower for the SVMs using the single country index (between 48.7% and 53.2% for the general model and 49.0% and 52.6% for the parsimonious specification) than for the SVMs with unrestricted dummies but are similar to those produced by the ordered choice models (that also use this country index). Hence, the superior predictive performance of SVMs over ordered choice models seems to be because the former can estimate the coefficients of the country dummies simultaneously and unrestrictedly, whereas the ordered choice models cannot. To explore this issue further we calculate the percentage of correct predictions from ordered choice models that include a new country index constructed using the weights obtained from the general undifferenced SVM with unrestricted dummies. 14 The weights for this model are reported in Table 6. Predictive performance is reported for ordered choice models that include this new country index in the row entitled With Country SVM of Table 4. 14 The simple and Spearman rank correlation coefficients are 0.793 and 0.862, respectively, for the correlation between the original country index and the one based upon the SVM weights. Hence, whilst they are highly correlated there are still notable differences between the two indexes. 20 The original index is replaced with the new index in the general models and the general-to- specific methodology is applied to obtain parsimonious models. The predictive performance measures fall in the range of 53.8% to 57.8% with the parsimonious model securing the highest accuracy. It is noticeable that all of the ordered choice models that use the index based upon SVM weights have a superior predictive accuracy than the best performing ordered choice model that employs the original index. However, these ordered choice models can still be substantially outperformed by the undifferenced SVM that estimates the country dummies unrestrictedly. This further highlights the advantage that the SVM has in terms of its ability to freely estimate the coefficients of all of the country dummies. 6. Conclusions Using data on banks from around the world we compare two different methodologies, ordered choice models and SVMs, for predicting and identifying the significant determinants of bank ratings. The ordered choice models unambiguously identify the following significant determinants of ratings. Banks with a greater capitalisation ( Equity ), larger assets [ ( )Assetsln ], and a higher return on assets ( ROAE ) have higher bank ratings. Further, if a bank’s liquidity ( )Liquidity∆ increased two periods ago it ratings will rise. Conversely, the greater is a bank’s ratio of operating expenses to total operating income ( OEOI ) and the more recent is the date that the rating is made ( time ) the lower is the rating of the bank. However, we also find that net operating income to total assets ( )NOA has an unexpected negative influence on bank ratings. In addition, there is strong evidence that a bank’s country of origin has a significant influence on bank ratings. Inclusion of this country effect 21 substantially raises the ability of an ordered choice model to accurately predict international bank ratings relative to models that exclude country effects. The SVM results confirm the importance of country effects as significant determinants of ratings: when they are excluded the predictive performance of the model considerably deteriorates relative to a model including country effects. However, although the SVM method could be adapted to identify the significant variables it suggested few significant determinants using a 95% level. Using an 80% confidence interval unsurprisingly raised the number of significant variables. Given the arbitrary and unconventional choice of confidence level required to secure this result we considered whether the breadth of the confidence intervals was due to multicollinearity among the lags of the covariates. Using an SVM that incorporated variables in both difference and level form yielded more significant determinants using a conventional 95% confidence interval that all had expected signs on the weights. These significant variables are, Equity , ( )Assetsln , ROAE , Liquidity∆ and OEOI which are the same as for the ordered choice models except NOA and time , which were not indicated as significant by the differenced SVM. However, the predictive performance of the parsimonious SVMs were considerably lower than that of the general SVMs suggesting that the model reduction method applied to SVMs leads to important determinants being excluded from the model and, therefore, not being identified. For this reason we are cautious to present the variables identified by the SVM method as the only significant determinants of ratings. Thus, based primarily on the results obtained from the parsimonious ordered choice models, we conclude that ratings reflect a bank’s financial position (as measured by various financial variables), the timing of when the rating was made and a bank’s country of origin. Regarding the timing effect, FR may have applied more prudent views and policies as a reaction to critiques of their role during the financial turbulence of the late 1990s. We have 22 therefore identified a set of determinants that are revealing in how ratings agencies determine the risks of inherently opaque banks, especially given the high predictive accuracy achieved by our models. We have also found that SVMs can produce substantially better in-sample predictions of international bank ratings than the standard method currently used for this purpose, ordered choice models. This appears to be due to the SVM’s ability to estimate a large number of country dummies’ coefficients unrestrictedly, which was not possible with the ordered choice models due to the small sample size. Given that prediction is the primary purpose of modelling ratings, this is an important result. In this paper we have only considered in-sample predictions due to the fact that using some of the data for an out-of- sample data set would have severely restricted the (training) in-sample to less than the 360 observations. However, consideration of the relative out-of-sample predictive performance of SVMs and ordered choice models, requiring more observations than were available here, would be a desirable avenue for further research. Additionally we deliberately did not use SVMs with non-linear kernels to model the data in this exercise since we did not have sufficient data for an independent validation set to optimize non-linear model parameter settings. However we expect that using different kernels may improve performance. We intend to pursue these lines of research with a larger data set. 23 References Altman, E. I., and Saunders, A. (1998). Credit risk measurement: Developments over the last 20 years. Journal of Banking and Finance, 21, 1721–1742. Altman, E., and Rijken, H., A. (2004). How Rating Agencies Achieve Rating Stability, Journal of Banking & Finance, 28, 2679-2714. Altman, E., and Rijken, H., A. (2006). A Point in Time Perspective on Through-the Cycle Ratings, Financial Analysts Journal, 62, 54-70. Altman, E., Bharath, S., and Saunders, A. (2002). Credit ratings and the BIS capital adequacy reform agenda. Journal of Banking & Finance 26, 929-951. Altman, E., Saunders, A., (2001). An analysis and critique of the BIS proposal on capital adequacy and ratings. Journal of Banking & Finance 25, 25–46. Amato, J. D. and Furfine, C.H. (2004). Are Credit Ratings Procyclical?, Journal of Banking and Finance 29, 2641–2677. Chang C-C and Lin C-J (2001) LIBSVM : a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm Greene W. H., (2008). Econometric Analysis, Pearson, Prentice Hall, 6 th edition. Hendry D. F. and Santos C. (2005). “Regression models with data-based indicator variables” Oxford Bulletin of Economics and statistics, 67, 5, 571 – 595. Hendry D. F., (2001). “Modelling UK Inflation, 1875 – 1991” Journal of Applied Econometrics, 16, 255 – 275. Huang, Zan, Hsinchun Chen, Chia-Jung Hsu, Wun-Hwa Chen, Soushan Wu. (2004). Credit rating analysis with support vector machines and neural networks: a market comparative study, Decision Support Systems, 37, pp. 543-558, Kamstra, M., Kennedy, P., Suan, T.K. (2001). Combining bond rating forecasts using logit. The Financial Review, 37, pp.75-96. Kim, S. K. (2005). Predicting bond ratings using publicly available information, Expert Systems with Applications, 29, pp.75-81 Kolari, J. D. Glennon, H. Shin and M. Caputo, (2002). Predicting large US commercial bank failures, Journal of Economics and Business, 54, pp. 361–387. Lee, Y.C. (2007). Application of support vector machines to corporate credit rating prediction, Expert Systems with Applications 33, pp. 67–74. Levich, R., Majnoni, G., Rinhart, C. (2002). Ratings, Rating and Agencies and the Global Financial System, Kluwer Publishing. Meyer, P., & Pifer, H. (1970). Prediction of bank failures. Journal of Finance, 25, 853–868. Morgan D. P. (2002). Rating Banks: Risk and Uncertainty in an Opaque Industry. The American Economic Review, 92 (4), 874–888. Pinto, A. R. (2006). Control and Responsibility of Credit Rating Agencies in the United States, American Journal of Comparative Law, 54, 341-356. Portes, R. (2008). Ratings agency reform, Vox, 22 January,2008, 24 Vapnik V (1995). The Nature of Statistical Learning Theory. Springer NY 25 Table 1: Ordered Choice Models’ Country Index Weights Country Weight Country Weight ANDORRA 7.00 TAIWAN 4.09 CANADA 7.00 COLOMBIA 4.00 IRELAND 7.00 COSTA RICA 4.00 NORWAY 7.00 LITHUANIA 4.00 SWEDEN 7.00 MALTA 4.00 USA 6.82 MOROCCO 4.00 SWITZERLAND 6.75 PERU 4.00 SPAIN 6.71 TURKEY 3.76 NETHERLANDS 6.50 EL SALVADOR 3.75 SAUDI ARABIA 6.43 INDONESIA 3.75 AUSTRIA 6.00 INDIA 3.54 CZECH REPUBLIC 6.00 EGYPT 3.50 ESTONIA 6.00 HUNGARY 3.50 HONG KONG 6.00 BULGARIA 3.40 ICELAND 6.00 LATVIA 3.33 JORDAN 6.00 ARGENTINA 3.12 SAN MARINO 6.00 BENIN 3.12 SOUTH AFRICA 6.00 IRAN 3.12 KOREA REP. OF 5.82 JAMAICA 3.12 FRANCE 5.71 KENYA 3.12 UNITED KINGDOM 5.64 LEBANON 3.12 ITALY 5.55 MONGOLIA 3.12 CHILE 5.50 NIGERIA 3.12 GERMANY 5.50 TUNISIA 3.12 KUWAIT 5.50 KAZAKHSTAN 3.00 SLOVENIA 5.50 ROMANIA 3.00 GREECE 5.43 RUSSIAN FEDERATION 2.90 BERMUDA 5.33 PHILIPPINES 2.83 QATAR 5.25 VENEZUELA 2.75 UNITED ARAB EMIRATES 5.25 VIETNAM 2.67 BAHRAIN 5.17 GEORGIA REP. OF 2.50 AUSTRALIA 5.00 PAKISTAN 2.50 MACAU 5.00 CHINA-PEOPLE'S REP. 2.44 OMAN 5.00 UKRAINE 2.21 PANAMA 5.00 ALBANIA 2.00 TRINIDAD AND TOBAGO 5.00 ARMENIA 2.00 MEXICO 4.89 AZERBAIJAN 2.00 JAPAN 4.86 BOSNIA-HERZEGOVINA 2.00 ISRAEL 4.67 MACEDONIA (FYROM) 2.00 SLOVAKIA 4.50 NIGER 2.00 BRAZIL 4.38 SERBIA 2.00 CYPRUS 4.33 SRI LANKA 1.80 THAILAND 4.33 DOMINICAN REPUBLIC 1.75 POLAND 4.29 BELARUS 1.60 MALAYSIA 4.25 BANGLADESH -6.38 Table 1 notes. The country index for the ordered choice models is constructed as the sum of the products of the country weights (given in the column headed weight) and the individual country dummy variables. 26 Table 2: Bank ratings ordered choice regressions Logit specifications Probit specifications Variables General model Parsimonious model General model Parsimonious model Country 1.975 (12.938) 1.904 (14.391) 1.078 (12.81) 1.039 (13.618) time –0.298 (–2.114) –0.249 (–2.147) –0.182 (–2.50) –0.155 (–2.407) 1−tEquity –0.001 (–0.014) 0.005 (0.18) 1−tLiquidity 0.437 (0.295) 0.446 (0.55) ( ) 1ln −tAssets 0.840 (2.762) 0.524 (6.900) 0.522 (2.97) 0.290 (7.072) NI_Margin 1−t –0.034 (–0.239) –0.028 (–0.47) 1−tNOA 6.520 (0.663) 1.954 (0.38) 1−tOEOI –0.528 (–2.094) –0.508 (–3.449) –0.296 (–2.01) –0.310 (–3.840) 1−tROAE 0.021 (1.367) 0.033 (3.595) 0.015 (1.98) 0.020 (3.948) 2−tEquity 0.032 (0.611) 0.014 (0.49) 2−tLiquidity 5.180 (2.805) 2.877 (2.75) ( ) 2ln −tAssets –0.009 (–0.013) –0.161 (–0.41) NI_Margin 2−t 0.108 (0.697) 0.066 (0.89) 2−tNOA –27.913 (–1.553) –10.376 (–2.360) –15.194 (–1.55) –5.824 (–2.081) 2−tOEOI –1.377 (–2.091) –1.812 (–3.132) –0.749 (–2.03) –1.005 (–3.128) 2−tROAE 0.016 (1.616) 0.008 (1.30) 3−tEquity 0.066 (0.868) 0.081 (6.7346) 0.044 (1.32) 0.051 (6.909) 3−tLiquidity –5.481 (–2.548) –2.914 (–2.67) ( ) 3ln −tAssets –0.356 (–0.361) –0.120 (–0.25) NI_Margin 3−t –0.067 (–0.850) –0.039 (–0.92) 3−tNOA 4.089 (0.410) 2.978 (0.57) 3−tOEOI –0.302 (–0.592) –0.265 (–0.84) 3−tROAE 0.001 (0.233) 0.000 (0.05) 4−tEquity –0.002 (–0.039) –0.004 (–0.20) 4−tLiquidity 0.205 (0.141) –0.184 (–0.26) ( ) 4ln −tAssets 0.105 (0.188) 0.079 (0.29) NI_Margin 4−t 0.035 (1.091) 0.024 (1.30) 4−tNOA 0.392 (0.100) –0.398 (–0.19) 4−tOEOI –0.012 (–0.094) –0.030 (–0.37) 4−tROAE 0.002 (0.0665) 0.004 (2.599) 0.001 (0.74) 0.002 (2.121) 2−∆ tLiquidity 5.311 (4.157) 3.029 (4.125) Limit Points λ1 8.049 (5.315) 6.797 (5.643) 4.505 (5.593) 3.866 (5.988) λ2 11.618 (7.1275) 10.316 (7.916) 6.304 (7.402) 5.649 (8.205) λ3 14.158 (8.507) 12.822 (9.384) 7.685 (8.675) 7.012 (9.703) λ4 16.123 (9.335) 14.786 (10.239) 8.735 (9.749) 8.058 (10.543) λ5 18.456 (10.211) 17.098 (11.195) 10.023 (10.367) 9.333 (11.483) λ6 20.121 (10.877) 18.740 (11.968) 10.934 (11.029) 10.232 (12.228) λ7 22.618 (11.803) 21.175 (12.946) 12.301 (12.034) 11.567 (13.309) Fit Measures % correct 52.089 50.278 50.139 50.278 Pseudo 2 R 0.385 0.381 0.373 0.368 SBC 2.993 2.681 2.926 2.729 LR statistic 536.110 [0.000] 532.230 [0.000] 519.016 [0.000] 514.638 [0.000] LR(general→*) NA 6.993 [0.997] NA 7.368 [0.995] Observations 359 360 359 360 Table 2 notes. The dependent variable is a bank’s rating which has eight categories that correspond to the integer values in the range of 1 to 8 and yields seven limit points, 7 ,...,2 ,1 , =iiλ (the intercept is not separately identified from the limit points). Z-statistics (in parentheses) are based upon Huber-White standard errors and the percentage of correct predictions (% correct) use the category with the highest probability to give the predicted rating. Also reported are the Pseudo 2 R , Schwartz’s information criterion, SBC, and likelihood ratio tests for the model’s explanatory power, LR Statistic, and the deletion of variables from the general model to obtain the parsimonious model, LR(general→*). Probability values are given in square parentheses. All regressions were estimated using E-Views 6.0 and STATA 10. 27 Table 3: Bank ratings SVMs Undifferenced specifications Differenced specifications Variables General model Parsimonious model Variables General model Parsimonious model time –0.096 [–0.24, –0.01] * –0.091 [–0.01, 0.01] time –0.102 [–0.29, 0.03] 1−tEquity 0.016 [–0.01, 0.06] 1−tEquity 0.050 [0.03, 0.08] * 0.050 [0.03, 0.07] * 1−tLiquidity 0.369 [–0.13, 1.25] 1−tLiquidity –0.133 [–0.08, 0.95] ( ) 1ln −tAssets 0.371 [0.20, 0.64] * 0.350 [0.28, 0.39] * ( ) 1ln −tAssets 0.402 [0.25, 0.46] * 0.434 [0.36, 0.52] * NI_Margin 1−t 0.010 [–0.08, 0.05] NI_Margin 1−t –0.039 [–0.11, 0.01] 1−tNOA –0.444 [–0.79, 0.05] 1−tNOA –0.459 [–1.12, 0.25] 1−tOEOI –0.266 [–0.53, –0.10] * –0.183 [–0.43, –0.08] * 1−tOEOI –0.776 [–1.77, –0.14] * –0.364 [–1.13, –0.02] * 1−tROAE 0.010 [0.01, 0.02] * 0.009 [0.002, 0.02] * 1−tROAE 0.020 [0.003, 0.04] * 0.014 [–0.01, 0.03] 2−tEquity 0.017 [–0.02, 0.05] 1−∆ tEquity -0.036 [–0.08, 0.02] 2−tLiquidity 0.793 [0.01, 1.47] * 1.055 [0.28, 1.62] * 1−∆ tLiquidity 0.300 [–0.71, 1.46] ( ) 2ln −tAssets 0.242 [–0.27, 0.55] ( ) 1ln −∆ tAssets -0.034 [–0.30, 0.52] NI_Margin 2−t 0.007 [–0.09, 0.08] ∆ NI_Margin 1−t 0.050 [–0.07, 0.12] 2−tNOA –0.508 [–8.01, 11.32] 1−∆ tNOA 0.008 [–0.54, 0.41] 2−tOEOI –0.494 [–1.04, –0.03] * –0.977 [–1.31, –0.57] * 1−∆ tOEOI 0.511 [–0.08, 1.42] 2−tROAE 0.008 [–0.05, 0.01] 1−∆ tROAE -0.010 [–0.03, 0.02] 3−tEquity 0.011 [–0.03, 0.06] 2−∆ tEquity -0.005 [–0.05, 0.06] 3−tLiquidity –1.272 [–2.00, –0.47] * –1.251 [–1.82, –0.47] * 2−∆ tLiquidity 1.404 [0.23, 2.55] * 1.165 [–0.09, 2.30] ( ) 3ln −tAssets –0.475 [–0.90, 0.15] ( ) 2ln −∆ tAssets 0.218 [–0.43, 0.75] NI_Margin 3−t –0.039 [–0.08, 0.08] ∆ NI_Margin 2−t 0.043 [–0.10, 0.11] 3−tNOA –0.654 [–1.07, –0.08] * –0.235 [–1.14, 0.55] 2−∆ tNOA 0.155 [–0.46, 0.74] 3−tOEOI –0.151 [–0.79, 0.17] 2−∆ tOEOI 0.041 [–0.53, 1.00] 3−tROAE 0.000 [–0.01, 0.01] 2−∆ tROAE -0.002 [–0.03, 0.004] 4−tEquity 0.007 [–0.03, 0.03] 3−∆ tEquity -0.009 [–0.04, 0.05] 4−tLiquidity –0.036 [–0.88, 0.50] 3−∆ tLiquidity -0.195 [–1.54, 0.93] ( ) 4ln −tAssets 0.259 [–0.18, 0.63] ( ) 3ln −∆ tAssets -0.210 [–0.88, 0.40] NI_Margin 4−t –0.012 [–0.05, 0.01] ∆ NI_Margin 3−t 0.018 [–0.03, 0.07] 4−tNOA –0.015 [–0.62, 0.65] 3−∆ tNOA -0.639 [–1.66, 0.45] 4−tOEOI 0.005 [–0.18, 0.08] 3−∆ tOEOI -0.018 [–0.22, 0.26] 4−tROAE 0.002 [–0.001, 0.005] 3−∆ tROAE -0.002 [–0.01, 0.003] Table 3 notes. The dependent variable is the bank rating (which is never differenced). Confidence intervals are reported from a bootstrap (1,000 samples) in square brackets []. These are 80% confidence intervals for the undifferenced specifications while 95% level intervals are employed for the difference specifications. An asterix, * , indicates a variable that is significant according to the reported confidence intervals. Models also include dummy variables for countries but these are not shown to save space. 28 Table 4: Percentage Correct Predictions: Ordered Choice Models Logit Probit General Parsimonious General Parsimonious With Country 52.1% 50.3% 50.1% 50.3% No Country 35.9% 36.6% 34.0% 34.1% With Country SVM 56.8% 57.8% 53.8% 53.8% Table 4 notes. The predicted rating for each observation is chosen upon the basis of the category with the highest probability. The row entitled With Country refers to the percentage of correct predictions obtained from the models reported in Table 2. The row entitled No Country reports the correct predictions for models developed using the general-to-specific method where the country variable is excluded from the general model. The row entitled With Country SVM denotes the predictive accuracy of ordered choice models that use the country index constructed with the weights obtained from the SVM’s general model (reported in Table 6) and the parsimonious models are developed applying the general-to-specific method. 29 Table 5: Percentage Correct Predictions from SVMs Undifferenced Model with Unrestricted Dummies General Parsimonious ε ε 0.05 0.25 0.5 0.05 0.25 0.5 C 0.25 53.2% 53.8% 48.5% 48.2% 46.2% 44.3% 0.5 56.8% 59.3% 51.0% 50.1% 49.3% 46.8% 1 57.9% 61.3% 53.2% 50.4% 51.8% 48.8% 2 58.2% 62.4% 56.3% 51.5% 52.1% 47.6% 4 59.9% 61.6% 53.2% 51.3% 50.7% 49.0% 8 58.5% 61.3% 53.2% 51.5% 50.7% 48.8% 12 58.2% 61.6% 52.6% 51.5% 50.7% 48.2% Differenced Model with Unrestricted Dummies General Parsimonious ε ε 0.05 0.25 0.5 0.05 0.25 0.5 C 0.25 52.4% 54.3% 47.4% 46.5% 47.1% 44.3% 0.5 56.3% 59.3% 51.5% 46.8% 46.8% 45.1% 1 57.4% 60.2% 52.4% 46.8% 47.4% 45.4% 2 57.7% 61.8% 54.3% 47.4% 47.4% 44.6% 4 58.8% 60.7% 53.8% 46.5% 47.1% 44.3% 8 59.1% 61.6% 53.8% 46.5% 46.0% 44.0% 12 58.5% 59.6% 54.9% 46.5% 46.0% 43.2% Undifferenced Model with No Country Dummies General Parsimonious ε ε 0.05 0.25 0.5 0.05 0.25 0.5 C 0.25 37.6% 37.9% 34.8% 29.5% 29.5% 29.0% 0.5 38.7% 38.4% 35.4% 30.9% 31.8% 29.5% 1 38.12% 38.4% 36.5% 30.9% 30.9% 30.6% 2 39.3% 38.2% 36.8% 29.2% 29.5% 30.1% 4 38.4% 38.4% 38.2% 30.1% 29.5% 29.5% 8 38.7% 39.0% 38.4% 29.0% 29.5% 29.8% 12 38.4% 38.4% 38.2% 29.0% 29.2% 29.2% Undifferenced Model with Single Country Index General Parsimonious ε ε C 0.05 0.25 0.5 0.05 0.25 0.5 0.25 52.9% 52.6% 49.6% 51.8% 51.3% 50.1% 0.5 53.2% 52.4% 49.3% 51.5% 51.5% 49.6% 1 52.9% 52.9% 48.7% 52.4% 52.1% 49.0% 2 52.6% 52.9% 50.1% 52.4% 51.5% 49.6% 4 52.9% 52.9% 51.8% 52.1% 51.5% 50.7% 8 52.9% 52.1% 50.7% 52.6% 50.7% 51.3% 12 52.6% 52.1% 51.3% 52.4% 50.4% 51.8% Table 5 notes. the statistics are the percentage of correct predictions obtained from SVMs. All sections except for the second section are based upon SVMs using undifferenced variables. The section headed Undifferenced Model with Unrestricted Dummies refers to SVMs that allow all country dummies to be estimated unrestricted – the regressions from which the predictions are made are reported in Table 3 under the heading Undifferenced specifications. The section headed Undifferenced Model with No Country Dummies reports predictive accuracy using SVMs that exclude all country terms. The section headed Undifferenced Model with Single Country Index provides results based upon SVMs including the single country index constructed from the ordered choice regressions (the weights are reported in Table 1) and no individual country dummy variables. The section headed Differenced Model with Unrestricted Dummies 30 refers to SVMs that allow all country dummies to be estimated unrestricted and include differenced variables – the regressions from which the predictions are made are reported in Table 3 under the heading Differenced specifications. 31 Table 6: Undifferenced General SVM Model’s Country Index Weights Country Weight Country Weight USA 2.283 KENYA 0.000 ANDORRA 2.000 LATVIA 0.000 UNITED KINGDOM 2.000 MACEDONIA (FYROM) 0.000 SWITZERLAND 1.705 MALTA 0.000 BERMUDA 1.684 NETHERLANDS 0.000 CANADA 1.281 NORWAY 0.000 KOREA REP. OF 1.066 PHILIPPINES 0.000 CZECH REPUBLIC 1.065 SAN MARINO 0.000 SAUDI ARABIA 1.050 SERBIA 0.000 FRANCE 1.019 SOUTH AFRICA 0.000 SLOVENIA 0.929 SPAIN 0.000 PANAMA 0.644 SWEDEN 0.000 UNITED ARAB EMIRATES 0.616 TAIWAN 0.000 KUWAIT 0.612 TUNISIA 0.000 MACAU 0.608 HUNGARY -0.009 OMAN 0.519 ALBANIA -0.015 LITHUANIA 0.452 MALAYSIA -0.026 AUSTRIA 0.435 POLAND -0.093 CYPRUS 0.412 INDONESIA -0.133 MEXICO 0.401 ROMANIA -0.170 SLOVAKIA 0.382 MOROCCO -0.179 BAHRAIN 0.337 ISRAEL -0.185 QATAR 0.307 THAILAND -0.251 EL SALVADOR 0.297 NIGER -0.266 TRINIDAD AND TOBAGO 0.297 PERU -0.320 BRAZIL 0.249 TURKEY -0.396 JAPAN 0.246 AZERBAIJAN -0.463 GERMANY 0.228 KAZAKHSTAN -0.473 JORDAN 0.108 RUSSIAN FEDERATION -0.539 GEORGIA REP. OF 0.034 INDIA -0.550 MONGOLIA 0.003 COLOMBIA -0.619 ARMENIA 0.000 UKRAINE -0.763 AUSTRALIA 0.000 JAMAICA -0.869 BENIN 0.000 VIETNAM -0.969 BOSNIA-HERZEGOVINA 0.000 VENEZUELA -0.984 BULGARIA 0.000 NIGERIA -1.072 CHILE 0.000 ARGENTINA -1.083 COSTA RICA 0.000 LEBANON -1.134 EGYPT 0.000 BELARUS -1.205 ESTONIA 0.000 PAKISTAN -1.416 GREECE 0.000 BANGLADESH -1.452 HONG KONG 0.000 SRI LANKA -1.611 ICELAND 0.000 DOMINICAN REPUBLIC -1.696 IRELAND 0.000 IRAN -2.141 ITALY 0.000 CHINA-PEOPLE'S REP. -2.188 Table 6 notes. The estimated weights for the individual country dummy variables are obtained from the SVM model where the variables are not differenced and the model includes 32 all possible variables (general model). The other estimation results for this model are reported in Table 3.