Credit scoring using the clustered support vector machine


Credit scoring using the clustered support vector machine

Terry Harris ⇑
Credit Research Unit, Department of Management Studies, The University of the West Indies, Cave Hill Campus, P.O. Box 64, Barbados

a r t i c l e i n f o

Article history:
Available online 6 September 2014

Keywords:
Credit risk
Credit scoring
Clustered support vector machine
Support vector machine

a b s t r a c t

This work investigates the practice of credit scoring and introduces the use of the clustered support vec-
tor machine (CSVM) for credit scorecard development. This recently designed algorithm addresses some
of the limitations noted in the literature that is associated with traditional nonlinear support vector
machine (SVM) based methods for classification. Specifically, it is well known that as historical credit
scoring datasets get large, these nonlinear approaches while highly accurate become computationally
expensive. Accordingly, this study compares the CSVM with other nonlinear SVM based techniques
and shows that the CSVM can achieve comparable levels of classification performance while remaining
relatively cheap computationally.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

In recent years, credit risk assessment has attracted significant
attention from managers at financial institutions around the world.
This increased interest has been in no small part caused by the
weaknesses of existing risk management techniques that have
been revealed by the recent financial crisis and the growing
demand for consumer credit (Wang, Yan, & Zhang, 2011). Address-
ing these concerns, over past decades credit scoring has become
increasingly important as financial institutions move away from
the traditional manual approaches to this more advanced method,
which entails the building of complex statistical models (Huang,
Chen, & Wang, 2007; Zhou, Lai, & Yu, 2010).

Many of the statistical methods used to build credit scorecards
are based on traditional classification techniques such as logistic
regression or discriminant analysis. However, in recent times
non-linear approaches,1 such as the kernel support vector machine,
have been applied to credit scoring. These methods have helped to
increase the accuracy and reliability of many credit scorecards
(Bellotti & Crook, 2009; Yu, 2008). Nevertheless, despite these
advances credit analyst at financial institutions are pressed to
continually pursue improvements in classifier performance in an
attempt to mitigate the credit risk faced by their institutions.
However, many of the improvements in classifier performances
remain unreported due to the proprietary nature of industry led

credit scoring research which attempts to find more efficient and
effective algorithms.

In the wider research community, the recent vintages of
non-linear classifiers (e.g. the kernel support vector machine) have
received a lot of attention and have been critiqued for, inter alia,
their large time complexities. In fact the best-known time
complexity for training a kernel based support vector machine is
still quadratic (Bordes, Ertekin, Weston, & Bottou, 2005). As a
result, when applied to credit scoring substantial computational
resources are consumed when training on reasonably sized real
world datasets. Accordingly, efforts to develop and apply new
classifiers to credit scoring, which are capable of separating nonlin-
ear data while remaining relatively inexpensive computationally,
are well placed.

This paper investigates the suitability for credit scoring of a
recently developed support vector machine based algorithm that
has been proposed by Gu and Han (2013). Their clustered support
vector machine has been shown to offer comparable performance
to kernel based approaches while remaining cheap in terms of
computational time. Furthermore, this study makes some novel
adjustments to their implementation and explores the use of
radius basis function (RBF) kernels in addition to the linear kernel
posited by Gu and Han.

The remainder of this paper is presented as follows. Section 2
outlines a brief review of the literature concerning the field of
credit scoring and sets the stage for the proposed CVSM model
for credit scoring that is presented in Section 3. The details of the
historic clients’ loan dataset and modeling method are highlighted
in Section 4. Section 5 presents the study results, and Section 6 dis-
cusses the findings, presents conclusions, and outlines possible
directions for future research.

http://dx.doi.org/10.1016/j.eswa.2014.08.029
0957-4174/� 2014 Elsevier Ltd. All rights reserved.

⇑ Tel.: +1 (246) 417 4302; fax: +1 (246) 438 9167.
E-mail address: terry.harris@cavehill.uwi.edu

1 This has been applied because credit-scoring data is often not linearly separable.

Expert Systems with Applications 42 (2015) 741–750

Contents lists available at ScienceDirect

Expert Systems with Applications

j o u r n a l h o m e p a g e : w w w . e l s e v i e r . c o m / l o c a t e / e s w a

http://crossmark.crossref.org/dialog/?doi=10.1016/j.eswa.2014.08.029&domain=pdf
http://dx.doi.org/10.1016/j.eswa.2014.08.029
mailto:terry.harris@cavehill.uwi.edu
http://dx.doi.org/10.1016/j.eswa.2014.08.029
http://www.sciencedirect.com/science/journal/09574174
http://www.elsevier.com/locate/eswa


2. Background and related works

2.1. Overview

Credit scoring has been critical in permitting the exceptional
growth in consumer credit over the last decades. Indeed without
accurate, automated credit risk assessment tools, lenders could
not have expanded their balance sheets effectively over this time.
This section presents a brief review of the relevant literature that
has emerged in this space.

2.2. What is credit scoring?

Credit scoring can be viewed as a method of measuring the risk
attached to a potential customer, by analyzing their data to deter-
mine the likelihood that the prospective borrower will default on a
loan (Abdou & Pointon, 2011). According to Eisenbeis (1978), Hand
and Jacka (1998), and Hand, Sohn, and Kim (2005) credit scoring
can also be described as the statistical technique employed to con-
vert data into rules that can be used to guide credit granting deci-
sions. As a result, it represents a critical process in a firm’s credit
management toolkit. Durand (1941) posited that the procedure
includes collecting, analyzing and classifying different credit ele-
ments and variables in order to make credit granting decisions.
He noted that to classify a firm’s customers, the objective of the
credit evaluation process, is to reduce current and expected risk
of a customer being ‘‘bad’’ for credit. Thus credit scoring is an
important technology for banks and other financial institutions
as they seek to minimize risk.

2.3. Related works

Over the years, the demand for consumer credit has increased
exponentially. According to Steenackers and Goovaerts (1989), this
increase in the demand for credit can be attributable to the
increased levels of consumption and the reliance on credit to sup-
port this activity. In the United States, this rising level of consum-
erism followed the introduction of the first modern credit card in
1950s, so that by the 1980s over 55% of American households
owned a credit card. Crook, Edelman, and Thomas (2007) posited
that by this time, in the US, the total amount of outstanding con-
sumer credit was over $700 billion. Comparatively, at the end of
June 2013 this figure had risen to a staggering $2800 billion, a
400% increase (BGFRS, 2013).

Henley (1994) noted that the increasing demand for consumer
credit has led to the development of many practical the scoring
models, which have adopted a wide range of statistical and non-
linear methods. Similarly, Mays (2001) posited that a number of
various techniques have been used to build credit scoring applica-
tions by credit analyst, researchers, and software developers. These
techniques have included; discriminant analysis, linear regression,
logistic regression, decision trees, neural networks, support vector
machines, k-means, etc.

In recent times, the use of more complex non-linear techniques,
such as neural networks, and support vector machines, to build
credit scoring applications has seen significant increases in the
reported accuracy and performance on benchmarking datasets
(Baesens et al., 2003). Irwin, Warwick, and Hunt (1995) and
Paliwal and Kumar (2009) both provide evidence that advanced
statistical techniques yield superior performance when compared
to traditional statistical techniques, such as discriminant analysis,
probit analysis and logistic regression. Masters (1995) also
provided evidence that the use of sophisticated techniques, such
as neural networks, was essential because they had the capability
to more accurately model credit scoring data that exhibits

interactions and curvature. However, as pointed out by Hand
(2006) the increased performance of these more advanced tech-
niques could be illusionary and if real, diminished due to shifts
in the class distribution over time. The following sub-sections pres-
ent a brief discussion concerning some of the classical and
advanced statistical models used for credit risk assessment.

2.4. Discriminant analysis

In his seminal paper, Fisher (1936) proposed the use of discrim-
inant analysis to differentiate between two or more classes in a
dataset. Since that time, Durand (1941) and Altman (1986) have
both applied Fisher’s (1936) discriminant analysis to credit scoring.
Durant used discriminant analysis to assess the creditworthiness
of car loan applicants, while Altman used it to explore corporate
bankruptcy proposing his popular Z-scores (Altman, 1968). In
works published separately by Desai, Crook, and Overstreet
(1996), Hand and Henley (1997), Hand, Oliver, and Lunn (1998),
Sarlija, Bensic, and Bohacek (2004), and Abdou and Pointon
(2009), they showed that discriminant analysis is indeed a valid
technique for credit scoring. Hand and Henley (1997) noted that
discriminant analysis, a parametric statistical technique, was well
suited to credit scoring because it was designed to classify groups
and variables into two or more categories or discriminate between
two groups. However, Saunders and Allen (1998) noted that with
this type of method certain assumptions about the data must be
met. These assumptions include, normality, linearity, homoscedas-
ticity, non-multicollinearity, etc. Falbo (1991) and Sarlija et al.
(2004) posited that despite these limitations, over the years this
technique has been frequently applied to build credit scoring
applications, and it remains one of the most popular approaches
taken today when classifying customers as creditworthy or un-
creditworthy.

Several authors have criticized the use of discriminant analysis
in credit scoring. Eisenbeis (1978) point-out a number of the statis-
tical problems in applying discriminant analysis to credit scorecard
development. These problems include the following: group defini-
tion, classification error prediction, estimating population priors,
and the use of linear functions instead of quadratic functions, to
mention a few. Nevertheless, Greene (1998) and Abdou (2009)
noted that despite these limits, discriminant analysis is one of
the most commonly used techniques in credit scoring.

2.4.1. Linear regression
Another popular classical statistical technique applied to credit

scoring is linear regression. This method has developed into an
essential component of data analysis in general and is concerned
with describing the relationship between a dependent variable
and one or more independent variables. Thus, customers’ historical
payments, guarantees, default rates and other factors can be ana-
lyzed using linear regression to set up a score for each factor,
and compare it with the bank’s cut-off (threshold) score. Hence,
only if a new customer’s score exceeds the bank’s cut-off score will
credit be granted (Hand & Jacka, 1998).

In its basic form, linear regression used for credit scoring
requires the establishment of a threshold score. This threshold
credit score is derived from the relationships between the firm’s
historic clients’ features and their associated weights. As can be
seen in the linear equation, Z ¼ h0 þ h1 x1 þ h2x2 þ�� �þ hnxn, where
the variable n denotes the number of features collected from past
and potential clients. These features are represented by the x’s,
which are multi-dimensional vectors in Rm, where m denotes the
number of clients in the historical clients’ database. The h’s repre-
sent the weights, and the feature variables and their weights used
to calculate a credit score, Z 2 R, thus when an applicant scores

742 T. Harris / Expert Systems with Applications 42 (2015) 741–750


https://isiarticles.com/article/48568