doi:10.1016/j.eswa.2005.01.003


Building credit scoring models using genetic programming

Chorng-Shyong Ong
a
, Jih-Jeng Huang

a
, Gwo-Hshiung Tzeng

b,c,*

a
Department of Information Management, National Taiwan University, Taipei, Taiwan

b
Institute of Management of Technology, National Chiao Tung University, Ta-Hsuch Rd, Hsunchu 300, Hsinchu 1001, Taiwan

c
College of Management, Kainan University, Taoyuan, Taiwan

Abstract

Credit scoring models have been widely studied in the areas of statistics, machine learning, and artificial intelligence (AI). Many novel

approaches such as artificial neural networks (ANNs), rough sets, or decision trees have been proposed to increase the accuracy of credit

scoring models. Since an improvement in accuracy of a fraction of a percent might translate into significant savings, a more sophisticated

model should be proposed to significantly improving the accuracy of the credit scoring mode. In this paper, genetic programming (GP) is

used to build credit scoring models. Two numerical examples will be employed here to compare the error rate to other credit scoring models

including the ANN, decision trees, rough sets, and logistic regression. On the basis of the results, we can conclude that GP can provide better

performance than other models.

q 2005 Elsevier Ltd. All rights reserved.

Keywords: Credit scoring; Artificial neural network (ANN); Decision trees; Genetic programming (GP); Rough sets
1. Introduction

Credit scoring models have been widely used by financial

institutions to determine if loan customers belong to either a

good applicant group or a bad applicant group. The advantages

of using credit scoring models can be described as the benefit

from reducing the cost of credit analysis, enabling faster credit

decision, insuring credit collections, and diminishing possible

risk (Lee, Chiu, Lu, & Chen, 2002; West, 2000). Since an

improvement in accuracy of a fraction of a percent might

translate into significant savings (West, 2000), a more

sophisticated model should be proposed to significantly

improve the accuracy of the credit scoring model in this paper.

In order to obtain a satisfied credit scoring model,

numerous methods have been proposed. Roughly, these

methods can be classified to parametric statistical

methods (e.g. discriminant analysis and logistic

regression), non-parametric statistical methods (e.g. k

nearest neighbor and decision trees), and soft-computing
0957-4174/$ - see front matter q 2005 Elsevier Ltd. All rights reserved.

doi:10.1016/j.eswa.2005.01.003

* Corresponding author. Address: Institute of Management of Techno-

logy, National Chiao Tung University, Ta-Hsuch Rd, Hsunchu 300,

Hsinchu 1001, Taiwan. Tel.: C886 3571212157505; fax: 886 35753926.

E-mail address: ghtzeng@cc.nctu.edu.tw (G.-H. Tzeng).
approaches (e.g. artificial neural network (ANN) and rough

sets). Recently, ANNs are the most popular tool used for

credit scoring and has been reported that its accuracy is

superior to that of traditional statistical methods in dealing

with credit scoring problems, especially in regards to non-

linear patterns (Desai, Crook, & Overstreet, 1996, 1997;

Mahlhotra & Malhotra, 2003; Jensen, 1992; Piramuthu,

1999). However, on the other hand, ANN has been

criticized for its poor performance when incorporating

irrelevant attributes or small data sets (Castillo, Marshall,

Green, & Kordon, 2003; Feraud & Cleror, 2002; Nath,

Rajagopalan, & Ryker, 1997).

In order to build an effective discriminant function, two

issues should be considered. First, the relationships among

attributes and classes may be linear or non-linear. Second,

the irrelevant attributes should be removed in order to

increase the accuracy of the classification model. In this

paper, GP is employed to automatically and heuristically

determine the adequate discriminant functions and the valid

attributes simultaneously. In addition, unlike ANNs which

are only suited for large data sets, GP can perform well even

in small data sets (Nath et al., 1997).

In order to efficiently obtain the discriminant function, the

data set is preprocessed by discretization. Two real-world
Expert Systems with Applications 29 (2005) 41–47
www.elsevier.com/locate/eswa

http://www.elsevier.com/locate/eswa


C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–4742
cases will be used below to compare the accuracy rate to other

classification models including the logistic regression model,

ANN, decision trees and rough sets. On the basis of the

results, we can conclude that GP can provide better

performance than other models.

The rest of this paper is organized as follows. Section 2

describes the models for credit scoring. Discretization and

genetic programming are proposed in Section 3. Two real-

world examples are used to demonstrate the proposed

method in Section 4. Discussions are presented in Section 5

and conclusions are in Section 6.
2. Credit scoring models

In this section, we describe three popular models used in

building credit scoring models. The first model is logistic

regression, which is mostly used for classification problems

in the area of statistics. The second model is ANN, which is

known for its excellent ability of learning non-linear

relationships in a system. The third model is rough sets,

which is one kind of induction based algorithms, and has

been widely used in classification problems since 1990s.
2.1. Logistic regression

Logistic regression model is one of the most popular

statistical tools for classification problems. Logistic

regression model, unlike other statistical tools (e.g.

discriminant analysis or ordinary linear regression), can

suit various kinds of distribution functions such as Gamble,

Poisson, normal, etc. (Press & Wilson, 1978) and is more

suitable for the credit scoring problems. In additional, in

order to increase its accuracy and flexibility several methods

have been proposed to extend the traditional binary logistic

regression model, including multinomial logistic regression

model (Agresti, 1990; Aldrich & Nelson, 1984; DeMaris,

1992; Knoke & Burke, 1980; Liao, 1994) and logistic

regression model for ordered categories (McCullagh, 1980).

Therefore, the generalized logistic regression model is the

general form of binary logistic regression model and

multinomial logistic regression model.

Let a p-dimensional explanatory variables x0Z(x1,x2,.,
xp) and Y be the response variable with categories 1,2,.,r.
Then the multinomial logistic regression model be given by

the equation

logisticðpÞ Z ln
PðY Z jjxÞ

PðY Z kjxÞ

� �
Z x0bj; 0% j% r; j sk

(1)

where bj is a (pC1) vector of the regression coefficients for
the jth variable.

Let the last response level be the reference level and then

the response probabilities p1,p2,.,pr can be calculated by
the equations

pr hPðY Z rjxÞ Z
ex

0brP
r
lZ1 e

x0bl

Z
ex

0br

ex
0br C

P
rK1
lZ1 e

x0bl
Z

1

1 C
P

rK1
lZ1 e

x0b

(2)

pj hPðY Z jjxÞ Z pr e
x0bj ; 1% j% r K 1 (3)

where l is a response level, and

l Z lðbj; 1% j% r; j skÞ Z
Xn
iZ1

lnðPðY Z yijxiÞÞ;

l 2½1; 2; .; r�

(4)

is the ln likelihood for the multinomial logistic regression

model and {(yi,xi),1%i%n} denotes the sample of n objects.
When the category is equal to two, the multinomial logistic

regression model reduces to a binary logistic regression

model.

Although logistic regression model can perform well in

many applications, when the relationships of the system are

non-linear, the accuracy of logistic regression decreases and

ANN has been proposed to deal with this problem.
2.2. Artificial neural network

Artificial neural networks were developed to mimic the

neurophysiology of the human brain to be a type of flexible

non-linear regression, discriminant, and clustering models

The architecture of ANN can usually be represented as a

three-layer system, named input, hidden, and output layers.

The input layer first processes the input features to the

hidden layer. The hidden layer then calculates the adequate

weights by using the transfer function such as hyperbolic

tangent, softmax, or logistic function before sending to the

output layer.

Combining many computing neurons into a highly

interconnected system, we can detect the complex non-

linear relationship in the data. The simple three-layer

perceptron, which is most used in credit scoring problems,

can be depicted as shown in Fig. 1.

Recently, ANN has been widely used in credit scoring

problems, and it has been reported that its accuracy is

superior to the traditional statistical methods such as

discriminant analysis and logistic regression (Desai et al.,

1996, 1997; Jensen, 1992; Mahlhotra & Malhotra, 2003;

Piramuthu, 1999). However, as mentioned previously, ANN

has been criticized for its poor performance when existing

irrelevant attributes or small data sets. Although many

methods have been proposed to deal with the problem of

variable selection (Feraud & Cleror, 2002; Nath et al.,

1997), it is time waste and makes the model more

complicated. In addition, other scholars are criticized


Fig. 1. Three-layer neural network.

C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–47 43
the limitations of its long training process in designing the

optimal network’s topology in credit scoring problems

(Chung & Gray, 1999; Craven & Shavlik, 1997).
2.3. Rough sets

Rough sets, originally proposed by Pawlak (1982), is a

mathematical tool used to deal with vagueness or uncer-

tainty Compared to fuzzy sets, there are some advantages to

rough set theory (Pawlak, Grzymala-Busse, Slowinski, &

Ziarko, 1995). One main advantage is that rough sets do not

need any pre-assumptions or preliminary information about

the data, such as the grade of membership function in fuzzy

sets (Grzymala-Busse, 1988). Recently, rough set

theory and fuzzy set theory have been used to complement

or incorporate (Chakrabarty, Biswas, & Nanda, 2000;

Mordeson, 2001; Radzikowska & Kerre, 2002) each other

rather than to compete (Dubois & Prade, 1991). More

detailed discussion about the process of rough set theory can

refer to Walczak and Massart (1999).

The original concept of approximation space in rough

sets can be described as follows.

Given an approximation space

apr Z ðU; AÞ

where U is the universe which is a finite and non-empty set,

and A is the set of attributes. Then based on the

approximation space, we can define the lower and upper

approximations of a set.

Let X be a subset of U and the lower approximation of in

A is

apr
�
ðAÞ Z fxjx 2U; U=IndðAÞ3Xg (5)

The upper approximation of X in A is

�aprðAÞ Z fxjx 2U; U=IndðAÞh X sfg (6)

where

U=IndðAÞ Z fðxi; xjÞ2U$U; f ðxi; aÞ Z f ðxj; aÞ c a 2Ag

(7)
Eq. (5) represents the least composed set in A containing X,

called the best upper approximation of X in A, and Eq. (6)

represents the greatest composed set in A contained in X,

called the best lower approximation.

After constructing upper and lower approximations, the

boundary can be represented as

BNðAÞ Z �aprðAÞ K apr
�
ðAÞ (8)

According to the approximation space, we can calculate

reducts and decision rules. Given an information system

IZ(U, A) then the reduct, RED(B), is a minimal set of
attributes B4A such that rB(U)ZrA(U) where

rBðUÞ Z

P
cardð

�
BXiÞ

cardðUÞ
(9)

denotes the quality of approximation of U by B.

Once the reducts have been derived, overlaying the

reducts on the information system can induce the decision

rules. A decision rule can be expressed as f0q, where f
denotes the conjunction of elementary conditions,

0 denotes ‘indicates’, and q denotes the disjunction of
elementary decisions.

The advantage of the induction based approaches (e.g.

rough sets and decision trees) is that it can provide the

intelligible rules for decision-makers (DMs). These intelli-

gible rules can help DMs to realize the contents of data sets.

Although these induction methods have been well devel-

oped and successfully used in credit scoring problems (Ahn,

Cho, & Kim, 2000; Beynon & Peel, 2001; Dimitras,

Slowinski, Susmaga, & Zopounidis, 1999), the main

problem of induction based methods is the ability of

forecasting. It is clear that if a newly entered object does not

match any rule, it cannot be determined which class it

belongs to. Next, we described the concepts of GP which is

used here to build the credit scoring models in Section 3.
3. Genetic programming

Genetic programming was proposed by Koza (1992) to

automatically extract intelligible relationships in a system

and has been used in many applications such as symbolic

regression (Davidson, Savic, & Walters, 2003), and

classification (Stefano, Cioppa, & Marcelli, 2002; Zhang

& Bhattacharyya, 2004). The representation of GP can be

viewed as a tree-based structure composed of the function

set and terminal set. The function set is the

operators, functions or statements such as arithmetic

operators ({C,K,!,j}) or conditional statements
(If.then.) which are available in the GP. The terminal
set contains all inputs, constants and other zero-argument in

the GP tree. For example to express xyC3/x, the GP tree can
be represented as Fig. 2.

Once we initialize a population of the GP tree, the

following procedures are similar to genetic algorithms


Fig. 3. The crossover operator of GP tree.

Fig. 2. The representation of a GP tree.

C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–4744
(GAs) including defining the fitness function, genetic

operators such as crossover, mutation and reproduction,

and the termination criterion, etc. Next, we introduce three

main operators, crossover, mutation and reproduction, to

show the procedures of finding the (approximate) optimal

generation.

In GP, the crossover operator is used to swap the subtree

from the parents to reproduce the children using mating
Table 1

The discretization of the continuous attributes in Australian data set

Value 1 2 3 4 5 6

A2 [*, 17.75) [17.75,

21.21)

[21.21,

22.38)

[22.38,

23.04)

[23.04,

23.34)

[2

24

A3 [*, 0.480) [0.480,

1.793)

[1.793,

2.793)

[2.793,

4.020)

[4.020,

6.103)

[6

A7 [*, 0.145) [0.145,

1.020)

[1.020,

2.145)

[2.145, *)

A10 [*, 1) [1, *)

A13 [*, 23) [23, 93) [93, 171) [171, 262) [262, *)

A14 [*, 13) [13, *)
selection policy rather than exchanging bit strings as in

GAs. An example of a crossover in GP is shown in Fig. 3.

Similar to GAs, GP uses the mutation operator in order to

avoid falling into the local optimal solution. The mutation

operator is used to randomly choose a node in a subtree and

replace it with a new created subtree randomly. Finally, a

new generation can be reproduced from two parents using

the reproduction operator to represent a better solution.

In order to determine the adequate discriminant function,

the fitness function of GP can be described as

ffi Z

P
N
i absðoi K eiÞ

N
(10)

where abs($) denotes the absolute operator, oi denotes the
observed class and ei denotes the expected class.

It should be highlighted that the function set and the

terminal set should be varied enough to represent the

relations among independent and response variables. More-

over, in order to satisfy the principle of parsimony, the depth

of the GP-tree should also be limited.

In addition, in order to obtain the discriminant function

efficiently and effectively, discretization of continuous

attributes should be employed before GP. Many discretiza-

tion algorithms such as Boolean reasoning algorithm,

entropy algorithm and naı̈ve algorithm have been proposed

to deal with this problem (Shan, Hamilton, Ziarko, &

Cercone, 1996; Wu, 1996). In this paper, Boolean reasoning

algorithm is employed to determine the adequate discrete

values. Next, two empirical cases will be used follow to

compare the proposed method and the other models.
4. Empirical analysis

In this section, GP is compared to MLP, classification

and regression tree (CART), C4.5, Rough sets, and logistic

regression (LR) using two-real world data sets. The first data

set includes Australian credit scoring data with 307

examples of credit worthy customers and 383 examples

for credit unworthy customers. It contains 14 attributes,

where six are continuous attributes and eight are categorical

attributes. The second data set, called the German Credit

Data Set, was provided by Prof. Hofmann in Hamburg.
7 8 9 10 11

3.34,

.38)

[24.38,

27.92)

[27.92,

32.38)

[32.38,

37.38)

[37.38,

48.96)

[48.96, *)

.103, *)


Table 2

The parameter settings of GP

Parameter Value

Population size 40

Fitness function Eq. (10)

Function set {C,K,!,sin,cos,R,Z,%,and,or,not,if}
Terminal set {Attributes,1,2,3,4,5,6,7,8,9,10,11,12,13,14}

Maximum number of

generation

1000

Selection Lexictour

Crossover rate 0.9

Mutation rate 0.01

C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–47 45
It includes customer credit scoring data with 20 features,

such as age, gender, marital status, credit history records,

job, account, loan purpose, other personal information, etc.

There are 700 records judged to be credit worthy and 300

records judged to be credit unworthy. Both data sets are

made public from the UCI Repository of Machine Learning

Databases, and are mostly used to compare the performance

of various classification models.

The first step of the proposed method is to dissect the

continuous attributes. For example of Australian data set,

the results of the discretization can be shown as in Table 1.

The discretization of the continuous attributes in German

data set can be described as shown in Appendix A.

Next, we set the GP parameters of Australian data set

as shown in Table 2 and the parameters of German data set

can also be shown in Appendix B. In order to build

the discriminant function as flexible as possible, we

incorporate the logic operators into the function set. On

the other hand, due to the range of the discretization values

is from 1 to 14, we incorporate the constants from 1 to 14

into the terminal set.
Table 3

The comparison of the credit scoring models in Australian data set

Australian data Sample 1 (%) Sample 2 (%) Sample 3 (%

GP 0.1111 0.1280 0.1304

MLP 0.1352 0.1256 0.1352

CART 0.1497 0.1256 0.1449

C4.5 0.1594 0.1304 0.1400

Rough sets 0.1382 0.1729 0.1538

LR 0.1497 0.1449 0.1304

Table 4

The comparison of the credit scoring models in German data set

German data Sample 1 (%) Sample 2 (%) Sample 3 (%

GP 0.2166 0.2266 0.2200

MLP 0.2400 0.2382 0.2500

CART 0.2765 0.2617 0.2435

C4.5 0.2446 0.2500 0.2227

Rough sets 0.2533 0.2649 0.2631

LR 0.2400 0.2421 0.2500
Five sub-samples are used to compare the error rate of

the credit scoring models. In addition, the holdout method

is used for avoiding the problem of overfitting. The error

rate of the test sets in both Australian and German data sets

can be described as shown in Tables 3 and 4.

On the basis of the results, we can conclude that the

proposed method outperforms to other models in our

empirical analysis. In addition, ANN and logistic regression

also well perform in this study and can be other choices for

the credit scoring model. Next, we provide the discussions

based on our implementation.
5. Discussions

Due to the huge growth rate of the credit industry,

building an effective credit scoring model have been an

important task for saving amount cost and efficient decision

making. Although many novel approaches have been

proposed, more issues should be considered for increasing

the accuracy of the credit scoring model.

First, the irrelevant variables will destroy the structure of

the data and decreases the accuracy of the discriminant

function. Second, the credit scoring model should determine

the correct discriminant function (linear or non-linear)

automatically. Third, the credit scoring model should be

useful in both large and small data sets. For above reasons,

GP is used to build the credit scoring models in this paper.

On this basis of the simulated results, we can conclude

that GP outperforms than other models. However, ANN and

logistic regression can also provide the satisfied solutions

and can be other alternatives. The accuracy of the induction

based approaches (decision trees and rough set) is inferior in

this study. It is clear that the decision rules are derived from
) Sample 4 (%) Sample 5 (%) Overall

0.1207 0.0966 0.1173

0.1062 0.1014 0.1207

0.1400 0.1497 0.1419

0.1014 0.1159 0.1294

0.1718 0.1777 0.1628

0.1304 0.1352 0.1381

) Sample 4 (%) Sample 5 (%) Overall

0.2433 0.2266 0.2266

0.2433 0.2533 0.2449

0.3170 0.3721 0.2941

0.2926 0.3318 0.2683

0.2353 0.2551 0.2543

0.2479 0.2500 0.2460


Table B1

The parameter settings of GP

Parameter Value

Population size 40

Fitness function Eq. (10)

Function set {C,K,!,R,Z,%,and,or,not,if}
Terminal set {Attributes,1, 2, 3, 4, 5}

Maximum number of generation 40

Selection Lexictour

Crossover rate 0.9

Mutation rate 0.01

C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–4746
the training set. However, if a newly entered object within

the test set does not match any rule, it cannot be determined

which class it belongs to.

Compared to other models, we consider that GP is more

suitable for the credit scoring problems for the following

reasons. Unlike the traditional statistical methods need the

assumptions of the data set and the attributes, GP is a non-

parametric tool and suitable for any situations and data sets.

Compared to ANNs, GP can determine the adequate

discriminant function automatically rather than assigned

the transfer function by decision-makers. In addition, GP

can also select the important variable automatically. Finally,

the discriminant function which is derived by GP can

provide the better forecasting performance than the

induction based algorithms.
6. Conclusions

Building a credit scoring model involves the problems of

variable selection and model identification. Although many

approaches have been proposed, a flexible and accurate

method is limited. In this paper, GP is employed to build the

discriminant function for the credit scoring problems. On

the basis of the empirical results, we can conclude that GP is

more flexible and performs better accuracy in the credit

scoring problems significantly.
Appendix A

The discretization of the continuous attributes in Germen

data set using Boolean reasoning algorithms can be

described as shown in Table A1.
Appendix B

The parameters of German data set can be shown as in

Table B1.
Table A1

The discretization of the continuous attributes in German data set

Value 1 2 3 4 5

Checking [*, 1) [1, 2) [2, *)

Duration [*, 12) [12, 23) [23, 32) [38, *)

History [*, 2) [2, *)

Amount [*, 714) [714,

1387)

[1387,

2045)

[2045,

3914)

[3914, *)

Saving [*, 2) [2, *)

Employed [*, 2) [2, 3) [3, *)

Installp [*, 4) [4, *)

Resident [*, 2) [2, 4) [4, *)

Age [*, 27) [27, 33) [33, *)

Existcr [*, 2) [2, *)

Job [*, 1) [1, *)
References

Agresti, A. (1990). Categorical data analysis. New York: Wiley.

Ahn, B. S., Cho, S. S., & Kim, C. Y. (2000). The integrated methodology of

rough set theory and artificial neural network for business failure

prediction. Expert Systems with Applications, 18(2), 65–74.

Aldrich, J. H., & Nelson, F. D. (1984). Linear probability, logit, and probit

models. Beverly Hills, CA: Sage.

Beynon, M. J., & Peel, M. J. (2001). Variable precision rough set theory and

data discretisation an application to corporate failure prediction.

OMEGA: the International Journal of Management Science, 29(6),

561–576.

Castillo, F., Marshall, K., Green, J., & Kordon, A. (2003). A methodology

for combining symbolic regression and design of experiments to

improve empirical model building. Genetic and Evolutionary Compu-

tation Conference , 1975–1985.

Chakrabarty, K., Biswas, R., & Nanda, S. (2000). Fuzziness in rough sets.

Fuzzy Sets and Systems, 110(2), 247–251.

Chung, H. M., & Gray, P. (1999). Special section: Data mining. Journal of

Management Information Systems, 16(1), 11–16.

Craven, M. W., & Shavlik, J. W. (1997). Using neural networks for data

mining. Future Generation Computer Systems, 13(2/3), 221–229.

Davidson, J. W., Savic, D. A., & Walters, G. A. (2003). Symbolic and

numerical regression: Experiments and applications. Information

Sciences, 150(1/2), 95–117.

DeMaris, A. (1992). Logit modeling. Beverly Hills, CA: Sage.

Desai, V., Crook, J., & Overstreet, G. (1996). A comparison of neural

networks and linear scoring models in credit union environment.

European Journal of Operations Management, 95(1), 24–37.

Desai, V., Crook, J., & Overstreet, G. (1997). Credit scoring models in the

credit union environment using neural networks and genetic algorithms.

IMA Journal of Mathematics Applied in Business and Industry, 8(4),

324–346.

Dimitras, A. I., Slowinski, R., Susmaga, R., & Zopounidis, C. (1999).

Business failure prediction using rough sets. European Journal of

Operational Research, 144(2), 263–280.

Dubois, D., & Prade, H. (1991). In Z. Pawlark (Ed.), Rough sets:

Theoretical aspects of reasoning about data. Dordrecht, The Nether-

lands: Kluwer.

Feraud, R., & Cleror, F. (2002). A methodology to explain neural network

classification. Neural Network, 15(2), 237–246.

Grzymala-Busse, J. W. (1988). Knowledge acquisition under uncertain-

ty—A rough set approach. Journal of intelligent and Robotic Systems,

1(1), 3–16.

Jensen, H. L. (1992). Using neural networks for credit scoring. Managerial

Finance, 18(1), 15–26.

Knoke, D., & Burke, P. J. (1980). Log-linear models. Beverly Hills,

CA: Sage.

Koza, J. (1992). Genetic programming: On the programming of computers

by means of natural selection. Cambridge, MA: MIT Press.


C.-S. Ong et al. / Expert Systems with Applications 29 (2005) 41–47 47
Lee, T. S., Chiu, C. C., Lu, C. J., & Chen, I. F. (2002). Credit scoring using

the hybrid neural discriminant technique. Expert Systems with

Applications, 23(3), 245–254.

Liao, T. F. (1994). Interpreting probability model: Logit, probit, and other

generalized linear models. Beverly Hills, CA: Sage.

Mahlhotra, R., & Malhotra, D. K. (2003). Evaluating consumer loans using

neural networks. OMEGA: The International Journal of Management

Science, 31(2), 83–96.

McCullagh, P. (1980). Regression model for ordinal data. Journal of the

Royal Statistical Society, Series B, 42(2), 109–142.

Mordeson, J. N. (2001). Rough set theory applied to (fuzzy) ideal theory.

Fuzzy Sets and Systems, 121(2), 315–324.

Nath, R., Rajagopalan, B., & Ryker, R. (1997). Determining the saliency of

input variables in neural network classifiers. Computers and Operations

Researches, 24(8), 767–773.

Pawlak, Z. (1982). Rough set. International Journal of Computer and

Information Science, 11(5), 341–356.

Pawlak, Z., Grzymala-Busse, J., Slowinski, R., & Ziarko, W. (1995). Rough

sets. Communications of the ACM, 38(11), 88–95.

Piramuthu, S. (1999). Financial credit-risk evaluation with neural and

neurofuzzy systems. European Journal of Operational Research,

112(2), 310–321.
Press, S. J., & Wilson, S. (1978). Choosing between logistic regression and

discriminant analysis. Journal of the American Statistical Association,

73(4), 699–705.

Radzikowska, A. M., & Kerre, E. E. (2002). A comparative study of fuzzy

rough sets. Fuzzy Sets and Systems, 126(2), 137–155.

Shan, N., Hamilton, H. J., Ziarko, W., & Cercone, N. (1996). Discretization

of continuous valued attributes in attribute-value systems. Proceeding

of the fourth International orkshop on Rough Sets, Fuzzy Sets, and

MachineDiscovery, Tokyo, Japan , 74–81.

Stefano, C. D., Cioppa, A. D., & Marcelli, A. (2002). Character

preclassification based on genetic programming. Pattern Recognition

Letters, 23(12), 1439–1448.

Walczak, B., & Massart, D. L. (1999). Rough sets theory. Chemometrics

and Intelligent Laboratory Systems, 47(1), 1–16.

West, D. (2000). Neural network credit scoring models. Computers and

Operations Research, 27(11/12), 1131–1152.

Wu, X. D. (1996). A Bayesian discretizer for real-valued attributes. The

Computer Journal, 39(8), 688–691.

Zhang, Y., & Bhattacharyya, S. (2004). Genetic programming in

classifying large-scale data: an ensemble method. Information Science,

163(1/3), 85–101.


	Building credit scoring models using genetic programming
	Introduction
	Credit scoring models
	Logistic regression
	Artificial neural network
	Rough sets

	Genetic programming
	Empirical analysis
	Discussions
	Conclusions
	Appendix A
	Appendix B
	References