Building credit scoring models using genetic programming

https://doi.org/10.1016/j.eswa.2005.01.003Get rights and content

Abstract

Credit scoring models have been widely studied in the areas of statistics, machine learning, and artificial intelligence (AI). Many novel approaches such as artificial neural networks (ANNs), rough sets, or decision trees have been proposed to increase the accuracy of credit scoring models. Since an improvement in accuracy of a fraction of a percent might translate into significant savings, a more sophisticated model should be proposed to significantly improving the accuracy of the credit scoring mode. In this paper, genetic programming (GP) is used to build credit scoring models. Two numerical examples will be employed here to compare the error rate to other credit scoring models including the ANN, decision trees, rough sets, and logistic regression. On the basis of the results, we can conclude that GP can provide better performance than other models.

Introduction

Credit scoring models have been widely used by financial institutions to determine if loan customers belong to either a good applicant group or a bad applicant group. The advantages of using credit scoring models can be described as the benefit from reducing the cost of credit analysis, enabling faster credit decision, insuring credit collections, and diminishing possible risk (Lee et al., 2002, West, 2000). Since an improvement in accuracy of a fraction of a percent might translate into significant savings (West, 2000), a more sophisticated model should be proposed to significantly improve the accuracy of the credit scoring model in this paper.

In order to obtain a satisfied credit scoring model, numerous methods have been proposed. Roughly, these methods can be classified to parametric statistical methods (e.g. discriminant analysis and logistic regression), non-parametric statistical methods (e.g. k nearest neighbor and decision trees), and soft-computing approaches (e.g. artificial neural network (ANN) and rough sets). Recently, ANNs are the most popular tool used for credit scoring and has been reported that its accuracy is superior to that of traditional statistical methods in dealing with credit scoring problems, especially in regards to non-linear patterns (Desai et al., 1996, Desai et al., 1997, Mahlhotra and Malhotra, 2003, Jensen, 1992, Piramuthu, 1999). However, on the other hand, ANN has been criticized for its poor performance when incorporating irrelevant attributes or small data sets (Castillo et al., 2003, Feraud and Cleror, 2002, Nath et al., 1997).

In order to build an effective discriminant function, two issues should be considered. First, the relationships among attributes and classes may be linear or non-linear. Second, the irrelevant attributes should be removed in order to increase the accuracy of the classification model. In this paper, GP is employed to automatically and heuristically determine the adequate discriminant functions and the valid attributes simultaneously. In addition, unlike ANNs which are only suited for large data sets, GP can perform well even in small data sets (Nath et al., 1997).

In order to efficiently obtain the discriminant function, the data set is preprocessed by discretization. Two real-world cases will be used below to compare the accuracy rate to other classification models including the logistic regression model, ANN, decision trees and rough sets. On the basis of the results, we can conclude that GP can provide better performance than other models.

The rest of this paper is organized as follows. Section 2 describes the models for credit scoring. Discretization and genetic programming are proposed in Section 3. Two real-world examples are used to demonstrate the proposed method in Section 4. Discussions are presented in Section 5 and conclusions are in Section 6.

Section snippets

Credit scoring models

In this section, we describe three popular models used in building credit scoring models. The first model is logistic regression, which is mostly used for classification problems in the area of statistics. The second model is ANN, which is known for its excellent ability of learning non-linear relationships in a system. The third model is rough sets, which is one kind of induction based algorithms, and has been widely used in classification problems since 1990s.

Genetic programming

Genetic programming was proposed by Koza (1992) to automatically extract intelligible relationships in a system and has been used in many applications such as symbolic regression (Davidson, Savic, & Walters, 2003), and classification (Stefano et al., 2002, Zhang and Bhattacharyya, 2004). The representation of GP can be viewed as a tree-based structure composed of the function set and terminal set. The function set is the operators, functions or statements such as arithmetic operators

Empirical analysis

In this section, GP is compared to MLP, classification and regression tree (CART), C4.5, Rough sets, and logistic regression (LR) using two-real world data sets. The first data set includes Australian credit scoring data with 307 examples of credit worthy customers and 383 examples for credit unworthy customers. It contains 14 attributes, where six are continuous attributes and eight are categorical attributes. The second data set, called the German Credit Data Set, was provided by Prof.

Discussions

Due to the huge growth rate of the credit industry, building an effective credit scoring model have been an important task for saving amount cost and efficient decision making. Although many novel approaches have been proposed, more issues should be considered for increasing the accuracy of the credit scoring model.

First, the irrelevant variables will destroy the structure of the data and decreases the accuracy of the discriminant function. Second, the credit scoring model should determine the

Conclusions

Building a credit scoring model involves the problems of variable selection and model identification. Although many approaches have been proposed, a flexible and accurate method is limited. In this paper, GP is employed to build the discriminant function for the credit scoring problems. On the basis of the empirical results, we can conclude that GP is more flexible and performs better accuracy in the credit scoring problems significantly.

References (36)

  • S. Piramuthu

    Financial credit-risk evaluation with neural and neurofuzzy systems

    European Journal of Operational Research

    (1999)
  • A.M. Radzikowska et al.

    A comparative study of fuzzy rough sets

    Fuzzy Sets and Systems

    (2002)
  • B. Walczak et al.

    Rough sets theory

    Chemometrics and Intelligent Laboratory Systems

    (1999)
  • D. West

    Neural network credit scoring models

    Computers and Operations Research

    (2000)
  • Y. Zhang et al.

    Genetic programming in classifying large-scale data: an ensemble method

    Information Science

    (2004)
  • A. Agresti

    Categorical data analysis

    (1990)
  • J.H. Aldrich et al.

    Linear probability, logit, and probit models

    (1984)
  • F. Castillo et al.

    A methodology for combining symbolic regression and design of experiments to improve empirical model building

    Genetic and Evolutionary Computation Conference

    (2003)
  • Cited by (0)

    View full text