A genetic programming model for bankruptcy prediction: Empirical evidence from Iran
Introduction
Corporate bankruptcy is a very important economic phenomenon. The health and success of the firms are of widespread concern to policy makers, industry participants, investors, and managers (O’Leary, 1998). It also is a problem that affects the economy of every country. The number of failing firms is important for the economy of a country and it can be considered as an index of the development and robustness of the economy (Zopounidis & Dimitras, 1998). The high individual, economic, and social costs encountered in corporate failures or bankruptcies have spurred searches for better understanding and prediction capability (McKee & Lensberg, 2002).
Prediction of corporate bankruptcy is a phenomenon of increasing interest to investors/creditors, borrowing firms, and governments alike. Timely identification of firms’ impending failure is indeed desirable (Jones, 1987). By this time, several methods have been used for predicting bankruptcy. Early research focused primarily on univariate models such as individual financial ratios. Among these studies Beaver (1966) is more noticeable than the others. He introduced a univariate technique for the classification of firms in two groups using some financial ratios. The ratios were used individually and a cut-off score was calculated for each ratio on the basis of minimizing misclassification. The univariate methods were later criticized, in spite of its considerable results, because of the correlation among ratios and providing different signals for a firm by ratios (Dimitras, Zanakis, & Zopounidis, 1996).
Later research turned to multivariate models. Researchers found that corporate bankruptcy can be affected by many different factors at the same time. Altman (1968) introduced a multivariable technique, multiple discriminant analysis (MDA), for failure prediction. Because this study made use of more than one variable for bankruptcy prediction and applied an advanced statistical technique for determining the relationship among predictor variables, it was of much interest.
MDA provides good predictions but suffers from some limitations. Hence, variety methods introduced to overcome MDA shortcomings and improving accuracy. These methods can be grouped in two categories: statistical and artificial intelligence models. First group consists of Logit (Foreman, 2002, Ohlson, 1980, Zavgren, 1985), Probit (Casey et al., 1986, Theodossiou, 1991), Linear Probability (Stone and Rasp, 1991, Vranas, 1992) Cumulative Sums (Kahya & Theodossiou, 1999), and etc. Neural Networks (Altman et al., 1994, Coats and Fant, 1993, Jo et al., 1997), Genetic Algorithms (Shin and Lee, 2002, Varetto, 1998), Case Based Reasoning (Park & Han, 2002), Rough Sets (Dimitras et al., 1999, McKee and Lensberg, 2002), Support Vector Machine (Min & Lee, 2005), and etc, constitute second group. Some of these models have high predictive accuracy levels but because of absence bankruptcy theory, attempts to establish a generally accepted model for bankruptcy prediction are not successful. Some studies have provided comprehensive surveys on bankruptcy prediction methods such as Dimitras et al., 1996, Jones, 1987, and Kumar and Ravi 2007.
A common approach to bankruptcy prediction is to review the literature to identify a large set of potential predictive financial and/or non-financial variables and then develop a reduced set of variables, through some combination of judgmental and mathematical analysis that will predict bankruptcy (Lensberg, Eilifsen, & McKee 2006). In this study we implemented such approach for variables selection stage. After this, a relatively new technique for bankruptcy prediction, Genetic programming, constructed an accurate classification model for bankruptcy prediction. This model was benchmarked with the MDA, the most common used classification model for this subject. In the rest of the paper, first, we discuss about GP and MDA, two techniques were used for bankruptcy prediction modeling. In the section 4 we explain variable selection process and after that models development, empirical results and conclusion will be discussed.
Section snippets
Genetic programming
Genetic programming (GP) is a search methodology belonging to the family of evolutionary computation (EC). GP can be considered an extension of Genetic algorithms (GAs) (Koza, 1992). GAs is stochastic search techniques that can search large and complicated spaces on the ideas from natural genetics and evolutionary principle (Goldberg, 1989, Holland, 1975). They have been demonstrated to be effective and robust in searching very large spaces in a wide range of applications (Colin, 1994, Shin and
Multiple discriminant analysis (MDA)
MDA is a statistical technique used to classify an observation into one of several a priori groupings dependent upon the observation’s individual characteristics. Therefore, MDA allows the researcher to study the difference between two or more groups of objects with respect to several variables simultaneously (Klecka, 1985). It is used primarily to classify and/or make predictions in problems where the dependent variable appears in qualitative. In the case of two groups consisting of bankrupt
Data collection
The data set used for this research consists of 144 Iranian companies. All of them were or still are listed on the Tehran Stock Exchange (TSE). 72 companies went bankrupt under paragraph 141 of Iran Trade Law1 from 1998 through 2005. The other 72 companies are “matched” companies, from the same period of listing on the TSE. Because of small
GP model
Table 2 shows obtained result from GP model. This model could classify firms in the training sample with 94% overall accuracy rate. In detailed view, the GP model has achieved 96% accuracy rate for correct classifying bankrupt firms and 93% accuracy rate for correct classifying non-bankrupt firms in the training sample. In addition, the GP model was applied to the holdout sample for testing. This model could correct classify 90% firms in the holdout sample. That this rates for correct
Summary and conclusions
Bankruptcy is a highly significant worldwide problem that affects the economic well being of all countries. The high social costs incurred by various stakeholders associated with bankrupt firms have spurred searches for better theoretical understanding and prediction capability.
In this paper genetic programming (GP) and multiple discriminant analysis (MDA) techniques have been used to find out whether it is possible to predict the survival or failure of Iranian corporations based on financial
References (45)
- et al.
Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience)
Journal of Banking and Finance
(1994) - et al.
Discovering interesting classification rules, with genetic programming
Applied Soft Computing
(2002) - et al.
Business failure prediction using rough sets
European Journal of Operational Research
(1999) - et al.
A survey of business failure with an emphasis on prediction methods and industrial application
European Journal of Operational Research
(1996) Analyzing bankruptcy in the restaurant industry: A multiple discriminant model
Hospitality Management
(2002)Failure prediction: sensitivity of classification accuracy to alternative statistical methods and variable sets
Journal of Accounting and Public Policy
(1983)- et al.
Bankruptcy prediction using case-based reasoning, neural networks, and discriminant analysis
Expert Systems with Applications
(1997) - et al.
Bankruptcy theory development and classification via genetic programming
European Journal of Operational Research
(2006) - et al.
Genetic programming and rough sets: A hybrid approach to bankruptcy classification
European Journal of Operational Research
(2002) - et al.
Bankruptcy prediction using support vector machine with optimal choice of kernel function parameters
Expert Systems with Applications
(2005)