Skip to main content

Advertisement

Log in

Heuristic procedures for improving the predictability of a genetic programming financial forecasting algorithm

  • Focus
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Financial forecasting is an important area in computational finance. Evolutionary Dynamic Data Investment Evaluator (EDDIE) is an established genetic programming (GP) financial forecasting algorithm, which has successfully been applied to a number of international financial datasets. The purpose of this paper is to further improve the algorithm’s predictive performance, by incorporating heuristics in the search. We propose the use of two heuristics: a sequential covering strategy to iteratively build a solution in combination with the GP search and the use of an entropy-based dynamic discretisation procedure of numeric values. To examine the effectiveness of the proposed improvements, we test the new EDDIE version (EDDIE 9) across 20 datasets and compare its predictive performance against three previous EDDIE algorithms. In addition, we also compare our new algorithm’s performance against C4.5 and RIPPER, two state-of-the-art classification algorithms. Results show that the introduction of heuristics is very successful, allowing the algorithm to outperform all previous EDDIE versions and the well-known C4.5 and RIPPER algorithms. Results also show that the algorithm is able to return significantly high rates of return across the majority of the datasets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. We use these indicators because they have been proved to be quite useful in developing GDTs in previous works like Martinez-Jaramillo (2007), Allen and Karjalainen (1999) and Austin et al. (2004). Of course, there is no reason why not use other information like fundamentals or limit order book. However, the aim of this work is not to find the ultimate indicators for financial forecasting.

  2. These are the 6 indicators mentioned earlier; each indicator has two different period lengths, 12 and 50 days, thus resulting to a total of 12 technical indicators.

  3. As we have mentioned, each GDT makes recommendations of buy (1) or not-to-buy (0). The former denotes a positive signal and the latter a negative. Thus, within the range of the training period, which is \(t\) days, a GDT will have returned a number of positive signals.

  4. To make this clearer, let us give an example: if a given GP tree can have a maximum of \(k\) indicators, then the permutations of the available 12 indicators (we are using 6 different indicators, with 2 periods each, thus \(6*2=12\)) under EDDIE 7 are \(12^k\); on the other hand, if EDDIE 8 is using the same 6 indicators with periods within the range of 2 to 65 days, then the permutations of the available 384 indicators (we are using 6 different indicators with 65\(-\)1=64 periods each, thus \(64*6=384\)) are \(384^k\). It is thus obvious that EDDIE 8’s search space is significantly larger, which can therefore explain the difficulties of EDDIE 8 of consistently finding good solutions.

  5. The datasets used in our experiments can be downloaded from: http://www.cs.kent.ac.uk/people/staff/mk451/datasets.html.

  6. Refer to Sect. 5.2 for the definition of best tree.

References

  • Abdelmalek W, Hamida S, Abid F (2009) Selecting the best forecasting-implied volatility model using genetic programming. J Appl Math Decis Sci 2009:179230

  • Abdou H (2009) Genetic programming for credit scoring: the case of Egyptian public sector banks. Expert Syst Appl 36(9):11,402–11,417

    Article  Google Scholar 

  • Agapitos A, O’Neill M, Brabazon A (2010) Evolutionary learning of technical trading rules without data-mining bias. In: Schaefer R, Cotta C, Kołodziej J, Rudolph G (eds) Parallel problem solving from nature—PPSN XI, Springer, Lecture notes in computer science, vol 6238, pp 294–303

  • Allen F, Karjalainen R (1999) Using genetic algorithms to find technical trading rules. J Financ Econ 51:245–271

    Article  Google Scholar 

  • Austin M, Bates G, Dempster M, Leemans V, Williams S (2004) Adaptive systems for foreign exchange trading. Quant Financ 4(4):37–45

    Article  Google Scholar 

  • Backus J (1959) The syntax and semantics of the proposed international algebraic language of Zurich. In: International conference on information processing, UNESCO, pp 125–132

  • Binner J, Kendall G, Chen SH (eds) (2004) Applications of artificial intelligence in finance and economics. Advances in econometrics, vol 19. Elsevier

  • Brookhouse J, Otero FEB, Kampouridis M (2014) Working with OpenCL to speed up a genetic programming financial forecasting algorithm: initial results. In: Wagner S, Affeneller M (eds) GECCO 2014 workshop on evolutionary computation software systems (EvoSoft), pp 1117–1124

  • Chen SH (2002) Genetic algorithms and genetic programming in computational financ. Springer-Verlag, New York LLC

    Book  Google Scholar 

  • Cohen W (1995) Fast effective rule induction. In: Proceedings of the 12th international conference on machine learning, Morgan Kaufmann, pp 115–123

  • Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30

    MathSciNet  MATH  Google Scholar 

  • Edwards R, Magee J (1992) Technical analysis of stock trends. New York Institute of Finance, New York

    MATH  Google Scholar 

  • Fayyad U, Piatetsky-Shapiro G, Smith P (1996) From data mining to knowledge discovery: an overview. In: Fayyad UM, Piatetsky-Shapiro G, Smyth P, Uthurusamy R (eds) Advances in knowledge discovery and data mining. MIT Press, pp 1–34

  • García S, Herrera F (2008) An extension on “statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. J Mach Learn Res 9:2677–2694

    MATH  Google Scholar 

  • Giacobini M, Provero P, Vanneschi L, Mauri G (2014) Towards the use of genetic programming for the prediction of survival in cancer. In: Cagnoni S, Mirolli M, Villani M (eds) Evolution, complexity and artificial life. Springer, Berlin, pp 177–192

    Chapter  Google Scholar 

  • Hu Y (1998) Constructive induction: covering attribute spectrum. Feature extraction construction and selection. Kluwer Academic Publishers, pp 257–272

  • Kampouridis M, Otero FEB (2013) Using attribute construction to improve the predictability of a GP financial forecasting algorithm. In: Proceedings of the conference on technologies and applications of artificial intelligence, IEEE Xplore, pp 55–60

  • Kampouridis M, Tsang E (2010) EDDIE for investment opportunities forecasting: extending the search space of the GP. In: Proceedings of the IEEE world congress on computational intelligence, Barcelona, Spain, pp 2019–2026

  • Kampouridis M, Tsang E (2012) Investment opportunities forecasting: extending the grammar of a gp-based tool. Int J Comput Intell Syst 5(3):530–541

    Article  Google Scholar 

  • Koza J (1992) Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Krawiec K (2002) Genetic programming-based construction of features for machine learning and knowledge discovery tasks. Genet Program Evol Mach 3(4):329–343

    Article  MATH  Google Scholar 

  • Li J (2001) FGP: a genetic programming-ased financial forecasting tool. PhD thesis, Department of Computer Science, University of Essex

  • Martinez-Jaramillo S (2007) Artificial financial markets: an agent-based approach to reproduce stylized facts and to study the red queen effect. PhD thesis, CFFEA, University of Essex

  • Otero FEB, Silva M, Freitas A, Nievola J (2003) Genetic programming for attribute construction in data mining. In: Proceedings of EuroGP, LNCS 2610, pp 384–393

  • Otero FEB, Freitas A, Johnson C (2008) cAnt-Miner: an ant colony classification algorithm to cope with continuous attributes. In: Ant colony optimization and swarm intelligence (Proceedings of ANTS 2008), pp 48–59

  • Otero FEB, Freitas A, Johnson C (2013) A new sequential covering strategy for inducing classification rules with ant colony algorithms. IEEE Trans Evol Comput 17(1):64–76

    Article  Google Scholar 

  • Otero FEB, Johnson CG (2013) Automated problem decomposition for the boolean domain with genetic programming. Proceedings of the 16th European conference on genetic programming, EuroGP 2013, Austria, Vienna, pp 169–180

  • Phua C, Lee V, Smith K, Gayler R (2010) A comprehensive survey of data mining-based Fraud detection research. http://www.bsys.monash.edu.au/people/cphua/

  • Piatetsky-Shapiro G, Frawley W (1991) Knowledge discovery in databases. AAAI Press, Menlo Park, California

  • Poli R, Langdon W, McPhee N (2008) A field guide to genetic programming. Lulu.com

  • Quinlan JR (1993) C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc, San Francisco

    Google Scholar 

  • Dos Santos J, Ferreira C, Da S, Torres R, Gonçalves M, Lamparelli R (2011) A relevance feedback method based on genetic programming for classification of remote sensing images. Inf Sci 181(13):2671–2684

    Article  Google Scholar 

  • Tsang E, Martinez-Jaramillo S (2004) Computational finance. IEEE Comput Intell Soc Newsl 3–8

  • Tsang E, Li J, Markose S, Er H, Salhi A, Iori G (2000) EDDIE in financial decision making. J Manag Econ 4(4) (online)

  • Tsang E, Markose S, Er H (2005) Chance discovery in stock index option and future arbitrage. New Math Nat Comput World Sci 1(3):435–447

    Article  MATH  Google Scholar 

  • Wang P, Tsang E, Weise T, Tang K, Yao X (2010) Using GP to evolve decision rules for classification in financial data sets. In: Cognitive informatics (ICCI), 2010 9th IEEE international conference on, pp 720–727

  • Wilson G, Banzhaf W (2010) Fast and effective predictability filters for stock price series using linear genetic programming. In: Evolutionary computation (CEC), 2010 IEEE congress on, pp 1–8, doi:10.1109/CEC.2010.5586297

  • Witten H, Frank E (2005) Data mining: practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco, California

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fernando E. B. Otero.

Additional information

Communicated by C.-S. Lee.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kampouridis, M., Otero, F.E.B. Heuristic procedures for improving the predictability of a genetic programming financial forecasting algorithm. Soft Comput 21, 295–310 (2017). https://doi.org/10.1007/s00500-015-1614-8

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-015-1614-8

Keywords

Navigation