Skip to main content

Advertisement

Log in

Impacts of sampling strategies in tournament selection for genetic programming

  • Original Paper
  • Published:
Soft Computing Aims and scope Submit manuscript

Abstract

Tournament selection is one of the most commonly used parent selection schemes in genetic programming (GP). While it has a number of advantages over other selection schemes, it still has some issues that need to be thoroughly investigated. Two of the issues are associated with the sampling process from the population into the tournament. The first one is the so-called “multi-sampled” issue, where some individuals in the population are picked up (sampled) many times to form a tournament. The second one is the “not-sampled” issue, meaning that some individuals are never picked up when forming tournaments. In order to develop a more effective selection scheme for GP, it is necessary to understand the actual impacts of these issues in standard tournament selection. This paper investigates the behaviour of different sampling replacement strategies through mathematical modelling, simulations and empirical experiments. The results show that different sampling replacement strategies have little impact on selection pressure and cannot effectively tune the selection pressure in dynamic evolution. In order to conduct effective parent selection in GP, research focuses should be on developing automatic and dynamic selection pressure tuning methods instead of alternative sampling replacement strategies. Although GP is used in the empirical experiments, the findings revealed in this paper are expected to be applicable to other evolutionary algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15

Similar content being viewed by others

Notes

  1. It is the degree to which the better individuals are favoured (Miller and Goldberg 1995).

  2. Other selection schemes may also suffer this problem.

  3. It refers to individual programs that have never participated into any tournament in a parent selection phase.

  4. It refers to individual programs that have participated into tournaments but have never won any tournament, and accordingly have not been selected for mating.

  5. \(\alpha\) is significance level, and \(100(1-\alpha)\%\) is the confidence level.

  6. Assuming \(\left(^a_b\right)=0\) if b > a.

References

  • Abramowitz M, Stegun IA (eds) (1965) Handbook of mathematical functions. Dover, New York

  • Affenzeller M, Wagner S, Winkler S (2005) GA-selection revisited from an ES-driven point of view. In: Artificial intelligence and knowledge engineering applications: a bioinspired approach. Lecture notes in computer science, vol 3562. Springer, Berlin, pp 262–271

  • Agnelli D, Bollini A, Lombardi L (2002) Image classification: an evolutionary approach. Pattern Recognit Lett 23(1–3):303–309

    Article  MATH  Google Scholar 

  • Akyol A, Yaslan Y, Erol OK (2007) A genetic programming classifier design approach for cell images. In: Mellouli K (ed) Proceedings of the 9th European conference on symbolic and quantitative approaches to reasoning with uncertainty, ECSQARU, Hammamet, Tunisia, October 31–November 2, 2007. Lecture notes in computer science, vol 4724. Springer, Berlin, pp 878–888

  • Andreae P, Xie H, Zhang M (2008) Genetic programming for detecting rhythmic stress in spoken english. Int J Knowl-Based Intell Eng Syst (Special Issue on Genetic Programming) 12(1):15–28

    Google Scholar 

  • Back T (1994) Selective pressure in evolutionary algorithms: a characterization of selection mechanisms. In: Proceedings of the first IEEE conference on evolutionary computation, pp 57–62

  • Blickle T, Thiele L (1995) A mathematical analysis of tournament selection. In: Proceedings of the sixth international conference on genetic algorithms, pp 9–16

  • Blickle T, Thiele L (1997) A comparison of selection schemes used in evolutionary algorithms. Evol Comput 4(4):361–394

    Article  Google Scholar 

  • Box G, Hunter S, Hunter WG (2005) Statistics for experimenters: design, innovation, and discovery, 2nd edn. Wiley, Berlin

  • Brameier M, Banzhaf W, Informatik F (2001) A comparison of linear genetic programming and neural networks in medical data mining. IEEE Trans Evol Comput 5:17–26

    Article  Google Scholar 

  • Branke J, Andersen HC, Schmeck H (1996) Global selection methods for SIMD computers. In: Proceedings of the AISB96 workshop on evolutionary computing, pp 6–17

  • Brindle A (1981) Genetic algorithms for function optimisation. PhD thesis, Department of Computing Science, University of Alberta

  • Bulmer MG (1980) The mathematical theory of quantitative genetics. Oxford University Press, Oxford

    MATH  Google Scholar 

  • Castillo F, Kordon A, Sweeney J, Zirk W (2006) Using genetic programming in industrial statistical model building. In: O’Reilly U-M et al (eds) Genetic programming theory and practice II, chap 3. Springer, Berlin, pp 31–48

  • Ciesielski V, Mawhinney D (2002) Prevention of early convergence in genetic programming by replacement of similar programs. In: Proceedings of the 2002 Congress on evolutionary computation. IEEE Press, Berlin, pp 67–72

  • de Jong K (2007) Parameter setting in eas: a 30 year perspective. In: Parameter setting in evolutionary algorithms. Springer, Berlin, pp 1–18

  • de Sa LB, Mesquita A (2008) Evolutionary synthesis of low-sensitivity equalizers using adjacency matrix representation. In: Keijzer M et al (eds) Proceedings of the 10th annual conference on genetic and evolutionary computation, Atlanta, GA, USA, 12–16 July 2008. ACM, New York, pp 1283–1290

  • Eiben AE, Smith JE (2003) Introduction to evolutionary computing. Springer, Berlin

  • Espejo PG, Ventura S, Herrera F (2010) A survey on the application of genetic programming to classification. IEEE Trans Syst Man Cybernet C 40(2):121–144

    Article  Google Scholar 

  • Goldberg DE, Deb K (1991) A comparative analysis of selection schemes used in genetic algorithms. In: Foundations of genetic algorithms, pp 69–93

  • Grefenstette JJ, Baker JE (1989) How genetic algorithms work: a critical look at implicit parallelism. In: Schaffer JD (ed) Proceedings of the 3rd international conference on genetic algorithms. Morgan Kaufmann, pp 20–27

  • Gustafson SM (2004) An analysis of diversity in genetic programming. PhD thesis, University of Nottingham

  • Holland JH (1975) Adaptation in natural and artificial systems. University of Michigan Press, Ann Arbor

    Google Scholar 

  • Hong J-H, Cho S-B (2004) Lymphoma cancer classification using genetic programming with snr features. In: Proceedings of 7th EuroGP conference, pp 78–88

  • Koza JR (1992) Genetic programming—on the programming of computers by means of natural selection. MIT Press, Cambridge

    MATH  Google Scholar 

  • Koza JR, Bennett FH III, Andre D, Keane MA (1999) Genetic programming III: Darwinian invention and problem solving, 1st edn. Morgan Kaufmann

  • Koza JR, Keane MA, Streeter MJ, Mydlowec W, Yu J, Lanza G (2003) Genetic programming IV: routine human-competitive machine intelligence. Kluwer, Dordrecht

  • Lee W-C (2006) Genetic programming decision tree for bankruptcy prediction. In: Proceedings of the 2006 joint conference on information sciences, JCIS 2006, Kaohsiung, Taiwan, ROC, October 8–11 2006. Atlantis Press

  • Li J, Tsang EPK (2000) Reducing failures in investment recommendations using genetic programming. In: Computing in economics and finance. Universitat Pompeu Fabra, Barcelona, Spain, 6–8 July 2000

  • Lima CF, Pelikan M, Goldberg DE, Lobo FG, Sastry K, Hauschild M (2007) Influence of selection and replacement strategies on linkage learning in boa. In: Proceedings of IEEE Congress on evolutionary computation, pp 1083–1090

  • Miller BL, Goldberg DE (1995) Genetic algorithms, tournament selection, and the effects of noise. Technical report 95006, University of Illinois at Urbana-Champaign, July 1995

  • Miller BL, Goldberg DE (1996) Genetic algorithms, selection schemes, and the varying effects of noise. Evol Comput 4(2):113–131

    Article  Google Scholar 

  • Motoki T (2002) Calculating the expected loss of diversity of selection schemes. Evol Comput 10(4):397–422

    Article  Google Scholar 

  • Muhlenbein H, Schlierkamp-Voosen D (1993) Predictive models for the breeder genetic algorithm, I: continuous parameter optimization. Evol Comput 1(1):25–49

    Article  Google Scholar 

  • Newman DJ, Hettich S, Blake CL, Merz CJ (1998) UCI repository of machine learning databases

  • Poli R, Langdon WB (2006) Backward-chaining evolutionary algorithms. Artif Intell 170(11):953–982

    Article  MathSciNet  MATH  Google Scholar 

  • Popovici E, de Jong K (2003) Understanding EA dynamics via population fitness distributions. In: Proceedings of the genetic and evolutionary computation conference 2003, pp 1604–1605

  • Popp RL, Montana DJ, Gassner RR, Vidaver G, Iyer S (1998) Automated hardware design using genetic programming, VHDL, and FPGAs. In: IEEE international conference on systems, man, and cybernetics, vol 3, San Diego, CA USA, 11–14 October 1998. IEEE, pp 2184–2189

  • Schmidt MD, Lipson H (2007) Learning noise. In: Thierens D, et al (eds) Proceedings of the 9th annual conference on genetic and evolutionary computation, vol 2, London, 7–11 July 2007. ACM Press, pp 1680–1685

  • Smits G, Kordon A, Vladislavleva K, Jordaan E, Kotanchek M (2005) Variable selection in industrial datasets using pareto genetic programming. In: Yu T, Riolo RL, Worzel B (eds) Genetic programming theory and practice III. Genetic programming, vol 9, chap 6, 12–14 May 2005. Springer, Ann Arbor, pp 79–92

  • Sokolov A, Whitley D (2005) Unbiased tournament selection. In: Proceedings of genetic and evolutionary computation conference. ACM Press, New York, pp 1131–1138

  • Vanyi R (2005) Practical evaluation of efficient fitness functions for binary images. In: Rothlauf F et al (eds) Applications of evolutionary computing, EvoWorkshops2005: EvoBIO, EvoCOMNET, EvoHOT, EvoIASP, EvoMUSART, EvoSTOC. LNCS, vol 3449, Lausanne, Switzerland, 30 March–1 April 2005. Springer, Berlin, pp 310–324

  • Winkler S, Affenzeller M, Wagner S (2008) Offspring selection and its effects on genetic propagation in genetic programming based system identification. Cybernet Syst 2:549–554

    Google Scholar 

  • Xie H, Zhang M, Andreae P (2007) Another investigation on tournament selection: modelling and visualisation. In: Proceedings of genetic and evolutionary computation conference, pp 1468–1475

  • Zhang M, Ciesielski V, Andreae P (2003) A domain independent window-approach to multiclass object detection using genetic programming. EURASIP J Appl Signal Process 2003(8):841–859

    Article  MATH  Google Scholar 

  • Zhang W, ming Wu Z, ke Yang G (2004) Genetic programming-based chaotic time series modeling. J Zhejiang Univ Sci 5(11):1432–1439

    Article  Google Scholar 

  • Zhang M, Gao X, Lou W (2006) Gp for object classification: brood size in brood recombination crossover. In: The 19th Australian joint conference on artificial intelligence. LNAI, vol 4303. Springer, Berlin, pp 274–284

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Huayang Xie.

Appendix: Proof of Eqs. 12 and 17 being equivalent

Appendix: Proof of Eqs. 12 and 17 being equivalent

Equation 17 can be simplified to:

$$ \begin{aligned} P(W_j)&=\frac{\sum^{k}_{n=1}\frac{1}{n} \frac{(|S_j|-1)!}{(n-1)!(|S_j|-1-n+1)!} \left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k-n\end{array}\right)} {\left(\begin{array}{l} N\\ k \end{array}\right)}\\ &=\frac{\sum^{k}_{n=1}\frac{(|S_j|-1)!}{n!(|S_j|-n)!} \left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k-n\end{array}\right)} {\left(\begin{array}{c} N\\ k \end{array}\right)}\\ &= \frac{\sum^{k}_{n=1}\frac{1}{|S_j|} \frac{|S_j|!}{n!(|S_j|-n)!}\left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k-n \end{array}\right)} {\left(\begin{array}{l} N\\ k \end{array}\right)}\\ &= \frac{\sum^{k}_{n=1} \left(\begin{array}{l} |S_j|\\ n \end{array}\right) \left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k-n \end{array}\right)} {\left(\begin{array}{l} N\\ k \end{array}\right)|S_j|} \end{aligned} $$

After applying the relation \(\sum^{n}_{m=0}\left(\begin{array}{l}r\\m\end{array}\right)\left(\begin{array}{l}s\\n-m\end{array}\right)=\left(\begin{array}{l}r+s\\n\end{array}\right)\) (Abramowitz and Stegun 1965), we can further simply the equation to

$$ \begin{aligned} &=\frac{\left(\begin{array}{c}|S_j|+ \sum^{j-1}_{i=1}|S_i|\\ k \end{array}\right)- \left(\begin{array}{l} |S_j|\\ 0 \end{array}\right) \left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k \end{array}\right)} {\left(\begin{array}{l} N\\ k \end{array} \right)|S_j|}\\ &=\frac{\left(\begin{array}{l} \sum^{j}_{i=1}|S_i|\\ k\end{array}\right)- \left(\begin{array}{l} \sum^{j-1}_{i=1}|S_i|\\ k \end{array}\right)} {\left(\begin{array}{c} N\\ k \end{array}\right)|S_j|} \end{aligned} $$

which is the same as Eq. 12.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Xie, H., Zhang, M. Impacts of sampling strategies in tournament selection for genetic programming. Soft Comput 16, 615–633 (2012). https://doi.org/10.1007/s00500-011-0760-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00500-011-0760-x

Keywords

Navigation