Use of graphics processing units for automatic synthesis of programs

https://doi.org/10.1016/j.compeleceng.2015.04.006Get rights and content

Highlights

  • A new quantum-inspired linear genetic programming system that runs on the GPU.

  • Allows the synthesis of solutions for large-scale real-world problems.

  • Eliminates the overhead of copying the fitness results from the GPU to the CPU.

  • Proposes a new selection mechanism to recognize the programs with best evaluations.

  • Improves performance of the GP execution through exploiting the GPU environment.

Abstract

Genetic programming (GP) is an evolutionary method that allows computers to solve problems automatically. However, the computational power required for the evaluation of billions of programs imposes a serious limitation on the problem size. This work focuses on accelerating GP to support the synthesis of large problems. This is done by completely exploiting the highly parallel environment of graphics processing units (GPUs). Here, we propose a new quantum-inspired linear GP approach that implements all the GP steps in the GPU and provides the following: (1) significant performance improvements in the GP steps, (2) elimination of the overhead of copying the fitness results from the GPU to the CPU, and (3) incorporation of a new selection mechanism to recognize the programs with the best evaluations. The proposed approach outperforms the previous approach for large-scale synthetic and real-world problems. Further, it provides a remarkable speedup over the CPU execution.

Introduction

The idea of enabling the computer to automatically create programs that solve problems establishes a new paradigm for developing reliable applications. The field of genetic programming (GP) has demonstrated that devising computer programs on the basis of a high-level description is viable.

Genetic programming extends conventional evolutionary algorithms to deal with computer programs. The essence of GP is to use the Darwinian principle of natural selection in which a population of computer programs is maintained and modified, according to genetic variation. A GP system progresses toward a solution by stochastically transforming populations of programs into better populations of programs until a stopping criterion is met.

In the past few years, GP has been successfully applied to a wide variety of problems, including automatic design, pattern recognition, financial prediction, robotic control, data mining, image processing, and synthesis of analog electrical circuits [1], [2], [3], [4], [5], [6], [7]. However, a major drawback of GP is that the search space of candidate programs can become enormous. For example, to solve the 20-bit Boolean multiplexer problem, a total of 1,310,720,000 candidate programs have to be evaluated [8]. In addition, the evaluation of the fitness of a single program in the search space may demand testing this program with numerous different combinations of input data. Consequently, the time required to evaluate the programs may be unreasonable. The computational power required by GP to evaluate billions of programs with hundreds or thousands of input data can be a huge obstacle to solving large real-world problems.

The last few years have witnessed remarkable advances in the design of parallel processors. In particular, graphics processing units (GPUs) have become increasingly popular. Further, their high computational power, low cost, and reasonable floating-point capabilities have made them attractive platforms for speeding up GP. Modern GPUs contain thousands of cores, and the massive parallelism provided is highly suitable to process GP in parallel.

The power of the GPU has been previously exploited to accelerate GP by using different methodologies. The compilation methodology [9], [10], [11], [12] generates programs using the GPU high-level language, and each of these programs has to be compiled before its evaluation. The pseudo-assembly methodology [13], [14] creates programs in the pseudo-assembly code of the GPU, and a just-in-time (JIT) compilation is performed for each individual before the evaluation. The interpretation methodology [15], [16], [17] interprets programs during its evaluation. The machine code methodology [18] generates programs in the GPU machine code and does not require any compilation step before evaluation.

The above mentioned methodologies have been used with different levels of success. However, the machine code methodology, called GPU machine code genetic programming (GMGP) [18], has exhibited significant performance gains over the others. Avoiding the compilation overhead without including the cost of parsing the evolved program is the key for severe reductions in the computational time. In addition, GMGP implements linear genetic programming by using a quantum-inspired evolutionary algorithm. The use of a quantum-inspired evolutionary algorithm provides a more efficient evolutionary algorithm that includes the past evaluation history to improve the generation of new programs. The linear genetic programming approach is more appropriate for machine code programs, as computer architectures require programs to be provided as linear sequences of instructions.

This work extends GMGP in two directions: First, we propose a new approach where the power of the GPU is fully exploited in the GP algorithm. Second, we assess the impact of selecting programs with the best evaluations on the best final solutions (i.e. programs). We have developed two approaches: GMGP-gpu and GMGP-gpu+. GMGP-gpu implements all the GP steps in the GPU and provides the following: (1) significant performance improvements in the GP steps and (2) elimination of the overhead when copying the fitness results from the GPU to the CPU through the PCIe bus. GMGP-gpu+ incorporates a new selection mechanism to recognize programs with the best evaluations. The new selection mechanism produces a more efficient comparison of the past population with the current population, bringing more diversity to the search.

The two proposed approaches were compared to GMGP and found to outperform GMGP for large-scale synthetic and real-world problems. The speedups ranged from 1.3 to 2.6. They also provided substantial speedups over CPU execution; the parallel GPGMGP-gpu+ executed up to 325.5 times faster than the CPU version.

The remainder of this paper is organized as follows: Section 2 presents previous work on parallel GP. Section 3 briefly introduces GP and quantum-inspired GP. Section 4 describes the GMGP system. Section 5 describes our approach to fully exploit the power of the GPU in GMGP. Section 6 presents the experimental results obtained for synthetic and real-world problems. Finally, Section 7 draws some conclusions and presents future research directions.

Section snippets

Related work

GP has been extensively used to solve a wide range of problems. The problem of a high computational burden associated with GP has been tackled with a variety of parallel processing methods: distributed algorithms [19], hardware implementation [20], and FPGA implementation [21].

Recently, the power of GPUs has been harnessed to accelerate GP processing in different ways. The first GPU implementations of GP were proposed by Chitty [9] and Harding and Banzhaf [10]. They used the compilation

Genetic programming

In GP, a population of computer programs is evolved. Each computer program is called an individual and represents a potential solution to the problem. The basic algorithm defines the following [22]:

  • 1.

    Terminal Set: A set of input variables and/or constants.

  • 2.

    Function Set: A set of instructions that, combined with the terminals, are used to build a program.

  • 3.

    Fitness Function: A score value assigned to each individual that gives the measure of the program fitness to the problem stated, i.e., how well

GPU machine code genetic programming

GPU machine code genetic programming (GMGP) is the first system to automatically synthesize computer programs based on the synthesis of GPU machine code programs [18]. GMGP uses a quantum-inspired linear GP algorithm, based on QILGP. The basic algorithm of GMGP follows the evolutionary algorithm of Fig. 2. However, the evaluation step is executed in parallel on the GPU. The initialization, observation, selection, and operator P steps are executed on the CPU.

The evaluation step is the most

Complete exploration of the GPU in GMGP

This work proposes a novel approach to explore the power of GPUs within the framework of GP. Our approach is called GMGP-gpu and aims at including the other genetic programming steps in the parallel execution of the GPU. GMGP-gpu is based on the observation that large problems imply in large programs and large search spaces. Large programs have a large number of instructions, and both the observation and the operator P steps of the quantum-inspired evolutionary algorithm can take a substantial

Experimental results

GMGP-gpu and GMGP-gpu+ were implemented in C and CUDA 5.5 by using gcc 4.4.7, nvcc release 5.5, and V5.5.0, and by using 256 threads per block in the evaluation step. The experiments were conducted on a GeForce GTX TITAN GPU, with 2688 CUDA cores (at 837 MHz) and 6 GB of RAM (no ECC) with a memory bandwidth of 288.4 GB/s through a 384-bit data bus. GMGP-gpu and GMGP-gpu+ were compared with a CPU-only parallel processing version, where the Intel x86 machine code was synthesized. For this

Conclusions

The synthesis of programs based on user-defined requirements is an exciting field of evolutionary computation. At present, however, this ambition is bounded by the computational power needed to perform the evaluation of billions of programs. This work focuses on accelerating genetic programming (GP) to support the synthesis of large problems. We worked on exploiting the highly parallel environment of the GPU to tackle the huge computational effort required by GP.

We provide a new perspective on

Cleomar Pereira da Silva received his MSc and DSc in Electrical Engineering at Pontifical Catholic University of Rio de Janeiro (2015). Currently, he is a professor in the Department of Education Development at Federal Institute of Education, Science and Technology Catarinense. He is currently interested in researching gpus and genetic programming.

References (25)

  • J.R. Koza

    Genetic programming: on the programming of computers by means of natural selection

    (1992)
  • Tackett WA. Genetic programming for feature discovery and image discrimination. In: Proceedings of the 5th...
  • M. Santini et al.

    Genetic programming for financial time series prediction

  • J. Busch et al.

    Automatic generation of control programs for walking robots using genetic programming

  • F.E. Otero et al.

    Genetic programming for attribute construction in data mining

  • S. Harding et al.

    Genetic programming on GPUs for image processing

    Int J High Perform Syst Archit

    (2008)
  • J.R. Koza et al.

    Automated synthesis of analog electrical circuits by means of genetic programming

    IEEE Trans Evol Comput

    (1997)
  • W.B. Langdon

    A many threaded CUDA interpreter for genetic programming

  • D.M. Chitty

    A data parallel approach to genetic programming using programmable graphics hardware

  • S. Harding et al.

    Fast Genetic Programming on GPUs

  • Harding SL, Banzhaf W. Distributed genetic programming on GPUs using CUDA. In: Workshop on parallel architectures and...
  • Langdon WB, Harman M. Evolving a CUDA kernel from an nVidia template. In: IEEE congress on evolutionary computation;...
  • Cited by (0)

    Cleomar Pereira da Silva received his MSc and DSc in Electrical Engineering at Pontifical Catholic University of Rio de Janeiro (2015). Currently, he is a professor in the Department of Education Development at Federal Institute of Education, Science and Technology Catarinense. He is currently interested in researching gpus and genetic programming.

    Douglas Mota Dias is a postdoctoral researcher in the Department of Electrical Engineering at Pontifical Catholic University of Rio de Janeiro (PUC-Rio) and a member of the IEEE Computational Intelligence Society. He received his MSc and DSc in Electrical Engineering from PUC-Rio, Brazil, in 2010. His current research interests include machine learning, genetic programming, quantum-inspired evolutionary algorithms, and high-performance computing.

    Cristiana Bentes received her MSc and DSc in Systems Engineering and Computer Science at Federal University of Rio de Janeiro (1998). She is currently an associate professor in the Department of Systems Engineering and Computer Science at State University of Rio de Janeiro. Her current research interests include high-performance genetic programming, parallel computing, and gpu programming.

    Marco Aurélio Cavalcanti Pacheco received his PhD in Computer Science at the University College London in 1991. He is currently a professor in the Department of Electrical Engineering at the Catholic University Rio de Janeiro, Brazil, where he coordinates the ICA: Applied Computational Intelligence Laboratory. His research interests include evolutionary computation, evolvable hardware, neural networks, fuzzy systems, applied computational intelligence, and knowledge discovery databases.

    Reviews processed and recommended for publication to the Editor-in-Chief by Associate Editor Dr. Jesus Carretero.

    View full text