Image feature selection using genetic programming for figure-ground segmentation

https://doi.org/10.1016/j.engappai.2017.03.009Get rights and content

Abstract

Figure-ground segmentation is the process of separating regions of interest from unimportant background. One challenge is to segment images with high variations (e.g. containing a cluttered background), which requires effective feature sets to capture the distinguishing information between objects and backgrounds. Feature selection is necessary to remove noisy/redundant features from those extracted by image descriptors. As a powerful search algorithm, genetic programming (GP) is employed for the first time to build feature selection methods that aims to improve the segmentation performance of standard classification techniques. Both single-objective and multi-objective GP techniques are investigated, based on which three novel feature selection methods are proposed. Specifically, one method is single-objective, called PGP-FS (parsimony GP feature selection); while the other two are multi-objective, named nondominated sorting GP feature selection (NSGP-FS) and strength Pareto GP feature selection (SPGP-FS). The feature subsets produced by the three proposed methods, two standard sequential selection algorithms, and the original feature set are tested via standard classification algorithms on two datasets with high variations (the Weizmann and Pascal datasets). The results show that the two multi-objective methods (NSGP-FS and SPGP-FS) can produce feature subsets that lead to solutions achieving better segmentation performance with lower numbers of features than the sequential algorithms and the original feature set based on standard classifiers for given segmentation tasks. In contrast, PGP-FS produces results that are not consistent for different classifiers. This indicates that the proposed multi-objective methods can help standard classifiers improve the segmentation performance while reducing the processing time. Moreover, compared with SPGP-FS, NSGP-FS is equally capable of producing effective feature subsets, yet is better at keeping diverse solutions.

Introduction

Figure-ground image segmentation can be regarded as a special case of image segmentation. It only identifies regions of interest and considers other parts as the background, thus producing binary images as the result. Figure-ground segmentation is an important topic, as many tasks in computer vision and image processing, e.g. robot grasping and image editing, are only interested in certain regions of images and use it as a preprocessing step to isolate these regions (Zou et al., 2014). It is difficult to achieve accurate segmentation performance especially for images with high variations (Liang et al., 2015, Liang et al., 2017), e.g. in terms of object shapes and background regions. In these cases, effective image features that can capture distinguishing information are necessary. However, image features extracted by existing feature descriptors often contain noisy/redundant features, which feature selection (FS) can help remove, thus improving the segmentation performance.

One challenge in FS is the large search space of possible feature subsets, so effective search methods are crucial. Generally, existing FS algorithms use three types of search methods, e.g. exhaustive, sequential and random methods, to search for good feature subsets (Liu and Yu, 2005). FS using the exhaustive search methods, e.g. breadth first search, evaluates all possible combinations of the original features exhaustively, and then find the best subset. The exhaustive search has a high computational cost and may lead to the over-fitting problem (Nagata et al., 2015). Sequential search methods, such as forward selection and backward selection (Zhou and Hansen, 2006), aim to produce a good solution in a reasonable time by trading off accuracy and optimality for speed. Random search methods initially randomly select a feature subset, then two ways are applied to search for an optimal subset (Kumar and Minz, 2014). One is to use a completely random method to generate the next subset, such as the Las Vegas algorithm. However, it assumes that the run time is infinite, which is not realistic. As the Las Vegas algorithm cannot discover a good feature subset in the allocated time, it often performs poorly. The other is to include heuristic knowledge in the search process, such as evolutionary computation (EC) techniques.

EC techniques have the potential to solve problems with large search spaces efficiently, and can be applied to a wide range of optimisation problems (Nag and Pal, 2016), e.g. feature selection. EC techniques can be mainly categorized into three groups (Borenstein and Ullman, 2008): evolutionary algorithms (e.g. genetic algorithms and genetic programming), swarm intelligence (e.g. particle swarm optimisation and bee algorithm) and others (e.g. memetic algorithms). Compared with other EC techniques, GP is more flexible (Espejo et al., 2010), as it can utilize complex and variable-length representations, such as trees. The flexibility of GP makes it possible to evolve better solutions than those designed by experts. GP has been applied to feature selection (Nag and Pal, 2016, Smart and Burrell, 2015, Davis et al., 2006, Muni et al., 2006), and promising results have been achieved.

Existing GP based FS methods can be divided into three categories based on the evaluation of selected feature subsets, which are the filter, the wrapper and the embedded approaches (Xue et al., 2016), which are described in detail in Section 2.1. For the filter approach, the avoidance of learning algorithms in the evaluation loop makes it challenging to evaluate the feature subsets, as it ignores the actual performance of the selected features for a given task (e.g. a classification problem). In contrast, the wrapper approach generally produces better performing feature subsets than the filter approach (Xue et al., 2016). Moreover, the embedded approach is more complex conceptually, and modifications to the learning algorithm may cause poor performance (Maldonado and Weber, 2011). Therefore, we only investigate the wrapper feature selection methods in this work.

The paper (Liang et al., 2016) published in “ IEEE Congress on Evolutionary Computation 2016” is our initial work of feature selection. In paper (Liang et al., 2016), GP is used to evolve segmentation algorithms rather than for the purpose of feature selection. Based on the GP-evolved solutions, a simple feature selection is conducted by selecting the features with high occurrence rates. In contrast, this paper employs GP for the first time to develop feature selection methods for figure-ground segmentation tasks, which aims to produce effective feature subsets to help improve segmentation performance on complex images (e.g. images with high variations). Three novel wrapper FS methods using both single-objective and multi-objective GP techniques, i.e. PGP-FS (parsimony GP FS), NSGP-FS (nondominated sorting GP FS) and SPGP-FS (strength Pareto GP FS), have been developed. Specifically, PGP-FS is single-objective; while NSGP-FS and SPGP-FS are multi-objective, which are based on two multi-objective techniques respectively, i.e. NSGA-II (nondominated sorting genetic algorithm) and SPEA2 (strength Pareto evolutionary algorithm). The generated feature subsets from the three proposed methods will be studied and compared with two standard selection methods (sequential forward and sequential backward methods) and the whole feature set. Specifically, we investigate the following objectives:

  • 1.

    explore whether the proposed methods can produce effective feature subsets for complex segmentation tasks,

  • 2.

    compare the single-objective method (PGP-FS) with the multi-objective methods (NSGP-FS and SPGP-FS),

  • 3.

    investigate which one of the two multi-objective methods (NSGP-FS and SPGP-FS) performs better.

The rest of the paper is organised as follows. Section 2 provides the introduction to feature selection and existing works that utilize GP to select features. Section 3 describes the related methods, i.e. the feature extraction and the pixel classification based figure-ground segmentation method. In Section 4, the new feature selection methods are described. Section 5 provides the experimental preparation. In Section 6, the results are presented, and conclusions and future work are shown in Section 7.

Section snippets

Feature selection

Feature selection (FS) is the process of selecting a feature subset from a large space of possible subsets (Saeys et al., 2007). There are three major steps in a general feature selection procedure, i.e. generating subsets, evaluating subsets and checking whether the stopping criteria has been met (Dash and Liu, 1997). As shown in Fig. 1, firstly the possible feature subsets are produced by search methods, then the subsets are evaluated. Two classes of evaluation criteria are widely-used, i.e.

Baseline methods

Figure-ground segmentation tasks can be treated as pixel classification problems, hence our previous work (Liang et al., 2016) proposed a pixel classification based segmentation system, which used GP to evolve pixel classifiers. Since this paper focuses on GP's capacity for feature selection, a more generic system based on the previous one is proposed, which applies standard classifiers for pixel classification, and a feature selection process based on GP is introduced into the segmentation

New wrapper feature selection methods

This section introduces three new wrapper feature selection methods using GP, i.e. a single-objective methods and two multi-objective methods. Fig. 5 illustrates how feature subsets are generated based on the evolved solutions of GP based FS methods. Tree representation is popularly used in GP, so it is also applied in this work. For one GP solution, its terminals are collected, and the unique features that occur in the terminals form the selected feature subset. For example, the left tree

GP settings

The GP set-up parameters employ the settings used by Koza (1992) except for the following parameters. Compared with 1024 set by Koza, the population size in this work is set to 500, which is sufficient to solve the related problems while can reduce the computational cost. Moreover, crossover and mutation are applied as reproduction operators, whose rates are 90% and 10% respectively. The function set, including four standard arithmetic operators and five conditional operators, is displayed in

Training results

This part analyses the results in the training stage of generating solutions (feature subsets). Fig. 7 displays the single best solutions over 30 runs for PGP-FS, and the nondominated solutions of aggregated Pareto fronts over 30 runs for NSGP-FS and SPGP-FS. From this figure, it can be seen that solutions contain less than 25 features, much smaller than the whole set of 53 features. Most NSGP-FS and SPGP-FS solutions have less features than PGP-FS solutions. Moreover, solutions with the same

Conclusions and future work

Three wrapper feature selection methods using GP were proposed in this paper, two of which (i.e. NSGP-FS and SPGP-FS) are multi-objective and one of which (i.e. PGP-FS) is single-objective. The contributions of this work are twofold. Firstly, this is the first work that investigates the ability of GP to select effective feature subsets for image segmentation tasks. Even though GP has been studied for feature selection by existing works, they are mainly for classification problems. GP has not

References (39)

  • K. Deb et al.

    A fast and elitist multiobjective genetic algorithmNsga-ii

    IEEE Trans. Evolut. Comput.

    (2002)
  • P.G. Espejo et al.

    A survey on the application of genetic programming to classification

    IEEE Trans. Syst., Man, Cybern., Part C.

    (2010)
  • M. Everingham et al.

    The pascal visual object classes challengea retrospective

    Int. J. Comput. Vision.

    (2015)
  • M. Hall et al.

    The weka data mining softwarean update

    ACM SIGKDD Explor. Newsl.

    (2009)
  • J.R. Koza

    Genetic Programming: On the Programming of Computers by Means of Natural Selection

    (1992)
  • V. Kumar et al.

    Feature selection

    SmartCR

    (2014)
  • Liang, Y., Zhang, M., Browne, W.N., 2015. A supervised figure-ground segmentation method using genetic programming. In:...
  • Liang, Y., Zhang, M., Browne, W.N., 2016. Figure-ground image segmentation and feature selection using genetic...
  • Liang, Y., Zhang, M., Browne, W.N., Multi-objective genetic programming for figure-ground image segmentation. In:...
  • Cited by (29)

    • Review of swarm intelligence-based feature selection methods

      2021, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      Pattern recognition is one of the most important applications of machine learning in different sciences. Machine learning methods utilized in the many areas of medical diagnosis (Liu et al., 2017; Li et al., 2019b), marketing (Huang and Tsai, 2009; Yuan et al., 2020), image processing (Liang et al., 2017; Zhou et al., 2017), text mining (Wang and Hong, 2019; Kou et al., 2020), information retrieval (Lin et al., 2014; Ji et al., 2019), Identification (Bi et al., 2019; Koide et al., 2020), etc. One of the significant goals of modeling and classification of data is to predict based on the train data and available features.

    • Bi-objective memetic GP with dispersion-keeping Pareto evaluation for real-world regression

      2020, Information Sciences
      Citation Excerpt :

      As GP based methods are not deterministic, they run 30 times and the results are the average values of 30 runs. Table 3 compares the proposed MNSGP and MSPGP with two reference methods (i.e. NSGP and SPGP) that have been used in existing works [26,44]. This table presents numerical results in terms of the average hypervolume and the average time cost of 30 runs.

    • Preference-driven Pareto front exploitation for bloat control in genetic programming

      2020, Applied Soft Computing Journal
      Citation Excerpt :

      Two new preference-driven Pareto dominance principles in multi-objective techniques are designed, i.e. the static constraint Pareto dominance (scPd) and the dynamic constraint Pareto dominance (dcPd). Existing multi-objective GP methods [18–20,30] normally treat all objectives equally and generate well-distributed trade-offs on the Pareto front. However, a front solution with extremely small size, yet correspondingly poor functionality, can be valueless in practice.

    View all citing articles on Scopus
    View full text