Image feature selection using genetic programming for figure-ground segmentation
Introduction
Figure-ground image segmentation can be regarded as a special case of image segmentation. It only identifies regions of interest and considers other parts as the background, thus producing binary images as the result. Figure-ground segmentation is an important topic, as many tasks in computer vision and image processing, e.g. robot grasping and image editing, are only interested in certain regions of images and use it as a preprocessing step to isolate these regions (Zou et al., 2014). It is difficult to achieve accurate segmentation performance especially for images with high variations (Liang et al., 2015, Liang et al., 2017), e.g. in terms of object shapes and background regions. In these cases, effective image features that can capture distinguishing information are necessary. However, image features extracted by existing feature descriptors often contain noisy/redundant features, which feature selection (FS) can help remove, thus improving the segmentation performance.
One challenge in FS is the large search space of possible feature subsets, so effective search methods are crucial. Generally, existing FS algorithms use three types of search methods, e.g. exhaustive, sequential and random methods, to search for good feature subsets (Liu and Yu, 2005). FS using the exhaustive search methods, e.g. breadth first search, evaluates all possible combinations of the original features exhaustively, and then find the best subset. The exhaustive search has a high computational cost and may lead to the over-fitting problem (Nagata et al., 2015). Sequential search methods, such as forward selection and backward selection (Zhou and Hansen, 2006), aim to produce a good solution in a reasonable time by trading off accuracy and optimality for speed. Random search methods initially randomly select a feature subset, then two ways are applied to search for an optimal subset (Kumar and Minz, 2014). One is to use a completely random method to generate the next subset, such as the Las Vegas algorithm. However, it assumes that the run time is infinite, which is not realistic. As the Las Vegas algorithm cannot discover a good feature subset in the allocated time, it often performs poorly. The other is to include heuristic knowledge in the search process, such as evolutionary computation (EC) techniques.
EC techniques have the potential to solve problems with large search spaces efficiently, and can be applied to a wide range of optimisation problems (Nag and Pal, 2016), e.g. feature selection. EC techniques can be mainly categorized into three groups (Borenstein and Ullman, 2008): evolutionary algorithms (e.g. genetic algorithms and genetic programming), swarm intelligence (e.g. particle swarm optimisation and bee algorithm) and others (e.g. memetic algorithms). Compared with other EC techniques, GP is more flexible (Espejo et al., 2010), as it can utilize complex and variable-length representations, such as trees. The flexibility of GP makes it possible to evolve better solutions than those designed by experts. GP has been applied to feature selection (Nag and Pal, 2016, Smart and Burrell, 2015, Davis et al., 2006, Muni et al., 2006), and promising results have been achieved.
Existing GP based FS methods can be divided into three categories based on the evaluation of selected feature subsets, which are the filter, the wrapper and the embedded approaches (Xue et al., 2016), which are described in detail in Section 2.1. For the filter approach, the avoidance of learning algorithms in the evaluation loop makes it challenging to evaluate the feature subsets, as it ignores the actual performance of the selected features for a given task (e.g. a classification problem). In contrast, the wrapper approach generally produces better performing feature subsets than the filter approach (Xue et al., 2016). Moreover, the embedded approach is more complex conceptually, and modifications to the learning algorithm may cause poor performance (Maldonado and Weber, 2011). Therefore, we only investigate the wrapper feature selection methods in this work.
The paper (Liang et al., 2016) published in “ IEEE Congress on Evolutionary Computation 2016” is our initial work of feature selection. In paper (Liang et al., 2016), GP is used to evolve segmentation algorithms rather than for the purpose of feature selection. Based on the GP-evolved solutions, a simple feature selection is conducted by selecting the features with high occurrence rates. In contrast, this paper employs GP for the first time to develop feature selection methods for figure-ground segmentation tasks, which aims to produce effective feature subsets to help improve segmentation performance on complex images (e.g. images with high variations). Three novel wrapper FS methods using both single-objective and multi-objective GP techniques, i.e. PGP-FS (parsimony GP FS), NSGP-FS (nondominated sorting GP FS) and SPGP-FS (strength Pareto GP FS), have been developed. Specifically, PGP-FS is single-objective; while NSGP-FS and SPGP-FS are multi-objective, which are based on two multi-objective techniques respectively, i.e. NSGA-II (nondominated sorting genetic algorithm) and SPEA2 (strength Pareto evolutionary algorithm). The generated feature subsets from the three proposed methods will be studied and compared with two standard selection methods (sequential forward and sequential backward methods) and the whole feature set. Specifically, we investigate the following objectives:
- 1.
explore whether the proposed methods can produce effective feature subsets for complex segmentation tasks,
- 2.
compare the single-objective method (PGP-FS) with the multi-objective methods (NSGP-FS and SPGP-FS),
- 3.
investigate which one of the two multi-objective methods (NSGP-FS and SPGP-FS) performs better.
The rest of the paper is organised as follows. Section 2 provides the introduction to feature selection and existing works that utilize GP to select features. Section 3 describes the related methods, i.e. the feature extraction and the pixel classification based figure-ground segmentation method. In Section 4, the new feature selection methods are described. Section 5 provides the experimental preparation. In Section 6, the results are presented, and conclusions and future work are shown in Section 7.
Section snippets
Feature selection
Feature selection (FS) is the process of selecting a feature subset from a large space of possible subsets (Saeys et al., 2007). There are three major steps in a general feature selection procedure, i.e. generating subsets, evaluating subsets and checking whether the stopping criteria has been met (Dash and Liu, 1997). As shown in Fig. 1, firstly the possible feature subsets are produced by search methods, then the subsets are evaluated. Two classes of evaluation criteria are widely-used, i.e.
Baseline methods
Figure-ground segmentation tasks can be treated as pixel classification problems, hence our previous work (Liang et al., 2016) proposed a pixel classification based segmentation system, which used GP to evolve pixel classifiers. Since this paper focuses on GP's capacity for feature selection, a more generic system based on the previous one is proposed, which applies standard classifiers for pixel classification, and a feature selection process based on GP is introduced into the segmentation
New wrapper feature selection methods
This section introduces three new wrapper feature selection methods using GP, i.e. a single-objective methods and two multi-objective methods. Fig. 5 illustrates how feature subsets are generated based on the evolved solutions of GP based FS methods. Tree representation is popularly used in GP, so it is also applied in this work. For one GP solution, its terminals are collected, and the unique features that occur in the terminals form the selected feature subset. For example, the left tree
GP settings
The GP set-up parameters employ the settings used by Koza (1992) except for the following parameters. Compared with 1024 set by Koza, the population size in this work is set to 500, which is sufficient to solve the related problems while can reduce the computational cost. Moreover, crossover and mutation are applied as reproduction operators, whose rates are 90% and 10% respectively. The function set, including four standard arithmetic operators and five conditional operators, is displayed in
Training results
This part analyses the results in the training stage of generating solutions (feature subsets). Fig. 7 displays the single best solutions over 30 runs for PGP-FS, and the nondominated solutions of aggregated Pareto fronts over 30 runs for NSGP-FS and SPGP-FS. From this figure, it can be seen that solutions contain less than 25 features, much smaller than the whole set of 53 features. Most NSGP-FS and SPGP-FS solutions have less features than PGP-FS solutions. Moreover, solutions with the same
Conclusions and future work
Three wrapper feature selection methods using GP were proposed in this paper, two of which (i.e. NSGP-FS and SPGP-FS) are multi-objective and one of which (i.e. PGP-FS) is single-objective. The contributions of this work are twofold. Firstly, this is the first work that investigates the ability of GP to select effective feature subsets for image segmentation tasks. Even though GP has been studied for feature selection by existing works, they are mainly for classification problems. GP has not
References (39)
- et al.
Feature selection for classification
Intell. data Anal.
(1997) - et al.
Novel feature selection method for genetic programming using metabolomic 1H NMR data
Chemom. Intell. Lab. Syst.
(2006) - et al.
Genetic programming for evolving figure-ground segmentors from multiple features
Appl. Soft Comput.
(2017) - et al.
Genetic programming and frequent item set mining to identify feature selection patterns of iEEG and fMRI epilepsy data
Eng. Appl. Artif. Intell.
(2015) - et al.
Breadth-first heuristic search
Artif. Intell.
(2006) - Ahmed, S., Zhang, M., Peng, L., 2013. Enhanced feature selection for biomarker discovery in lc-ms data using gp. In:...
- et al.
Combined top-down/bottom-up segmentation
IEEE Trans. Pattern Anal. Mach. Intell.
(2008) - Bui, L.T., Essam, D., Abbass, H.A., Green, D., 2004. Performance analysis of evolutionary multi-objective optimization...
- et al.
Multi-objective evolutionary algorithms
Encycl. Artif. Intell.
(2009) - Dash, M., Liu, H., Motoda, H., 2000. Consistency based feature selection. In: Pacific-Asia conference on knowledge...
A fast and elitist multiobjective genetic algorithmNsga-ii
IEEE Trans. Evolut. Comput.
A survey on the application of genetic programming to classification
IEEE Trans. Syst., Man, Cybern., Part C.
The pascal visual object classes challengea retrospective
Int. J. Comput. Vision.
The weka data mining softwarean update
ACM SIGKDD Explor. Newsl.
Genetic Programming: On the Programming of Computers by Means of Natural Selection
Feature selection
SmartCR
Cited by (29)
Automatic design of machine learning via evolutionary computation: A survey
2023, Applied Soft ComputingSupervised feature selection techniques in network intrusion detection: A critical review
2021, Engineering Applications of Artificial IntelligenceReview of swarm intelligence-based feature selection methods
2021, Engineering Applications of Artificial IntelligenceCitation Excerpt :Pattern recognition is one of the most important applications of machine learning in different sciences. Machine learning methods utilized in the many areas of medical diagnosis (Liu et al., 2017; Li et al., 2019b), marketing (Huang and Tsai, 2009; Yuan et al., 2020), image processing (Liang et al., 2017; Zhou et al., 2017), text mining (Wang and Hong, 2019; Kou et al., 2020), information retrieval (Lin et al., 2014; Ji et al., 2019), Identification (Bi et al., 2019; Koide et al., 2020), etc. One of the significant goals of modeling and classification of data is to predict based on the train data and available features.
Bi-objective memetic GP with dispersion-keeping Pareto evaluation for real-world regression
2020, Information SciencesCitation Excerpt :As GP based methods are not deterministic, they run 30 times and the results are the average values of 30 runs. Table 3 compares the proposed MNSGP and MSPGP with two reference methods (i.e. NSGP and SPGP) that have been used in existing works [26,44]. This table presents numerical results in terms of the average hypervolume and the average time cost of 30 runs.
Preference-driven Pareto front exploitation for bloat control in genetic programming
2020, Applied Soft Computing JournalCitation Excerpt :Two new preference-driven Pareto dominance principles in multi-objective techniques are designed, i.e. the static constraint Pareto dominance (scPd) and the dynamic constraint Pareto dominance (dcPd). Existing multi-objective GP methods [18–20,30] normally treat all objectives equally and generate well-distributed trade-offs on the Pareto front. However, a front solution with extremely small size, yet correspondingly poor functionality, can be valueless in practice.
A novel Error-Correcting Output Codes algorithm based on genetic programming
2019, Swarm and Evolutionary Computation