A genetic programming-based feature selection and fusion for facial expression recognition
Introduction
Recognition of human emotions has long been the subject of active research area. A wide range of human interaction applications have to decipher the facial emotional state. Unlike other non-verbal gesture, the emotional state of the face can be relied to several expressions. Most research has focused on posed facial expressions and reached high level of efficiency recognizing human emotions [1], [2]. However, some posed expressions are still very hard to discriminate such as and emotions. Quite fewer works have done advances interpreting spontaneous facial emotions. There are several factors affecting the precision of facial expression recognition (FER) systems on spontaneous or posed expressions, including prominent facial feature selection, feature fusion and classifier design. Since, FER applications have to deal with natural emotions, our goal in this work is to develop a system that can achieve accurate recognition rates on posed as well as on spontaneous facial expressions. Extracting efficient facial features is crucial towards facial emotion recognition. Commonly, two types of features are used to discriminate facial emotions: geometric and appearance features. Geometric features give clues about shape and position of face components, whereas appearance based features contain information about the furrows, bulges, wrinkles, etc. Appearance features contain micro-patterns which provide important information about the facial expressions. But one major drawback with them is the difficulty to generalize across different persons. Although geometric features are noise sensitive and difficult to extract, they proved to be sufficient to give accurate facial expression recognition results [3]. Moreover, He et al. [4]
demonstrated that geometric features are more effective than appearance ones in most cases. However, geometric-based FER methods still have difficulties discriminating some expressions. As an illustration, Fig. 1 shows two faces displaying emotional state that were miss-classified as expressions using geometric based features and correctly recognized using local binary patterns (LBP) according to a study presented in [5]. Indeed, micro-patterns, captured by LBP features, are able to offset the weakness of geometric based features by capturing micro-variations caused by wrinkles, which are useful to separate the and emotional states. Therefore, facial geometry distortions, given by geometrical features, are complementary with textural information captured by appearance ones. In other words, considering geometrical and appearance feature fusion can be an interesting way to design more discriminative features to deal with FER challenges.
Feature fusion based methods face the problem of high dimensionality which can affect the quality of the facial emotion recognition. Indeed, dealing with large number of features can increase the computational time and overwhelm classifiers with unnecessary or redundant information. In this case, a rigorous feature selection step is necessary. To carry out selection of a good feature subset, several factors must be considered. First, feature selection cannot be performed in the same way for spontaneous and posed expressions. Indeed, spontaneous facial muscle movements have been proven significantly different from deliberate ones. According to Ekman and P [6], zygomatic major is the only interacted muscle in posed smiles. However, muscles around the eyes (i.e. orbicularis oculi) are also contracted during genuine smiles. Moreover, Namba et al. [7] studied the difference in action units (AUs) [8] between involuntary and real emotions. For instance, the three most commonly seen AUs for genuine are squinting eyes and raising the upper lip. On the other side, glare and raising the chin are often spotted in posed . The eyebrow raiser, the lips part, the jaw drop and the upper lid raiser are hardly observed in genuine expressions comparing to acted ones as specified in [7]. In other words, for the same emotional state, spontaneous AUs differ from posed ones. Second, AUs discriminating between expressions may change from one couple of expressions to another even within the same expression category (spontaneous or posed). Therefore, performing a static selection method and choosing relatively the best features subset to implement the prediction for all the expressions may not be efficient. In fact, choosing the average does not always mean choosing the best. Although the selected subset has shown good results in most expressions, it may perform poorly in others. Thus, for better training and emotion detection, choosing the right and effective features is crucial as irrelevant and noisy features may mislead and negatively affect the recognition system.
To sum up, two main challenges facing research in the field of facial expression recognition were raised. First, is it possible to fuse hybrid features (geometric, texture, etc.), within the framework of facial expression recognition, without information redundancy? Second, how to design a feature selection mechanism to solve the problem of expressions that are commonly miss-classified by FER systems? The aim of this work is to explore soft computing techniques, such as genetic programming, to enhance the way features are selected and fused for more accurate facial expression recognition. The main contributions of this work are threefold:
- •
We propose a genetic programming (GP) based framework for posed and spontaneous facial expression recognition allowing to combine hybrid facial features, then we test it on the fusion of geometric and appearance features.
- •
The feature selection and fusion, in this work, are performed in a binary way: the most prominent features are selected and fused differently for each pair of expressions.
- •
To the best of our knowledge, we are the first to propose a genetic programming based classifiers for facial emotion recognition incorporating simultaneous feature selection and fusion within the evolutionary learning process.
The rest of this paper is organized as follows. We present a brief literature review on feature fusion and selection based methods for facial expression recognition in Section 2. The proposed method is detailed in Section 3. The results are presented and discussed in Section 4. Finally, in Section 5, we conclude the proposed work and present some ideas for future studies.
Section snippets
Related work
A basic FER framework have to perform emotion recognition by representing the facial emotional states through an appropriate set of features, while ensuring a trade-off between recognition accuracy and computation time. Geometric and texture features has been widely used, mostly separately, for FER purposes. There has been some researches on feature fusion in emotion recognition over recent years. In [9], the authors proposed a hybrid method combining geometric and appearance features. This
Proposed method
This section provides a detailed description of the proposed method for spontaneous and posed facial emotion recognition. The general flowchart of the suggested method is illustrated in Fig. 2. Indeed, a face detection step followed by a feature extraction step are performed. In the extraction step, geometric and texture features are extracted and fed to genetically evolved programs. Indeed, a binary genetic program learns to select the most discriminating features and to fuse them specifically
Experimental results
In this section, the proposed facial emotion recognition method is tested and compared to several relevant FER methods. The proposed method has been implemented using the Anaconda Python distribution and the DEAP1 evolutionary computation framework. The experiments have been executed on a PC with Windows 10 as an operating
Conclusion and future work
In this work, a robust method for automatic facial expression recognition, is proposed. Genetic programming-based binary programs, which incorporate feature selection and fusion in the learning process, are proposed to discriminate between pairs of expression classes. The overall expression recognition is performed using a unique tournament elimination between the learned binary classifiers. The suggested method selects and combines differently linear, eccentricity and LBP features for each
CRediT authorship contribution statement
Haythem Ghazouani: Conceptualization, Methodology, Software, Validation, Writing.
Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
References (47)
- et al.
Facial expression recognition of intercepted video sequences based on feature point movement trend and feature block texture variation
Appl. Soft Comput.
(2019) - et al.
Facial expression recognition using distance and texture signature relevant features
Appl. Soft Comput.
(2019) - et al.
A principal component analysis of facial expressions
Vision Res.
(2001) - et al.
A decision-theoretic generalization of on-line learning and an application to boosting
J. Comput. Syst. Sci.
(1997) - et al.
300 faces in-the-wild challenge: database and results
Image Vision Comput.
(2016) - et al.
Facial expression recognition based on deep convolution long short-term memory networks of double-channel weighted mixture
Pattern Recogn. Lett.
(2020) - et al.
Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition
Pattern Recogn.
(2019) - et al.
Fully automated facial expression recognition using 3D morphable model and mesh-local binary pattern
- et al.
Sparse coding-based representation of LBP difference for 3D/4D facial expression recognition
Multimedia Tools Appl.
(2019) - M.F. Valstar, I. Patras, M. Pantic, Facial action unit detection using probabilistic actively learned support vector...