A novel representation in genetic programming for ensemble classification of human motions based on inertial signals

https://doi.org/10.1016/j.eswa.2021.115624Get rights and content

Abstract

The use of sensing technologies and novel computational methods for automated motion detection can play a major role in improving the quality of life. Recently, researchers have become interested in employing the inertial sensor technology to record human motion signals as well as the new machine learning methods for signal-based motion detection. This manuscript proposes a novel method for human motion detection based on inertial sensors. The spatial information of a motion is first used in this method for geometric feature extraction. This manuscript also aims to introduce a novel ensemble learning approach through the genetic programing paradigm. To reduce the general complexity in the process of designing the proposed classifier, an initial population of binary trees (genes) is first created and then enhanced through genetic programing to select the best classifier. A complete experiment was conducted to evaluate the proposed ensemble classifier for the classification of inertial signals of human motions. According to the experimental results based on several well-known datasets of inertial signals, the proposed approach performed appropriately in comparison with the existing methods.

Introduction

Due to their small sizes and low costs, inertial sensors such as accelerometers and gyroscopes are appropriate for use in smart wearable devices. In fact, smart devices can be worn on different body parts from top to bottom (e.g. hands, chest, knees, and legs) for the long-term physical data recording (Wang, Chen, Yang, Zhao, & Chang, 2016). This has led to increased researchers’ interest in designing human motion detection systems based on inertial signals. In a motion detection system with inertial sensors, every motion Ci in Cii=1N (Nindicates the number of motions) is measured by M inertial sensor. Moreover, every inertial sensor Sj in Sjj=1M has P signals; therefore, M×P inertial signals are stored for every motion Ci. The classification task is to find a mapping function to automatically assign the matrix of inertial signals to Ci. In these systems, the larger the M, the larger the number of measurements; thus, motions are recorded more accurately. However, this could make the classification model more complicated because of increasing the problem dimensions. As a result, the system scalability declines due to the hardware limitations of inertial devices. Hence, these systems seek two contradictory goals, i.e. accuracy and scalability.

The major challenge which has so far been taken into account is to increase the classification accuracy. For this purpose, many powerful methods have been proposed through deep learning, an advantage of which is good generalization in addition to accuracy. After a deep learning model is developed, it is trained in an end-to-end manner. This eliminates the manual feature learning process (Wang, Cang, & Yu, 2019). The most well-known deep learning model is the convolutional neural network (CNN), which yielded in an accuracy of 93% on average for REALDISP with 11 different motions in (San et al., 2017). Another class is based on the long short-term memory (LSTM), which resulted in an average accuracy of 91% for PAMAP2 with 17 different motions (Lv, Xu, & Chen, 2019). Despite acceptable accuracy rates in deep learning methods, scalability is still an open-ended problem depending directly on the size and quantity of model parameters. These models are inappropriate for inertial devices with limited memory, processing unit, and battery consumption due to high computational costs (Ahmed Bhuiyan et al., 2020, Demrozi et al., 2020, Sepahvand and Abdali-Mohammadi, 2019). Recently, researchers have tried to propose lightweight deep learning models by working on such topics as parameter pruning and knowledge distillation (Gou, Yu, Maybank, & Tao, 2020) in the community and employing such models as MobileNets (Howard, Zhu, Chen, Kalenichenko, Wang, Weyand, & Adam, 2017) and ShuffleNets (Zhang, Zhou, Lin, & Sun, 2018). Nevertheless, these models are now at an infancy stage, and only a few methods have so far been proposed for image classification. Probably, deep learning methods based on MobileNets and ShuffleNets will be appropriate for inertial devices in the future.

Given the inappropriateness of heavyweight deep learning models for scalable motion classification, ensemble learning is a good option. In fact, ensemble methods have a better classification than single classifiers with their key advantage being their scalability. The ensemble of a set of various classifiers can generally have positive effects on accuracy and reduction of overfitting (Zhou & Feng, 2017). Generalization error reduction and diversity are two major factors in the performance of an ensemble. Diversity indicates that the resultant error rates of classifiers are uncorrelated, something which is not easily measurable (Kuncheva & Whitaker, 2003). In particular, integrating Genetic Programming (GP) with the ensemble method has attracted many researchers to the development of diverse ensembles due to tangible advances through the use of GP in this area. GP (Koza, 1994) is an evolutionary method in which every solution is a tree-like program that evolves through selection, crossover, and mutation operations to perform a specific task in a single- or multi-objective problem. Optimal classifications of a group can be obtained through the random nature of GP by repeating the iterations of this algorithm as well as convergence in the problem space. Many methods have been proposed in recent years to design GP-based ensembles. Instances are the multifactor GP model based on multifactor optimization (Zhong, Feng, Cai, & Ong, 2020), grammar-guided GP ensemble (Moyano, Gibaja, Cios, & Ventura, 2020), scalable GP ensemble (Kumar, Satapathy, & Murthy, 2009), hybrid GP ensemble (Chan, Kwong, & Kremer, 2020), and distributed GP ensemble (Folino, Pizzuti, & Spezzano, 2008) as well as other methods introduced in (Tao, Chen, Fu, Jiang, & Zhang, 2019), (Hengpraprohm & Chongstitvatana, 2008) and (Nag & Pal, 2016). In these methods, the overall idea is to generate a population (NPOP) of ensembles coded in chromosomes when Θ(τ)=θ1(τ),θ2(τ),...,θNPOP(τ) in which θi(τ) represents the ith chromosome in the τth iteration. In fact, θi(τ) is an ensemble displayed with a tree including simple arithmetic operators (,+,-,/) or geometric operators (Sin,Cos,Tan,Cot). For complicated pattern recognition problems in dynamic signals of human motions (e.g. an accelerator signal) in which natural and dynamic noises are high, a regressor created with simple mathematic operators would be inappropriate. Therefore, such GP-based ensembles lose efficiency in these problems and fail to catch up with powerful methods such as deep learning.

The flexible tree-based representation can be employed to create a more powerful ensemble for complicated classification problems. In this type of representation, the tree nodes of GP programs include not only simple mathematic operators but also high-level operators (e.g. single homogenous and heterogeneous regressors), which can be used in complicated classification problems. According to the literature, the flexible tree-based representation was used for image classification in the model introduced in (Bi, Xue, & Zhang, 2019) and recently finalized in (Bi, Xue, & Zhang, 2020). The nodes of the tree introduced in (Bi et al., 2020) are divided into image texture operators and ensemble learning layers. The idea of this method to create an ensemble of single classifiers is to use the majority voting algorithm, in which the output labels of classifiers are compared. Destructive diversity occurs in the voting algorithm when there are too many wrong answers. In this case, the accuracy of an ensemble might always be lower than that of an individual model (Wang, 2008).

It is possible to connect single classifiers by using the perceptron topology of a neural network instead of the ensemble of single classifiers through voting and finding a better alternative in order to overcome destructive diversity (Ojha, Abraham, & Snášel, 2017). In a sequential perceptron connection in neural networks, every perceptron that is a regressor can use the support output as a new input and experience. The final output of the classification is a decision made through the experiences of all regressors. In this type of ensemble, a general collaboration is formed between the functions of all regressors. As a result, the better and more comprehensive search of the pattern recognition problem space would be possible by developing a powerful topology. The same model was used in this paper to introduce a flexible ensemble tree based on GP (ETGP), in which the sets of heterogeneous single classifiers were interconnected from the lower levels up to the novel root tree to develop an ensemble. In the novel tree, the nodes are the high-level operators divided into classification and combination layers. In the classification layer, the nodes are single classifiers that are tasked with classifying their inputs and obtaining a true positive set. In fact, single classifiers are nonlinear models which are able to estimate complicated patterns. Between every two classification layers, a combination layer is then employed to combine the outputs of the lower classification layer and deliver them to the upper classification layer by using set operators. Finally, every ensemble tree extracts the samples belonging to a true positive set of the current class in the tree root node. After these trees are evaluated through the GP paradigm in the training phase, the best tree will be selected for the test phase. The proposed ETGP was used to classify motions based on the signals of inertial sensors. In this paper, two preprocessing phases were implemented to make the input ensemble lightweight and reduce the general complexity. Inertial navigation algorithms were first employed to determine the inertia of the position signal corresponding to the trajectory of motions, from which the high-level geometric features were then extracted (their efficiency was highlighted in our previous work (Sepahvand, Abdali-Mohammadi, & Mardukhi, 2017)). The evaluations cover the accuracy and scalability of motion classification. The model size and number of its parameters were evaluated to analyze scalability. Moreover, the comparisons were divided into two sections. The proposed ETGP was first compared with peer methods, and the results were analyzed. To determine the accuracy, many comparisons were then drawn with state-of-the-art deep learning methods for detecting motions based on inertial sensors.

The most important contributions of this manuscript are as follows:

  • 1.

    Proposing a scalable method for detecting human motions based on inertial signals.

  • 2.

    Employing high-level geometric features to describe human motions.

  • 3.

    Developing an ensemble learning model based on a novel representation in the GP algorithm for motion classification.

Finally, the most important questions that this article seeks to answer are the following:

  • 1.

    Is the computational complexity of a GP-based ensemble classifier lower than that of a deep learning classifier?

  • 2.

    Can the accuracy of a GP-based ensemble compete with that of a deep learning model?

The rest of this paper is organized as follows. Section 2 briefly reviews achievements in related works. In Section 3, different components of the proposed method are described. In Section 4, the proposed method is evaluated and the results are analyzed. Finally, the discussion and conclusions are presented in 5 Discussion, 6 Conclusions, respectively.

Section snippets

Related works

In recent years, many systems have been proposed for human motion detection based on inertial signals. Various datasets have been collected to design these systems for recording human motions. The most important datasets, for which many classification methods have recently been proposed, include REALDISP with 33 motions, PAMAP2 with 24 motions, Wrist-Worn with 14 motions, and Chest-Mounted with 7 motions. Different methods of hand-craft and automated feature learning as well as single and deep

Proposed method

Fig. 1 demonstrates the pipeline of the proposed system for human motion detection based on inertial signals obtained from accelerometers and gyroscopes. The pipeline consists of three phases of position signal calculation, geometric signal extraction, and classification through the proposed ETGP. The system was designed in a three-phase pipeline framework for an important reason. Our previous study can be reviewed to justify the use of position signal and geometric feature extraction phases (

Experimental results

In this section, the proposed system is evaluated using four large datasets called REALDISP (Banos, Toth, Damas, Pomares, & Rojas, 2014), Chest-Mounted (Casale, Pujol, & Radeva, 2012), Wrist-worn (Bruno, Mastrogiovanni, Sgorbissa, Vernazza, & Zaccaria, 2013), and PAMAP2 (Reiss & Stricker, 2012), which are available at the UCI machine learning database. These datasets contain the signals of various human motions, collected by inertial sensors including accelerometers and gyroscope. In these

Discussion

In the construction of ensemble classifiers with evolutionary algorithms such as GP, the development of classifier takes place automatically in the course of evolution. This method avoids the problems and difficulties of traditional methods which involve designing the classifier structure manually through trial and error [37]. According to the proposed ETGP, which is shown in Fig. 4, the tree height and the number of nodes in the tree are flexible. Therefore, the tree can contain an arbitrary

Conclusions

This paper presented a new system of human motion classification based on the signals recorded from inertial sensors. The main focus of this paper was to introduce a new GP-based ensemble method (ETGP). In this method, the computational cost is reduced by constructing an initial population of binary trees (genes) in one run of GP, with each tree representing one classifier. For each tree, the mapping function was defined as XY, where Y is the classification output generated by the gene and X

CRediT authorship contribution statement

Majid Sepahvand: Conceptualization, Software, Data curation, Writing - original draft, Visualization, Investigation, Validation. Fardin Abdali-Mohammadi: Conceptualization, Methodology, Project administration, Supervision, Writing - review & editing.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

References (49)

  • Y. Bi et al.

    Genetic Programming with a new representation to automatically learn features and evolve ensembles for image classification

    IEEE Transactions on Cybernetics

    (2021)
  • B. Bruno et al.

    Human motion modelling and recognition: A computational approach

    Paper presented at the 2012 IEEE International Conference on Automation Science and Engineering (CASE)

    (2012)
  • B. Bruno et al.

    Analysis of human behavior recognition algorithms based on acceleration data

    Paper presented at the 2013 IEEE International Conference on Robotics and Automation

    (2013)
  • P. Casale et al.

    Personalization and user verification in wearable systems using biometric walking patterns

    Personal and Ubiquitous Computing

    (2012)
  • P. De et al.

    Recognition of human behavior for assisted living using dictionary learning approach

    IEEE Sensors Journal

    (2018)
  • F. Demrozi et al.

    Human activity recognition using inertial, physiological and environmental sensors: A comprehensive survey

    IEEE Access

    (2020)
  • Ö.F. Ertuǧrul et al.

    Determining the optimal number of body-worn sensors for human activity recognition

    Soft Computing

    (2017)
  • G. Folino et al.

    Training distributed GP ensemble with a selective algorithm based on clustering and pruning for pattern classification

    IEEE Transactions on Evolutionary Computation

    (2008)
  • F.-A. Fortin et al.

    DEAP: Evolutionary algorithms made easy

    The Journal of Machine Learning Research

    (2012)
  • W. Gomaa et al.

    Adl classification based on autocorrelation function of inertial signals

    Paper presented at the 2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

    (2017)
  • Gou, J., Yu, B., Maybank, S. J., & Tao, D. (2020). Knowledge Distillation: A Survey. arXiv preprint...
  • S. Hengpraprohm et al.

    A genetic programming ensemble approach to cancer microarray data classification

    Paper presented at the 2008 3rd International Conference on Innovative Computing Information and Control

    (2008)
  • T. Hossain et al.

    A method for sensor-based activity recognition in missing data scenario

    Sensors

    (2020)
  • Howard, A. G., Zhu, M., Chen, B., Kalenichenko, D., Wang, W., Weyand, T., … Adam, H. (2017). Mobilenets: Efficient...
  • Cited by (13)

    • An adaptive teacher–student learning algorithm with decomposed knowledge distillation for on-edge intelligence

      2023, Engineering Applications of Artificial Intelligence
      Citation Excerpt :

      The transfer of intelligence to edge devices resulted in specific advantages: (1) reducing response latency and cost of communication with a server, (2) decentralizing computation and increasing fault tolerance, and (3) developing smart edge devices and a centralized server with local and central decision-making features. Hence, large-scale intelligent tasks can be performed in a scalable manner (Teerapittayanon et al., 2017; Sepahvand and Abdali-Mohammadi, 2021). Many researchers have lately tried to develop lightweight deep learning models by proposing various methods such as model compression and acceleration (Cheng et al., 2018), parameter pruning and sharing (Wang et al., 2018), low-rank decomposition (Yu et al., 2017), transferred compact convolutional filters, and knowledge distillation (KD) (Gou et al., 2021).

    • Teacher–student knowledge distillation based on decomposed deep feature representation for intelligent mobile applications

      2022, Expert Systems with Applications
      Citation Excerpt :

      If the 8-byte floating point data type is used to define every parameter, the size of this model will be 84.04 M * 8B = 640 MB and its computational complexity up to ∼10 GFLOPs, which is far greater than the limited computing resources of mobile devices (Fu et al., 2020). The dependency of large-scale statistical models and deep neural networks on many computing resources can limit the implementation of these models on devices and platforms with limited resources and low storage spaces (Sepahvand & Abdali-Mohammadi, 2019; Sepahvand & Abdali-Mohammadi, 2021b). Furthermore, many applications such as real-time detections and predictions require high-speed execution, although many of these complicated models are slow in execution (J. Park et al., 2019; Sepahvand & Abdali-Mohammadi, 2022; Sepahvand, Abdali-Mohammadi, & Mardukhi, 2017).

    • Overcoming limitation of dissociation between MD and MI classifications of breast cancer histopathological images through a novel decomposed feature-based knowledge distillation method

      2022, Computers in Biology and Medicine
      Citation Excerpt :

      Many researchers have tried to reduce the computational complexity of these models through different methods such as compression [15], model acceleration [16], Interpretable models [17,18], pruning and parameter sharing [19]. However, small-scale devices are usually used in other applications such as monitoring patients and reaching online medical diagnoses that require high-speed execution, for these devices are easily portable [20–22]. However, they have limited computing resources due to their low dimensions.

    View all citing articles on Scopus
    View full text