Next Article in Journal
Using the FEM Method in the Prediction of Stress and Deformation in the Processing Zone of an Elastic/Visco-Plastic Material during Diamond Sliding Burnishing
Previous Article in Journal
SAgric-IoT: An IoT-Based Platform and Deep Learning for Greenhouse Monitoring
Previous Article in Special Issue
Detecting and Controlling Slip through Estimation and Control of the Sliding Velocity
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Classification of Faults Operation of a Robotic Manipulator Using Symbolic Classifier

Faculty of Engineering, University of Rijeka, Vukovarska 58, 51000 Rijeka, Croatia
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(3), 1962; https://doi.org/10.3390/app13031962
Submission received: 30 December 2022 / Revised: 31 January 2023 / Accepted: 1 February 2023 / Published: 2 February 2023
(This article belongs to the Special Issue Robot Intelligence for Grasping and Manipulation)

Abstract

:
In autonomous manufacturing lines, it is very important to detect the faulty operation of robot manipulators to prevent potential damage. In this paper, the application of a genetic programming algorithm (symbolic classifier) with a random selection of hyperparameter values and trained using a 5-fold cross-validation process is proposed to determine expressions for fault detection during robotic manipulator operation, using a dataset that was made publicly available by the original researchers. The original dataset was reduced to a binary dataset (fault vs. normal operation); however, due to the class imbalance random oversampling, and SMOTE methods were applied. The quality of best symbolic expressions (SEs) was based on the highest mean values of accuracy ( A C C ¯ ), area under receiving operating characteristics curve ( A U C ¯ ), P r e c i s i o n ¯ , R e c a l l ¯ , and F 1 S c o r e ¯ . The best results were obtained on the SMOTE dataset with A C C ¯ , A U C ¯ , P r e c i s i o n ¯ , R e c a l l ¯ , and F 1 S c o r e ¯ equal to 0.99, 0.99, 0.992, 0.9893, and 0.99, respectively. Finally, the best set of mathematical equations obtained using the GPSC algorithm was evaluated on the initial dataset where the mean values of A C C ¯ , A U C ¯ , P r e c i s i o n ¯ , R e c a l l ¯ , and F 1 S c o r e ¯ are equal to 0.9978, 0.998, 1.0, 0.997, and 0.998, respectively. The investigation showed that using the described procedure, symbolically expressed models of a high classification performance are obtained for the purpose of detecting faults in the operation of robotic manipulators.

1. Introduction

The main mode of operation for industrial robotic manipulators is unsupervised automatic operation. In modern production facilities, the industrial robots will perform the set tasks completely autonomously, either repeating the same series of operations [1] or adjusting the movement according to sensor inputs [2]. One of the issues that can arise with such an operation is faults. For this paper, faults are in this instance catastrophic faults that occur when the industrial manipulator hits an object which existed in its surroundings [3]. Such a failure requires an immediate ceasement of operation—as further movement can cause significant damage to the equipment or, crucially, a more serious injury to the human operator. One of the ideas to achieve a timely stopping is to utilize machine learning (ML) methods.
ML methods have a wide application in the area of fault detection. In [4], the authors discuss the application of ML algorithms in the Internet of Things (IoT) systems to detect possible faults in photovoltaic systems. Authors conclude wide applicability of such techniques can lead to promising results. The authors in [5], have utilized Long short-term memory (LSTM) networks on the time series data relating to the faults in power transmission. The developed models are tested and demonstrate high resiliency despite varying operating conditions. In [6], the authors demonstrate the use of ML in a maritime application—namely, for the detection of faults and operation optimization. The one-dimensional convolutional neural networks (CNN) authors use show promising results. Dang et al. [7] demonstrate the application of multiple algorithms on the problem of arcing fault detection in DC systems. The authors compare long short-term memory (LSTM), gated recurrent unit (GRU), and deep neural network (DNN) algorithms, concluding that all have satisfying performance. Another application in electrical engineering is demonstrated in [8]. The authors demonstrate an application of private reinforcement learning to achieve a precise detection that is above the stated baseline. In [9], the ANN and k-nearest neighbor approaches were applied to the problem of spiral bevel gear fault detection. In [10], the method was proposed for effective connection between users and providers. The method can discover users’ suitable queries and understand their preferences. The proposed system anticipates the connections among various technical fields and helps personnel discover useful technical knowledge. A combination of MC-SIOT with deep learning algorithms can better maintain the functionality system state. In [11], chaotic back propagation (BP) neural network was utilized for the prediction of smart manufacturing information system reliability. The results showed that when SMIS fails, the failure behavior can easily lead SMIS into chaos through the propagation of an interdependent network.
In the previous literature overview, it is shown that different algorithms outperform others depending on operating conditions, indicating the need for testing novel algorithms on fault detection problems. From the previously presented literature, it is obvious that such methods may be easily applicable to robotic fault detection.
In [12], the experimental investigation on a robot manipulator is presented in which the neural network was used to analyze the vibration condition on joints. In this research, two different types of neural networks were used, i.e., self-organizing map neural network (SOMNN) and radial basis neural network (RBNN). The investigation showed that at both running speeds, the RBNN outperforms the SOMNN with a R M S E value of 0.0004. The fault diagnosis approach of robotic manipulators was investigated in [13] using support vector machines (SVM). Using the radial basis function (RBF) network the interpolation of unknown actuator faults was achieved. The interpolation of unknown actuator faults was achieved using a radial basis function (RBF) network and was successfully tested experimentally. A neural-network fault diagnosis module and a reinforcement learning-based fault-tolerant control module were used in [14] as fault-tolerant control frameworks for robotic manipulators that are subjected to joint actuator faults. After the actuator fault is detected and diagnosed, the additive reinforcement learning controller will produce compensation torques to achieve system safety and maintain control. Using the proposed method an average accuracy of 97% was achieved. The SVM was used in [15] for fault detection of the robot manipulator and the results were compared with results obtained with ANN. The results showed that the recognition rate was higher in the case of SVM (99.6%). An adaptive neural network model for diagnosing faults (FD) in combination with adaptive, fuzzy, backstepping, variable control (FC) was proposed and used in [16] for fault-tolerant control. The variable structure observer (VSO) was used for the FD technique of the robot manipulator while higher-order VSO (HVSO) was used to solve the chattering phenomenon of VSO. The estimation performance of HVSO was improved by the implementation of a neural network in the FD pipeline. The improvement of FC was achieved using adaptive higher-order variable structure observer (AHVSO) and the results showed 27% to 29% improvement in faults detection when compared to HVSO and VSO methods.
The faults in sensors and actuators of the scara robot were detected using ANN and fuzzy logic in [17]. The proposed approach successfully detects and isolates the actuator and sensor faults.
The scientific papers in which the same dataset [18] was used as in this research are listed and described below. The self-organizing map (SOM), and genetic algorithm with SOM (GA-SOM) have been used in [19] to detect the fault operation of the robot manipulator. The GA-SOM outperforms the SOM with an achieved classification accuracy of 91.95%. In [20], the performance of base-level and meta-level classifiers were compared on [18]. The ML classifiers that have been used were Naive Bayes, boosted Naive Bayes, bagged Naive Bayes, SVM, boosted SVM, bagged SVM, decision table (DT), boosted DT, bagged DT, decision tree (DTr), boosted (DTr), bagged DTr, plurality voting, stacking meta decision trees, and stacking ordinary decision trees. The bagged Naive Bayes method achieved the highest classification accuracy of 95.24%. The deep convolutional neural networks (DCNN) are used in [21] to detect robot manipulator execution failures using sensor data from force and torque sensors. The best classification accuracy of 98.82% was achieved with one-dimensional CNN followed by two-dimensional CNN with 98.77% classification accuracy. In [22], 24 neural network (NN) architectures with seven learning algorithms were used to predict execution failures. The 10-8-5-4 NN architecture with the Bayesian regularization algorithm achieved a classification accuracy of 95.45%. The multi-layer perceptron (MLP) was used in [23] to predict robot execution failures. The highest classification accuracy with MLP was 90.45%. The deep-belief neural network (DBN) for detection of the robotic manipulator’s failure execution was investigated in [24]. The performance of DBN was compared with other standard ML classifiers (C-support vector classifier, logistic regression, decision tree classifier, K-Nearest Neighbor Classifier, MLP, AdaBoost Classifier, random forrest classifier, bagging classifier, voting classifier) and results showed that the approach has a higher detection accuracy (80.486%) than other algorithms. In [3], the MLP, SVM, CNN, and Siamese neural network (SNN) were utilized for the detection of faults during the operation of the robotic manipulator. The highest F 1 S c o r e value (1.0) was achieved with SNN.
The representation of all research papers that used the dataset [18] in their investigation with various ML algorithms and achieved classification accuracy ( A C C , F 1 s c o r e ) values are presented in Table 1.
It can be seen from Table 1 that in none of the research papers was the genetic programming-symbolic classifier (GPSC) used.

The Description of Research Novelty, Investigation Hypotheses with Overall Scientific Contribution

It can be noticed from the previously presented state of the art, that there are a few papers in which ML algorithms were used for fault operation analysis and detection. The main disadvantage which can be noticed in the previous literature overview is that these ML models are computationally intensive, i.e., they require large computational resources, especially the CNN or DNN. After these algorithms are trained on a specific dataset they have to be stored to be used in the future for processing new data. Storing and re-using these models requires a lot of computational resources so their usage in systems that control the robot manipulators is questionable. Another problem with the previously mentioned algorithms is that it is nearly impossible to express the complex models obtained by them as comparatively simple mathematical expressions which are understandable to scientists and engineers.
The research novelty is to present the process of the GPSC algorithm implementation to obtain symbolic expressions (SEs) which can detect fault operation with high classification performance. The SEs are easier to implement in existing control systems of robotic manipulators since they have lower computation-wise performance requirements when compared to complex models such as CNN or DNN. The main idea of this research is the procedure presentation of the GPSC algorithm application to generate SEs which will be utilized as high-performance detection models of robot manipulator faults.
The algorithm used in this paper [25] begins its process by creating the population members of the initial population that are not able to solve a particular task. However, through the predefined number of generations with the application of evolutionary computing operations (recombining and mutating), they adapt as the solution for a particular task.
From state-of-the-art, defined novelty, and research ideas, the following hypotheses will be investigated in this paper:
  • Is there a possibility to use the GPSC algorithm to generate SEs for the detection of fault operation of a robotic manipulator with high classification performance?
  • Can the proposed algorithm achieve high classification performance on datasets that are balanced using various oversampling methods?
  • Is it possible to use the GPSC algorithm with a random selection of hyperparameter values (RSHV) method, validated with 5-fold cross-validation (5FCV) to obtain SEs for the detection of fault operation of a robotic manipulator with high classification accuracy?
  • Can the high performance of SEs that consist of a reduced number of input parameters be achieved?
The scientific contributions are:
  • Investigates the possibility of obtaining a SE for robot manipulator fault operation using GPSC algorithm.
  • Investigates the influence of dataset oversampling methods (OMs) on (SEs) classification performance.
  • Investigates if using the GPSC algorithm with an RSHV method, validated using a 5FCV process can generate a set of robust SEs with a high detection accuracy of robot manipulator fault operation.
The outline of this paper consists of four sections. Firstly, the Materials and Methods section where research methodology, dataset, OMs, GPSC, RSHV, 5FCV procedure, and resources are described. Afterward, in the Results section, the SEs obtained using the GPSC algorithm with RSHV validated with 5FCV are presented. In the Discussion section, the results are discussed and justified in detail. Lastly, in the Conclusions section, the conclusions of the conducted investigation are provided.

2. Materials and Methods

The approach used in this investigation is described here, starting with the Dataset description and accompanying statistical analysis, oversampling methods, GSPC algorithm, model evaluation, and finally the computational resources used during the research.

2.1. Research Methodology

From the initial investigation, it was noticed that the dataset is highly imbalanced. For that reason, it was necessary to apply balancing methods to equalize the number of samples per class. This was achieved using random oversampling and SMOTE methods. The aforementioned methods generated two variations of the initial dataset and were used in the GPSC to obtain SEs. The results obtained on each modified dataset were compared, and the one with the highest classification performance and the smallest size is considered to be the best—as the goal is to obtain the best performing, but still relatively simple, equations. The final validation is performed on the initial dataset. The presentation of this process is given in Figure 1.

2.2. Dataset Description

The publicly available dataset [18] was used in this research. The dataset is well documented in [26,27] so in this subsection, only a brief dataset description is given. The dataset consists of a total of 463 data points and each of the data points D is shaped as:
D = F 1 x F 1 y F 1 z T 1 x T 1 y T 1 z F 2 x F 2 y F 2 z T 2 x T 2 y T 2 z F 15 x F 15 y F 15 z T 15 x T 15 y T 15 z .
As seen in the equation above, each of the data points consists of 90 measurements, organized into 15 subpoints. Each of the subpoints is a measurement of force and torque in each of the x, y, and z axes. The time difference between each point is 0.315 s, meaning that the entirety of the data point consists of 28.35 s. The statistical analysis and GPSC variable representation for all dataset variables are shown in Table A1 (Appendix A.1).
It should be noted that in GPSC all input variables are represented with X i notation where i = 0 , , 89 since there are 90 input variables. The output variable in GPSC is represented with y.
The data points are split into 15 classes in total—“collision in tool”, “collision in part”, “bottom obstruction”, “bottom collision”, “lost”, “moved”, “slightly moved”, “right collision”, “left collision”, “back collision”, “obstruction”, “front collision”, “collision”, “normal”, and “ok”. The number of elements in each class is given in Figure 2.
Out of the listed classes, “normal” and “ok” indicate a non-faulty/normal operation, while the others indicate fault. Since for a lot of applications just knowing whether the fault has occurred is enough, and it is not necessary to immediately discern the type of it, the dataset can be made into a binary dataset by combining all the “faulty” data points and all the “non-faulty” data points, as shown in the Figure 3.
As can be seen from Figure 2 and Figure 3, the classes in the dataset are not balanced. This fact opens the possibility of applying the dataset balancing techniques which will be described in the following section.
In this paper, only the binary problem will be considered because even balancing problems could not provide sufficient datasets in the case of multiple classes. In a multi-class problem, as seen in Figure 2, there are 15 classes. The extremes in the number of samples are lost class with only 3 samples and normal class with 109 samples. An equal number of samples per class (14 classes in total) can be achieved using OMs. However, even with the implementation of dataset balancing methods, each class would contain only 109 samples which is not enough for the implementation of ML methods.
One of the investigations important to statistical dataset analysis is the correlation analysis between dataset variables. The correlation between variables could indicate if the dataset used in the training process will produce the trained ML model with high classification accuracy or not. In this paper, authors calculated Pearson’s coefficients of correlation which return the values in the range of < 1.0 , 1.0 > . The strong correlation ranges are in the range of ± < 0.5 , 1.0 > . The low correlation range is considered to be < 0.5 , 0.5 > , where 0 is the weakest correlation. If the value is negative, the growth of the variables is inverse, and vice-versa. Since the dataset consists of 91 variables and the correlation heatmap is too large for better visualization the interrelationship between dataset variables is presented in Figure 4.
As seen in Figure 4, correlation values between the input and the target variables are in the 0.1 to −0.21 range. However, the majority of variables (75) are weakly correlated with the target variable (“Class” with a 0.05 to −0.1 range). Due to low correlation values between dataset variables, all variables were used in the GPSC algorithm to obtain SEs. In the Results section, the analysis of obtained SEs is performed, and analyzed which input variables ended up in the best SEs.

2.3. Dataset Balancing Methods

Due to the low sample number per class even in the binary classification problem (Figure 3) only dataset OMs were used and these are: random oversampling and SMOTE methods. The class that contains a low number of points in the dataset is labeled as a “minority” class. On the other hand, the opposite is referred to as the “majority” class.

2.3.1. Random Oversampling

Oversampling data randomly is one of the simplest balancing methods in which the minority class samples are randomly selected and copied to match the number of majority class samples.

2.3.2. SMOTE

The Synthetic Minority Oversampling Technique (SMOTE) [28] is a technique in which synthetic samples are generated in the following way:
  • Calculate the amount of samples N that have to be generated to obtain 1:1 class distribution;
  • Application of iterative process consisting of the following steps:
    -
    A random selection of minority class sample is performed;
    -
    K nearest neighbors (by default K = 5), are searched for;
    -
    The N of K samples are randomly chosen to generate new instances using the interpolation procedure. The difference between the sample under consideration and selected neighbors is used and increased by a factor in the range of < 0 , 1 > , which is appended to the sample. Using this procedure new synthetic samples are created.
Using the previously described procedure new synthetic samples are created along the line segments which can be used to connect two dataset samples. The results of the balancing dataset using different oversampling methods are listed in Table 2.

2.4. Genetic Programming Symbolic Classifier

The GSPC algorithm is a method in which a randomly generated initial population is unable to detect the target variable and through the evolution process makes them fit for detection of the target variable with high classification accuracy. When GPSC execution begins the initial population is created. This is a complex process that requires the definition of specific GPSC hyperparameter values, i.e., NumGens, PopSize, InitDepth, InitMethod, TourSize, ConstRange, and FunSet. To define input variables before GPSC execution the train part of the dataset must be provided. The InitMethod used to create the initial population in this paper was “ramped half-and-half”. This method combines the full and grow method and to ensure population diversity the maximum depth limit (InitDepth) is defined in a specific range. The length of the SE (population member) is measured with the number of elements, i.e., constants, functions, and input variables. It should be noted that in the results section, the best SEs will also be compared in terms of length.
To create one population member, input variables from the dataset are required as well as functions and constants. The constants are numbers defined in range using the hyperparameter ConstRange. The functions are randomly selected from the FunSet and in this case, the FunSet consists of the following functions: “+”, “−”, “·”, “÷”, “ m i n ”, “ m a x ”, “ s i n ”, “ c o s ”, “ t a n ”, “ l n ”, “ l o g 2 ”, “ l o g 10 ”, “ ”. After the creation of the initial population, the next step is to evaluate and calculate the fitness value of population members. In this paper the logarithmic loss fitness function was used, which is calculated as:
  • With the obtained expression, model the predicted output class for all training data points;
  • Calculate the Sigmoid function of the generated output:
    S ( t ) = 1 1 + e t ,
  • The log-loss function is calculated with the predicted and real training data points, per:
    H p ( q ) = 1 N y i · log ( p ( S i ) ) + ( 1 y i ) · log ( 1 p ( S i ) ) ,
    with y representing the real dataset output and p ( S ) is the output of the sigmoid function.
In GPSC two different genetic operators (GOs) are used, crossover and mutation, with mutation split into three different subtypes: subtree mutation, hoist mutation, and point mutation. These four genetic operations are performed on winners of tournament selection. In the case of crossover, two winners of tournament selection are required. The hyperparameter names, using which the probabilities of GOs are defined, are Cross, SubMute, HoistMute, and PointMute. The sum of all genetic operation probabilities can be equal to less than 1. If it is less than 1 then some parents enter the next generation unmodified.
The GPSC has two termination criteria, i.e., StopCrit (lowest predefined value of the fitness function) and NumGens (maximum number of generations). The MaxSamp hyperparameter defines how much of the training dataset will be used to evaluate the population members from generation to generation.
To prevent the bloat phenomenon the parsimony pressure method is used. The bloat occurs during GPSC algorithm execution when the population members’ size rapidly increases from generations without benefiting fitness function. When this method is used, the value of ParsCoef must be defined. This method is introduced during the tournament selection process when very large population members are found. To make them less favorable for winning tournament selection their fitness value is modified. This is achieved by using the equation:
f p ( x ) = f ( x ) c l ( x ) ,
where l ( x ) is the size of the unit’s expression, c is the parsimony pressure coefficient and l ( x ) is the total length of the population member.

2.5. Training Procedure of the GPSC Algorithm

To develop an RSHVs method the initial tuning of the GPSC hyperparameters had to be performed. The hyperparameter ranges are listed in Table 3.
The ranges in Table 3 were defined through the initial tuning of the GPSC algorithm. The PopSize was set to the 100–1000 range that was propagated for 100–300 NumGens. The StopCrit range values are very low to prevent early termination of GPSC algorithm execution. The GO coefficients have the same ranges from which the values are randomly selected. The idea is to see which one of the genetic operations will be most influential in case of obtaining the set of best SEs. The MaxSamp was set to the 0.99–1 range, i.e., the entire training dataset was used to evaluate each population member in each generation. The initial investigation showed that the best range for this research is the one shown in Table 3 since it will prevent the bloat phenomenon while enabling stable growth of population members while lowering the fitness function value.
The modeling using the described algorithm consists of:
  • Random hyperparameter selection;
  • Training the GPSC algorithm with the randomly selected hyperparameters;
  • Evaluating obtained SEs and testing if all EMs are above 0.99. If they are above 0.99 the process is terminated, otherwise, the process is repeated.
Some successful implementations of this process are well documented in [29,30,31]. Initially, the dataset is split in a 70:30 training/testing data split, because of faster execution. Then, the obtained models are tested using a 5FCV procedure on 70% of the dataset. The training procedure is repeated until it yields a score higher than 0.99. If that fails, the random hyperparameter choice, along the entire process, is repeated.

2.6. GPSC Evaluation Methods

In this subsection, the evaluation metrics (EMs) are described, as well as the process of evaluation methodology (EMT).

2.6.1. Evaluation Metrics

The accuracy, area under received operating characteristics curve (AUC), recall and precision values, and F1-score were used to evaluate the obtained models. When discussing classification problems, there are four types of outcomes: true positive (TP), true negative (TN), false positive (FP), and false negatives (FN). The TP is an outcome where the ML model correctly predicts the positive class. In the case of binary classification, there are two classes defined as positive (class labeled as 1) and negative class (class labeled as 0 or −1). The TN is an outcome where the ML model correctly predicts the negative class. The FP is an outcome where the ML model incorrectly predicts the positive class. The FN is the outcome where the ML model incorrectly predicts the negative class.
The classification accuracy [32] is the fraction of prediction the ML model got right. The classification accuracy is defined as correct predictions and total values ratio. A C C can be written in the following form:
A C C = T N + T P F P + F N + T P + T N .
A U C is one of the EMs used in this research that computes the area under the curve which defines the ratio of true positive and false negative predictions [33]. The precision metric value [34] gives information on how many positive classifications were correct, The P r e c i s i o n may be expressed as:
P r e c i s i o n = T P T P + F P .
The recall metric value [34] provides information on how many actual members of the positive class were identified correctly by the trained ML model. The R e c a l l is calculated as:
R e c a l l = T P T P + T N
The F 1 S c o r e , according to [35], can be written as:
F 1 S c o r e = 2 · R e c a l l · P r e c i s i o n R e c a l l + P r e c i s i o n .
All evaluation metric values are expressed in the range of [ 0 , 1 ] with the higher score representing better performance.

2.6.2. Evaluation Methodology

The process of determining the models starts with the random determination of hyperparameters. Then, the cross-validation is performed and after each fold, the evaluation metric values are calculated. After the cross-validation process is performed, the EMs are determined. If the values are higher than 0.99 then conclusive training and evaluation are performed. If the values of EMs are below the threshold the process is repeated.
In case the system progress to the final stage the GPSC algorithm training is executed on the training dataset part. After training is finished the results are evaluated on the training and testing dataset. If this yields satisfactory results, the process is finished. Otherwise, the process of repetition starts.

2.7. Computational Resources

The training and evaluation of the models are performed on the system as described below:
  • Hardware
    -
    Intel i7-4770
    -
    16 GB DDR3 RAM
  • Software
    -
    Python 3.9.13
    *
    imblearn 0.9.1
    *
    scikit.learn 1.2.0
    *
    gplearn 0.4.2

3. Results

In this section, the results achieved on each balanced dataset and the best SEs evaluated on the initial dataset will be presented. In the subsection “The results obtained on balanced datasets using the GPSC algorithm” the best results are presented and compared. In the subsection entitled “Evaluating best models” the results of the best SEs evaluated on the initial dataset are presented.

3.1. The Results Obtained on Balanced Datasets Using the GPSC Algorithm

The hyperparameter values used to obtain the best SEs using the GPSC algorithm on each balanced dataset are shown in Table 4.
From Table 4, it can be seen that the best-performing model on the random oversampling dataset was obtained with a very small initial population (172) compared to the SMOTE case where the initial population consisted of 850 members. The population members in the case of random oversampling evolved for 293 (upper value of hyperparameter range) generations while in the SMOTE case the population members evolved for 227 (lower value hyperparameter range) generations. The SMOTE case had a higher tournament selection size, and the population members in tree form were much larger (6,11). In the case of random oversampling, the subtree mutation (0.49) was dominating genetic operation while in the SMOTE case the crossover (0.3) was the dominating one. In both cases, as planned the stopping criteria were never met since the hyperparameter was very low. The parsimony coefficient in the SMOTE case was larger than in the random oversampling case. Classification performances are illustrated in Figure 5.
The mean results and standard deviations, (error bars in Figure 5) are also given in Table 5.
Table 5 and Figure 5 show that SEs obtained in the SMOTE case have higher classification accuracy. The average CPU time in both cases is equal to 100 min. One of the factors that influence the CPU time to execute GPSC is the dataset size. The dataset is small in terms of samples (668 samples); however, the number of input variables for each sample is pretty large (90). The average CPU time to execute one iteration of 5FCV was 20 min so to perform the entire 5FCV is 100 min.
The average length of SEs obtained on a random oversampled dataset is lower than in the SMOTE dataset balanced case however, with the latter higher classification performance was obtained. So, the best results were those obtained on the SMOTE dataset.

3.2. Evaluating Best Models

As stated in the previous subsection the best model in terms of classification performance were those obtained on the SMOTE dataset. There are five equations, due to them being obtained in the 5-fold cross-validation. The best SEs are shown in Appendix B.
The Equation (A1) in Appendix B, consists of 19 input variables and these variables are X 1 , X 4 , X 5 , X 6 , X 7 , X 8 , X 11 , X 12 , X 16 , X 18 , X 40 , X 44 , X 48 , X 56 , X 60 , X 77 , X 78 , X 81 , and X 83 . From Table A1 these input variables are F 1 y , T 1 y , T 1 z , F 2 x , F 2 y , F 2 z , T 2 z , F 3 x , T 3 y , F 4 x , T 7 y , F 8 z , F 9 x , F 10 z , F 11 x , T 13 z , F 14 x , T 14 x , and T 14 z . The Equation (A2) consists of 31 input variables and these variables are: X 0 , X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 14 , X 15 , X 16 , X 18 , X 19 , X 24 , X 25 , X 28 , X 30 , X 39 , X 44 , X 45 , X 47 , X 49 , X 52 , X 58 , X 63 , X 73 , X 77 , X 78 , X 80 , X 88 . From Table A1 these input variables are F 1 x , F 1 y , F 1 z , T 1 x , T 1 y , T 1 z , F 2 x , F 2 y , F 2 z , F 3 z , T 3 x , T 3 y , F 4 x , F 4 y , F 5 x , F 5 y , T 5 y , F 6 x , T 7 x , F 8 z , T 8 x , T 8 z , F 9 y , T 9 y , T 10 y , T 11 x , F 13 y , T 13 z , F 14 x , F 14 z , and T 15 y . The Equation (A3) consist of 23 input variables and these variables are X 0 , X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 18 , X 28 , X 36 , X 40 , X 43 , X 49 , X 59 , X 68 , X 69 , X 70 , X 81 , X 82 , X 84 , X 86 . From Table A1 these input variables are F 1 x , F 1 y , F 1 z , T 1 x , T 1 y , T 1 z , F 2 x , F 2 y , F 2 z , F 4 x , T 5 y , F 7 x , T 7 y , F 8 y , F 9 y , T 10 z , F 12 z , T 12 x , T 12 y , T 14 x , T 14 y , F 15 x , and F 15 z . The Equation (A4) consists of 28 input variables and these variables are X 0 , X 1 , X 2 , X 4 , X 5 , X 7 , X 8 , X 9 , X 16 , X 18 , X 20 , X 21 , X 22 , X 25 , X 27 , X 28 , X 40 , X 44 , X 46 , X 49 , X 50 , X 51 , X 59 , X 76 , X 77 , X 79 , X 80 , and X 88 . From Table A1 these input variables are F 1 x , F 1 y , F 1 z , T 1 y , T 1 z , F 2 y , F 2 z , T 2 x , T 3 y , F 4 x , F 4 z , T 4 x , T 4 y , F 5 y , T 5 x , T 5 y , T 7 y , F 8 z , T 8 y , F 9 y , F 9 z , T 9 x , T 10 z , T 13 y , T 13 z , F 14 y , F 14 z , and T 15 y . The Equation (A5) consists of 28 input variables and these variables are: X 0 , X 1 , X 2 , X 3 , X 4 , X 5 , X 6 , X 7 , X 8 , X 14 , X 16 , X 18 , X 24 , X 27 , X 28 , X 32 , X 33 , X 36 , X 42 , X 46 , X 56 , X 63 , X 66 , X 67 , X 69 , X 73 , X 74 , X 86 . From Table A1 these input variables are F 1 x , F 1 y , F 1 z , T 1 x , T 1 y , T 1 z , F 2 x , F 2 y , F 2 z , F 3 z , T 3 y , F 4 x , F 5 x , T 5 x , T 5 y , F 6 z , T 6 x , F 7 x , F 8 x , T 8 y , F 10 z , T 11 x , F 12 x , F 12 y , T 12 x , F 13 y , F 13 z , and F 15 z .
To compute the output Equation (A1) requires the lowest number of input variables (19) its length is lowest when compared to other equations. The Equations (A4) and (A5) require an equal number of input variables (28), however, not the same ones. Equation (A3) and Equation (A2) require 23 and 31 input variables, respectively.
Although it seems that these five SEs require a lot of input variables many of those variables are required in multiple SEs. Based on a detailed comparison of the best SEs it was found that a large number of variables are not required to compute the output. The input variables not in the best set of SEs are: X 10 , X 13 , X 17 , X 23 , X 26 , X 29 , X 31 , X 34 , X 35 , X 37 , X 38 , X 41 , X 53 , X 54 , X 55 , X 57 , X 61 , X 62 , X 64 , X 65 , X 71 , X 72 , X 75 , X 85 , X 87 , and X 89 . From Table A1, these input variables are T 2 y , F 3 y , T 3 z , T 4 z , F 5 z , T 5 z , F 6 y , T 6 y , F 7 y , F 7 z , F 7 z , T 7 z , T 9 z , F 10 x , F 10 y , T 10 x , F 11 y , F 11 z , T 11 y , T 11 z , T 12 z , F 13 x , T 13 x , and F 15 y , respectively.
The evaluation was performed as follows:
  • Use the variables from the non-augmented dataset inside the expressions to obtain the predicted outputs;
  • Apply the sigmoid function Equation (2) on that output, to transform the output of this function to an integer value;
  • Compare the obtained values with the original target values from the dataset and obtain evaluation metric values.
The results of the evaluation of the models are listed in Table 6.
Table 6 shows that the best model achieved even better classification performance on the original dataset. The dataset balanced with the SMOTE method did have some synthetic samples that deviate from the original dataset samples. The final evaluation of the proposed approach would be to compare the results obtained on the original dataset with those results obtained in other research papers. In Table 7, the list of research papers with achieved classification performance is compared with results obtained in this research on the initial (binary) dataset.
As seen from Table 7, the highest score was achieved in the case of [3] using SNN. The results in this paper are slightly lower although this approach outperforms all other research papers in terms of classification accuracy. The additional benefit of using this approach is that a set of robust SEs was obtained which can be easily stored and used.

4. Discussion

The dataset consists of 90 input variables (forces and torques) and the output (target) variable which defines the operation class and there is a total of 15 classes. However, due to a small dataset and a large number of classes, the samples were divided into two classes “normal” and “fault” operation. So, by reducing the multi-class problem into a binary one the number of samples for these two classes increased. Since the initial dataset was improperly balanced, the balancing methods were applied to rectify this. One of the reasons why the multi-class problem was initially abandoned in this research is that some classes in the multi-class dataset had an extremely low number of samples per class (3, 5, 7, and 9). The dataset oversampling methods could not be applied to the classes with the extremely low number of original class samples. The initial investigation with GPSC using a multi-class dataset generated poor results so the multi-class problem was initially abandoned. Low correlation values were determined with the initial analysis, but all the variables ended up being included in the best-performing equations.
The definition of hyperparameter range for the random hyperparameter method is a time-consuming process since it requires initial tuning of hyperparameter ranges and observing the GPSC algorithm execution behavior. The most crucial hyperparameters are genetic operation coefficients, the relation between population size and tournament selection size, and parsimony coefficient value. The starting hyperparameter tuning showed that an increase in the value of any genetic operation would not have any benefit towards the evolution process so they were all set to a pretty general case (all in the initial range of 0.001 to 1). Initial investigation showed that small tournament size and large population size can prolong GPSC execution time drastically. To prevent this the tournament size is, in the best-case scenario, 30% of the entire population. Finally, the most sensitive hyperparameter is the anti-bloat mechanism (parsimony). As mentioned, due to the low correlation between inputs and outputs, having a low parsimony coefficient value can result in a violent growth of equations. On the other hand, too large of a value will stifle the evolution process, resulting in the non-convergence of the model.
During this initial testing stage, it was ensured that the dominating stopping criteria would be a predefined maximum number of generations. The lowest value of the log loss fitness function was never achieved.
The described methodology was successful in generating models for all of the dataset variations, with the best-performing ones resulting from the SMOTE augmented dataset. The results shown in Table 5 show that random oversampling has lower classification accuracy than the SMOTE case and lower size of SEs. However, the size of SEs obtained in the SMOTE case is not so big so they were chosen as the best results.
The models trained on the SMOTE dataset showed that not all 90 inputs are necessary for classification. Each of these five SEs can be used to determine the class. However, all five are used to obtain the robust solution which was the initial idea of utilizing 5FCV. The SE that requires the lowest number of input variables is Equation (A1). This research also showed that not all of the input variables are required to obtain the best results. A detailed comparison of the best SEs showed that 26 input variables out of 90 are not needed to detect fault operation so these variables can be omitted from further investigation.
The utilization of the obtained models on the non-augmented dataset showed that using these SEs high classification accuracy could be achieved which can be seen from the results shown in Table 6.

5. Conclusions

The paper demonstrates an application of GSPC on a robot operation failures dataset. In the analyzed binary problem the GPSC was applied and validated for randomly selected hyperparameters, to obtain SE which can detect faulty operation. Since the binary variation of the original dataset was imbalanced the idea was to apply various balancing methods to equalize the dataset. The random oversampling and SMOTE methods were applied. The best results were evaluated on the initial dataset. After conducted investigation the following conclusions can be drawn:
  • The GPSC algorithm can be applied to obtain the models that detect the faulty operation of the robot manipulator and show high-performance metrics;
  • The investigation showed that with the application of OMs, the balance between class samples was reached, and using these types of datasets in GPSC generated high-performing models. So the conclusion is that dataset OMs have some influence on the classification accuracy of obtained results;
  • Conducted research demonstrating that by using a balanced dataset with the SMOTE method in the GPSC algorithm with RSHVs and 5FCV, the best SEs in terms of high mean evaluation metric values with low standard deviation can be obtained. When the aforementioned SEs were applied to the initial imbalanced dataset the results of EMs slightly deviate from those obtained on the SMOTE dataset.
  • The GPSC algorithm as applied procured a set of the best SEs that can be used to obtain a robust solution;
  • This investigation also showed that not all variables are necessary to detect the faulty operation of a robotic manipulator. In this case, a total of 26 input variables are not required which is a great reduction in the experimental measurement of these variables.
The main pros of the proposed method are:
  • The entire GPSC model does not have to be stored. No matter how long the equation is it still requires lower computational resources than the entire CNN or DNN. The aforementioned CNN or DNN can not be simply transformed into a form of SE.
  • The generated SEs with GPSC in some cases do not require all dataset input variables. However, other machine learning methods require all input variables that were used to train them.
The main cons of the proposed method are:
  • The development of the RSHVs method is a time-consuming process that requires changing each hyperparameter value and running GPSC execution to investigate its influence on the performance of the algorithm;
  • The ParsCoef is the most sensitive for tuning. A small change in its value can have a great impact on the GPSC performance.
Future work will be focused on creating the original dataset, preferably balanced. The idea is to apply the same procedure in an experimental study to validate the mathematical equations obtained using the same procedure described in this paper. Besides that, the dataset which was used in this investigation will be subjected to a synthetic data generation process to try artificially create samples of the original dataset and after that perform oversampling. Synthetic data generation, in this case, is mandatory, especially for classes with an extremely small number of samples. Hopefully, this procedure will contribute to high classification accuracy in the multi-class problem.

Author Contributions

Conceptualization, N.A. and I.L.; data curation, N.A. and S.B.Š.; formal analysis, I.L. and M.G.; funding acquisition, N.A. and I.L.; investigation, S.B.Š. and M.G.; methodology, N.A. and S.B.Š.; project administration, N.A. and I.L.; resources, I.L.; resources, I.L.; software, N.A. and M.G.; supervision, N.A. and I.L.; validation, I.L.; visualization, S.B.Š.; writing—original draft, N.A. and S.B.Š.; writing—review and editing, I.L. and M.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Institutional Review Board Statement

The institutional review board statement is not applicable.

Informed Consent Statement

The informed consent statement is not applicable.

Data Availability Statement

Publicly available repository located at https://archive.ics.uci.edu/ml/datasets/Robot+Execution+Failures, accessed on 1 December 2022.

Acknowledgments

This research has been (partly) supported by University of Rijeka scientific grants uniri-mladi-technic-22-61, uniri-mladi-technic-22-57, uniri-tehnic-18-275-1447, the CEEPUS network CIII-HR-0108, project CEKOM under the grant KK.01.2.2.03.0004, European Regional Development Fund under the grant KK.01.1.1.01.0009 (DATACROSS), and Erasmus+ project WICT under the grant 2021-1-HR01-KA220-HED-000031177.

Conflicts of Interest

The authors of this paper declare no conflict of interest.

Appendix A. Dataset Additional Information

Appendix A.1. Dataset Statistics and GPSC Variable Representation

In Table A1 the result of dataset statistical analysis is shown as well as the GPSC variable representation.
Table A1. Dataset statistics and GPSC variable representation.
Table A1. Dataset statistics and GPSC variable representation.
Dataset
Variable
MeanStdMinMaxGPSC
Variable
Dataset
Variable
MeanStdMinMaxGPSC
Variable
Class0.7213820.44880401y T 8 x −10.913670.98129−524400 X 45
F 1 x 5.42980654.7229−254353 X 0 T 8 y −5.790566.34445−492433 X 46
F 1 y 1.04535644.96064−338219 X 1 T 8 z −4.3542117.17883−15064 X 47
F 1 z −37.7624369.2985−3617361 X 2 F 9 x −1.5745147.21401−389339 X 48
T 1 x −5.23974117.2394−450686 X 3 F 9 y −2.2894236.62957−343190 X 49
T 1 y 7.358531111.824−286756 X 4 F 9 z −62.5702403.8498−2792151 X 50
T 1 z −1.7732225.55332−137149 X 5 T 9 x −8.758170.60608−567410 X 51
F 2 x 3.48164148.37184−246337 X 6 T 9 y −7.8336962.94308−487437 X 52
F 2 y −0.198739.63278−360205 X 7 T 9 z −2.7732213.33397−8388 X 53
F 2 z −58.1361399.9206−3261146 X 8 F 10 x 0.41684734.27027−248338 X 54
T 2 x −7.989286.15365−467605 X 9 F 10 y −3.3002236.81007−353188 X 55
T 2 y −7.4902864.08104−271261 X 10 F 10 z −51.7041340.3173−278889 X 56
T 2 z −1.306715.80375−69135 X 11 T 10 x −9.1101568.56809−563408 X 57
F 3 x 2.59827242.13289−247331 X 12 T 10 y −5.2872662.68289−502462 X 58
F 3 y −2.3282942.9849−367276 X 13 T 10 z −3.1965412.09228−8997 X 59
F 3 z −61.5961379.1127−3281132 X 14 F 11 x −1.3585365.97879−492448 X 60
T 3 x −7.4708485.93531−535620 X 15 F 11 y −7.4276551.87442−364185 X 61
T 3 y −2.6069174.5475−427476 X 16 F 11 z −110.994510.0633−323494 X 62
T 3 z −1.3369315.03959−66144 X 17 T 11 x −8.010878.76645−558404 X 63
F 4 x 3.15334839.23666−249337 X 18 T 11 y −7.2267875.12606−576454 X 64
F 4 y −1.1144745.40036−364354 X 19 T 11 z −4.4794822.87002−19981 X 65
F 4 z −67.635428.3144−3292107 X 20 F 12 x −4.4384488.33604−883460 X 66
T 4 x −6.3369384.95088−495567 X 21 F 12 y −8.4881255.64501−364187 X 67
T 4 y −3.2440675.36126−633464 X 22 F 12 z −134.477575.1008−3451179 X 68
T 4 z −1.7602617.14544−88161 X 23 T 12 x −3.31102105.7364−5401016 X 69
F 5 x 0.71706343.43762−251351 X 24 T 12 y −5.8099489.37178−568458 X 70
F 5 y 0.66738752.38475−368438 X 25 T 12 z −4.7537826.98989−23386 X 71
F 5 z −79.6004479.6485−3348107 X 26 F 13 x −5.4751689.50921−851462 X 72
T 5 x −12.0842105.922−824536 X 27 F 13 y −7.2613453.43668−352181 X 73
T 5 y −8.7775485.63418−725406 X 28 F 13 z −140.566586.8129−3275126 X 74
T 5 z −2.2116619.82233−128201 X 29 T 13 x −7.306783.26654−516400 X 75
F 6 x −0.5550838.56243−262324 X 30 T 13 y −10.261382.56735−567471 X 76
F 6 y −2.263540.62626−320254 X 31 T 13 z −4.1576719.97507−19793 X 77
F 6 z −57.8942417.1217−3051418 X 32 F 14 x −1.6630769.58861−480460 X 78
T 6 x −3.5637189.31248−672747 X 33 F 14 y −5.2159848.83071−343212 X 79
T 6 y −2.4254967.07047−468389 X 34 F 14 z −133.618559.7376−322692 X 80
T 6 z −1.6025923.169−162244 X 35 T 14 x −7.8574587.05328−527531 X 81
F 7 x −2.6976241.83346−389338 X 36 T 14 y −7.8920178.53241−600466 X 82
F 7 y −4.2591838.11985−382192 X 37 T 14 z −3.6781920.93429−248101 X 83
F 7 z −62.5767440.9198−3557126 X 38 F 15 x −3.241960.06207−497342 X 84
T 7 x −4.0669571.7687−547472 X 39 F 15 y −3.5507649.50523−343242 X 85
T 7 y −6.9157763.1772−429601 X 40 F 15 z −104.626483.9425−295595 X 86
T 7 z −2.9935213.10753−9183 X 41 T 15 x −9.9352187.94631−599462 X 87
F 8 x −1.4773241.89043−262340 X 42 T 15 y −9.94676.72955−646466 X 88
F 8 y −1.9935237.24492−331190 X 43 T 15 z −2.9028114.6582−91108 X 89
F 8 z −57.879389.6659−279597 X 44

Appendix B. The Best SEs

The set of best SEs that were obtained on the dataset balanced with SMOTE OM.
y 1 = 2 . sin ( ( 1.4 log ( min ( sin ( X 81 ) , X 44 ) ) + tan ( tan ( | X 81 | ) ) + 1.4 log ( X 1 ) + 2 X 16 3 + sin ( X 40 3 ) + 1.4 log ( X 48 ) + 2 sin ( X 60 3 ) + 1.4 log ( X 78 ) ) 1 3 ) + 2 . sin ( ( 1.4 log ( min ( sin ( X 81 ) , X 44 ) ) + 1.4 log ( X 1 ) + sin ( X 11 + X 12 ) + 1.4 log ( log ( X 12 ) ) + 2 X 16 3 + sin ( X 40 3 ) + 1.4 log ( X 48 ) + sin ( X 60 3 ) + 1.4 log ( X 78 ) ) 1 3 ) + 2 . | log ( X 12 ) | + 1.4 log ( X 1 ) + log ( X 12 ) + X 16 3 + sin ( X 40 3 ) + sin ( X 48 3 ) + sin ( X 60 3 ) + 1.4 log ( log ( X 78 ) ) 3 + 2 . 1.4 log ( X 1 ) + 2 log ( X 12 ) + X 16 3 + sin ( X 40 3 ) + sin ( X 48 3 ) + sin ( X 60 3 ) + 1.4 log ( log ( X 78 ) ) 3 + 2 . sin ( X 1 + X 12 + 2 sin ( X 40 3 ) + X 77 + 1.4 log ( X 78 ) 3 ) + 2 . sin ( X 1 + X 40 3 + sin ( X 40 3 ) 3 + X 77 + tan ( 1.4 log ( X 83 ) ) 3 ) + 2.88539 log ( X 1 ) + sin ( X 11 + X 12 ) + 2 . sin ( X 11 + X 12 3 ) + 11.5416 log ( log ( X 12 ) ) + 2 . sin ( 2 X 16 3 + X 4 X 18 + sin ( X 40 ) + 1.4 log ( X 48 ) + 1.4 log ( 1.4 log ( X 48 ) ) + X 60 3 3 ) + 2 . sin ( 2 X 16 3 + X 4 X 18 + sin ( X 40 3 ) + 1.4 log ( X 48 ) + 1.4 log ( 1.4 log ( X 48 ) ) + X 60 3 3 ) + 2 . X 16 3 + 2 . X 56 X 18 3 + 2.88539 log ( X 48 ) + 2 . sin ( X 60 3 ) + 2.88539 log ( X 78 )
y 2 = min ( 1.4 log ( X 0 + tan ( log ( X 88 ) ) ) , min ( 1.4 log ( X 0 + tan ( log ( X 88 ) ) ) , 1.4 log ( max ( X 28 , 1.2 log ( X 28 ) ) min ( X 80 , sin ( X 47 ) ) ) + 1.4 log ( 1.4 log ( max ( X 30 , X 4 ) ) ) + min ( 1.4 log ( 0.4 X 25 log ( X 0 ) ) + 0.4 log ( X 15 X 14 ) + log ( X 77 X 39 ) , 1.4 log ( tan ( X 63 ) ) ) + 1.4 log ( cos ( tan ( log ( X 88 ) ) ) 1 . min ( X 80 , sin ( X 47 ) ) ) + 1.4 log ( 1.4 log ( 1.4 log ( 1.4 log ( X 0 ) ) ) + sin ( X 0 ) 1 . sin ( X 47 ) ) + 1.4 log ( 1.4 log ( X 0 ) ) + 0.4 log ( X 15 X 14 ) + log ( X 19 ) + 0.4 log ( X 30 ) + 0.4 log ( cos ( X 39 ) ) + 1.4 log ( X 49 ) + 3 . X 52 3 + 1.4 log ( 1.4 log ( X 52 3 ) ) + log ( X 73 ) ) + min ( 1.4 log ( X 0 + sin ( X 16 ) ) , 1.4 log ( max ( 1.4 log ( X 0 ) , log ( X 19 ) ) sin ( X 47 ) ) + 1.4 log ( 1.4 log ( 1.4 log ( X 0 ) ) + 0.4 log ( 0.4 log ( X 16 ) ) ) + log ( X 18 ) + 0.4 log ( cos ( X 45 ) ) + X 52 3 + 0.4 log ( X 78 ) ) + max ( X 0 , X 30 , X 44 X 58 ) min ( X 80 , sin ( X 4 ) ) + max ( X 18 X 44 , 1.4 log ( X 49 ) ) 1 . min ( X 80 , sin ( X 47 ) ) + min ( 1.4 log ( 0.4 X 25 log ( X 0 ) ) + 0.4 log ( X 15 X 14 ) + log ( X 77 X 39 ) , 1.4 log ( tan ( X 63 ) ) ) + log ( X 19 ) 1 . min ( X 80 1 . X 7 , 1.4 log ( X 0 ) ) + min ( log ( X 19 ) , 1.4 log ( 1.4 log ( X 28 ) 1 . min ( X 80 , sin ( X 47 ) ) ) + 1.4 log ( X 24 ) + X 52 3 ) + 1.4 log ( cos ( tan ( log ( X 88 ) ) ) 1 . min ( X 80 , sin ( X 47 ) ) ) + 1.4 log ( 1.4 log ( 1.4 log ( 1.4 log ( X 0 ) ) ) + sin ( X 0 ) 1 . sin ( X 47 ) ) + 1.4 log ( 1.4 log ( X 0 ) ) + 3 . log ( X 19 ) + 0.4 log ( 0.4 log ( sin ( X 3 ) ) ) + 0.4 log ( X 30 ) + 0.4 log ( cos ( X 39 ) ) + 0.4 log ( cos ( X 45 ) ) + 1.4 log ( X 49 ) + 2 . X 52 3 + 1.4 log ( 1.4 log ( X 52 3 ) ) + 0.4 log ( cos ( X 52 3 ) ) + log ( X 73 ) ) + min ( 1.4 log ( X 0 + sin ( X 16 ) ) , 1.4 log ( max ( 1.4 log ( X 0 ) , log ( X 19 ) ) 1 . sin ( X 47 ) ) + 1.4 log ( 1.4 log ( 1.4 log ( X 0 ) ) + 0.4 log ( 0.4 log ( X 16 ) ) ) + log ( X 18 ) + 0.4 log ( cos ( X 45 ) ) + X 52 3 + 0.4 log ( X 78 ) ) + max ( X 0 , X 30 , X 44 X 58 ) 1 . min ( X 80 , sin ( X 4 ) ) + max ( X 18 X 44 , 1.4 log ( X 49 ) ) 1 . min ( X 80 , sin ( X 47 ) ) + log ( X 19 ) 1 . min ( X 80 1 . X 7 , 1.4 log ( X 0 ) ) + min ( log ( X 19 ) , 1.4 log ( 1.4 log ( X 28 ) 1 . min ( X 80 , sin ( X 47 ) ) ) + 1.4 log ( X 24 ) + X 52 3 ) + 2 . log ( X 19 ) + 0.4 log ( 0.4 log ( sin ( X 3 ) ) ) + 0.4 log ( cos ( X 45 ) ) + 0.4 log ( cos ( X 52 3 ) )
y 3 = 6 ( sec ( cos ( X 0 ) ) ( ( 2.4 cot ( cos ( cos ( tan ( X 82 ) ) ) ) ( 2.4 ( sec ( cos ( X 0 ) ) ( ( 2.4 cot ( cos ( cos ( tan ( X 82 ) ) ) ) ( sec ( cos ( X 0 ) ) ( cot ( cos ( cos ( log ( X 0 ) ) ) ) ( cot ( cos ( cos ( log ( X 0 ) ) ) ) ( 1.4 ( max ( log ( X 0 ) , 2.4 ( sec ( cos ( X 0 ) ) ( cot ( cos ( cos ( log ( X 0 ) ) ) ) ( cot ( cos ( cos ( log ( cos ( log ( X 0 ) ) ) ) ) ) ( 1.4 ( max ( X 70 , X 84 , log ( X 0 ) ) ) + cos ( | X 86 3 | ) ) + log ( max ( X 28 , X 36 , log ( max ( X 18 , X 36 , X 70 , log ( X 0 ) , cos ( max ( cos ( log ( X 43 ) ) , log ( X 43 ) ) ) + 1.4 ( cos ( log ( X 0 ) ) ) X 28 , log ( max ( X 59 , X 40 X 68 , X 70 ) ) ) ) ) ) ) + log ( log ( max ( X 70 , log ( X 0 ) , 3.55 log ( X 49 ) ) ) ) ) + log ( max ( X 28 , X 36 , log ( X 36 ) ) ) ) ) ) + cos ( | X 86 3 | ) ) + log ( max ( X 28 , X 36 , log ( max ( 4485.04 , X 18 , X 36 , X 70 , log ( X 0 ) , log ( max ( X 40 X 68 , X 69 , X 70 ) ) ) ) ) ) ) + log ( log ( max ( X 70 , log ( X 0 ) , 3.55 log ( X 49 ) ) ) ) ) + log ( max ( X 28 , log ( max ( X 70 , log ( X 0 ) , 2.4 ( log ( max ( X 28 , X 36 , log ( max ( X 40 X 68 , X 70 , X 81 ) ) ) ) + cos ( | X 86 3 | ) | cos ( cos ( log ( cos ( log ( X 0 ) ) ) ) ) | ) ) ) ) ) ) + log ( max ( X 28 , X 36 , log ( max ( X 70 , cos ( log ( X 43 ) ) , log ( X 0 ) , log ( X 43 ) ) ) ) ) ) / | cos ( cos ( log ( cos ( log ( X 0 ) ) ) ) ) | + log ( log ( cos ( log ( X 0 ) ) ) ) ) + log ( max ( X 28 , X 36 , log ( max ( X 28 , X 36 ) ) ) ) ) + log ( max ( X 28 , log ( max ( X 70 , log ( X 0 ) , 2.4 ( log ( max ( X 28 , X 36 , log ( max ( X 40 X 68 , X 70 , X 81 ) ) ) ) + cos ( | X 86 3 | ) | cos ( cos ( log ( cos ( log ( X 0 ) ) ) ) ) | ) ) ) ) ) ) + log ( max ( X 28 , X 36 , log ( max ( X 70 , cos ( log ( X 43 ) ) , log ( X 0 ) , log ( X 43 ) ) ) ) ) ) / | cos ( cos ( log ( cos ( log ( X 0 ) ) ) ) ) | + log ( log ( cos ( log ( X 0 ) ) ) ) ) + log ( max ( X 28 , X 36 , log ( log ( X 0 ) ) ) ) )
y 4 = max ( X 22 , max ( X 22 , X 88 , max ( X 22 , X 76 , X 88 , 1.4 log ( min ( X 80 , 0.4 log ( max ( X 0 , 0.4 log ( X 20 ) ) ) ) ) , max ( X 16 , X 18 , X 22 , X 25 , X 28 , X 59 , X 88 , max ( X 18 , X 59 , X 88 , 1.4 log ( min ( X 80 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , 1.4 log ( | log ( max ( X 0 , 0.4 log ( X 20 ) ) ) | ) ) ) + | tan ( X 59 ) | ) ) ) ) + max ( X 1 , X 88 , X 27 + 1.4 log ( 0.4 log ( log ( X 8 ) ) ) , X 49 | X 0 | min ( X 77 , X 79 ) ) ) + max ( X 22 , X 76 , X 88 , 1.4 log ( min ( X 80 , 0.4 log ( max ( X 0 , 0.4 log ( X 20 ) ) ) ) ) , max ( X 28 , 1.4 log ( min ( X 2 , X 50 + X 9 , 0.4 log ( 0.4 log ( X 20 ) ) ) ) , max ( X 22 , X 76 , 1.4 log ( min ( X 2 , X 80 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , 0.4 log ( X 27 3 ) ) ) + | log ( 0.4 log ( X 20 ) ) | ) ) ) ) + max ( X 16 , X 18 , X 22 , X 25 , X 28 , X 40 , X 46 , X 59 , sin ( X 44 ) ) ) + 1.4 log ( min ( X 2 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , X 88 ) ) + | log ( 0.4 log ( X 20 ) ) | ) ) ) ) + 1.4 log ( min ( X 80 , 0.4 log ( log ( tan ( X 59 ) + 0.4 log ( X 51 X 21 ) ) ) ) ) ) + max ( X 16 , X 18 , X 22 , X 25 , X 28 , X 40 , X 46 , X 59 , sin ( X 44 ) ) ) + 1.4 log ( min ( X 2 , X 80 , 0.4 log ( log ( 0.4 log ( max ( X 0 , 0.4 log ( X 20 ) ) ) ) ) ) ) , sin ( X 44 ) ) + max ( X 28 , 1.4 log ( min ( X 2 , X 50 + X 9 , 0.4 log ( 0.4 log ( X 20 ) ) ) ) , max ( X 22 , X 76 , 1.4 log ( min ( X 2 , X 80 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , 0.4 log ( X 27 3 ) ) ) + | log ( 0.4 log ( X 20 ) ) | ) ) ) ) + max ( X 16 , X 18 , X 22 , X 25 , X 28 , X 40 , X 46 , X 59 , sin ( X 44 ) ) ) + 1.4 log ( min ( X 2 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , X 88 ) ) + | log ( 0.4 log ( X 20 ) ) | ) ) ) ) + 1.4 log ( min ( X 80 , 0.4 log ( log ( tan ( X 59 ) + 0.4 log ( X 51 X 21 ) ) ) ) ) ) ) + max ( X 18 , X 59 , X 88 , 1.4 log ( min ( X 80 , 0.4 log ( log ( 0.4 log ( max ( X 51 X 21 , 1.4 log ( | log ( max ( X 0 , 0.4 log ( X 20 ) ) ) | ) ) ) + | tan ( X 59 ) | ) ) ) ) + max ( X 1 , X 88 , X 27 + 1.4 log ( 0.4 log ( log ( X 8 ) ) ) , X 49 | X 0 | min ( X 77 , X 79 ) ) ) + 1.4 log ( min ( X 2 , X 80 , 0.4 log ( log ( 0.4 log ( max ( X 0 , 0.4 log ( X 20 ) ) ) ) ) ) ) ) + 1.4 log ( 0.4 log ( max ( X 0 , 0.4 log ( X 20 ) ) ) ) ) + 1.4 log ( 0.4 log ( 0.4 log ( X 20 ) ) )
y 5 = max ( X 16 , max ( X 16 , max ( X 16 , max ( X 16 , max ( X 46 , ( 1.4 log ( sin ( ( max ( X 16 , log ( X 33 ) max ( 1.4 log ( 1.20112 log ( X 73 3 ) ) , tan ( 0.4 log ( X 73 ) ) ) ) ) / | X 28 | ) ) + max ( X 42 , tan ( 1.4 log ( 1.4 log ( 1.2 log ( min ( X 14 , X 18 ) ) ) ) ) ) ) 1 3 ) + 1.4 log ( max ( X 16 , ( 1.4 log ( sin ( max ( X 0 , X 32 ) ) ) + tan ( 1.4 log ( 0.65 log ( X 69 ) ) ) ) 1 3 ) + 1.4 log ( min ( X 14 , X 18 ) ) ) + max ( X 16 , max ( X 16 , X 46 , max ( X 16 , X 46 , ( max ( X 16 , X 46 , max ( X 16 , ( max ( X 16 , tan ( 1.4 log ( 1.2 log ( min ( X 3 , X 66 ) ) ) ) ) + max ( X 16 , 1.13 log ( sin ( max ( X 0 , X 32 ) ) ) 3 ) + 1.4 log ( sin ( X 67 ) ) ) 1 3 ) + 1.4 log ( min ( X 18 , X 86 ) ) ) ) 1 3 ) + max ( X 16 , ( 1.4 log ( max ( X 16 , log ( X 0 ) , max ( X 16 , min ( X 18 , X 74 ) , tan ( 1.4 log ( 1.4 log ( X 18 ) ) ) ) ) ) + max ( X 16 , tan ( 1.4 log ( 1.4 log ( X 18 ) ) ) ) + max ( X 16 , X 4 9 ) ) 1 3 ) ) + max ( X 24 , tan ( tan ( 1.4 log ( 1.20112 log ( X 36 ) ) ) ) ) ) ) + max ( 1.4 log ( min ( X 18 , X 86 ) ) , tan ( 0.4 log ( X 73 ) ) ) ) + max ( X 16 , max ( X 16 , ( max ( X 16 , max ( X 16 , ( max ( X 16 , 1.13 log ( sin ( max ( X 0 , X 32 ) ) ) 3 ) + X 4 9 + 1.4 log ( sin ( 1.4 log ( X 67 ) ) ) ) 1 3 ) + 1.4 log ( min ( X 18 , X 86 ) ) ) ) 1 3 ) + max ( X 16 , ( max ( X 16 , max ( X 16 , X 46 , X 4 3 , tan ( 1.4 log ( min ( X 14 , X 18 ) ) ) ) + max ( X 16 , 1.4 log ( sin ( log ( X 69 ) log ( X 56 ) ) ) ) + 1.4 log ( log ( X 33 ) log ( X 56 ) ) ) + max ( X 16 , tan ( 1.4 log ( 1.4 log ( min ( X 18 , X 74 ) ) ) ) ) ) 1 3 ) ) + max ( X 18 , max ( X 16 , tan ( 1.4 log ( max ( X 16 , tan ( 1.4 log ( 1.4 log ( X 18 ) ) ) ) ) ) ) + 1.4 log ( sin ( min ( X 18 , X 74 ) log ( X 56 ) ) ) ) + 1.4 log ( 1.4 log ( min ( X 74 , X 4 3 ) ) ) ) + max ( 1.4 log ( max ( X 24 , tan ( 1.4 log ( max ( X 16 , 1.4 log ( sin ( X 14 ) ) + cos ( X 27 X 18 ) 3 ) + max ( X 16 , tan ( 1.4 log ( X 63 ) ) ) + X 2 ) ) ) ) , tan ( tan ( 1.4 log ( 1.20112 log ( X 73 ) ) ) ) ) )
It should be noted that during GPSC executions mathematical functions division, natural logarithm, square root, and logarithms with base 2 and 10 are modified to avoid errors during execution. If the Equations (A1)–(A5) are used then the aforementioned mathematical functions must be applied which are defined in the following way:
  • Division function
    y D I V ( x 1 , x 2 ) = x 1 x 2 if | x 2 | > 0.001 1 if | x 2 | < 0.001 .
  • Square root
    y S Q R T ( x ) = | x | ,
  • Natural logarithm
    y l o g ( z ) = log ( | z | ) if | z | > 0.001 0 if | z | < 0.001
The Equation (A8) can be applied to log2, and log10 however log must be replaced with log2, and log10, respectively. The z , x 1 , and x 2 , in Equations (A6)–(A8) are arbitrary variable names.

References

  1. Polic, M.; Maric, B.; Orsag, M. Soft robotics approach to autonomous plastering. In Proceedings of the 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), Lyon, France, 23–27 August 2021; pp. 482–487. [Google Scholar]
  2. Bonci, A.; Cen Cheng, P.D.; Indri, M.; Nabissi, G.; Sibona, F. Human-robot perception in industrial environments: A survey. Sensors 2021, 21, 1571. [Google Scholar] [CrossRef] [PubMed]
  3. Anđelić, N.; Car, Z.; Šercer, M. Neural Network-Based Model for Classification of Faults During Operation of a Robotic Manipulator. Tehnički Vjesn. 2021, 28, 1380–1387. [Google Scholar]
  4. Mellit, A.; Kalogirou, S. Artificial intelligence and internet of things to improve efficacy of diagnosis and remote sensing of solar photovoltaic systems: Challenges, recommendations and future directions. Renew. Sustain. Energy Rev. 2021, 143, 110889. [Google Scholar] [CrossRef]
  5. Rafique, F.; Fu, L.; Mai, R. End to end machine learning for fault detection and classification in power transmission lines. Electr. Power Syst. Res. 2021, 199, 107430. [Google Scholar] [CrossRef]
  6. Theodoropoulos, P.; Spandonidis, C.C.; Giannopoulos, F.; Fassois, S. A Deep Learning-Based Fault Detection Model for Optimization of Shipping Operations and Enhancement of Maritime Safety. Sensors 2021, 21, 5658. [Google Scholar] [CrossRef]
  7. Dang, H.L.; Kim, J.; Kwak, S.; Choi, S. Series DC arc fault detection using machine learning algorithms. IEEE Access 2021, 9, 133346–133364. [Google Scholar] [CrossRef]
  8. Belhadi, A.; Djenouri, Y.; Srivastava, G.; Jolfaei, A.; Lin, J.C.W. Privacy reinforcement learning for faults detection in the smart grid. Ad Hoc Netw. 2021, 119, 102541. [Google Scholar] [CrossRef]
  9. Tayyab, S.M.; Chatterton, S.; Pennacchi, P. Fault detection and severity level identification of spiral bevel gears under different operating conditions using artificial intelligence techniques. Machines 2021, 9, 173. [Google Scholar] [CrossRef]
  10. Dou, Z.; Sun, Y.; Wu, Z.; Wang, T.; Fan, S.; Zhang, Y. The architecture of mass customization-social Internet of Things system: Current research profile. ISPRS Int. J. Geo-Inf. 2021, 10, 653. [Google Scholar] [CrossRef]
  11. Zhu, J.; Gong, Z.; Sun, Y.; Dou, Z. Chaotic neural network model for SMISs reliability prediction based on interdependent network SMISs reliability prediction by chaotic neural network. Qual. Reliab. Eng. Int. 2021, 37, 717–742. [Google Scholar] [CrossRef]
  12. Eski, I.; Erkaya, S.; Savas, S.; Yildirim, S. Fault detection on robot manipulators using artificial neural networks. Robot. Comput.-Integr. Manuf. 2011, 27, 115–123. [Google Scholar] [CrossRef]
  13. Caccavale, F.; Cilibrizzi, P.; Pierri, F.; Villani, L. Actuators fault diagnosis for robot manipulators with uncertain model. Control Eng. Pract. 2009, 17, 146–157. [Google Scholar] [CrossRef]
  14. Yan, Z.; Tan, J.; Liang, B.; Liu, H.; Yang, J. Active Fault-Tolerant Control Integrated with Reinforcement Learning Application to Robotic Manipulator. In Proceedings of the 2022 American Control Conference (ACC), Atlanta, GA, USA, 8–10 June 202; pp. 2656–2662.
  15. Matsuno, T.; Huang, J.; Fukuda, T. Fault detection algorithm for external thread fastening by robotic manipulator using linear support vector machine classifier. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 3443–3450. [Google Scholar]
  16. Piltan, F.; Prosvirin, A.E.; Sohaib, M.; Saldivar, B.; Kim, J.M. An SVM-based neural adaptive variable structure observer for fault diagnosis and fault-tolerant control of a robot manipulator. Appl. Sci. 2020, 10, 1344. [Google Scholar] [CrossRef]
  17. Khireddine, M.S.; Chafaa, K.; Slimane, N.; Boutarfa, A. Fault diagnosis in robotic manipulators using artificial neural networks and fuzzy logic. In Proceedings of the 2014 World Congress on Computer Applications and Information Systems (WCCAIS), Hammamet, Tunisia, 17–19 January 2014; pp. 1–6. [Google Scholar]
  18. UCI Machine Learning Repository: Robot Execution Failures Data Set. Available online: https://archive.ics.uci.edu/ml/datasets/Robot+Execution+Failures (accessed on 1 December 2022).
  19. Parisi, L.; RaviChandran, N. Genetic algorithms and unsupervised machine learning for predicting robotic manipulation failures for force-sensitive tasks. In Proceedings of the 2018 4th International Conference on Control, Automation and Robotics (ICCAR), Auckland, New Zealand, 20–23 April 2018; pp. 22–25. [Google Scholar]
  20. Koohi, T.; Mirzaie, E.; Tadaion, G. Failure prediction using robot execution data. In Proceedings of the 5th Symposium on Advances in Science and Technology, Mashhad, Iran, 12–17 May 2011; pp. 1–7. [Google Scholar]
  21. Liu, Y.; Wang, X.; Ren, X.; Lyu, F. Deep Convolution Neural Networks for the Classification of Robot Execution Failures. In Proceedings of the 2019 CAA Symposium on Fault Detection, Supervision and Safety for Technical Processes (SAFEPROCESS), Xiamen, China, 5–7 July 2019; pp. 535–540. [Google Scholar]
  22. Diryag, A.; Mitić, M.; Miljković, Z. Neural networks for prediction of robot failures. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2014, 228, 1444–1458. [Google Scholar] [CrossRef]
  23. Junior, J.J.A.M.; Pires, M.B.; Vieira, M.E.M.; Okida, S.; Stevan, S.L., Jr. Neural network to failure classification in robotic systems. J. Appl. Instrum. Control 2016, 4, 1–6. [Google Scholar]
  24. Dash, P.B.; Naik, B.; Nayak, J.; Vimal, S. Deep belief network-based probabilistic generative model for detection of robotic manipulator failure execution. Soft Comput. 2021, 1–13. [Google Scholar] [CrossRef]
  25. Poli, R.; Langdon, W.; Mcphee, N. A Field Guide to Genetic Programming. 2008. Available online: http://www0.cs.ucl.ac.uk/staff/W.Langdon/ftp/papers/poli08_fieldguide.pdf (accessed on 15 December 2022).
  26. Lopes, L.S.; Camarinha-Matos, L.M. Feature transformation strategies for a robot learning problem. In Feature Extraction, Construction and Selection; Springer: Berlin/Heidelberg, Germany, 1998; pp. 375–391. [Google Scholar]
  27. Camarinha-Matos, L.M.; Lopes, L.S.; Barata, J. Integration and learning in supervision of flexible assembly systems. IEEE Trans. Robot. Autom. 1996, 12, 202–219. [Google Scholar] [CrossRef]
  28. Fernández, A.; Garcia, S.; Herrera, F.; Chawla, N.V. SMOTE for learning from imbalanced data: Progress and challenges, marking the 15-year anniversary. J. Artif. Intell. Res. 2018, 61, 863–905. [Google Scholar] [CrossRef]
  29. Anđelić, N.; Baressi Šegota, S.; Lorencin, I.; Car, Z. The Development of Symbolic Expressions for Fire Detection with Symbolic Classifier Using Sensor Fusion Data. Sensors 2022, 23, 169. [Google Scholar] [CrossRef]
  30. Anđelić, N.; Lorencin, I.; Baressi Šegota, S.; Car, Z. The Development of Symbolic Expressions for the Detection of Hepatitis C Patients and the Disease Progression from Blood Parameters Using Genetic Programming-Symbolic Classification Algorithm. Appl. Sci. 2023, 13, 574. [Google Scholar] [CrossRef]
  31. Anđelić, N.; Baressi Šegota, S.; Lorencin, I.; Glučina, M. Detection of Malicious Websites Using Symbolic Classifier. Future Internet 2022, 14, 358. [Google Scholar] [CrossRef]
  32. Sturm, B.L. Classification accuracy is not enough. J. Intell. Inf. Syst. 2013, 41, 371–406. [Google Scholar] [CrossRef] [Green Version]
  33. Hosmer, D.W.; Lemeshow, S.; Sturdivant, R. Area under the receiver operating characteristic curve. In Applied Logistic Regression, 3rd ed.; Wiley: Hoboken, NJ, USA, 2013; pp. 173–182. [Google Scholar]
  34. Buckland, M.; Gey, F. The relationship between recall and precision. J. Am. Soc. Inf. Sci. 1994, 45, 12–19. [Google Scholar] [CrossRef]
  35. Chicco, D.; Jurman, G. The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genom. 2020, 21, 1–13. [Google Scholar] [CrossRef] [Green Version]
Figure 1. The methodology research flowchart.
Figure 1. The methodology research flowchart.
Applsci 13 01962 g001
Figure 2. The distribution of the data points in the dataset, by class.
Figure 2. The distribution of the data points in the dataset, by class.
Applsci 13 01962 g002
Figure 3. The distribution of the data points in the dataset, by class, after sorting for binary classification.
Figure 3. The distribution of the data points in the dataset, by class, after sorting for binary classification.
Applsci 13 01962 g003
Figure 4. Correlation of the output (“Class”) variable and the inputs.
Figure 4. Correlation of the output (“Class”) variable and the inputs.
Applsci 13 01962 g004
Figure 5. The mean and standard deviation values of EMs.
Figure 5. The mean and standard deviation values of EMs.
Applsci 13 01962 g005
Table 1. The list of research papers, with described ML methods and achieved results in fault operation obtained using dataset [18].
Table 1. The list of research papers, with described ML methods and achieved results in fault operation obtained using dataset [18].
ReferenceMethodsResults
[19]GA-SOM, SOM A C C : 91.95 %
[20]Naive Bayes, Boosted Naive Bayes,
Bagged Naive Bayes, SVM,
Boosted SVM, Bagged SVM,
Decision Table (DT), Boosted DT,
Bagged DT, Decision Tree (DTr),
Boosted (DTr), Bagged DTr,
Plurality Voting,
Stacking Meta Decision Trees,
Stacking Ordinary Decision Trees
A C C : 95.24 %
[21]DCNN A C C : 98.82 %
[22]NN with
Bayesian regularization
A C C : 95.45 %
[23]MLP A C C : 90.45 %
[24]DBN, C-support vector classifier,
logistic regression, decision tree classifier,
K-Nearest Neighbor Classifier,
MLP, AdaBoost Classifier,
Random Forrest Classifier,
Bagging Classifier, Voting Classifier
A C C : 80.486 %
[3]SNN F 1 S c o r e : 1.00
Table 2. The results of dataset balancing methods.
Table 2. The results of dataset balancing methods.
Dataset Balancing Method NameNumber of Minority Class SamplesNumber of Majority Class SamplesTotal Number of Samples
Random
Oversampling
334334668
SMOTE334334668
Table 3. The range of GPSC hyperparameters.
Table 3. The range of GPSC hyperparameters.
GPSC HyperparameterRange
PopSize100–1000
NumGens100–300
TourSize100–300
InitDepth3–12
Cross0.001–1
SubMute0.001–1
HoistMute0.001–1
PointMute0.001-1
StopCrit 1 × 10 7 1 × 10 6
MaxSamp0.99–1
ConstRange−10,000–10,000
ParsCoef 1 × 10 5 2 × 10 4
Table 4. GSPC hyperparameters yielding the best performing models on each dataset.
Table 4. GSPC hyperparameters yielding the best performing models on each dataset.
Dataset VariationGPSC Hyperparameters
Random
Oversampling
172, 293, 161, (4, 7),
0.1, 0.49, 0.36, 0.032, 7.78 × 10 7 ,
0.99, (−280.17, 5256.88), 6.95 × 10 6
SMOTE850, 227, 270, (6, 11),
0.39, 0.13, 0.23, 0.23, 1.4 × 10 7 ,
0.99, (−7689.72, 8984.85), 1.63 × 10 5
Table 5. The evaluation metric values as well as computational time and the length of SE.
Table 5. The evaluation metric values as well as computational time and the length of SE.
Dataset
Type
ACC ¯
± SD ( ACC )
AUC ¯
± SD ( AUC )
Precision ¯
± SD ( Precision )
Recall ¯
± SD ( Recall )
F 1 Score ¯
± SD ( F 1 Score )
Average CPU Time
per Simulation [min]
Length of
SEs
Random
Oversampling
0.985
± 0.0056
0.984
± 0.0057
0.9882
± 0.00273
0.982
± 0.0086
0.985
± 0.0058
100729/368/78/119/159
SMOTE 0.99
± 0.0089
0.99
± 0.00888
0.992
± 0.006
0.9893
± 0.0106
0.99
± 0.0084
544/430/354/387/334
Table 6. The mean and standard deviation scores obtained using the best models on the non-augmented dataset.
Table 6. The mean and standard deviation scores obtained using the best models on the non-augmented dataset.
Evaluation MetricMean ValueStandard Deviation
A C C 0.9978 5 × 10 5
A U C 0.998 3.48 × 10 5
P r e c i s i o n 1.00
R e c a l l 0.997 1.6 × 10 5
F 1 S c o r e 0.9985 2.74 × 10 5
Table 7. The results comparison obtained in this paper with results from other research papers.
Table 7. The results comparison obtained in this paper with results from other research papers.
ReferenceMethodsResults
[19]GA-SOM, SOM A C C : 91.95 %
[20]Naive Bayes, Boosted Naive Bayes,
Bagged Naive Bayes, SVM,
Boosted SVM, Bagged SVM,
Decision Table (DT), Boosted DT,
Bagged DT, Decision Tree (DTr),
Boosted (DTr), Bagged DTr,
Plurality Voting,
Stacking Meta Decision Trees,
Stacking Ordinary Decision Trees
A C C : 95.24 %
[21]DCNN A C C : 98.82 %
[22]NN with
Bayesian regularization
A C C : 95.45 %
[23]MLP A C C : 90.45 %
[24]DBN, C-support vector classifier,
logistic regression, decision tree classifier,
K-Nearest Neighbor Classifier,
MLP, AdaBoost Classifier,
Random Forrest Classifier,
Bagging Classifier, Voting Classifier
A A C : 80.486 %
[3]SNN F 1 S c o r e : 100 %
This paperGPSC A C C 99.78%,
A U C 0.998%,
P r e c i s i o n 100%,
R e c a l l 99.7%
F 1 S c o r e 99.85%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Anđelić, N.; Baressi Šegota, S.; Glučina, M.; Lorencin, I. Classification of Faults Operation of a Robotic Manipulator Using Symbolic Classifier. Appl. Sci. 2023, 13, 1962. https://doi.org/10.3390/app13031962

AMA Style

Anđelić N, Baressi Šegota S, Glučina M, Lorencin I. Classification of Faults Operation of a Robotic Manipulator Using Symbolic Classifier. Applied Sciences. 2023; 13(3):1962. https://doi.org/10.3390/app13031962

Chicago/Turabian Style

Anđelić, Nikola, Sandi Baressi Šegota, Matko Glučina, and Ivan Lorencin. 2023. "Classification of Faults Operation of a Robotic Manipulator Using Symbolic Classifier" Applied Sciences 13, no. 3: 1962. https://doi.org/10.3390/app13031962

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop