Application of Symbolic Classifiers and Multi-Ensemble Threshold Techniques for Android Malware Detection
Created by W.Langdon from
gp-bibliography.bib Revision:1.8506
- @Article{andelic:2025:BDCC,
-
author = "Nikola Andelic and Sandi {Baressi Segota} and
Vedran Mrzljak",
-
title = "Application of Symbolic Classifiers and Multi-Ensemble
Threshold Techniques for Android Malware Detection",
-
journal = "Big Data and Cognitive Computing",
-
year = "2025",
-
volume = "9",
-
number = "2",
-
pages = "Article No. 27",
-
keywords = "genetic algorithms, genetic programming",
-
ISSN = "2504-2289",
-
URL = "
https://www.mdpi.com/2504-2289/9/2/27",
-
DOI = "
doi:10.3390/bdcc9020027",
-
abstract = "Android malware detection using artificial
intelligence today is a mandatory tool to prevent cyber
attacks. To address this problem in this paper the
proposed methodology consists of the application of
genetic programming symbolic classifier (GPSC) to
obtain symbolic expressions (SEs) that can detect if
the android is malware or not. To find the optimal
combination of GPSC hyperparameter values the random
hyperparameter values search method (RHVS) method and
the GPSC were trained using 5-fold cross-validation
(5FCV). It should be noted that the initial dataset is
highly imbalanced (publicly available dataset). This
problem was addressed by applying various preprocessing
and oversampling techniques thus creating a huge number
of balanced dataset variations and on each dataset
variation the GPSC was trained. Since the dataset has
many input variables three different approaches were
considered: the initial investigation with all input
variables, input variables with high feature
importance, application of principal component
analysis. After the SEs with the highest classification
performance were obtained they were used in
threshold-based voting ensembles and the threshold
values were adjusted to improve classification
performance. Multi-TBVE has been developed and using
them the robust system for Android malware detection
was achieved with the highest accuracy of 0.98 was
obtained.",
-
notes = "also known as \cite{bdcc9020027}",
- }
Genetic Programming entries for
Nikola Andelic
Sandi Baressi Segota
Vedran Mrzljak
Citations