Improvement of pulsars detection using dataset balancing methods and symbolic classification ensemble
Created by W.Langdon from
gp-bibliography.bib Revision:1.8194
- @Article{ANDELIC:2024:ascom,
-
author = "N. Anelic",
-
title = "Improvement of pulsars detection using dataset
balancing methods and symbolic classification
ensemble",
-
journal = "Astronomy and Computing",
-
volume = "47",
-
pages = "100801",
-
year = "2024",
-
ISSN = "2213-1337",
-
DOI = "doi:10.1016/j.ascom.2024.100801",
-
URL = "https://www.sciencedirect.com/science/article/pii/S2213133724000167",
-
keywords = "genetic algorithms, genetic programming, Dataset
balancing methods, Genetic programming symbolic
classifier, Pulsars detection",
-
abstract = "Highly accurate detection of pulsars is mandatory.
With the application of machine learning (ML)
algorithms, the detection of pulsars can certainly be
improved if the dataset is balanced. In this paper, the
publicly available dataset (HTRU2) is highly imbalanced
so various balancing methods were applied. The balanced
dataset was used in genetic programming symbolic
classifier (GPSC) to obtain symbolic expressions (SEs)
that can detect pulsars with high classification
accuracy. To find the optimal combination of GPSC
hyperparameters the random hyperparameter search (RHS)
method was developed and applied. The GPSC was trained
using 5-fold cross-validation so after each training a
total of 5 SEs were obtained. The best set of SEs are
selected based on their classification performance and
all of them are applied on the original dataset. The
best classification accuracy (ACC), the area under
receiver operating characteristic (AUC), precision,
recall, and f1-score were achieved in the case of the
dataset balanced with the AllKNN method i.e. all mean
evaluation metric values are equal to 0.995. The
ensemble consisted of 25 SEs that achieved the
ACC=0.978, AUC=0.9452 , Precision=0.905, Recall=0.9963,
and F1-Score=0.94877, on the original dataset",
- }
Genetic Programming entries for
Nikola Andelic
Citations