Generating Fake News Detection Model Using A Two-Stage Evolutionary Approach
Created by W.Langdon from
gp-bibliography.bib Revision:1.8194
- @Article{Kong:2023:ACC,
-
author = "Jeffery T. H. Kong and W. K. Wong and
Filbert H. Juwono and Catur Apriono",
-
journal = "IEEE Access",
-
title = "Generating Fake News Detection Model Using A Two-Stage
Evolutionary Approach",
-
year = "2023",
-
volume = "11",
-
pages = "85067--85085",
-
abstract = "While fake news is morally reprehensible,
irresponsible parties intentionally use it to achieve
their goals by disseminating it to vulnerable and
targeted groups. Machine learning techniques have been
researched extensively to detect fake news. On the
other hand, evolutionary-based algorithms are now
gaining popularity in the research community. In this
study, a two-stage evolutionary approach is proposed to
generate and optimise a mathematical equation for fake
news detection. In the first stage, tree-based Genetic
Programming (GP) algorithm is used to generate
mathematical expressions to detect correlations between
the language-independent (Lang-IND) features, extracted
from Fake.my-COVID19 dataset, the newly curated fake
news dataset in a mixed Malay - English language. The
uniqueness of the proposed approach is that the
mathematical expressions are formed by basic arithmetic
operators or to include complex arithmetic operators
such as addition, multiplication, subtraction,
division, square, abs, log1p, sign, square root, and
exponential together with Lang-IND features as the
variables. Prior to second stage of the evolutionary
approach, a sensitivity analysis is applied to shorten
the best equation while maintaining the F1-score
performance. In the second stage, an Adaptive
Differential Evolution (ADE), is used to fine-tune the
mathematical model. The experimental results conclude
that the proposed two-stage evolutionary approach can
be applied in fake news detection and the model can
learn to predict using the Lang-IND features. Results
from the first stage shows that the equation from GP
scores F1-score of 83.23percent on Fake.my-COVID19
dataset using complex arithmetic operators and at tree
depth of 8. After the fine-tuning stage, the model
performance increases the F1-score to 84.44percent. The
performance of the proposed two-stage evolutionary
approach outperforms the baseline performance of six
commonly-used machine learning algorithms, with Random
Forest having the highest F1-score of 84.07percent. The
mathematical model is also tested separately on two
other unseen datasets of different domain topic or
language and achieves acceptable F1-scores.",
-
keywords = "genetic algorithms, genetic programming, differential
evolution, Fake news, Feature extraction, Mathematical
models, Random forests, Machine learning, COVID-19,
Social networking (online), Fake news detection,
evolutionary approach",
-
DOI = "doi:10.1109/ACCESS.2023.3303321",
-
ISSN = "2169-3536",
-
notes = "Also known as \cite{10210550}",
- }
Genetic Programming entries for
Jeffery T H Kong
Wei Kitt Wong
Filbert H Juwono
Catur Apriono
Citations