Detecting phishing e-mails using text and data mining
Created by W.Langdon from
gp-bibliography.bib Revision:1.8081
- @InProceedings{Pandey:2012:ICCIC,
-
author = "Mayank Pandey and Vadlamani Ravi",
-
booktitle = "Computational Intelligence Computing Research (ICCIC),
2012 IEEE International Conference on",
-
title = "Detecting phishing e-mails using text and data
mining",
-
year = "2012",
-
DOI = "doi:10.1109/ICCIC.2012.6510259",
-
abstract = "This paper presents text and data mining in tandem to
detect the phishing email. The study employs Multilayer
Perceptron (MLP), Decision Trees (DT), Support Vector
Machine (SVM), Group Method of Data Handling (GMDH),
Probabilistic Neural Net (PNN), Genetic Programming
(GP) and Logistic Regression (LR) for classification. A
dataset of 2500 phishing and non phishing emails is
analysed after extracting 23 keywords from the email
bodies using text mining from the original dataset.
Further, we selected 12 most important features using
t-statistic based feature selection. Here, we did not
find statistically significant difference in
sensitivity as indicated by t-test at 1percent level of
significance, both with and without feature selection
across all techniques except PNN. Since, the GP and DT
are not statistically significantly different either
with or without feature selection at 1percent level of
significance, DT should be preferred because it yields
'if-then' rules, thereby increasing the
comprehensibility of the system.",
-
keywords = "genetic algorithms, genetic programming,
Classification, Decision Tree, Group Method Of Data
Handling, Logistic regression, Multilayer Perceptron,
Phishing webpage, Probabilistic Neural Network, Support
Vector Machine, Text mining",
-
notes = "Also known as \cite{6510259}",
- }
Genetic Programming entries for
Mayank Pandey
Vadlamani Ravi
Citations