Optimization of Computing and Networking Resources of a Hadoop Cluster Based on Software Defined Network
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Khaleel:2018:IEEEAccess,
-
author = "Ali Khaleel and Hamed Al-Raweshidy",
-
journal = "IEEE Access",
-
title = "Optimization of Computing and Networking Resources of
a Hadoop Cluster Based on Software Defined Network",
-
year = "2018",
-
volume = "6",
-
pages = "61351--61365",
-
abstract = "In this paper, we discuss some challenges regarding
the Hadoop framework. One of the main ones is the
computing performance of Hadoop MapReduce jobs in terms
of CPU, memory, and hard disk I/O. The networking side
of a Hadoop cluster is another challenge, especially
for large-scale clusters with many switch devices and
computing nodes, such as a data centre network. The
configurations of Hadoop MapReduce parameters can have
a significant impact on the computing performance of a
Hadoop cluster. All issues relating to Hadoop MapReduce
parameter settings are addressed. Some significant
parameters of Hadoop MapReduce are tuned using a novel
intelligent technique based on both genetic programming
and a genetic Algorithm, with the aim of optimizing the
performance of a Hadoop MapReduce job. The Hadoop
framework has more than 150 configurations of
parameters and hence, setting them manually is not
difficult, but also time-consuming. Consequently, the
above-mentioned algorithms are used to search for the
optimum values of parameter settings. The
software-defined network (SDN) is also employed to
improve the networking performance of a Hadoop cluster,
thus accelerating Hadoop jobs. Experiments have been
carried out on two typical applications of Hadoop,
including a Word Count Application and Tera Sort
application, using 14 virtual machines in both a
traditional network and an SDN. The results for the
traditional network show that our proposed technique
improves MapReduce jobs' performance for 20 GB with the
Word Count application by 69.63percent and 30.31percent
when compared to the default and Gunther work,
respectively. While for the Tera Sort application, the
performance of Hadoop MapReduce is improved by
73.39percent and 55.93percent, compared with the
default and Gunther work, respectively. Moreover, the
experimental results in an SDN environment showed that
the performance of a Hadoop MapReduce job is further
improved due to the advantages of the intelligent and
centralized management achieved using it. Another
experiment has been conducted to evaluate the
performance of Hadoop jobs using a large-scale cluster
in a data centre network, also based on SDN, with the
results revealing that this exceeded the performance of
a conventional network.",
-
keywords = "genetic algorithms, genetic programming",
-
DOI = "doi:10.1109/ACCESS.2018.2876385",
-
ISSN = "2169-3536",
-
notes = "Also known as \cite{8494732}",
- }
Genetic Programming entries for
Ali Khaleel
Hamed Al-Raweshidy
Citations