1 Introduction

The availability of fresh drinking water is a focus of major concern throughout the globe. The presence of arsenic and its compounds makes water unsuitable for drinking purposes. The contamination of water by arsenic is mainly due to natural or anthropogenic processes such as oxidation, reduction, biochemical methylation, precipitation, metal plating, glass industries, mining and fertilizers (Choong et al. 2007; Shevade and Ford 2004). Arsenic usually exists as arsenic (III) and arsenic (V) in natural water. As(III) species are more harmful than arsenic (V) because of lower stability and high mobility in the water environment (Wang and Mulligan 2006; Vaughan and Reed 2005; Asli et al. 2013). Research reports have shown that arsenic poisoning causes skin, lung and liver cancers besides other effects (Jain and Ali 2000). Due to its detrimental effects on human health, the World Health Organization (WHO) recommends a maximum permissible limit of 10 μg/L, beyond which water is undesirable for drinking (WHO 2008).

The removal of arsenic from water by a cost effective method is a challenging task that has gained much attention by researchers. A number of methods and materials have been used for removal of arsenic: among them adsorption is a promising and attractive process when the adsorbent is suitable and versatile (Ma et al. 2011; Chandra et al. 2010; Wan Ngah et al. 2011).

In order to study the effectiveness of the process, modeling and optimization of the removal process is viewed as an important aspect of the study. For example, in the past artificial neural network, response surface methodology, neuro-fuzzy and swam particle optimization have been used widely in the prediction of contaminant removal from water systems with reasonable accuracy (Kotti et al. 2013; Dora et al. 2013; Aleboyeh et al. 2008; Akratos et al. 2008; Chu 2003; Saha et al. 2010; Elmolla et al. 2010; Khajeh and Modarress 2010); presently some researchers have tried to develop applications of genetic programming (GP) and support vector machine (SVM) tools in prediction and optimization of various aspects, such as prediction of carbon monoxide concentrations by using hybrid Partial Least Square Support Vector Machine (LS-SVM) models (Garg et al. 2014; Yeganeh et al. 2012), and photovoltaic plant output forecasting (Russo et al. 2014), vaporization enthalpy of petroleum fractions and pure hydrocarbon modeling (Parhizgar et al. 2013) and groundwater level prediction and simulation (Fallah-Mehdipour et al. 2013) by using GP techniques.

However, As(III) removal by cerium oxide tetraethylenepentamine and prediction of As(III) removal processes by GP and SVM techniques has not been tested yet. Hence, it is necessary to study it in more detail in order to understand the prediction behavior of both models. The main motivation behind this study was to investigate the possibility of cerium oxide tetraethylenepentamine as a low cost hybrid material for As(III) removal by using GP and LS-SVM techniques. Furthermore, a comparative evaluation was done between the two models based on statistical techniques, sensitive nature of variables and confirmatory experiments.

2 Experimental

2.1 Chemicals

All chemicals used were of AR grade and procured from Merck, Frankfurt, Germany or Sigma Aldrich, St. Louis, United States of America. Standard solutions were prepared according to procedures available in literature (Jain and Ali 2000). The glassware used was obtained from Borosil.

2.2 Synthesis and Characterization of Adsorbent

A series of hybrid materials were synthesized by following the procedure according to our previous publication (Mandal et al. 2011). The required pH 9 of the solutions was maintained by the addition of required amount of 1 M NH4OH. Three different solutions, i.e., solution-A containing 0.5 M of adipic acid (100 mL), solution-B containing 1 M of cerium oxide (100 mL) and solution-C containing 2 M (100 mL) tetraethylenepentamine, were added drop wise to 500 mL round bottomed flask containing 100 mL of distilled water following the above sequence. The solutions were kept under stirring at a temperature of 50 °C maintaining inert atmosphere. After a predetermined time of 3 h, the white colored product so obtained was filtered. The product obtained was kept undisturbed in the oven at 70 °C for drying. Other hybrid materials were prepared accordingly by changing solution-B (magnesium oxide/ zirconium oxychloride/ calcium chloride). The dried material was calcined at 200 °C for 2 h and used as adsorbent. The characterizations were carried out using various sophisticated analytical instruments and techniques as presented in Table S-1 [see Supplemental online material (S-1)].

2.3 Adsorption Experiments

The residual As(III) solution was measured in the solution after separation by filtration after taking 0.3 g of each hybrid material for 30 min in a series of conical flasks by taking 100 mg/L of As(III) in order to know the removal efficiency of each material. The adsorption experiments were carried out according to methods available in literature (Vatutsina et al. 2007) by taking 5 mg/L, 25 mg/L and 50 mg/L As(III) solution without any other ions. Table 1 presents the experimental parameter settings for As(III) removal by batch process.

Table 1 Experimental parameter settings for As(III) removal from water by batch process

In order to understand the behavior of competing ions affecting the removal efficiency, interaction of the following anions (CO3 2−, SO4 2−, HCO3 2−, NO3 2−, F, Cl) were also studied. The As(III) adsorption capacity and % removal were calculated using the following relations:

$$ {\mathrm{q}}_{\mathrm{e}}=\left[\frac{\left({\mathrm{C}}_{\mathrm{o}}-{\mathrm{C}}_{\mathrm{e}}\right)\mathrm{V}}{\mathrm{m}}\right] $$
(1)
$$ \%\;\mathrm{removal}=\left(\frac{{\mathrm{C}}_{\mathrm{o}}-{\mathrm{C}}_{\mathrm{e}}}{{\mathrm{C}}_{\mathrm{o}}}\right)\times 100 $$
(2)

where qe is equilibrium adsorption capacity in mg/g, Co and Ce are the initial and equilibrium concentrations of As(III) in solution (mg/L), m is the adsorbent mass (g) and V is the volume of solution taken.

The isoelectric point (pHzpc) is determined using electrophoretic mobility in a solution of ionic strength of 0.01 M NaNO3 and 0.01 M NaCl according to procedure available in literature (Ma et al. 2011).

3 Computational Modeling Approach

3.1 Computational Models

3.1.1 Genetic Programming

Genetic programming (GPTIPS) is a biological inspired technology (multi gene) based on Darwinian Theory of evolution and on Herbert Spencer phrase “survival of the fittest” (Garg et al. 2014). In Genetic programming, the fittest population has to be developed from the best performing trees of different sizes randomly by performing tasks based on physical data of mutation and crossover process. The mathematical models generated are represented by the population size. The models are generated by combining the elements randomly from the functional and terminal sets. A model example is presented in Fig. S-1a. The performance of initial population is checked on the training input data based on the fitness function, generally known as root mean square error (RMSE, See Eq. 3).

$$ \mathrm{RMSE}=\sqrt{\frac{1}{\mathrm{N}}{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}^2}} $$
(3)

where Ymodel,i is the value predicted of ith data sample by GP model, Yactual,i is the actual value of the ith sample and N is the number of training samples.

Based on the above fact, the genetic algorithm selects model for next genetic operation known as mutation and crossover. The need of genetic operation is to form the best performing genetic population that represent a new generation. The phenomenon of generating population through mutation and crossover continues until the termination criterion is achieved. Termination criterion is the maximum threshold limit and maximum number of generations. The parameter setting of genetic programming provides a significant role in implementing the model efficiently. The selection of parameter settings is done by trial and error method. The population size and number of generations depend upon the involvement of the input data. A study by Garg et al. (2014) suggests that minimum error will be achieved for complex data, if the population size and number of generation values are kept at a higher range.

3.1.2 Support Vector Machines Model

The support vector machines are powerful learning tools for data classification and regression based on statistical theory (Andrés et al. 2012). The model can easily predict both linear and non-linear data series by mapping. Least Square Support Vector Machines is an advanced version of standard SVM, which changes the inequality limits by equality limits in solving the quadratic equations. Input variables in the lower dimensional space are projected into a higher dimensional space H so as to convert the regression problem with non-linearity to the linear regression problem. To assist with such conversion, several transfer function can be used. Following the literature (Garg et al. 2014), we have formulated the LS-SVM (See Fig. S-1b) for prediction in the present study as follows (Garg et al. 2014),

$$ \begin{array}{c}\hfill \mathrm{Y}=\mathrm{f}\left(\mathrm{x}\right)={\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}{\mathrm{w}}_{\mathrm{i}}{\varnothing}_{\mathrm{i}}\left(\mathrm{x}\right)+\mathrm{b}={\mathrm{w}}^{\mathrm{H}}\varnothing \left(\mathrm{x}\right)+\mathrm{b}}\hfill \\ {}\hfill \mathrm{W}={\left[{\mathrm{W}}_1{\mathrm{W}}_2{\mathrm{W}}_3\dots .{\mathrm{W}}_{\mathrm{N}}\right]}^{\mathrm{H}}\ \mathrm{and}\ \varnothing ={\left[{\varnothing}_1{\varnothing}_2{\varnothing}_3\dots {\varnothing}_{\mathrm{N}}\right]}^{\mathrm{H}}\hfill \end{array} $$
(4)

where xi are the values of ith input variable and yi are the values calculated numerically. The function ∅ i(x) is the transformed higher dimensional space. The model given by ∅ (x) is linear in nature and is a converted form of a nonlinear model in higher dimensional space. In this study, there are five input variables and one output variable. In the present LS-SVM model, the kernel function is used for learning which minimizes the regularized risk (Lr). Minimizing the regularized risk is estimated by the weight (w) and bias (b). The Lr equation can be used as studied by Garg et al. (2014). The parameter setting used in the present study is presented in Table 2.

Table 2 Parameter settings for GP and LS-SVM modeling for As(III) removal

The normalization of input and output variables is done by using the following equation (Garg et al. 2014):

$$ {X}_{norm}=0.8\times \left(\frac{X_i\mathit{\hbox{-}}{X}_{min}}{X_{max}\mathit{\hbox{-}}{X}_{min}}\right)+0.1 $$
(5)

where Xi is ith input, Xmin and Xmax are minimum and maximum values of variable Xi. The problem formulation of As(III) removal by experimental and computational modeling using GP and LS-SVM models is presented in Fig. 1.

Fig. 1
figure 1

Problem formulation of modeling of arsenic (III) removal

3.2 Model Performance

Several statistical methods are used for validation of model performance (Garg et al. 2014; Andrés et al. 2012; Pai et al. 2011). These methods are the: mean square error (MSE), mean absolute percentage error (MAPE), average absolute relative error (AARE), normalized bias (NB), standard deviation (σ), chi square (χ 2) and correlation coefficient (R2), which are described in Eq. (6) to (12):

$$ \mathrm{M}\mathrm{S}\mathrm{E}=\frac{1}{\mathrm{N}}{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}^2} $$
(6)
$$ \mathrm{MAPE}=\frac{1}{\mathrm{N}}{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}\left|\frac{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}{{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}}\right|\times 100} $$
(7)
$$ \mathrm{AARE}=\frac{1}{\mathrm{N}}{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}\left|\frac{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}{{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}}\right|} $$
(8)
$$ \sigma =\sqrt{{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}\frac{1}{\mathrm{N}-1}{\left[\left|\frac{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}{{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}}\right|-\mathrm{AARE}\right]}^2}} $$
(9)
$$ {\upchi}^2={\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}{\frac{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)}{{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}}}^2} $$
(10)
$$ \mathrm{N}\mathrm{B}\%={\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}\frac{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}\right)/{\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}}{\mathrm{N}}} $$
(11)
$$ {\mathrm{R}}^2=\frac{{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}}{\left({\mathrm{Y}}_{\mathrm{actual}.\mathrm{i}}-{\mathrm{Y}}_{\mathrm{model},\mathrm{mean}}\right)}^2-{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}}{\left({\mathrm{Y}}_{\mathrm{model},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{actaul},\mathrm{i}}\right)}^2}{{\displaystyle {\sum}_{\mathrm{i}=1}^{\mathrm{N}}}{\left({\mathrm{Y}}_{\mathrm{actual},\mathrm{i}}-{\mathrm{Y}}_{\mathrm{model},\mathrm{mean}}\right)}^2} $$
(12)

where Yactual,i is experimental value of % removal, Ymodel,i is predicted % removal by the model, Ymodel,mean is the average value of model % removal prediction, and N is the number of experiments.

4 Results and Discussion

According to initial adsorption experiments, cerium oxide tetraethylenepentamine (CTEPA) hybrid material exhibited maximum removal of As(III) from water as compared to other materials, and hence the CTEPA was selected for further study. The ion exchange capacity, specific surface area and removal percentage of all hybrid materials synthesized are presented in Table 3.

Table 3 Selection of best hybrid material towards As(III) removal efficiency

The thermogravimetric analysis (TGA) of the materials is represented in Fig. 2. From Fig. 2, it is clear that there is a rapid weight loss of 53 % in the temperature range 230–300 °C, and the corresponding endothermic DSC peak is due to the decomposition of organic component present in the material. Another weight loss of 8.1 % is observed in the range of temperature 150–230 °C, which may be due to the loss of free and/or bonded water molecules present in the material. The other weight loss of 15.9 % is observed in the range of 300–350 °C, which is due to the decomposition of residual organic components present in the material. The above observations indicate that the organic component of the material tetraethylenepentamine is an essential component of the material either in bonded or adsorbed form on the cerium oxide gel.

Fig. 2
figure 2

TGA and DSC of CTEPA hybrid material

Phase analysis of CTEPA and As(III) adsorbed CTEPA have been carried using the obtained X-ray diffraction pattern, as shown in Fig. 3. A series of sharp and intense peaks can be seen with (111) as the highest intense and sharp peak with clear reflections at low 2 theta values. This characteristic peak is observed for mesoporous cerium oxide (Laha and Ryoo 2003) that matches well with JCPDS No. 810972. The sharp peaks suggest high crystalline nature of the hybrid material before and after adsorption. Presence of arsenic is seen at lower 2 theta values with small peaks. After adsorption, slight shift of peaks to higher angle is observed suggesting inter-planar rearrangements due to the introduction of arsenic species and formation of complex framework with CTEPA hybrid material, as marked in figure. The peak intensity of (111) increases after arsenic adsorption relating to increase in crystallite size of the particles. The crystallite sizes are calculated using Debye Scherrer equation (Holzwarth and Gibson 2011):

Fig. 3
figure 3

The powder XRD of fresh CTEPA and CTEPA adsorbed arsenic (III)

$$ \mathrm{D}=0.9\lambda /\beta \cos\ \theta $$
(13)

where β is the full width half maximum at 2θ value. The average crystallite size before and after adsorption is calculated to be 21 nm and 102 nm, respectively. The above results suggest increase in crystallite size due to adsorption of arsenic on the material sites.

Field emission scanning electron microscopy (Fe-SEM) images of CTEPA hybrid material and As(III) adsorbed material are presented in Fig. 4, along with EDS micrographs. CTEPA is observed to have highly porous spongy structure with non-uniform pore structure. Presence of high amount of cerium and oxygen in EDS micrograph attributes to the absence of any other adsorbed impurities before the adsorption. After adsorption, a fairly high amount of arsenic is found from EDS micrograph supporting the particle flood on CTEPA surface with small fine agglomerated particles indicating possible adsorption of As(III). High Resolution TEM (Fig. 5) has been carried to better understand the morphology of CTEPA hybrid material. CTEPA is seen to be mesoporous in nature with low crystallinity as observed in corresponding selected area electron diffraction (SAED) pattern showing major concentric circles. As(III) adsorption is confirmed from high-resolution transmission electron microscopy (HR-TEM) image showing small particles having hard agglomeration over the surface of CTEPA material with high crystallinity. All the above results support the adsorption of arsenic species on the surface active sites of the adsorbent.

Fig. 4
figure 4

Fe-SEM and EDS micrographs of CTEPA material a before and b after adsorption of arsenic (III)

Fig. 5
figure 5

SAED and TEM pattern of CTEPA hybrid material a-b before and c-d after adsorption of arsenic (III)

The isotherm t-plot and pore size distribution of the material are represented in Fig. 6. The material exhibits the specific surface area of 394 m2/g, micropore volume of 0.522 cm3/g and average micropore diameter of 17.332 A°. A specific surface area (24.59 m2/g) of the material after adsorption of As(III) is found to be reduced, which indicates that the pores may be closed due to adsorption of As(III). The material shows Type II and Type III isotherm before and after adsorption, indicating indefinite multi-layer formation after completion of the monolayer. A study by Chandra et al. (2010) suggests the possibility of the above observation.

Fig. 6
figure 6

The typical nitrogen adsorption-desorption isotherms; t-plot and pore size distributions for a-c before and d-f after adsorption of arsenic (III)

4.1 Batch Adsorption Experiments

4.1.1 Influence of Dose and Solution pH on Adsorption Process

The influence of adsorbent dose on percentage removal of As(III) ions from aqueous solution is graphically represented in Fig. S-2. It is observed from the figure that the removal of As(III) increased from 16 to 95.6 %, 11.8 to 87.1 % and 7.36 to 79.2 % for 0.1–0.9 g/L of the material of initial concentration 5 mg/L, 25 mg/L and 50 mg/L As(III) solution, respectively. The optimum dose is found to be 0.7 g/L, as there is no noticeable change in percentage removal of As(III) beyond this dose. This dose was used for further adsorption studies.

The effect of pH on removal of As(III) ions and zeta potential of the material was studied and presented in Fig. S-3a-b. The isoelectric point of CTEPA is found to be 6.7 and 6.9 for 0.01 M NaCl and 0.05 M NaCl solutions, respectively.

The removal of As(III) was examined with a change in solution pH from 1 to 12. The percentage removal of As(III) rapidly decreases from 97.8 to 12.4 %, 90.6 to 3.5 % and 87.1 to 2.7 % with an increase in pH from 8 to 12, for initial As(III) concentrations of 5 mg/L, 25 mg/L and 50 mg/L, respectively, which indicates that pH plays an important role in the adsorption process. With increase in pH, the OH ion concentration increases in the solution which may be competing for the adsorption with As(III) species. The suitable pH for removal of As(III) species is between pH 6 and 7, which is in good agreement with the obtained isoelectric (pHzpc) values. The above experimental data for variable pH and adsorbent dose is trained by best fit model, and is found that the best fit model satisfactorily predicts the observation of the experimental data.

4.1.2 Effect of Temperature

The effect of temperature on percentage removal of As(III) with initial concentration of 5 mg/L, 25 mg/L and 50 mg/L onto the material was studied using optimum dose and pH. The results are graphically presented in Fig. S-4, the percentage removal of As(III) increases from 73.2 to 96.8 % (5 mg/L), 46.6 to 96.2 % (25 mg/L) and 47.3 to 97.1 % (50 mg/L), respectively in the temperature range of 20 °C to 70 °C. The above observations are further supported by calculating thermodynamic parameters. The following parameter (free energy (ΔG), change in enthalpy (ΔH) and change in entropy (ΔS)) of adsorption process are calculated using the following equations (Asli et al. 2013):

$$ \log {\mathrm{K}}_{\mathrm{C}}=\frac{\varDelta \mathrm{s}}{2.303\mathrm{R}}-\frac{\varDelta \mathrm{H}}{2.303\mathrm{R}\mathrm{T}} $$
(14)
$$ \varDelta \mathrm{G}=\varDelta \mathrm{H}-\mathrm{T}\varDelta \mathrm{S} $$
(15)

A plot of log Kc versus 1/T for initial As(III) concentration of 5 mg/L, 25 mg/L and 100 mg/L was found to be linear in nature. The value of Kc was calculated using the following equation (Asli et al. 2013):

$$ {\mathrm{K}}_{\mathrm{c}}=\raisebox{1ex}{${\mathrm{C}}_1$}\!\left/ \!\raisebox{-1ex}{${\mathrm{C}}_2$}\right. $$
(16)

where C1 is the amount of As(III) ion adsorbed per unit mass of CTEPA and C2 is the concentration of As(III) in aqueous phase. The values obtained are presented in Table 4. For a chemical reaction to occur, bonds must break before new bonds can be formed. The bond breaking and forming is a process of energy absorption (endothermic, +ΔH) and release (exothermic, −ΔH), respectively. So, possibly new bond formation takes place between As(III) and adsorbent resulting to negative enthalpy (exothermic nature) of the ongoing reaction. The positive value of ΔS suggests the structural modification in the system and increases the affinity of adsorbent for As(III) species. The positive value of entropy indicates the increase in randomness in the system. Negative value of ΔG at each temperature indicates the spontaneous nature and feasibility of ongoing adsorption. A decrease in values of ΔG with increase in temperature suggests more impulse of As(III) adsorption at higher temperature.

Table 4 Thermodynamic parameters obtained for As(III) removal using CTEPA, As(III) concentration of 5 mg/L, 25 mg/L and 50 mg/L

4.1.3 Effect of Contact Time and Adsorption Kinetics

The effect of contact time on percentage removal of As(III) is graphically presented in Fig. S-5. It is evident from the figure that the removal of As(III) increased from 42.6 to 95.8 %, 47.1 to 90.6 % and 41.5 to 90.8 % for a contact time of 10 to 50 min for initial As(III) concentration of 5 mg/L, 25 mg/L and 50 mg/L, respectively. It is clear that about 40 % removal took place within 10 min, and equilibrium is established after 40 min. The rapid removal during the first 10 min may be due to availability of vacant sites and high concentration gradient. Further, kinetics of adsorption and mass transfer was also computed using different mathematical models such as pseudo first-order, second-order and intraparticle diffusion (Weber-Morris equation) for initial As(III) concentration of 5 mg/L, 10 mg/L and 50 mg/L. The integrated form of the pseudo-first order rate equation can be represented as:

$$ \log \left({q}_e-{q}_t\right)= log{q}_e-{K}_1\left(\frac{t}{2.303}\right) $$
(17)
$$ \frac{t}{q_t}=\left[\left(\frac{1}{K_2{q}_e^2}\right)+\frac{1}{q_e}\right]t $$
(18)
$$ {q}_t={K}_p{t}^{1/2}+C $$
(19)

where qe and qt (mg/g) are the amounts of As(III) adsorbed at equilibrium and at time ‘t’, respectively. K1, K2 and Kp are first-order rate constant, second-order rate constant and intraparticle diffusion rate constant, respectively. The values are calculated from their respective slope and intercept (Fig. S-5b-d). The data obtained are presented in Table 5. Due to poor regression coefficient (R2) values obtained for pseudo-first order and intraparticle diffusion, it may be considered that the adsorption process is best fitted with second-order rate equation. The contact time and kinetic data were trained with the best fitted model and it was found that the model predicts satisfactorily with high correlation values. The trend of observation is presented in respective plots (Fig. S-5b-d).

Table 5 Kinetic and isotherm parameter for As(III) removal by CTEPA

It is observed from Fig. S-5d that there is a deviation of line from the origin which indicates that the intraparticle transport is not the only rate limiting step, but the transport of adsorbate through the pores of the adsorbent and adsorption on the surface of the material are also responsible. Subsequently, the acceleration of the adsorption process indicates that the diffusion is not consecutive due to pore size (Erhan et al. 2004). At the final phase of adsorption process, the rate of diffusion remains constant due to exhaustion of pores in the material. When the Kp value is higher, the rate of adsorption is increased, but when the C value is higher, the adsorption is better due to better bonding between adsorbate and adsorbent. The concentration gradient decides the flow of particles into the pores of material. With the increase in the concentration of As(III), the Kp value increases, suggesting that the intraparticle diffusion is considered as the concentration diffusion (Erhan et al. 2004; Biyan et al. 2009).

4.1.4 Impact of Adsorbate Concentration and Adsorption Isotherms

The effect of initial concentration on % removal of As(III) from aqueous solution is graphically presented in Fig. 7a, which presents the impact of concentration of adsorbent on the removal efficiency of the material. It is evident from the graph that with increasing initial concentration of As(III) from 1 to 180 mg/L in the solution, the removal efficiency decreases from 92 to 55.2 % due to the lack of active sites at higher concentration of As(III).

Fig. 7
figure 7

Experimental and model prediction a effect of initial concentration b Langmuir isotherm plot c Freundlich isotherm plot d D-R isotherm plot

To understand the adsorption behaviour, adsorption data obtained from initial concentration was fitted to linear transformed Langmuir (Eq. 20), Freundlich (Eq. 21) and Dubinin-Radushkevich (D-R) (Eq. 22) adsorption isotherms, as presented in the following equations (Foo and Hameed 2010):

$$ \frac{1}{q_e}=\frac{1}{q_ob{C}_e}+\frac{1}{q_o} $$
(20)
$$ \log {q}_e= log{K}_f+\frac{1}{n} log{C}_e $$
(21)
$$ ln{q}_e= ln{q}_m-K{\varepsilon}^2 $$
(22)

where qo is the maximum amount of the As(III) ion adsorbed per unit weight of the material to form a complete monolayer on the surface (adsorption capacity), qe is the amount of As(III) adsorbed at equilibrium (mg/g), Ce is the equilibrium adsorbate concentration (mg/L), and b is the binding energy constant. Kf and 1/n are the constants representing Freundlich adsorption capacity and adsorption intensity, respectively. The Polanyi potential is represented as ε and is equal to RT ln(1 + 1/Ce), qm is the theoretical adsorption capacity, K is the constant and T is the temperature in degrees Kelvin. The plots of Langmuir, Freundlich and D-R isotherms are presented in Fig.7b-d.

The values of different parameters are calculated from the slope and intercept of the plot and are presented in Table 5. The high correlation coefficient (R2) value suggests that the adsorption data were best fitted to Langmuir adsorption isotherm with a maximum adsorption capacity of 124.8, 201.4 and 358.4 mg/g for 25, 50 and 75 °C, respectively. The mean free energy of adsorption (E) is calculated from the constant K using the relation (Foo and Hameed 2010)

$$ E={(2K)}^{\raisebox{1ex}{$-1$}\!\left/ \!\raisebox{-1ex}{$2$}\right.} $$
(23)

It is defined as the change in free energy when 1 mol of ion is transferred from the solution to the surface of the solid. The value of E is very useful in predicting the type of adsorption. The different types of adsorption processes are: (a) Physical (< 8 kJ/mol); (b) ion exchange (8 < 16 kJ/mol); and (c) chemisorption (> 16 kJ/mol). As presented in Table 5, the value obtained at different temperatures is higher than 16 kJ/mol, suggesting chemisorption nature of the adsorption process.

The dimensionless equilibrium parameter r is calculated by using the following equation (Foo and Hameed 2010):

$$ r=\frac{1}{1+b{C}_o} $$
(24)

where Co is initial As(III) concentration in mg/L, where values of r < 1 indicate favourable adsorption process. The values of r are found to be 0.340, 0.094 and 0.020 for initial concentrations of 5 mg/L, 25 mg/L and 50 mg/L, respectively, which indicates a favourable adsorption system. A similar trend of observation has been observed with high correlation value when the above isotherm and concentration data are trained by using best fitted model. The observations at prediction are presented in their respective plots (Fig. 7a-d).

4.1.5 Competitive Ions and Regeneration Studies

All the above studies were conducted by synthetic As(III) solution only. However, in real water, several other anions exist, which may compete for the adsorption. In order to understand the effect of anions on adsorption of As(III), a mixture of common anions such as hydroxide, nitrate, chloride, phosphate, sulphate, fluoride, carbonate and bicarbonate were added in known quantities of As(III) solution. The initial concentrations of As(III) were kept fixed at 5 mg/L, 25 mg/L and 50 mg/L, while the concentrations of the other anions are varied from 5 to 100 mg/L. The observation is reported in the form of a plot in Fig. S-6.

It is understood from the figure that presence of these anions highly reduces the As(III) removal efficiency. The anions reduced the As(III) adsorption in the order of hydroxide > bicarbonate > carbonate > phosphate > nitrate > fluoride > sulphate > chloride. This analysis depicts that hydroxides, bicarbonates and carbonates has the most and sulphate has the least effect on the removal of As(III) by the material. It is found that the percentage removal of As(III) is reduced by >71 % in the presence of hydroxide, bicarbonate and carbonate in As(III) solution. This may be due to material having higher interaction and strong attraction towards hydroxide, bicarbonate and carbonate ions as they have a tendency to form correspondingly hydroxide, carbonates and bicarbonates. Presence of sulphate and chloride reduces As(III) removal < 15 %. It can be concluded that this material can be used for As(III) removal in the presence of chloride and sulphate at high concentrations; however, presence of hydroxide, bicarbonate and carbonate can affect the removal percentage. So, the alkalinity test must be carried out before using CTEPA for As(III) removal.

A desorption and a regeneration study was carried out in order to understand the reusability of the material. The results obtained are presented in Table S-2.

Generally, poor desorption occurs, when the adsorption process is of chemical nature and this happens because in chemisorption process, adsorbate and adsorbent are associated with stronger bonds (Ma et al. 2011). The data obtained suggest the chemisorption nature of the adsorption process, which is in good agreement with the above data presented in free energy.

Adsorbents are usually regenerated in acidic or alkali medium. In the present study, the material is unstable in extreme acidic conditions (below pH of 3), so the regeneration tests are carried out between pH 8 and 11. It is observed from Table S-2 that regeneration is less than 10 % at different pH values. This suggests poor regeneration of CTEPA material. Studies by Ma et al. (2011) suggest the possibility of above regeneration and desorption studies.

4.1.6 Mechanism of As(III) Adsorption

The FTIR spectra of CTEPA before and after adsorption are shown in Fig. 8. Broad and intense bands at ~3128 cm−1 indicate the presence of overlapped O-H and NH2 stretching vibration groups (Ma et al. 2011). The band at ~1383 cm−1 corresponds to carbonate anion which may be due to absorption of atmospheric CO2 gas. Few lattice vibrations of metal-oxygen bonds M-O are also observed at ~661 cm−1 and ~486 cm−1 (Hameed et al. 2006). Two new peaks appear at ~1625 cm−1and ~560 cm−1 after As(III) adsorption onto the material, possibly indicating the formation of surface complexation (M-O-X) (where X = HAsO3 3−, H2AsO4 , AsO2 ). It is well known that As(III) species are adsorbed via electrostatic attraction and complexation (Ma et al. 2011). The presence of M-OH and M-NH2 + functional groups plays an important role in the adsorption process according to FTIR spectra. Possibly, there is electrostatic attraction between negatively charged As(III) species and positive surface of the adsorbent (NH2 +), since the solution pH is below the isoelectric point (pHzpc) (See Section 4.1.1). At low pH, surface hydroxyl groups are protonated and provide the complexation process, because –OH2+is easier to displace from the metal binding system than OH functional groups (Ma et al. 2011). At the active adsorption sites, As(III) species replaced the hydroxyl groups which are then released to the water solution. The adsorption process could be described in two steps, as illustrated in Fig. 9: (a) electrostatic interaction between positively charged center (nitrogen) and negatively charged arsenite molecule in solution; and (b) complexation between positively charged surface hydroxyl group and arsenite. Table 6 presents the comparison of adsorption capacity of As(III) ions with that of various other hybrid materials reported in the literature, also showing pH range.

Fig. 8
figure 8

FTIR comparison spectrum of CTEPA hybrid material before and after adsorption of arsenic (III)

Fig. 9
figure 9

Mechanism of arsenic (III) removal by CTEPA hybrid material

Table 6 Comparison of present adsorbent material with materials available in literature

4.2 Computational Modeling Approach

4.2.1 GP and LS-SVM Models

In the GP and LS-SVM modeling, input and output variables are normalized first in order to achieve the uniformity level of individual factors. The models are executed on the basis of trial and error method, as discussed in Section 3. Figures 10a and b illustrate the prediction of arsenic removal at training and testing using GP and LS-SVM model. As seen in both figures, there is a good agreement between models and experimental results and the models represent reasonable prediction of the system behaviour. The LS-SVM prediction is terminated after satisfying the optimized parameters (dF = 0.1129, dX = 0.13198, X = 6.6667, F(X) = 174.2688, Hyper parameter (gamma = 785.7, sig2 = 40.24)) and the best-so-far GP model (see Eq. 25) obtained after satisfying the termination criteria is:

Fig. 10
figure 10

Correlation prediction plot of arsenic (III) removal at training and testing using a GP and b LS-SVM models

$$ \begin{array}{l} ypred=15.71\times \left({x}_1\right)-0.5959\times \left({x}_3\right)+0.5959\times \left({x}_4\right)-0.7665\times \left({x}_5\right) + \hfill \\ {}0.6579\times tan\left({x}_3\right)\times \left( cos\left({x}_3\right) - \left({x}_5\right) + 0.9605\right) - 0.03035\times tan\left({x}_2\right)\times \left({x}_5-\right.\hfill \\ {}\left.1.148\right)+0.7665\times cos\left({x}_1\right)+15.71\times cos\left({x}_3\right)+0.5959\times cos\left({x}_5\right)-15.71\times \hfill \\ {}\begin{array}{l} sin\left({x}_2\right)-0.5959\times cos\left( sin\left({x}_5\right)\times sin\left({x}_1\right)\times \left({x}_1\right)-\left({x}_3\right)\right) + 0.03035\times \left({x}_3\right)\times \hfill \\ {}\left({x}_5\right)\times \left( sin\left({x}_5\right)-\left({x}_3\right)\times tan\left({x}_1\right)\right)+0.4997\times \left({x}_1\right)\times \left({x}_3\right)\times \left({x}_5\right)+10.85\hfill \end{array}\hfill \end{array} $$
(25)

The equation obtained through genetic programming satisfied the proposed parameter settings, as presented in Table 2. In Eq. (25), x 1, x 2, x 3, x 4, and x 5 denote normalized values of dose, pH, time, temperature and concentration, respectively. The execution time of running the GP and LS-SVM algorithm for the present study was approximately 10 min on Dell OPTIPLEX 980 (core i5, RAM 4GB, Windows 7). The number of experimental runs and its prediction behaviour by the GP and LS-SVM model is illustrated in Fig. S-7a-b.

The residuals versus experimental run for both models (GP and LS-SVM) (See Fig. 11a-b presents uniform nature of residues at training and testing, suggesting that the models were drifting slowly to lower values as the prediction continued. The distribution of points scattered randomly about −10 to +10 and −20 to +40 for GP and LS-SVM, respectively, regardless of the size of the fitted value, but the residual values may increase with the increase in the size of the fitted value. Having this condition, the residual points appear to be “funnel shaped” as larger end toward larger fitted values. This is due to the fact that with the increase in the value of the response, the residuals have larger and larger scatter. In the present investigation the position of residuals is well within the range of values used for prediction in genetic programming, but the residuals obtained by using LS-SVM was somewhat higher than the GP model because LS-SVM was found unable to capture satisfactorily the relationship between the process variables.

Fig. 11
figure 11

Experimental runs versus residuals obtained at training and testing by using a GP and b LS-SVM models

The percentage error distribution at prediction for both models at training and testing is presented in Fig. S-8a-b. The % error distribution lies within the range of +25 and −25 for GP and −70 to +60 for LS-SVM. In order to understand the reliability and performance of the models, a comparison among the models is discussed in Section 4.2.2.

4.2.2 Comparison and Performance of the Models

A separate comparison at prediction data for thermodynamic parameter, kinetic parameter and isotherm parameters are presented and discussed in Section 4.1. In this section the overall performance of both models in prediction of As(III) removal at batch studies are done by the statistical parameters presented in Section 3.3. The obtained values are presented in Table 7. The values (MSE, RMSE, MAPE, AARE, chi square, NB% and standard deviation) suggest that the GP model has impressively well learned the non-linear relationship between the input and output process variables with high correlation coefficient values (R2) of 0.988 and 0.977 at the training and testing phases, respectively. Compared to the LS-SVM model, the GP model has shown better performance. A statistical comparison based on box plots of relative error (%) for the two models at training and testing (Fig. 12) is done by using the following relation (See Eq. 26):

Table 7 Model comparison and performance at training and testing (GP and LS-SVM models)
Fig. 12
figure 12

Statistical comparison based on box plots of relative error (%) for GP and LS-SVM models at training and testing

$$ Relative\ error\ \left(\%\right)=\frac{\left|\overline{mode{l}_i}-{\overline{ actua l}}_i\right|}{actua{l}_i}\times 100 $$
(26)

where \( {\overline{model}}_i \) and \( {\overline{actual}}_i \) are the average values of predicted and actual quantities, respectively. A lower mean relative error of 3.48 % and 3.86 % at training and testing, respectively, shows that GP is able to capture the relationship between process variables satisfactorily. The descriptive statistics of the errors are shown in Table 7. The table illustrates mean, standard error, standard deviation, median, maximum and minimum errors. Lower range of confidence intervals at training and testing for the proposed GP model indicates that overall the GP model performance is found to be better than LS-SVM model. Thus, from the statistical comparison presented, it can be concluded that the proposed evolutionary GP model is better compared to LS-SVM in prediction of As(III) removal from water by CTEPA hybrid material.

4.2.3 Sensitivity Analysis

Once the prediction and selection of best model is done by establishing the relationship between input and output, the assessment of impact of each input on output is done by using the following Eq. (27) (Elmolla et al. 2010):

$$ SCO=\frac{\left(\boldsymbol{Scaled}\ \boldsymbol{Output}\boldsymbol{fo}\boldsymbol{r}\ 10\%\ \boldsymbol{in}\boldsymbol{crease}\ \boldsymbol{in}\ \boldsymbol{in}\boldsymbol{put}\right)\hbox{--} \left(\boldsymbol{Scaled}\ \boldsymbol{Output}\ \boldsymbol{f}\boldsymbol{o}\boldsymbol{r}\ 10\%\ \boldsymbol{decrease}\ \boldsymbol{in}\ \boldsymbol{in}\boldsymbol{put}\right)}{2} $$
(27)

The scaled change in output (SCO) is calculated with the current input increased by 10 % and the current input decreased by 10 %. Thus, the results obtained are the scaled output change per 10 % change in input. Increase/decrease of an input from its base value results in increase/decrease performance level. Logically, the net effect of change in input results in a positive score for average scaled change in output. The data obtained from sensitivity analysis is presented in Table 8.

Table 8 Net sensitivity at output and relative contribution of input variables affecting removal percentage of arsenic (III)

The percentage relative contribution of each input variable in removal percentage is calculated using the following equation (Bring 1996):

$$ Relative\ contribution\ \left(\%\right)=\frac{ vari anc{e}_x}{vari\mathrm{a}nc{e}_{x(total)}}\times 100 $$
(28)

The relative contribution of input variables in As(III) removal is presented in Fig. 13. The variables such as dose, pH and time are important influencing variables, which effect As(III) removal at highest. However, the sensitivity data suggest that variable concentration is more sensitive than others, but its contribution towards removal percentage is low, which makes it a less significant variable.

Fig. 13
figure 13

The relative contribution of input variables in arsenic (III) removal by CTEPA hybrid material

4.2.4 Best Optimal Settings

In the present study, a best optimal setting was obtained by feeding all possible experimental runs of batch studies in GP model (parameters used, as presented in Table 2). The optimum removal of 99.7 % is achieved with a dose = 0.7 g/L, pH = 6, time = 50 min, temperature = 65 °C, and initial concentration = 25 mg/L. The best-so-far GP model for predicting the optimized value is:

$$ ypred=11.79\times cos\left( sin\left({x}_1\right)\times \left({x}_4\right)\right)+\left(\left( tan\left({x}_3\right)+\left({x}_5\right)\right) - 1.127\right) - \left(5.952\times cos\left( cos\left({x}_2\right)+0.8083\right)\right) - 11.8\times \left( \sin \left({\mathrm{x}}_3\right) - 5.221\times \sin \left({\mathrm{x}}_4\right)\right) - 3.516\times \left({\mathrm{x}}_1\right)\times \left(\left({\mathrm{x}}_1\right) - 3.504\right) + 5.952\times \sin \left(\left({\mathrm{x}}_3\right) - \left({\mathrm{x}}_5\right)\right) + 6.278\times \tan \left( \sin \left( \sin \left(\left({\mathrm{x}}_3\right)\times \left({\mathrm{x}}_4\right)\right)\times \left(\left({\mathrm{x}}_2\right) + 0.8083\right)\right)\right) - 0.1955\times \tan \left(5.221\times \cos \left(\left({\mathrm{x}}_2\right) + 0.8083\right) - \left(\left({\mathrm{x}}_1\right) - 3.504\right)\times \left(\left({\mathrm{x}}_2\right) - 0.7015\right)\right) - 10.92\times \sin \left(3.516\times \left({\mathrm{x}}_1\right)\times \left(\left({\mathrm{x}}_1\right) - 3.504\right) + 5.221\times \left({\mathrm{x}}_1\right)\times \left({\mathrm{x}}_4\right) + 0.7015\right) - 5.952\times \tan \left(3.516\times \left({\mathrm{x}}_2\right)\right) + 7.442\times \tan \left( \sin \left(\left({\mathrm{x}}_1\right)\times \left({\mathrm{x}}_4\right)\right) - 0.9009\times \left({\mathrm{x}}_2\right)\right) - 0.4905\times \left(\left({\mathrm{x}}_1\right) - 3.516\right)\times \left( \cos \left(\left({\mathrm{x}}_2\right)\times \left({\mathrm{x}}_4\right)\right) - 0.7015\right)\times \left(\left({\mathrm{x}}_1\right) - \left({\mathrm{x}}_2\right) + \left({\mathrm{x}}_4\right)\right)+1.667\times \left(\left({\mathrm{x}}_1\right)-3.504\right)\times \left( \cos \left(\left({\mathrm{x}}_2\right)\times \left({\mathrm{x}}_4\right)\right)-0.7015\right)\times \left( \cos \left(\left({\mathrm{x}}_2\right)+0.8083\right)+ \cos \left(\left({\mathrm{x}}_2\right)\times \left({\mathrm{x}}_4\right)\right)+7.352\right)+88.72 $$
(29)

The GP model has impressively learned the nonlinear relationship between input and output with correlation coefficient (R2) of 0.987 at the optimization stage. The above observations suggest that, if optimized input combinations are followed, then it is possible to achieve maximum removal percentage.

4.2.5 Confirmatory Experiments

After finding out the best optimal settings from the modeling, ten numbers of actual experiments are performed with the same conditions in the laboratory to confirm the model adequacy and predictive relationship. The 95 % confidence interval of confirmation experiments (CICE) is calculated by using the equation reported by Ross (1988), Roy (1990) and Krishnaiah and Shahabudeen (2012). Confidence interval is estimated as 34.78. Therefore, the predicated confidence interval for the confirmatory experiments is in the range of:

$$ 34.78<\mu <99.7 $$

The actual average of ten removal experiments of confirmation test is found to be 96.52. As the overall average values of the confirmatory experimental test fall well within the 95 % CICE, the confidence interval obtained by using the equation reported by Ross (1988), Roy (1990) and Krishnaiah and Shahabudeen (2012) suggests the accuracy of genetic modeling in prediction of removal percentage.

5 Conclusions

The hybrid material cerium oxide tetraethylenepentamine (CTEPA) was successfully synthesized by co-precipitation method and was characterized using analytical techniques. This study presents an alternate approach using genetic programming and least square support vector machine tools to investigate removal percentage of As(III) ions from water using the hybrid material with a maximum removal percentage (97.2 %). The XRD study confirms the crystalline nature of the material having average crystallite size of 21 and 120 nm before and after adsorption, respectively. The porous nature of the material was confirmed from Fe-SEM studies and As(III) adsorption was confirmed from corresponding EDS. The thermodynamic study indicated that the adsorption process is exothermic, spontaneous and feasible in nature. The kinetic study indicated that the adsorption process is of second order. The Langmuir isotherm was found to best fit the experimental data resulting in high correlation coefficient of R2 > 0.98, with maximum adsorption capacity of 124.8 mg/g at 25 °C. The chemisorption nature of the adsorption process was confirmed by D-R isotherm studies. The FTIR studies suggested the mechanism of electrostatic attraction and complexation for As(III) ions onto the hybrid material. The least values of MSE, RMSE, MAPE, AARE, chi square, NB%, standard deviation and high correlation coefficient values (R2) (0.988 and 0.977) suggested that the genetic programming (GP) was best fitted to the experimental data and had higher predictive capability than LS-SVM. The variable concentration has the highest sensitivity among other variables, but it has the least contribution to the removal percentage. The study concludes that GP modeling can be used as effective prediction tool in As(III) removal from water. The results also indicate the superiority of genetic programming tools in capturing the nonlinear behaviour of the adsorption system. Based on the above findings, the study can be used as a guide for prediction of As(III) removal from water.