1 Introduction

Noise is introduced into images during acquisition, signal amplification, and transmission. Image-denoising techniques are mainly used to recover a clean version of a digital image that has been contaminated by additive or multiplicative noise. Noises in images are primarily caused by malfunctioning optical sensors or by electromagnetic interferences during acquisition, transmission, and storage. Noise always corrupts images; as a result, some pixels are affected and altered, whereas others remain unchanged. Nonetheless, distinguishing between noise-free and corrupted pixels is often a tedious task. Thus, image restoration techniques are a common research topic in the field of image processing [1, 2].

Existing image-denoising methods are categorized into heuristically optimized and nonparametric techniques. In the former type, linear and nonlinear filters are utilized in the restoration processes. Linear filters are commonly used due to the cost merit; however, these applications are limited by the poor performance of such filters, as reflected in the blurred edges, ridges, and fine textural details [3, 4]. Furthermore, these filters cannot address multiplicative and mixed noises effectively. By contrast, nonlinear filters can be used to remedy the shortcomings of linear filters [5]. They effectively preserve edges and sharp ridges. However, both local and nonlocal filtering methods known as the bilateral filter which is further modified and improved by [6, 7]. Several local filters are introduced to increase the performance of the Gaussian noise model given that Gaussian noise is the most common type of noise in the image-denoising field. Noisy images can also be smoothed in different directions with the anisotropic filter proposed in [8]; with this filter, only the directions that are orthogonal to the gradient are applied to decrease the blur effect via a Gaussian filter. Rudin and Osher [9] employed regularization method based on the total variation to handle the homogeneous and soft areas of an image which do not include the sharp-edged and high-frequency components. The bilateral filter presented in [10] also averages the pixels located in the same neighborhood; these pixels share features with and take values close to that of the central pixel. This filter was also utilized to improve the edges, ridges, and sharp components of a noisy image; nonetheless, the denoising visual quality of this bilateral filter is mainly dependent on parameter optimization, as are all local filters. In [11], a method was proposed that uses tetrolet transforms with a locally adaptive thresholding scheme to improve the edge-preserving image-denoising technique.

Local filters are typically considered nonparametric filters that can induce a desirable visual quality. Weighted averaging-based filters adhere to the same principle as early local filters do [12, 13]. Such filters are trained to restore images [14]. The sole difference of weighted averaging-based filters from local filters is that the weights are derived from an offline training set; they are not trained online on a great patches number or images. Liu et al. [15] analyzed and exploited the properties of a dictionary for a sparse-coding technique to robustly estimate the sparse codes in a natural image noise removal process. Bayesian least squares–Gaussian scale mixture model (BLS–GSM); that is worked according to multiscale wavelet analysis was introduced in [16] for local image statistics thus, the dependencies among coefficients in the local neighborhood can be utilized. Nonlocal filtering-based methods, such as block matching 3-dimension (BM3D) [17], BM3D-SAPCA where SAPCA stands for shape adaptive principle component analysis [18], and singular value decomposition (SVD) [19, 20], effectively maintained the visual quality of denoised images. K-clustering with K-SVD was introduced in [21] to incorporate a dynamic basis in local representations; this approach achieves good restoration results by training a dictionary from a corrupted image. In K-SVD, each patch in a noisy image can be labeled with the elements from the entire dictionary.

Given that noise models can take various forms, such as mixed, impulse, uniform, and Gaussian [22], methods must be developed that can easily adapt to different noises. In this regard, learning-based methods are employed because of their adaptive capability [23, 24]. Nonetheless, such techniques can offer only a linear solution, whereas real noise models are mainly nonlinear [25]. Accordingly, the current study addresses this problem by introducing a local adaptive learning-based denoising method that can estimate two types of noises (Gaussian and salt-and-pepper noises). Genetic programming (GP) has recently gained attention in solving many image processing problems. GP approaches have also been used for the removal of impulse noise. However, the results generated with GP cannot always be guaranteed given the random behavior of this process; thus, unpredicted results beyond the consideration of human experts may be generated. GP has been used extensively and performs well in the image restoration field, as reported in [26, 27]. In [28], an adaptive algorithm that was based on SURE risk and considered a universal threshold for noise removal was presented; the mean square error was estimated via an adaptive genetic algorithm (GA).

In [29], a distributed GA was introduced to denoise frames corrupted by vertical line scratches; the scratches in the noisy image were restored in parallel patches, where sub-populations simultaneously evolved with the use of the GA. This technique is advantageous given that it saves significantly more CPU running time than the sequential version of the GA does. In addition, a GA was employed by Korurek et al. [30] to determine the values of the parameters in a mixed noise model to induce the near-field effect of MRI images. Petrovic et al. [31] proposed a successful GP-based denoising approach that involves the detection and suppression of impulse noise; however, this technique was designed exclusively for multiplicative noise models and not for other noise types, such as mixed and additive noise.

A switching median filter was presented in [32] that performed well in both low and medium noise densities; however, this filter was inadequate in conditions with high noise levels. Chan et al. [33] developed a two-phase scheme to eliminate salt-and-pepper noise; these researchers employed an adaptive median filter to detect and remove noise with the use of a specialized regularization approach. Another GA-based image restoration approach was established in [34] to optimize a back-propagation neural network. In [35], a Bayesian framework was minimized with GA to create an energy function for image reconstruction.

In the present work, image-patch denoising technique is established based on the training data obtained with GP. In the first step of the training process, we conduct patch clustering before the GP process to classify significant patches of the tested image. The filter derived from GP is generally adaptive to local image details in filter models. The proposed core of the function set is composed of a wavelet thresholding filter and of basic arithmetic operators, such as addition, subtraction, and multiplication. These components are adaptive to several image details. The adaptive capability of the proposed algorithm is attributed to the capability of random combinations within the wavelet thresholding filter candidates in the arithmetic operators. Furthermore, although offline training procedure is inefficient, the testing-based online procedure can be fast and generates the best results among the top of spatial filters. Denoised images produced with the application of the proposed algorithm to the suppression of additive white noise can be efficiently compared with those generated by best of denoising methods. Additionally, the suggested algorithm is extended to conquer multiplicative noise models, such as salt-and-pepper noise; thus, the proposed algorithm is more multilateral than other denoising techniques are, including the GP-based denoising approach in [31].

The remainder of this paper is organized as follows. Section 2 presents a brief review of second-generation wavelet transform approaches and an introduction to GP. Section 3 describes the proposed methodology. Section 4 discusses the experimental results, including a comparison of the proposed method with other denoising techniques. Finally, Sect. 5 summarizes the study conclusions.

2 Related work

In this section, GP and second-generation wavelet thresholding techniques are briefly described.

2.1 Second-generation wavelet transform

The second-generation wavelets are designed based on the lifting scheme and are relatively new versions of discrete wavelets. The application of these wavelets in the natural image-denoising field is seminal; in the 2D discrete wavelet domain (Fig. 1), each square-shaped set of wavelet coefficients is called a sub-band (each separate packet in Fig. 1 is also a sub-band); the lifting scheme filter is employed to determine the value that has been filtered for all pixels recognized as noise. In a 2D image, each decomposition level consists of four filtered sub-bands: LL, LH, HL, and HH. The sub-bands labeled as LH\(_{1}\), HL\(_{1}\), and HH\(_{1}\) are sets of wavelet coefficients in the finest level. The following coarser level of sub-bands, namely, LH\(_{2}\), HL\(_{2}\), and HH\(_{2}\), are derived from the scaling coefficients in the finer level LL\(_{1 }\) [36]. The subsequent decomposition levels are applied until a certain level of decomposition is achieved.

2.2 Brief introduction to genetic programming

Genetic programming can be classified as an evolutionary technique that is primarily used to evolve programs, sub-programs, or the structures of procedures [37, 38]. The main steps in a GP algorithm are depicted in the flowchart in Fig. 2. GP is also used to run a program or a sequence of programs as well as to evaluate the related fitness function for determining the effectiveness of such programs. In addition, Genetic programming is used to choose which program of population has the superiority from the existing one by matching different fitness functions of singular population sets. As a result, Genetic programming can generate new computer programs through crossover and mutation processes. The recursive procedure of crossover, selection, and mutation is reiterated for a specific number of times or until a certain objective is achieved.

Fig. 1
figure 1

Three levels of the parent–child relationship among the three levels of 2D orthogonal wavelet transformation for the benchmark image “Baboon”

Fig. 2
figure 2

Basic genetic programming framework

3 Methodology

This study mainly aims to utilize GP to create a novel adaptive-based local filter in order to deal with several image details. Framework of the proposed technique is depicted in Fig. 3.

The supervised learning process of the proposed algorithm consists of two main parts: the classification process, which is performed based on training in offline mode, and the online restoration technique, which is the core of this algorithm. A noise-free image is first corrupted by adding noise. Subsequently, clustering in every patch is applied to the noisy image in order to gather all patches that carry the same features and textures. For every individual class, an optimum filter design is developed through GP. Model structure of the filter used in GP is unique. Filtering is achieved by applying second-generation wavelet thresholding to the clustered patches to suppress the noise as well as to smooth the edges and sharp ridges of the corrupted image.

In GP analysis, \(x_i\) is assumed to be a patch of the noise-free image, whose center is pixel i; \(y_i\) is the matching patch in the noisy image for a specific class k. The filtered patch \(\hat{x}_i\) is then obtained based on the chosen optimal filter parameters for class k as follows:

$$\begin{aligned} \hat{x}_i =F_k (y_i ), \end{aligned}$$
(1)
Fig. 3
figure 3

Framework of the proposed filter

where \(F_{k}\) is the desired function of the filter generated with GP for an individual class k. Furthermore, filter estimation is intended to reduce the structural model errors that may occur because of the high noise level. Figure 3 depicts the main frame structure of the proposed filter according to GP training and testing procedures. In the process of testing a corrupted image, local patches are retrieved from the corrupted image to calculate the distances among the features of SVD of each individual patch. Cluster centers, which are determined offline, are also calculated along with the associated coefficients to simplify the filtering process.

3.1 Patch clustering

The proposed algorithm clusters into groups individual patches that share image details and textures, such as sharp edges, ridges, and fine image elements. Existing clustering techniques exploited several features in tested images, such as intensity, brightness, and gradient [39]. In the image restoration field, however, the disturbance from noise necessitates the extraction of robust and strong structural features from the tested images. In this regard, the wavelet filter elements used in the function set of the proposed algorithm are robust in the high-frequency orientation of the image details, including sharp edges and fine textures. SVD [39] is implemented for features extracted from noisy images; subsequently, the clustering based on K-means approach is proposed for patch clustering based on these features. The clustering approach in this algorithm considers the magnitude of the individual value in the dominant path of every sub-image given the nondirectional behavior of noise models. In addition, the magnitude of SVD reflects whether the investigated patch ends in the smooth region or in the edge/texture region of the local neighborhood.

In every patch whose center is i and whose size is \(N=n\times n\), the gradient patch values can be expressed by a matrix \(G_i\). The corresponding SVD can be derived as follows:

$$\begin{aligned} G_i =[\nabla y_i ( 1)^T\nabla y_i ( 2)^T\ldots \nabla y_i (N)^T]^T,G_i =U_i S_i V_i^T , \end{aligned}$$
(2)

where

$$\begin{aligned} \nabla y_i (j)=\left[ \frac{\varphi y_i (j)}{\varphi \alpha }\frac{\varphi y_i (j)}{\varphi \beta }\right] ^T. \end{aligned}$$
(3)

\(\nabla y_i(j),\) represents the patch gradient \(y_{i}\) at point jj = 1, 2,...,N. The gradient of image y at point i that is represented by \(G_i\) is analyzed to three different parts: \(U_i\) represents \(N\times N\) orthogonal matrix, \(V_i\) is a \(2\times 2\)orthogonal matrix that indicates the leading direction of the gradient area in the individual patch, and \(S_i\) is an \(N\times 2\) matrix that holds singular values. Thus, the sub-images inside every dataset are grouped by nonlocal means as in [40]:

$$\begin{aligned} \mathrm{argmin}_c =\mathop \sum \limits _{k=1 w(y_{ki} )\in W}^K \mathop \sum \limits _{mk^i=1,2,\ldots ,N_k } \left| {w( {y_{ki} })-\mu _k} \right| ^2, \end{aligned}$$
(4)

where \(y_{ki}\) is a corrupted patch that is located at pixel i and ends at cluster k; \(w({y_{ki}})\) represents the scale value of the dominant orientation of the local gradient; and \(\mu _k\) represents the mean vector of the kth cluster of \(W_{m_k}\), which indicates the set of magnitude values of the dominant orientation of the local gradient. Accordingly, K clusters \(W_{m1}\), \(W_{m2}\), \(W_{mk},...,W_{mK}\) can be obtained. Each cluster in the sequence \(W_{m_k }\) is composed of \(N_k \) vectors \(W_{m_{kq}} (k = 1,{\ldots },K; q = 1, 2,{\ldots },N_k)\).

3.2 Wavelet thresholding filter

Theoretically, the coefficients of the wavelet are practically correlated in a small neighborhood. Thus, large coefficients generally have neighbors with large coefficients as well. Thus, the proposed thresholding function can be derived from the neighborhood coefficients of the corrupted image. Supposing that \(R_{i,j}\) represents the discrete wavelet coefficients under investigation in the corrupted image, the following is set:

$$\begin{aligned} U_{i,j}^2 =R_{i,j-1}^2 +R_{i,j}+R_{i,j+1}^2, \end{aligned}$$
(5)

where \(U_{i,j}^2\) is the result of the mathematical summation of the coefficients after taking the squared value. The coefficients are positioned in the same row as the coefficient to be subject to a threshold, and (ij) represents the location of the coefficient in the contaminated image. Given the following conditional inequality,

$$\begin{aligned} \mathrm{If } \; U_{i,j}^{2} <\lambda ^{2}, \end{aligned}$$
(6)

then the wavelet coefficient \(R_{i,j}\) is set to zero. Otherwise, this coefficient decreases according to the following equation:

$$\begin{aligned} R_{i,j(\mathrm{New})}=\frac{R_{i,j}^2 -R_{i,j}^2 *\lambda ^{2} }{U_{i,j}^2}, \end{aligned}$$
(7)

where \(\lambda =\sqrt{2\mathrm{ln}M_j}\sigma _w\) and \(\sigma _w\) is the noise variance. \(M_j \) represents the size of the coefficients of the sub-band under investigation.

3.3 Training of genetic programming local filters-based image restoration

The main objective of this training stage is to develop a filter that can be adaptive in order to denoise the singular classes of corrupted image patches with different textural forms. Shao et al. [14] adopted a least squares optimization procedure to estimate different parameters of linear model. Additionally, in real-world demands, the use of only a linear filter is insufficient for image restoration. For instance, such filters effectively remove additive Gaussian noise, but also show edges with blurred patterns. Moreover, these filters fail to successfully suppress auto unique-feature noise. By contrast, nonlinear filters can handle nonuniform smoothing and can also be adapted locally to image textures; thus, these filters are suitable for the removal of impulse and heavy noises. In the current study, a second-generation wavelet thresholding filter is adopted to address heavy and high-level noises. The main intuition in the proposed algorithm is that when the input image is categorized by individual features, GP is exploited to promote the wavelet thresholding filter for noisy patches that contain certain fine details. Specifically, whole patches fitting to an individual class \(W_{m_k}\) are trained using GP processes. Each patch generation is processed by wavelet thresholding and then compared with noise-free patches by applying fitness functions described in Eqs. (8) and (9).

3.3.1 Genetic programming function-set

In a problem domain, the function-set of the GP procedure is naturally emphasized to a certain extent. Practically, the functions of GP and their efficiency should be considered given the time consumed by and the complexity of the evolving process as well as of the individual tree. In the experimental test, the function-set of the system contains a wavelet thresholding filter with basic arithmetic operations that may affect the time consumption of the entire system. In this regard, the arithmetic operations are applied mostly on two images; however, the filtering process is used on a single patch (sub-image).

The parameters for the wavelet thresholding filter are presented in Eqs. (5)–(7). The wavelet thresholding filter is selected because of its capability to preserve edges. Moreover, its function is not highly complex; therefore, this filter can easily be trained by GP evolution [10, 41]. The arithmetic functions in a GP filter are used to improve individuals [42]. The structural individuals of the arithmetic functions positively affect the time complexity of the entire system, as shown in the experimental results.

3.3.2 Denoising-based fitness function

Taking the peak signal-to-noise ratio (PSNR) of the resulted image and averaging it are used to calculate the fitness function, which is necessary for evaluating the filter performance of all the clusters:

$$\begin{aligned} \mathrm{fitness}=\frac{N_k}{\mathop \sum \nolimits _{l=1}^{N_k} \mathrm{PSNR}_l}. \end{aligned}$$
(8)

The PSNR for each individual patch can be calculated as follows:

$$\begin{aligned} \mathrm{PSNR}_K= & {} 10\mathrm{log}_{10}\left( \frac{\mathrm{NL}^2}{\mathop \sum \nolimits _{j=1}^N [x_i (j)-F_k (y_i (j))]^2}\right) , \end{aligned}$$
(9)
$$\begin{aligned} \mathrm{MSE}= & {} \mathop \sum \nolimits _{j=1}^N [x_i ( j)-F_k (y_i (j))]^2 \end{aligned}$$
(10)

where L represents the image pixel values; j is the pixel position; N represents the pixels size in the same neighborhood; and \(y_i (j)\) and \(x_i (j)\) represent the pixel values at location j. The minimal fitness value reflects the best choice for each individual value.

4 Experimental results and discussion

The GP toolbox GPLAB 4.02 in Matlab 2014aFootnote 1 is used to implement the GP process sample code. The specifications of our system are as follows: a seven-core processor with 128 GB RAM running on Windows 8.1. The storage chip (Ram) is needed to be in 128 GB Ram and above in order to accomplish the coding processes. However, RAM with 32 GB could run the code, but the execution time burden will be expensive.

4.1 Experimental dataset

In this study, the Berkeley segmentation dataset composed of 250 images is employed [43]. The images in our experiments are down-sampled into \(256\times 256\) and are converted into gray-scale images to improve image quality. This method has been proven effective by many sources, such as [44]. In addition, all the candidate functions in the function set have already been tested and confirmed to have the best parameter range for images, with a patch size of \(9\times 9\). The test data in the experiments are obtained from seven standard test images, namely, “Boat”, “Barbara”, “F-16”, “Monarch”, “Baboon”, “Couple”, and “Lena”. These images comprise the standard grayscale image dataset [11] for most state-of-the-art denoising algorithms. Then, 150 images are selected from the Pascal VOC 2007dataset [45].

4.2 Determination of parameters

In this section, the parameters for the patch clustering and GP processes are introduced and explained in detail.

4.2.1 Patch clustering parameters

The size of the patches extracted from the dataset is set to \(9\times 9\). As per our experiments, the image restoration results are almost similar when the patch offset is set to either 1 or 3. Accordingly, the latter is selected for efficiency of the computational processes. Regarding cluster numbers, K is adjusted to meet the desired value in the clustering process. Practically, the experimental results indicated that performance can be optimized by selecting 7–28 clusters for the 7 selected images. In each individual experiment, the images are selected randomly from the training set. To implement the experimental clustering procedure, the K value is set to 18 for high performance and efficiency.

4.2.2 Main genetic programming processes

  • Initially, 50 generations and 500 populations are adopted and implemented in training process. The choice of generations and populations contributes to the choice of GP in order to generate adequate adaptive filtering design. In addition, the ramped half-and-half technique [46] is used to generate the initial population. Theoretically, the mutation rates reported in existing literature and in other studies that involve image processing are low. Thus, high mutation rates are ineffective, as per our experiments. The mutation rate is therefore set to 0.05.

  • “Tournament” selection is conducted in the experiments. Owing to the significant population burden, we have selected this method to enhance the flexibility of the overfitting tree during implementation. The best individual from both the parent and child generations is retained in the new population [34]. Therefore, the “keep best” principle is applied as the survival method in this experiment.

  • Training is terminated when the fitness value is less than \(1\times 10^{-6}\) or when the value of generations is maximized.

4.3 Results

The experimental results are presented and discussed in detail in this section.

4.3.1 Objective performance comparison

In this study, quantitative assessments include both peak signal-to-noise ratio (PSNR) and mean structural similarity (MSSIM) are adopted to assess performance objectively. As defined in Eq. (9), PSNR is calculated for the entire image, whereas MSSIM [47] can be derived as follows:

$$\begin{aligned} \mathrm{MSSIM}( {Z,A})= & {} \frac{\mathop \sum \nolimits _{k=1}^B \mathrm{SSIM}(z_k ,a_k )}{B} \end{aligned}$$
(11)
$$\begin{aligned} \mathrm{SSIM}( {o_k ,e_k })= & {} \frac{(2\mu _{z_k } \mu _{a_k } +C_1 )(2\sigma _{z_k a_k } +C_2 )}{(\mu ^2_{z_k } +\mu ^2_{a_k } +C_1 )(\sigma ^2_{z_k } +\sigma ^2_{a_k } +C_2 )},\nonumber \\ \end{aligned}$$
(12)

where Z and A represent the original image (noise-free image) and the reconstructed image, respectively; \(z_k \) and \(a_k \) represent the kth patches from the original and denoised images, respectively; B is the total number of local windows in the image; \(\mu _{z_k }\) and \(\mu _{a_k }\) are the mean intensities of \(z_k \) and \(a_k \), respectively; \(\sigma _{z_k a_k } \) is the covariance between \(z_k \) and \(a_k \); and \(C_1 \) and \(C_2 \) represent the constants required to stabilize the SSIM stable.

The proposed filter for the Gaussian noise model is well trained on sub-images (patches) that are corrupted by AWGN model “Additive White Gaussian Noise”. The related noise standard deviations \(\sigma \) are 10, 15, 20, 30, 45, 60, and 70. All the parameters of state-of-the-art denoising algorithms are set to the suggested values, that is, the optimal values presented in previous studies. The qualitative performance of the proposed algorithm is compared with that of the top state-of-the-art local learning-based techniques. As shown in Table 1, the proposed algorithm improves significantly on K-SVD and BM3D-SAPCA at low noise levels or even at average levels. This outcome can be attributed to the characteristics of the second-generation wavelet filter, which can prevent the blurring of sharp edges at these noise levels while smoothing the homogeneous textures and fine details.

Table 1 PSNR and MSSIM results of images denoised through different algorithms at \(\sigma =25\)
Fig. 4
figure 4

Results obtained with different denoising methods for the “Monarch” image with noise density 35 % salt and pepper: a noise-free image, b corrupted image (PSNR 9.95), c SMF (PSNR 17.51), d UINFGP (PSNR 16.78), e ROR–NLM (PSNR 27.83), f AMF (PSNR 28.63), and g the proposed algorithm (PSNR 29.17)

Fig. 5
figure 5

Results obtained with different denoising methods for the “F-16” image with noise density 35 % salt and pepper: a noise-free image, b corrupted image (PSNR 10.01), c SMF (PSNR 22.86), d UINFGP (PSNR 30.48), e ROR–NLM (PSNR 28.04), f AMF (PSNR 32.43), and g the proposed algorithm (PSNR 32.58)

In the proposed algorithm, the main outline facilitates adaptive GP factors through every single patch, thereby enhancing the efficiency of the main filter parameters. By contrast, existing filters, such as K-SVD, use an overall dictionary to produce different forms for the individual patches, thereby generating outliers and bubbles in the edges and ridges of the reconstructed images as a result of inaccurate representations of the sparse structure.

Fig. 6
figure 6

Denoising performance given salt-and-pepper noise in the F-16 standard image

Fig. 7
figure 7

Denoising performance given salt-and-pepper noise in the Monarch standard image

In low noise levels, the performance of the proposed algorithm is comparable to that of BM3D-SAPCA and K-SVD in terms of PSNR, according to an objective assessment. The proposed algorithm also reports a higher MSSIM than K-SVD does. In high noise levels, however, our algorithm encounters difficulties in performing well because of the clustering model. As a result, K-SVD generates inadequate results given the complications in clustering the structure. Therefore, sparse representation is lacking. Several local denoising techniques have also been selected for comparison with the proposed algorithm, including higher-order singular value decomposition (HOSVD) [48], multiresolution bilateral filtering (MBF) [49], the BM3D-SAPCA filter that is considered to be a best of local filters [17], and an over-complete BLS–GSM filter which exploits the wavelet domain locality in image models in [16]. The proposed algorithm performs considerably better than most of the aforementioned filters; however, our algorithm generates worse results than BM3D-SAPCA does in high noise levels.

Fig. 8
figure 8

Results obtained with various denoising methods for the Baboon image (\(256 \times 256\)): a original image, b noisy image with \(\sigma = 25\), c K-SVD, d BM3D-SAPCA, e BLS–GSM, f MBF, g HOSVD, and h the proposed algorithm

Fig. 9
figure 9

Results obtained with various denoising methods for the Couple image (\(256 \times 256\)): a original image, b noisy image with \(\sigma = 25\), c K-SVD, d BM3D-SAPCA, e BLS–GSM, f MBF, g HOSVD, and h the proposed algorithm

This result can be justified by the fact that the candidate filters in the function set all experience difficulties in high noise levels. In such a case, the GP process also encounters challenges in “creating” a breakthrough according to the evolution of many generations. All the parameters of the filters compared with the proposed filter are optimized based on the settings of the authors in the original works. All the implementation codes have been done in Matlab platform. The usage of second-generation wavelet thresholding filter in the proposed algorithm contributes significantly to reducing time consumption and complexity.

The proposed algorithm is also compared with state-of-the-art techniques in terms of removing multiplicative noise, especially salt-and-pepper noise. Several state-of-the-art denoising filters are contrasted, such as the standard median filter (SMF) [1], adaptive median filter (AMF) [50], universal impulse noise filtering using GP (UINFGP) [31], and Robust Outlyingness Ratio–Nonlocal Means (ROR–NLM) [51]. As illustrated in Figs. 4 and 5, the proposed algorithm outperforms state-of-the-art denoising algorithms when the noise level is medium (50 % and lower). In high-density noise levels, however, the performance of AMF is close to that of the proposed algorithm for the same reasons highlighted in the case of the Gaussian noise model. ROR–NLM generates better results than SMF and UINFGP do for two test images (F-16 and Monarch). Owing to the complex textures of both test images, the latter two filters performed the most poorly among the state-of-the-art denoising methods. By contrast, the performance of our algorithm is superior to that of the investigated algorithms given both images and under different noise models.

4.3.2 Comparison of visual performance levels

In terms of the visual quality of the Gaussian noise model, the proposed algorithm exhibits better performance than the other techniques do under medium noise levels, as displayed in Figs. 6 and 7. In high level of noise, the proposed algorithm generates results that are close to those of AMF and ROR–NLM. The outcome is ascribed to the natural of the fitness function set, where it is mainly composed of wavelet thresholding filters that preserve edges and sharp ridges well, especially at medium noise levels. Therefore, the proposed filter evidently exhibits the superior qualitative performance and shows high visual appearance among the examined techniques.

Given the salt-and-pepper noise model, Figs. 8 and 9 depict the sharp edges and soft textures that are well preserved by our algorithm, AMF, and ROR–NLM. By contrast, both SMF and UINFGP fail to preserve the main features and fine edges of the reconstructed image. In the resultant image denoised by NAFSM, some noise and blurred textures remain visible. Our algorithm clearly outperforms the other filters in terms of preserving the edges and ridges at the same noise levels. Nonetheless, considerable noise is detected in the median filter images. UINFGP generates severe artifacts, and the corresponding reconstructed images are the worst among those attributed to the other filters. These results agree with those of the objective analysis.

4.3.3 Computational efficiency

The offline training process of the proposed GP algorithm is computationally expensive. Accordingly, this study considers 80 generations and 800 populations in every class within the same individual generation. In addition, traditional time of training process for each individual cluster that has 1900 features is 7.5 h. As a result, the online testing principle reduces time complexity. The proposed algorithm performs more efficiently than state-of-the-art techniques do in the same environment and platform; only 0.24 s is required to process images that measure \(256\times 256\). The different techniques generated various computational time results; MBF, BM3D-SAPCA, BLS–GSM, and K-SVD take approximately 6.21, 4.18, 54.72, and 782.32 s (with 15 iterations), respectively, for processing.

5 Conclusion

In this paper, we have presented an image-denoising algorithm that uses GP. Exploiting the randomness of GP enables us to generate an optimal filter for each single class; this filter is highly efficient and can be adapted to different noise models. The experiments demonstrate that the performance of the proposed algorithm is comparable with that of state-of-the-art restoration techniques; moreover, this algorithm generates better results in terms of removing additive noise (represented by Gaussian noise) and multiplicative noise (represented by salt-and-pepper noise).

In the future plans, the proposed GP algorithm may be further extended to include other kinds of image enhancement applications, such as image registration artifacts removal and noise suppression in image coding-based learning dictionary, by changing the filter parameters. These parameters include the patch clustering procedure, varying filter designs (i.e., the bilateral filter), and the fitness function of GP. An approach that divides the images into several small frames can be investigated as well. The population cluster to be used is determined and developed from each small frame.