1 Introduction

Stochastic interpolation (SI) (Howard et al. 2009) is a recent mathematical procedure to recover a representative function from input data values \(u_{\rm d}\) that are distributed along the x axis. The recovered function is a representation of this input data: a putative model, process, or relation that may have generated the input data \(u_{\rm d}.\) The output vector \(u_{\rm o}\) contains the function values that are recovered by SI at spatial locations chosen by the user. Under certain conditions in the construction of the stochastic framework, not only the values in \(u_{\rm o}\) but also \( \text{d}^n u_{\rm o}/{\text{d}x^n} \) the values of the derivatives of the function can be recovered at any desired location. Both \(u_{\rm d}\) and \(u_{\rm o}\) locations can be non-uniformly distributed in space, and the vector dimension (size) of \(u_{\rm o}\) may differ from that of \(u_{\rm d}:\) it may be the same, smaller, or larger in its number of points.

Stochastic interpolation is obtained through the product of two matrices: a discrete de-convolution step followed by a discrete convolution step. If the model used is a probability distribution \(\phi(\sigma),\) then \(\sigma\) is the variance. In matrix notation this involves A, a square matrix whose elements depend on \(\phi_1(\sigma_1),\) and \(\tilde{A}\) the ‘augmented’ rectangular matrix whose elements depend on a parameter \(\phi_2(\sigma_2).\) The parameter \(\sigma\) is a tuning parameter that can be constant, or varied spatially as \(\sigma(x)\) for instance or it can depend on the data as in \(\sigma(u_{\rm d}).\)

$$ A^{-1}(\phi_1(\sigma_1)) u_{\rm d} = p, $$
(1)
$$ \tilde{A}(\phi_2(\sigma_2)) p = u_{\rm o}. $$
(2)

The first step is a de-convolution step that involves a matrix inversion of A to obtain the ‘pre-image’ vector p which has the same size and spatial locations as \(u_{\rm d}.\) The second step is a discrete convolution step that involves a matrix multiplication of the pre-image p by the augmented matrix \(\tilde{A}\) to obtain \(u_{\rm o}.\) Derivatives of the recovered function are available anywhere, provided that \(\phi,\) the function generating the row space of \(\tilde{A},\) is differentiable. For example, the first derivative is calculated as follows:

$$ \left( \frac{\rm d}{{\rm d}x} \tilde{A}(\sigma_2)\right) A^{-1}(\sigma_1) u_{\rm d} $$
(3)

where \({\rm d}\tilde{A}/{\rm d}x\) is taken to mean the application of \({\rm d}/{\rm d}x\) to the function generating the rows of the matrix \(\tilde{A}.\) The \(a_{jk}\) elements of A involve the function (\(y = f(x)\)) sampled at points \(x_j \in [0.1],\) i.e., at \(f (x_j) = y_j,\) a function that is assumed to be piecewise constant in \((z_{k-1}, z_k)\) with value \(y_k\) and where \(z_0 = -\infty,\) \(z_k =(x_{k+1}+x_k)/2\) for \(k =1,2, \ldots , n-1,\) and \(z_n =\infty.\) In the case when \(\phi\) is the Bernstein function corresponding to the Gaussian distribution or error function (erf), the elements of A are:

$$ a_{jk} = \frac{1}{2} \left[ \hbox{erf} \left( \frac{z_{k+1} - x_j}{\sqrt{\sigma(x) n}} \right) + \hbox{erf} \left( \frac{x_j - z_k} {\sqrt{\sigma(x) n}} \right) \right] . $$
(4)

and this choice for \(\phi\) ensures that the recovered function remains differentiable. Note that the matrices depend only on \(\sigma, \phi,\) and the locations of \(u_{\rm d}\) and \(u_{\rm o}.\) Interpolation is when \(\sigma_1 = \sigma_2\) causing locations where \(u_{\rm d}\) and \(u_{\rm o}\) coincide to have the exact same value of the recovered function (Howard et al. 2009).

2 The optimization of SI with genetic programming

Genetic programming (GP) (Koza 1992) can optimize concurrently the different components of the SI framework: \(u_{\rm d},\) \(\phi\) and \(\sigma\) with the objective of discovering a function that satisfies a user requirement. Generally, SI optimization calls for a GP scheme that can self-determine the size of the SI components as well as their value, e.g., the case that aims to reverse engineer the size of \(u_{\rm d},\) the value of its elements and their distribution to discover a function that satisfies required properties.

The GP scheme introduced in (Howard and Roberts 2001) is described here and proposed for SI component optimization. This discovers a computer program (the GP tree). When evaluated the GP tree works on a linear memory using storage functions, arithmetic functions, and the moving of I/O pointers. The evaluation results in a variable length real numbered vector. Its precise coefficients are interpreted according to a user supplied grammar. For example, in this paper these coefficients inform about the position and value of the elements of \(u_{\rm d},\) and the length of the real number vector (the number of these coefficients) informs about the number of terms in \(u_{\rm d},\) about the size of \(u_{\rm d}.\) That is, GP evolves candidates which when executed produce a variable length vector of coefficients. The coefficients are then interpreted by a grammar to tell us how good is the solution (the Darwinian fitness of the GP tree). Hence, the functions and terminal nodes of the GP trees do not change for different problems. Each problem has its own grammar that interprets the vector of numerical coefficient arising from GP tree evaluation.

Functions and terminals of this GP scheme are defined in Tables 1 and 2. The GP tree generates the required variable length vector as it is being evaluated by combining ephemeral constants to produce very accurate coefficients. During evaluation, functions in the GP tree manipulate a vector of coefficients which is held in global memory. Functions also manipulate L and C, two global pointers to the element or position in the vector of coefficients. Pointer L stands for “last index” or tail position, and pointer C stands for “current” position.

Table 1 GP terminals, functions and variables
Table 2 GP run parameters

Prior to the evaluation of the GP tree, L and C are both set to zero. Functions ADD, BACK and WRITE are functions of two arguments. They return one of the arguments, e.g., ADD returns its second, BACK its first and WRITE its second argument, the choice is arbitrary. Function ADD writes its first argument to the vector element pointed by L. It increments L provided \(L < L_{\rm MAX}\) and enforcing \(C=L.\) Function BACK decrements pointer C provided \(C> 0.\) Function WRITE overwrites the vector element at C with its first argument. Provided \(C < L_{\rm MAX}\) it also increments this pointer, and when \(C> L\) it increments pointer L also.

The function set is enhanced with two memories \(m_1\) and \(m_2\) manipulated by functions, again of two arguments. Functions \(Wm_I\) return their first argument and overwrite their second argument to memory location \(m_I.\) Functions \(Rm_I\) simply return the value of the memory at \(m_I\) and ignore both arguments. An ephemeral random constant \(S_C\) is stored as one byte and can represent up to 256 values. These are equally spaced and obtained by dividing the numbers 0 to 255 by 255, to obtain values in the range [0,1].

This version of GP was successful in previous research at evolving a monomial that satisfied the convection–diffusion equation (Howard and Roberts 2001). In this paper, however, we use this GP method to discover the \(u_{\rm d}\) in SI that satisfies the convection–diffusion equation. Satisfaction of this objective is measured using the derivatives of the recovered function. The test is for the satisfaction of the differential equation at a number of points, i.e., at \(u_{\rm o}\) where the derivatives are computed with SI. Although the computational results presented in this paper manipulate \(u_{\rm d}\) only, the same GP scheme could be used to optimize (select) \(\phi\) and \(\sigma\) concurrently also.

3 Numerical solution of convection–diffusion equations

For the purposes of illustrating how GP can be applied to discover values for a variable number of the components of SI, this paper applies the scheme in Howard and Roberts (2001) to reverse engineer an SI function that is a solution to a convection–diffusion equation problem. Although a new solution method for the convection–diffusion equations is an interesting and worth while endeavor and topic of research (for the reasons that are given in this section), the methods that are presented in this paper are incipient, and require more research or modification to become competitive. The presentation, therefore, is principally motivated as an illustrative problem. It illustrates how GP has the power to discover usable values and choices for the components of SI, and SI optimization has potential for security applications in image analysis, e.g., to usefully zoom up an image or to achieve novel image registration by peak sharpening, to extrapolate or to interpolate data.

The numerical solution of convection–diffusion differential equations is of immense importance to science and engineering (Morton 1996). The steady-state 1D homogeneous version of the convection–diffusion equation becomes increasingly challenging to solve at higher levels of convection over diffusion, i.e., at higher Peclet number (P). This equation, with temperature T as the unknown, is:

$$ \frac{{\rm d}^2 T} {{\rm d}x^2} - P \frac{{\rm d}T}{{\rm d}x} = 0,\quad \hbox{with}\; T(0) = 1, \quad \hbox{and}\quad T(1) = 0. $$
(5)

Differential equations are solved numerically by means of weighted residuals methods (WRMs) such as finite elements (FE) or finite differences (FD). In their standard setting of Galerkin weighting and central differences, respectively, they cannot generally solve these types of equations at high P (Axelsson and Barker 1984). Instead, they require special settings such as the Petrov–Galerkin weighting for FE (the Hemker function, see p.90 of Hemker 1977 or p. 163 of Morton 1996) or upwind differencing for FD.

The aforementioned special settings of WRMs are not easy to establish in the multi-dimensional (2D, 3D) setting, and is not at all clear how to establish these special settings for non-linear brothers: Navier–Stokes, heat transport, and combustion partial differential equations. Usually, a solution is attempted by tensor product of the special settings into multi-dimensions and by other directional schemes. Such schemes are generally inaccurate or can lead to misleading results.

Numerical solution of such systems is therefore not reliable for prediction, but rather such numerical solutions are calibrated to match real experiments and used for interpolating the results of experiments. For the general problem geometry and boundary conditions, the correct Petrov–Galerkin test space and correct upwind scheme are hard to establish and generally are unknown (see p. 212 of Morton 1996 and Benson 1991). Moreover, such ad-hoc approaches can lead to deceptively smooth solutions (Gresho and Lee 1979). This problem of WRM is a contributing factor that, generally speaking, makes computational fluid dynamics with WRM an interpolative rather than a predictive endeavor. Least squares FE methods circumvent some of these issues but squaring the differential equations gives rise to other numerical challenges.

For this reason, nascent and alternative methods of solving non-self-adjoint differential equations as the convection–diffusion equation are worthy of investigation. There is no reason why the scheme that will be presented in Sect. 4 cannot be extended to tackle the partial differential equations. For example, Dirichlet boundary conditions may be imposed as values of the input vector \(u_{\rm d}\) at their locations in the 2D domain of solution, and the satisfaction of the differential equation tested for anywhere in the domain, at arbitrary and expedient locations of \(u_{\rm o}.\)

4 Evolving SI

A previous attempt to evolve the analytical solution to these equations (Howard and Roberts 2001) developed the GP scheme of Sect. 2. The GP generated variable length vector of real valued coefficients defined the constants of a monomial. This monomial was the candidate analytical solution to the convection–diffusion equation. Each coefficient, \(a_i,\) corresponded to a monomial coefficient in: \(a_0 + a_1 x + a_2 x^2 + a_3 x^3 + \cdots.\) The residual (Eq. 5) was then squared and integrated over the domain [0,1]. The negative of this quantity was the Darwinian fitness to be maximized, i.e., driven to zero. Note that in this scheme the derivatives need not be sampled at points in the domain [0,1], because the integral over [0,1] can be obtained directly from the coefficients \(a_i.\) In spite of this advantage and its elegance this scheme cannot be easily moved to multi-dimensions as it is very difficult then to incorporate the boundary conditions.

The evolution of the function that is the solution to the convection–diffusion equation must satisfy two objectives: match the values at the boundary conditions and minimize the residual of the differential equation in the domain. Note that the search space which involves simultaneous satisfaction of both is very challenging particularly for significant P. The scheme in Howard and Roberts (2001) accounts for the Dirichlet boundary conditions (\(T=1.0\) at \(x=0\) and \(T=0\) at \(x=1.0\)) a priori by evolving the arbitrary length monomial \(p = a_0 + a_1 x + a_2 x^2 +\cdots+ a_n x^n\) that is substituted into the expression \(T = x(1-x) p + 1 - x.\) This expression guarantees satisfaction of these boundary conditions for whatever should be the value of p. This means that evolution need only minimize the residual of the differential equation. In Howard and Roberts (2001), the negative of the analytical integral of the square of the residual is used as the Darwinian fitness function. The principal drawback of the scheme in Howard and Roberts (2001) is the complicated treatment of boundary conditions that cannot be easily extended to 2D and 3D because analogous expressions such to \(T = x(1-x) p + 1 - x\) are not always easy to find or available.

Dirichlet boundary conditions with the SI framework, however, are straightforward because the end points are part of the input data \(u_{\rm d}.\) The evolutionary algorithm using SI can reverse engineer an SI function that satisfies the differential equations. It means evolving the points in \(u_{\rm d}\) but there is no reason why the end points (Dirichlet boundary conditions) cannot be set (constants of \(u_{\rm d}\)) and only the internal points discovered by evolution. Therefore, it was considered worthy of investigation to evolve the SI function recovery by means of the exact same GP scheme as in Howard and Roberts (2001). Instead of evolving an arbitrary number of monomial coefficients, the scheme evolves the location and numerical value of an arbitrary number of points that define \(u_{\rm d},\) i.e., this function that it seeks, a function that satisfies the partial differential equations. Thus, the GP evolved coefficients \([a_0, a_1, a_2,\ldots]\) now determine both the position and value pair of internal points of the input data, \(u_{\rm d}.\) The number of evolved coefficients also determines the size (dimension) of \(u_{\rm d}.\)

There exist a number of choices for Darwinian fitness function to be discussed and the resulting GP–SI scheme processes a candidate solution by the steps that follow.

4.1 Processing of a GP candidate solution

  1. 1.

    Evaluate the candidate (GP tree); it results in a variable length vector of real numbered coefficients; if this vector’s length is less than \(sz,\) e.g., \(sz < 7,\) return a poor fitness, e.g., \(-10^{10},\) otherwise:

  2. 2.

    interpret the vector to obtain \(u_{\rm d}\) (non-uniform locations and associated values), i.e., n and \(\{(x_i, T_i)\},\; i = 1, n;\)

  3. 3.

    convert the data coordinates to lie on the unit interval;

  4. 4.

    construct \(A_{nn}\) using a Bernstein function for \(\phi\) (Eq. 4);

  5. 5.

    compute the de-convolution matrix, \(A^{-1}_{nn};\)

  6. 6.

    construct the augmented matrix \(\tilde{A}_{mn},\) \(m > n\) using the same generator of the row space \(\phi\) (Eq. 4);

  7. 7.

    evaluate \(\tilde{A}_{mn} A^{-1}_{nn} {{\mathbf{T}}}\) to obtain m the output data \(u_{\rm o},\) i.e., \(\{ T_j\},\; i=1,m;\)

  8. 8.

    convert the output to the world coordinate system (solution to the convection–diffusion equation);

  9. 9.

    compute the Darwinian fitness f from the derivatives of T—these are available with SI as with Eq. 3, e.g., at each point in j of \(u_{\rm o}\) compute the residual of the differential equation, \(r_j = {\rm d}^2 T_j / {\rm d}x^2 - P ({{\rm d}T_j}/{{\rm d}x})\) and take the negative of some measure of this over the domain, e.g., \(f = - \parallel {{\mathbf{r}}} \parallel_2\) (the GP fitness is the negative of this least squares type measure of the residual of the differential equation at m chosen output locations of \(u_{\rm o}\)). GP evolves candidates to drive the error to zero by maximizing this fitness measure.

4.2 Vector interpretation: high Peclet numbers

The objective of the evolutionary method is to reverse engineer the \(u_{\rm d},\) the dataset that defines the function that is the solution to the convection–diffusion equation. Different ways of interpreting the real-numbered variable length vector of coefficients can be established to obtain \(u_{\rm d}\) (step 2 above). Two variants were implemented: one for low Peclet number cases (\(P \leq 5\)) and another for higher Peclet number cases \((P> 5).\) In both cases the length of the vector determines the size of the \(u_{\rm d}\) dataset, i.e., what is the number of input data points in \(u_{\rm d}.\)

The objective of GP was to discover an arbitrary number n of data points in \(u_{\rm d}\) \((x_i, T_i), \;i = 0, \;n-1\) inside the domain (excluding the end point boundary conditions). This involves discovery of both: the \(x_i\) position and the \(T_i\) value at each data point.

Upon numerical experimentation at high Peclet numbers, it became apparent that the technique as it stood could not solve these. It is necessary to carry out further research to develop some approach for the high Peclet number cases.

The rest of this section explores the use of buffer zones outside of the solution region. These are not elegant nor convincing for practical application. However, they help to obtain solution at higher Peclet numbers as demonstrated in this section. By this means, the domain of solution was mapped not to [0,1] but to [0.2,0.8]. Values of \(u_{\rm d}\) at \(x = 0.2\) and \(x = 0.8\) were set to \(T = 1.0\) and \(T = 0.0\) respectively (the boundary conditions). The locations \(x=0.0\) and \(x=1.0\) were also taken as fixed but the interpretation of vector coefficients determined their T values.

The length of the vector indicated the size of \(u_{\rm d}\) with the remaining coefficients setting a variable number of \(u_{\rm d}\) data locations and T values: \((x_i, T_i)\) within the domain [0,1].

Care was taken to avoid GP indicating two input data points at the same location, at the same value of x. This problem was addressed by following these steps in the interpretation of the vector of coefficients:

  1. 1.

    If the size of the vector is odd then discard the last coefficient (see Fig. 1a);

  2. 2.

    First coefficient gives the value of T at \(x = 0.0\) (see Fig. 1a).

  3. 3.

    Second coefficient gives the value of T at \(x = 1.0\) (see Fig. 1a).

  4. 4.

    Discretize the domain [0,1] into 101 uniformly distributed locations \(\Delta x = x_{i+1} - x_{i} = 0.01{\bf }\)

  5. 5.

    Each remaining pair of coefficients represents a point in \(u_{\rm d}\) or \((x_i,T_i)\) (see Fig. 1b).

Fig. 1
figure 1

Interpretation of the vector that is produced by evaluation of the genetic programming candidate. Top The number of terms of the vector is made even by dropping the last one if the number of terms is odd; the first two vector numbers are the \(u_{\rm d}\) values at the end points of the domain, middle other elements of \(u_{\rm d}\) are obtained by taking a pair of numbers at a time: the first term indicating a location and the second its corresponding function value, bottom the locations are matched to discrete bins and repeated values are moved to the next available slot (in this illustration grey balls are moved to black balls)

Define \(b_j\) as the set of first coefficients in each pair j. Now convert each \(b_j\) to a percentage and match to the closest discrete location as follows:

  1. 1.

    Sum the absolute value of every first coefficient of each pair: \(cs = \sum_j | b_j | .\)

  2. 2.

    Compute the gap percentages \(g_j\) for each \(x_i\) location as follows: \(g_j = | b_j | / cs.\)

  3. 3.

    Excluding locations 0.0, 0.2, 0.8 and 1.0, match each percentage to the closest allowable discrete location in [0,1] (see Fig. 1c).

  4. 4.

    Repeats are not allowed so move them to the next allowed location (see Fig. 1c).

The decision to extend the function outside of the domain is motivated to allow the SI interpolation to assume any value at the end points (at \(x=0.0\) and \(x=1.0\)) to give it more freedom to manoeuvre. However, this shift in the domain now requires scaling of Eq. 5 to account for the mapping from the [0.2,0.8] domain to the [0,1] domain:

$$ 0.6 \frac{{\rm d}^2 T}{{\rm d}x^2} - P \frac{{\rm d}T}{{\rm d}x} = 0\quad \hbox{ with}\; T(0.2) = 1,\quad \hbox{and}\quad T(0.8) = 0. $$
(6)

The aforementioned procedure is one of many alternative ways to interpret the vector of coefficients that results from the evaluation of the GP parse tree.

4.3 Darwinian fitness measure

The fitness is calculated at \(u_{\rm o}\) points that sample the function. Quite arbitrarily, a set of \(m = 241\) points was probabilistically varied using a Latin sampling so that these would be proportionately distributed in the domain [0.2,0.8].

Choices exist as to how to compute the fitness measure f from the residual \( r_j = | 0.6 ({\rm d}^2 T_j / {\rm d}x^2) - P ({{\rm d}T_j}/{{\rm d}x}) |\) at each point j:

  1. 1.

    \(f = -\sqrt {\sum_j r_j^2/241}.\)

  2. 2.

    Generate random locations by dividing [0.2,0.8] into 60 regions, each with four uniform gaps involving five output points. Integrate to obtain \(f = - \int_0^1 {r^2} {\rm d}x\) using the five point Boole’s rule (Boole and Moulton 1960).Footnote 1

  3. 3.

    In principle, it is possible to perform an analytical integration. This was not attempted, however, as it is an arduous task.

The derivatives can be obtained directly (Eq. 3) or computed by finite differences over \(u_{\rm o}.\)

4.4 Low Peclet numbers

The procedure for the low Peclet number cases is identical to the one just described except that the variable length vector defines the values in \((x_i, T_i), i = 0, n - 1\) inside the domain [0, 1] and the the two boundary conditions are additional points: \(T=1.0\) at \(x=0.0\) and \(T=0.0\) at \(x=1.0.\) Moreover, Eq. 5 replaces Eq. 6 as there is no need for scaling.

5 Numerical experiments

The results in this section are for the purpose of discussion and do not constitute a demonstration of any method capable of solving the convection–diffusion equations. However, they demonstrate how GP, a soft computing method, is able to discover a non-uniformly distributed data set \(u_{\rm d}\) of very precise real-valued coefficients that can numerically approximate the solution of a very challenging problem.

5.1 Low Peclet numbers

At low Peclet number the results are easy to obtain and accurate but become progressively more difficult to obtain at increasing Peclet numbers. For example, Fig. 2 is a typical result for a low Reynolds number: \(P = 1.0.\) The figure shows the output data \(u_{\rm o}\) obtained at 200 random locations. The agreement between the evolved and analytical solution of the differential equation is very good, and simply achieved by a function that is defined at the end point boundary conditions plus two other non-uniformly distributed internal points in \(u_{\rm d}\) (circled in the top right of the figure). Useful results are obtained with a value of \(\sigma = 1.0\) (Eqs. 1, 2).

Fig. 2
figure 2

T versus x at low values of the Peclet number \((P = 1)\) GP easily recovers the function that satisfies the differential equation and the boundary conditions. The function at \(u_{\rm o}\) is shown to be essentially identical to the exact analytical solution when plotted. Top right GP evolved two points for the \(u_{\rm d}.\) These are circled

An example of result at \(P=4\) is given in Fig. 3. In this particular experiment the domain was not formally mapped to (0.2,0.8) but the boundary conditions were brought inside the domain to end points \(x=0.2\) and \(x=0.8\) instead, i.e., the boundary conditions were not chosen as \(T = 1.0\) and as \(T = 0.0\) but to be the analytical solution values \(T=0.977136\) at \(x=0.2\) and \(T=0.560953\) \(x=0.8\) respectively. It is of course an toy example for illustration purposes. The result was achieved after sixteen generations of steady-state GP with an error of solution of 0.494. For \(u_{\rm d},\) GP evolved three internal points with their corresponding T values: (0.3, 0.960784), (0.41, 0.929412), (0.73, 0.678431). Of interest is the behaviour of the error: the absolute value of \(\frac{{\rm d}^2 T}{{\rm d}x^2} - P \frac{{\rm d}T}{{\rm d}x}\) at each output point of \(u_{\rm o}.\) Figure 3 compares the derivatives recovered from SI as computed by Eq. 3 to the analytical derivatives. Obviously the error is more pronounced for the second derivative than for the first derivative. The graph of pointwise absolute error shows how the error becomes smaller at the three evolved \(u_{\rm d}\) internal points and comparatively greater at the end points of the computational domain.

Fig. 3
figure 3

Example of results at higher Peclet number \((P = 4).\) Left T versus x, comparison of the recovered function \(u_{\rm o}\) to the analytical solution of the differential equation. Right Spatial derivatives versus x, comparison of SI computed derivatives to analytical solution derivatives, i.e., first derivative \(\frac{{\rm d}T} {{\rm d}x}\) and second derivative \(\frac{{\rm d}^2 T} {{\rm d}x^2}.\) Also shown is the distribution of the absolute value of the error of solution of the differential equation

5.2 Discussion on function derivatives

Derivatives for the function can also be computed locally and discretely by taking the difference in T values between the two closest points, i.e., by finite differences (FD). However, it was discovered that this approach is less satisfactory than the computation of derivatives by means of SI. This is because when obtained from equations such as Eq. 3, the derivatives involve every point in the SI approximation stencil and consequently are smoother than the very localized FD computed derivatives.

Figure 4 illustrates a typical candidate solution during the GP run. Apparent “clouds of dust” of points in FD computed second derivatives marginally detrimentally affect the computation of error which drives fitness. This effect is owing to lack of smoothness in the FD computation of derivatives.

Fig. 4
figure 4

Illustrating derivatives versus x computed using the finite difference measure. Top Absolute pointwise error, bottom second derivative. Note that the derivatives are not always smooth and contribute to error

5.3 Higher Peclet numbers

Computations at higher Peclet numbers are very challenging. Figure 5 gives a result for \(P=100.\) GP makes a lot of effort to get the function to turn appropriately by manipulating the interpolation points outside of the domain [0.2, 0.8], i.e., in the regions (0.0,0.2) and (0.8,1.0) where it evolves very large values for \(u_{\rm d}\) to encourage the function to turn appropriately inside of [0.2, 0.8]. This result required a population of 10,000 and running at very low values of \(\sigma,\) i.e., \(\sigma = 0.01\) (Eqs. 1, 2) to encourage the recovered function to achieve rapid turns. Although the error remained quite high (L2 norm of 66.0, using 241 sampling points for \(u_{\rm o}\)) the resulting function displayed the correct shape for both the solution and its derivatives (see Fig. 5). Note the stair-casing effect in the solution owing to the very low value of \(\sigma.\)

Fig. 5
figure 5

GP results at \(P = 100\) (dark line) compared to the analytical solution (light grey line). Clockwise from top left T versus x, solution to the convection–diffusion equation; first derivative, second derivative, absolute pointwise error, evolved and prescribed interpolation points. Note that the first four figures have been re-scaled to the (0,1) range from their original (0.2,0.8) range by the appropriate coordinate transformation

6 Conclusions

Stochastic interpolation, a recently developed mathematical procedure to recover a representative function from input data values \(u_{\rm d}\) that are distributed along the x axis, has unprecedented flexibility for tuning of its components and for this purpose this paper applies a GP-based procedure (Howard and Roberts 2001). This was illustrated with reference to a very challenging problem: the reverse engineering of \(u_{\rm d}\) in SI to recover a function that is a solution of the convection–diffusion differential equation. Such an example in itself represents an incipient application of GP and SI to solution of non-self-adjoint differential equations such as the convection–diffusion equation. However, this would need to be developed further to compete with established approaches (the family of weighted residual methods). This approach has advantages over the polynomial approach that was presented in Howard and Roberts (2001) because it can more easily implement arbitrary boundary conditions or become extended to multi-dimensions.

The primary contribution of this paper is to show how the parameters in SI can be tuned with GP. Such a demonstration advances application of SI to signal processing solution of engineering problems.