Research papersPrediction of wave ripple characteristics using genetic programming
Introduction
Sufficiently strong water wave propagation over a moveable bed composed of sand grains results in the development of rhythmic bedforms whose crest spacing is of the order of centimeters to meters while heights are of the order of centimeters. These features are often termed vortex ripples because of a recirculation cell that develops on the lee side of the bedform that is subsequently ejected upward during the reversals in flow direction. Accurate prediction of vortex ripple size and shape is crucial for successful determination of seabed bottom roughness, a first order control on wave attenuation (e.g., Ardhuin et al., 2002), as well as sediment transport as suspended load (e.g., Green and Black, 1999, Bolaños et al., 2012). Furthermore ripple migration is a fundamental mechanism of bedload transport (e.g., Traykovski et al., 1999, Becker et al., 2007), and parameterizations of bedload flux necessitate an accurate depiction of ripple size and shape.
Many predictors of equilibrium ripple geometry have been developed from field and laboratory datasets (e.g., Clifton, 1976, Nielsen, 1981, Grant and Madsen, 1982, Wiberg and Harris, 1994, Faraci and Foti, 2002, Styles and Glenn, 2002, Grasmeijer and Kleinhans, 2004, et al.,, Soulsby et al., 2012, Pedocchi and García, 2009a, Camenen, 2009). Equilibrium ripple size and shape is frequently broken down to include 3 subpopulations, a convention developed by Clifton (1976), and reviewed here in order of increasing hydrodynamic forcing. Orbital ripples are believed to scale linearly with wave orbital diameter at the seabed and display the largest steepness (ripple height/wavelength~0.15). Suborbital ripples show spacing that depends on wave orbital diameter and grain size. In even stronger hydrodynamic conditions anorbital ripples form, whose size is related to grain size alone and whose scaling is irrespective of wave orbital diameter. Suborbital ripples link the population of anorbital ripples with those of orbital ripples.
As noted by Smith and Wiberg (2006), recent field and laboratory work has challenged the existing typology for wave-generated ripples as a result of the addition of two new populations (Fig. 1). The first are ripples measured in fine sand under strong hydrodynamic conditions. Field and laboratory campaigns in more energetic conditions have discovered the presence of long wavelength, low amplitude ripples (‘hummocks’) in fine sands that scale with orbital diameter (e.g., Hanes et al., 2001, O’Donoghue et al., 2006). Predictors are unable to accurately capture this ripple size and shape (e.g., Bolaños et al., 2012), yet modeling (Chang and Hanes, 2004) and observation (Green and Black, 1999, Cummings et al., 2009) of these bedforms show that they eject vortices and are therefore important for their influence on seabed roughness and sediment transport. Furthermore at times these long wavelength ripples have superimposed anorbital ripples (e.g., Southard et al., 1990, Hanes et al., 2001, Williams et al., 2004), another unsolved problem in wave ripple prediction. Because of these complications, Pedocchi and García (2009a), who developed a recent well performing predictor, omit long wavelength ripples from their analysis, but note that these long wavelength ‘round crested’ ripples are observed above a critical threshold in U/ws (where U is the maximum orbital velocity at the bed and ws is the sediment fall velocity). Dumas et al. (2005) and Cummings et al. (2009) also show that the transition from anorbital scale ripples to round crested long wave orbital scale ripples is a function of orbital velocity (a set value for their given sediment mixtures).
The second new population of ripples are those found in medium to coarse sand (Traykovski et al., 1999, Ardhuin et al., 2002, Becker et al., 2007, Masselink et al., 2007, Traykovski, 2007, Cummings et al., 2009, Yamaguchi and Sekiguchi, 2011). Coarse grained ripples have been observed in shelf environments for several decades (e.g., Forbes and Boyd, 1987, Leckie, 1988 and references therein) but until recently ripple measurements have not been coupled to the hydrodynamic parameters of their formation. Recent lab work by Cummings et al. (2009) demonstrated the persistence of steep ripples with orbital scaling in coarse sand under strong hydrodynamic conditions.
These two new populations of ripples highlight a perennial problem with empirical predictors; unless equations are built using large, integrated data sets that encompass many conditions, prediction schemes are difficult to translate to different settings. A non-empirical approach, such as models based on first principles (e.g., Foti and Blondeaux, 1995, Blondeaux, 2001, Charru and Hinch, 2006), presents different problems: nonlinear; emergent processes that occur at the ripple scale such as flow separation, vortex ejection, turbulence, sediment suspension, pattern coarsening, defect creation, migration and annihilation (Werner and Kocurek, 1999); and the existence of multiple stable configurations in ripple sizes/shapes at a given hydrodynamic condition (a stability balloon; Hansen et al., 2001) limit the usefulness of finite-amplitude predictions. Prediction by numerical models of coupled fluid flow and bed evolution present promising results but have so far been tested under a narrow range of conditions and compared to few data sets (Marieu et al., 2008, Chou and Fringer, 2010).
If empirical data driven predictors are currently the most broadly applicable tools to develop field scale predictions, how should they be built? Traditionally the development of an empirical predictor relies on transforming a single (or several) noisy multidimensional dataset to lower-dimensions and fitting a curve (with a set functional form) through the resultant point cloud. Here we offer a different solution: a data integration campaign (the collection of many published datasets) followed by machine learning (ML), whereby computational optimization techniques are used to find solutions to multidimensional and nonlinear problems. The suite of techniques encompassed by ML are essentially identical to empirical data driven techniques used previously except the trial and optimization of solutions is outsourced to a computer.
The most common ML paradigm used in coastal studies is artificial neural networks (ANN). Recent examples of its use include predictions of alongshore sediment transport in the surfzone (van Maanen et al., 2010), sand bar behavior (Pape et al., 2010) and suspended sediment reference concentration under waves (Oehler et al., 2012). Yan et al. (2008) used an artificial neural network to predict wave ripple geometry (length and height) based on three input parameters (median grain size, wave period, and the maximum near bed wave orbital velocity). ANN results give better predictions based on 3 statistical measures (scatter index, correlation coefficient, and mean geometric deviation) than that of four common empirical models (Nielsen, 1981, van Rijn, 1993, Wiberg and Harris, 1994, Grasmeijer and Kleinhans, 2004). Yet the ANN ripple prediction scheme derived by Yan et al. (2008) was developed and compared to a limited dataset. Furthermore ANNs are problematic because the highly nonlinear result is difficult to interpret and does not offer immediate insight into the physical nature of the problem at hand. Decision or regression trees (e.g., Oehler et al., 2012), another common and well performing ML technique, is also hampered by the lack of direct physical significance and other drawbacks such as the lack of smoothness.
In this contribution we use genetic programming (GP; Koza, 1992), a population based optimization technique where the population consists of individual equations (i.e., a population of individual predictors). The mathematical or logical operations that constitute each algorithms can be modified at every time step via an ‘evolutionary’ process (such as crossover and mutation) to produce expressions that optimize model–data fit. Outputs developed by GP can be smooth functions that are easy to examine and interpret for physical significance. Furthermore, a priori determination of the functional form of the predictor is not required and the final optimized solution can take on any mathematical form (within user defined limits). Thus far genetic programming has been applied to a wide range of problems including the prediction of freshwater phytoplankton dynamics (Whigham and Recknagel, 1999), downscaling of atmospheric model output (Coulibaly, 2004), determining appropriate parameterization for roughness in vegetated flows (Baptist et al., 2007), wave forecasting (Kambekar and Deo, 2012) and mapping of seafloor habitats (Silva and Tseng, 2008).
The goal of this study is to demonstrate the applicability of ML techniques (specifically GP) to research questions in the coastal domain. To accomplish this goal we compile 27 different field and laboratory data sets of wave ripple prediction (995 individual measurements; Table 1) that span a broad range of conditions and develop a new wave-ripple predictor that is able to capture the morphology of ripple geometry in a wide range of forcing conditions, including conditions where long wave orbital ripples are present. We put our results in the context of existing formulations and theories, and assess the physical relevance of GP predictors. Our new equilibrium predictor ignores the effect of ripple orientation, time evolution, heterogeneous sediment, superimposed current, ripple asymmetry, and bio-degradation of ripples. We discuss these limitations in Section 5 but note here that other existing time dependent ripple prediction schemes capture one or more (but not all) of these processes (i.e., Soulsby et al., 2012, Traykovski, 2007). Finally, the compilation of published ripple data allows for the identification of gaps in knowledge and observations that should be pursued in future research. Future data collection campaigns can be added to this database, allowing for modifications to the prediction schemes shown below. In this sense the ripple prediction scheme we demonstrate here is dynamic.
Section snippets
Data
As a result of decades of study, many wave ripple datasets are available in the scientific literature. Examples of recent wave ripple data integration and compilations are Soulsby and Whitehouse (2005), Pedocchi and García (2009a) and Camenen (2009). Here we follow the lead of Pedocchi and García (2009a) and limit our data collection to studies using sediment with quartz (or near quartz) densities (2.65 g/cm3) performed in large oscillatory tunnels, large wave flumes, wave racetracks and field
Selection of training, validation, and testing data
The database is split into three subsets to be used as training, validation, and testing. The GP algorithm uses the training dataset to develop and optimize candidate solutions. The validation dataset is used to evaluate the fitness of GP derived solutions and define which predictors persist. Testing data is not used or seen by the GP algorithm and is instead reserved as an independent test of the final predictors (and other published predictors). In the genetic programming literature there
Ripple wavelength
The GP algorithm output is shown in Table 2. This experiment evaluated 1010 formulas to develop the Pareto front shown in Fig. 6. Cliffs, significant gains in error for small changes in equation complexity occur along the Pareto front at complexities of 3, 6, and 8 (Fig. 6) The first of these cliffs (at complexity 3) is a predictor, λ=0.607d0, that mimics the basic form of the orbital scale (i.e., weak hydrodynamics) predictor commonly used today, where ripple wavelength is a linear function of
Predictors derived from genetic programming
The suite of predictors that are produced as output of the genetic programming show a trend of increasing predictability with increasing complexity. Highly nonlinear predictors have been avoided in this study because they may be fit to the noise or variance present in the training dataset (i.e., they are overfit). Yet the more complex nonlinear predictors can be used as hypothesis for further field and lab studies where grain size effects are a focus.
Dependence on orbital scaling and grain size
Conclusion
We develop equilibrium predictors of oscillatory ripple geometry using genetic programming. Ripple length is a weak nonlinear function grain size and bottom orbital excursion. Ripple height and steepness are nonlinear functions of grain size and predicted ripple length (i.e., grain size and bottom orbital excursion). Furthermore these new predictor encompass a wide range of hydrodynamic and sedimentological conditions not previously included in published prediction schemes. However, the
Acknowledgments
We thank Paula Camus for sharing her MDA routine, Malcolm Green for insightful comments at the beginning of this study, and three anonymous reviewers for critical feedback. EBG thanks ‘IH Cantabria’ for funding during his stay, where part of this work was completed. G.C. acknowledges funding from the “Cantabria Campus Internacional, Augusto Gonzalez Linares Program”.
References (97)
- et al.
Comparison of measurements and models of bed stress, bedforms and suspended sediments under combined currents and waves
Coastal Engineering
(2012) Estimation of the wave-related ripple characteristics and induced bed shear stress
Estuarine, Coastal and Shelf Science
(2009)- et al.
Analysis of clustering and selection algorithms for the study of multivariate wave climate
Coastal Engineering
(2011) The distribution of nearshore bedforms and effects on sand suspension on low-energy, micro-tidal beaches in Southwestern Australia
Marine Geology
(2000)- et al.
Geometry, migration and evolution of small-scale bedforms generated by regular and irregular waves
Coastal Engineering
(2002) - et al.
Sea ripple formation: the heterogeneous sediment case
Coastal Engineering
(1995) - et al.
Observed and predicted bed forms and their effect on suspended sand concentrations
Coastal Engineering
(2004) - et al.
Suspended-sediment reference concentration under waves: field observations and critical analysis of two predictive models
Coastal Engineering
(1999) - et al.
Suspension of coarse and fine sand on a wave-dominated shoreface, with implications for the development of rippled scour depressions
Continental Shelf Research
(2004) - et al.
A new synergetic paradigm in environmental numerical modeling: hybrid models combining deterministic and machine learning components
Ecological Modelling
(2006)
Sheet flow and large wave ripples under combined waves and currents: field observations, model predictions and effects on boundary layer dynamics
Continental Shelf Research
Data splitting for artificial neural networks using SOM-based stratified sampling
Neural Networks
Geometry prediction for wave-generated bedforms
Coastal Engineering
Sand ripples generated by regular oscillatory flow
Coastal Engineering
The dimensions of sand ripples in full-scale oscillatory flows
Coastal Engineering
A data-driven approach to predict suspended-sediment reference concentration under non-breaking waves
Continental Shelf Research
Parameterization of bedform morphology and defect density with fingerprint analysis techniques
Continental Shelf Research
Prediction of time-evolving sand ripples in shelf seas
Continental Shelf Research
The effects of spatially complex inner shelf roughness on boundary layer turbulence and current and wave friction: Tairua Embayment, New Zealand
Continental Shelf Research
Observations of plan-view sand ripple behavior and spectral wave climate on the inner shelf of San Pedro Bay, California
Continental Shelf Research
Variability of wave-induced ripple migration in wave-flume experiments and its implications for sediment transport
Coastal Engineering
Prediction of sand ripple geometry under waves using an artificial neural network
Computers & Geosciences
Observations of wave-generated vortex ripples on the North Carolina continental shelf
Journal of Geophysical Research
Exploratory flow-duct experiments on combined-flow bed configurations, and some implications for interpreting storm-event stratification
Journal of Sedimentary Research
Relaxation time effects of wave ripples on tidal beaches
Geophysical Research Letters
On inducing equations for vegetation resistance
Journal of Hydraulic Research
Video-based observations of nearshore sand ripples and ripple migration
Journal of Geophysical Research
Mechanics of coastal forms
Annual Review of Fluid Mechanics
Optimal division of data for neural network models in water resources applications
Water Resources Research
Real-time deployment of artificial neural network forecasting models: understanding the range of applicability
Water Resources Research
Time-sequence observations of wave-formed sand ripples on an ocean shoreface
Sedimentology
Suspended sediment and hydrodynamics above mildly sloped long wave ripples
Journal of Geophysical Research
Ripple formation on a particle bed sheared by a viscous liquid. Part 2. Oscillating flow
Journal of Fluid Mechanics
A model for the simulation of coupled flow-bed form evolution in turbulent flows
Journal of Geophysical Research
Wave-formed sedimentary structures: a conceptual model
Downscaling daily extreme temperatures with genetic programming
Geophysical Research Letters
Fine-grained versus coarse-grained wave ripples generated experimentally under large-scale oscillatory flow
Journal of Sedimentary Research
Wave-formed sediment ripples: transient analysis of ripple spectral development
Journal of Geophysical Research
Geometry and grain-size sorting of ripples on low-energy sandy beaches: field observations and model predictions
Sedimentology
Response of sand ripples to change in oscillatory flow
Sedimentology
Experiments on oscillatory-flow and combined-flow bed forms: implications for interpreting parts of the shallow-marine sedimentary record
Journal of Sedimentary Research
Evolution of small scale regular patterns generated by waves propagating over a sandy bottom
Physics of Fluids
Gravel ripples on the inner Scotian Shelf
Journal of Sedimentary Research
Random sampling technique for overfitting control in genetic programming. Genetic programming
Movable bed roughness in unsteady oscillatory flow
Journal of Geophysical Research
Rippled scour depressions add ecologically significant heterogeneity to soft-bottom habitats on the continental shelf
Marine Ecology Progress Series
Cited by (41)
A machine learning approach to predicting equilibrium ripple wavelength
2022, Environmental Modelling and SoftwareSpatial Frequency Analysis and Information Synthesis for Understanding Coastal Barriers
2022, Treatise on GeomorphologyOn the runup parameterisation for reef-lined coasts
2022, Ocean ModellingCitation Excerpt :To improve the coastal modelling of physical processes the parameterisation of small-scale processes needs to be improved (Fringer et al., 2019). In recent years, interest in the use of Machine Learning for coastal and hydraulics applications has grown (e.g., Goldstein et al., 2013; Tinoco et al., 2015; Passarella et al., 2018; Goldstein et al., 2019; Beuzen et al., 2019; da Silva et al., 2020; among others). Machine learning is a subdiscipline of computer science concerned with the construction of computer programs that automatically improve with experience (Mitchell, 1997).
Predicting water turbidity in a macro-tidal coastal bay using machine learning approaches
2021, Estuarine, Coastal and Shelf SciencePredicting the bulk drag coefficient of flexible vegetation in wave flows based on a genetic programming algorithm
2021, Ocean EngineeringCitation Excerpt :It needs to mention that the merging of different data sources introduces uncertainty into the data, and the measurement error of each dataset is different owing to the different instruments and techniques used in each experiment. Because it is difficult to quantify the measurement error uniformly, its possible influences on the CD predictor were ignored in this study (Goldstein et al., 2013). Since the GP algorithm does not consider the physical dimensions, the dimensions of the output results may not be consistent (Goldstein et al., 2013; Liu et al., 2020; Shi et al., 2019; Tinoco et al., 2015).
Mechanism discovery and model identification using genetic feature extraction and statistical testing
2020, Computers and Chemical Engineering