Genetic programming for analysis and real-time prediction of coastal algal blooms
Introduction
Harmful algal blooms (HABs) refer to the explosive growth and accumulation of harmful microscopic algae (phytoplankton). The well-known form of algal bloom – the red tide – has been widely reported and has become a serious environmental problem due to its negative impacts on human health and aquatic life (e.g. anoxia or shellfish poisoning). In the past two decades there is an increasing trend in the occurrence of harmful algal blooms throughout the world. In particular, in April 1998, a devastating red tide resulted in the worst fish kill in Hong Kong's history, it destroyed over 80% (3400 tonnes) of cultured fish stock, with estimated loss of more than HK$312 million (Lee and Qu, 2004). Thus, a capability to analyze and predict the occurrence of algal blooms with an acceptable accuracy and lead-time would clearly be very beneficial to fisheries and environmental management.
Traditionally, models of phytoplankton dynamics are based on theories of the dependence of growth and decay factors on physical and biotic environmental variables (e.g. solar radiation, nutrients, flushing)—expressed mathematically and incorporated in advective diffusion equations in a water quality model. Such deterministic models are normally referred to as process-based models. Nowadays, with the availability of large amounts of data and with development of artificial intelligence techniques, a new paradigm of modelling called “data-driven modelling” has emerged.
Data-driven models are ideally suited to model the algal dynamics since such models can be set up rapidly and is known to be effective in handling dynamic, non-linear and noisy data, especially when underlying physical relationships are not fully understood, or when the required input data needed to drive the process-based models are not available. In the recent past, various data-driven models, such as artificial neural network (ANN) and fuzzy logic models, have been applied to model the water quality variables with different degrees of success (Recknagel et al., 1997, Recknagel et al., 2002, Maier et al., 1998, Chen and Mynett, 2003, Lee et al., 2003). In the present study, we employ an evolutionary based data-driven model, the Genetic Programming (GP) for analysis of the high frequency (two-hourly) chlorophyll fluorescence and related hydro-meteorological and water quality data. In the following sections, we first outline the key principles of genetic programming, followed by its application to modelling of algal dynamics. The optimal GP model for real time prediction of algal dynamics is then presented along with a comparison of model performance with those of other data driven models. Finally, the relationship between the auto-regressive nature of the revealed algal dynamics and flushing time is investigated using long-term water quality data of a similar semi-enclosed coastal water (Tolo Harbour).
Section snippets
Genetic programming
Genetic Programming (GP) is a relatively new automatic programming technique for evolving computer programs to solve problems (Koza, 1992). In engineering applications, GP is frequently applied to model structure identification problems. In such applications, GP is used to infer the underlying structure of either a natural or experimental process in order to model the process numerically. A number of applications of GP have been reported, which include sediment transport modelling, salt-water
Analysis and prediction of algal dynamics at Kat O
The application of GP for real-time algal bloom prediction at a station in the north eastern waters of Hong Kong is presented. Since the choice of appropriate input variables is important, GP models are first used for selecting the variables significant for the predictions and then these significant variables are used as input for the GP runs for real-time predictions. In the following we give an account of the nature of the data used, the details of the GP modelling, the analysis of the high
Hydrography and water quality of Tolo Harbour
Tolo Harbour is a semi-enclosed bay in the northeastern coastal waters of Hong Kong (Fig. 4). It is connected to the open sea at Mirs Bay; in general the water quality improves from the more enclosed and densely populated inner Harbour Subzone towards the better flushed outer Channel Subzone.
The nutrient enrichment in the harbour due to municipal and livestock waste discharges has been a major environmental concern over the past two decades. The organic loads are derived from the two major
Conclusions
This study presents the first analysis of near continuous algal dynamics data from a coastal observing station using a data driven evolutionary algorithm, Genetic Programming (GP). Consistent with previous analysis of sparse data, the present results reveal a strong auto-regressive nature of the algal dynamics in the semi-enclosed water. The results for daily prediction of chlorophyll fluorescence are within reasonable accuracy for a lead-time of up to 1 day and are comparable with those from
Acknowledgements
This study was supported by a Hong Kong Research Grants Council Group Research Project on dynamics of algal blooms and red tides in subtropical coastal waters (RGC/HKU 2/98C and 1/02C), and partially by a grant from the University Grants Committee of the Hong Kong Special Administrative Region, China (Project No. AoE/P-04/04). The assistance of the Hong Kong Agriculture, Fisheries and Conservation Department in the field monitoring is gratefully acknowledged. The authors also wish to thank DHI
References (24)
- et al.
Integration of data mining techniques and heuristic knowledge in fuzzy logic modelling of eutrophication in Taihu Lake
Ecol. Model.
(2003) - et al.
Numerical determination of flushing time for stratified waterbodies
J. Mar. Syst.
(2004) - et al.
Eutrophication dynamics of Tolo Harbour
Hong Kong Mar. Pollut. Bull.
(1999) - et al.
Neural network modelling of coastal algal blooms
Ecol. Model.
(2003) - et al.
Use of artificial neural networks for modelling cyanobacteria Anabaena spp. in the River Murray, South Australia
Ecol. Model.
(1998) - et al.
Artificial neural network approach for modelling and prediction of algal blooms
Ecol. Model.
(1997) - et al.
Modelling rainfall-runoff relationships using genetic programming
Math. Compt. Model.
(2001) - et al.
Hong Kong's worst red tide—causative factors reflected in a phytoplankton study at Port Shelter station in 1988
Harmful Algae
(2004) - et al.
Genetic Programming as a model induction engine
J. Hydroinform.
(2000) - et al.
The evolution of equations from hydraulic data, part I: Theory
J. Hydraul. Res.
(1997)
The evolution of equations from hydraulic data, part II: applications
J. Hydraul. Res.
Phytoplankton productivity in Tolo Harbour
Asian Mar. Biol.
Cited by (113)
Machine learning based marine water quality prediction for coastal hydro-environment management
2021, Journal of Environmental ManagementCitation Excerpt :In other word, the upcoming algal bloom events are strongly related to Chl-a concentration with 1–2 weeks ahead, which indicates the occurrence of HAB in Tolo Harbour with a cycle of 1–2 weeks. This auto-regressive characteristic of algal growth dynamics is also observed and concluded by other scholars (Lee et al., 2005; Muttil and Lee 2005; Muttil and Chau 2006). After comparing with three differently flushed stations, Muttil and Lee (2005) confirmed the phenomenon was related to the tidal flushing conditions.
Advanced control of membrane fouling in filtration systems using artificial intelligence and machine learning techniques: A critical review
2019, Process Safety and Environmental ProtectionDevelopment of computer vision system to predict peroxidase and polyphenol oxidase enzymes to evaluate the process of banana peel browning using genetic programming modeling
2018, Scientia HorticulturaeCitation Excerpt :Furthermore, it was observed that suggested method (the OFPs method) compared to method of back-propagation artificial neural network (BP-ANN) were more accurate, and it was less accurate than method of the support vector machine (SVM). Also, Muttil and Lee (2005) used the Genetic programming (GP) to real-time prediction of coastal algal blooms; the results of the GP were in agreement with the results of the artificial neural network method, and eventually these researchers concluded that the explanation of the results was facilitated using the form of analytical equations developed by GP method. In another study, in order to model the evapotranspiration process, the genetic programming (GP) was applied; compared with artificial neural network (ANN) models and the traditional Penman-Monteith (PM) method, the GP model performance was considerable, so that, its performance was comparable to the performance of the ANN model (Parasuraman et al., 2007).