Elsevier

Ecological Modelling

Volume 189, Issues 3–4, 10 December 2005, Pages 363-376
Ecological Modelling

Genetic programming for analysis and real-time prediction of coastal algal blooms

https://doi.org/10.1016/j.ecolmodel.2005.03.018Get rights and content

Abstract

Harmful algal blooms (HAB) have been widely reported and have become a serious environmental problem world wide due to its negative impacts to aquatic ecosystems, fisheries, and human health. A capability to predict the occurrence of algal blooms with an acceptable accuracy and lead-time would clearly be very beneficial to fisheries and environmental management. In this study, we present the first real-time modelling and prediction of algal blooms using a data driven evolutionary algorithm, Genetic Programming (GP). The daily prediction of the algal blooms is carried out at Kat O station in Hong Kong using 3 years of high frequency (two-hourly) chlorophyll fluorescence and related hydro-meteorological and water quality data. The results for the prediction of chlorophyll fluorescence, a measure of algal biomass, are within reasonable accuracy for a lead-time of up to 1 day. The results generally concur with those obtained with artificial neural network. As compared to traditional data-driven models, GP has the advantage of evolving an equation relating input and output variables. A detailed analysis of the results of the GP models shows that GP not only correctly identifies the key input variables in accordance with ecological reasoning, but also demonstrates the relationship between the auto-regressive nature of bloom dynamics and flushing time. This study shows GP to be a viable alternative for algal bloom modelling and prediction; the interpretation of the results is greatly facilitated by the analytical form of the evolved equations.

Introduction

Harmful algal blooms (HABs) refer to the explosive growth and accumulation of harmful microscopic algae (phytoplankton). The well-known form of algal bloom – the red tide – has been widely reported and has become a serious environmental problem due to its negative impacts on human health and aquatic life (e.g. anoxia or shellfish poisoning). In the past two decades there is an increasing trend in the occurrence of harmful algal blooms throughout the world. In particular, in April 1998, a devastating red tide resulted in the worst fish kill in Hong Kong's history, it destroyed over 80% (3400 tonnes) of cultured fish stock, with estimated loss of more than HK$312 million (Lee and Qu, 2004). Thus, a capability to analyze and predict the occurrence of algal blooms with an acceptable accuracy and lead-time would clearly be very beneficial to fisheries and environmental management.

Traditionally, models of phytoplankton dynamics are based on theories of the dependence of growth and decay factors on physical and biotic environmental variables (e.g. solar radiation, nutrients, flushing)—expressed mathematically and incorporated in advective diffusion equations in a water quality model. Such deterministic models are normally referred to as process-based models. Nowadays, with the availability of large amounts of data and with development of artificial intelligence techniques, a new paradigm of modelling called “data-driven modelling” has emerged.

Data-driven models are ideally suited to model the algal dynamics since such models can be set up rapidly and is known to be effective in handling dynamic, non-linear and noisy data, especially when underlying physical relationships are not fully understood, or when the required input data needed to drive the process-based models are not available. In the recent past, various data-driven models, such as artificial neural network (ANN) and fuzzy logic models, have been applied to model the water quality variables with different degrees of success (Recknagel et al., 1997, Recknagel et al., 2002, Maier et al., 1998, Chen and Mynett, 2003, Lee et al., 2003). In the present study, we employ an evolutionary based data-driven model, the Genetic Programming (GP) for analysis of the high frequency (two-hourly) chlorophyll fluorescence and related hydro-meteorological and water quality data. In the following sections, we first outline the key principles of genetic programming, followed by its application to modelling of algal dynamics. The optimal GP model for real time prediction of algal dynamics is then presented along with a comparison of model performance with those of other data driven models. Finally, the relationship between the auto-regressive nature of the revealed algal dynamics and flushing time is investigated using long-term water quality data of a similar semi-enclosed coastal water (Tolo Harbour).

Section snippets

Genetic programming

Genetic Programming (GP) is a relatively new automatic programming technique for evolving computer programs to solve problems (Koza, 1992). In engineering applications, GP is frequently applied to model structure identification problems. In such applications, GP is used to infer the underlying structure of either a natural or experimental process in order to model the process numerically. A number of applications of GP have been reported, which include sediment transport modelling, salt-water

Analysis and prediction of algal dynamics at Kat O

The application of GP for real-time algal bloom prediction at a station in the north eastern waters of Hong Kong is presented. Since the choice of appropriate input variables is important, GP models are first used for selecting the variables significant for the predictions and then these significant variables are used as input for the GP runs for real-time predictions. In the following we give an account of the nature of the data used, the details of the GP modelling, the analysis of the high

Hydrography and water quality of Tolo Harbour

Tolo Harbour is a semi-enclosed bay in the northeastern coastal waters of Hong Kong (Fig. 4). It is connected to the open sea at Mirs Bay; in general the water quality improves from the more enclosed and densely populated inner Harbour Subzone towards the better flushed outer Channel Subzone.

The nutrient enrichment in the harbour due to municipal and livestock waste discharges has been a major environmental concern over the past two decades. The organic loads are derived from the two major

Conclusions

This study presents the first analysis of near continuous algal dynamics data from a coastal observing station using a data driven evolutionary algorithm, Genetic Programming (GP). Consistent with previous analysis of sparse data, the present results reveal a strong auto-regressive nature of the algal dynamics in the semi-enclosed water. The results for daily prediction of chlorophyll fluorescence are within reasonable accuracy for a lead-time of up to 1 day and are comparable with those from

Acknowledgements

This study was supported by a Hong Kong Research Grants Council Group Research Project on dynamics of algal blooms and red tides in subtropical coastal waters (RGC/HKU 2/98C and 1/02C), and partially by a grant from the University Grants Committee of the Hong Kong Special Administrative Region, China (Project No. AoE/P-04/04). The assistance of the Hong Kong Agriculture, Fisheries and Conservation Department in the field monitoring is gratefully acknowledged. The authors also wish to thank DHI

References (24)

  • V. Babovic et al.

    The evolution of equations from hydraulic data, part II: applications

    J. Hydraul. Res.

    (1997)
  • B.S.S. Chan et al.

    Phytoplankton productivity in Tolo Harbour

    Asian Mar. Biol.

    (1987)
  • Cited by (113)

    • Machine learning based marine water quality prediction for coastal hydro-environment management

      2021, Journal of Environmental Management
      Citation Excerpt :

      In other word, the upcoming algal bloom events are strongly related to Chl-a concentration with 1–2 weeks ahead, which indicates the occurrence of HAB in Tolo Harbour with a cycle of 1–2 weeks. This auto-regressive characteristic of algal growth dynamics is also observed and concluded by other scholars (Lee et al., 2005; Muttil and Lee 2005; Muttil and Chau 2006). After comparing with three differently flushed stations, Muttil and Lee (2005) confirmed the phenomenon was related to the tidal flushing conditions.

    • Development of computer vision system to predict peroxidase and polyphenol oxidase enzymes to evaluate the process of banana peel browning using genetic programming modeling

      2018, Scientia Horticulturae
      Citation Excerpt :

      Furthermore, it was observed that suggested method (the OFPs method) compared to method of back-propagation artificial neural network (BP-ANN) were more accurate, and it was less accurate than method of the support vector machine (SVM). Also, Muttil and Lee (2005) used the Genetic programming (GP) to real-time prediction of coastal algal blooms; the results of the GP were in agreement with the results of the artificial neural network method, and eventually these researchers concluded that the explanation of the results was facilitated using the form of analytical equations developed by GP method. In another study, in order to model the evapotranspiration process, the genetic programming (GP) was applied; compared with artificial neural network (ANN) models and the traditional Penman-Monteith (PM) method, the GP model performance was considerable, so that, its performance was comparable to the performance of the ANN model (Parasuraman et al., 2007).

    View all citing articles on Scopus
    View full text