Elsevier

Ecological Modelling

Volume 342, 24 December 2016, Pages 97-112
Ecological Modelling

Spatially-explicit forecasting of cyanobacteria assemblages in freshwater lakes by multi-objective hybrid evolutionary algorithms

https://doi.org/10.1016/j.ecolmodel.2016.09.024Get rights and content

Highlights

  • A new multi-objective hybrid evolutionary algorithm to forecast toxic cyanobacteria.

  • Spacially-explicit forecasting of cyanobacteria assemblages in multiple sites/lakes.

  • Highly understandable rules to indicate water conditions favoured by cyanobacteria.

  • A decision tool for water managers to give early warning of cyanobacteria blooms.

Abstract

This paper proposes a novel multi-objective hybrid evolutionary algorithm (MOHEA) that allows spatially-explicit modelling of local outbreaks and dispersal of population density. The MOHEA was tested for modelling at once two cyanobacteria populations at one lake site, same population in two different lakes and same population at three different sites of one lake. All experiments with MOHEA utilized water quality time-series and abundances of Anabaena and Cylindrospermopsis monitored in the sub-tropical Lakes Wivenhoe and Somerset in Queensland (Australia) from 1999 to 2010. Results have demonstrated the capacity of MOHEA to determine generic rules that: (1) reveal crucial thresholds for outbreaks of cyanobacteria blooms, and (2) perform spatially-explicit forecasting of timing and magnitudes 7-day-ahead of bloom events.

Introduction

The current economic development in Australia and worldwide goes side by side with the global problems of eutrophication and climate change. There is evidence that high nutrient loads, rising temperatures, enhanced stratification, increased residence time and salinisation of drinking water reservoirs and lakes favor the dominance of cyanobacteria (Paerl and Huisman, 2008). Therefore water industries have to consider coinciding effects of eutrophication and climate change in their strategies to manage cyanobacterial blooms. However our ability to predict the occurrence and composition of cyanobacteria blooms has lagged well behind our ability to control total algal biomass. We urgently need advances in our ability to predict and prevent the growth of undesirable algae and other nuisance-forming organisms (Smith and Schindler, 2009). To develop comprehensive lake-based monitoring and early warning systems for water quality and cyanobacteria is therefore the right step forward (Schindler, 2009). Frequent population outbreaks of toxic cyanobacteria in drinking water reservoirs and lakes will have detrimental effects on raw water quality and aquatic biodiversity, and costly technology will be required to sustain safe human water supplies (e.g. Dodds et al., 2009). In order to assist water industries in making informed decisions and timely adaptations of measures for preventing and controlling effects of cyanobacteria, more adequate computer models are required (Jackson et al., 2001).

Traditionally, process-based models which allow simulations of food web dynamics and nutrient cycles over time by using ordinary differential equations (ODEs) (Pei and Ma, 2002, Arhonditsis and Brett, 2005, Chen et al., 2014) are widely used. However, there are some shortcomings to use these process-based models. Firstly, process-based models may hardly comprehend the causal complexity of the phytoplankton community in order to make accurate daily forecasts of population dynamics of algal species. Secondly, process-based models are calibrated for a limited number of years with annual data that constrains their validity to those years. Thirdly, the data demand of process-based models by far exceeds operationally-available data of a lake or a lake site at a certain point in time. Therefore it is unlikely that process-based models may ever been applicable as operational forecasting tools for early warning.

With rapidly growing amounts of ecological data and progress in computing technology, powerful tools for inductive reasoning and forecasting from complex data become available. Artificial neural networks (Hornik et al., 1989) approximate complex data with high accuracy by multivariate nonlinear models (Recknagel et al., 1997, Wei et al., 2001, Jeong et al., 2001), but lack the explicit representation of models extracted from data. In recent years, the use of evolutionary algorithms (EAs) (Holland, 1975) has gained wide popularity in domains, such as machine learning, pattern recognition, economic prediction and so on, due to their characteristics of self-adaptation, self-organization, self-learning and generality (Bäck et al., 1997). Since EA applications for ecological modelling have been pioneered by Bobbin and Recknagel (2001), Cao et al. (2006) developed the hybrid evolutionary algorithm (HEA) that is now worldwide applied for non-spatially-explicit modelling of cyanobacteria blooms in lakes and rivers (e.g. Kim et al., 2007, Chan et al., 2007; Recknagel et al., 2014a) as well as for knowledge discovery (Recknagel et al., 2014b, Recknagel et al., 2016). Since the HEA was designed to develop non-spatially-explicit models, resulting typically single output rule models did not represent spatial or multi-species relationships. However plankton communities in lakes vary seasonally and spatially by abiotic factors like advection, thermal stratification, nutrient loads as well as by biotic factors like competition, grazing, and predation. Therefore there is a demand for models allowing spatially-explicit forecasting that can identify local hotspots for seasonal outbreaks of cyanobacteria blooms.

It is well known that multi-objective optimization (MOO) techniques (Marler and Arora, 2004; Miettinen, 1999, Deb, 2001, Hanne, 2000) have been widely applied in many fields. The multi-objective hybrid evolutionary algorithm (MOHEA) proposed in this study allows to develop IF-THEN-ELSE rules with multiple outputs whereby fitting errors of all outputs are minimized by MOO. Resulting IF-THEN-ELSE rules with multiple outputs provide the benefit of: (1) revealing threshold conditions (IF-condition) that trigger population outbreaks being generic for all outputs, and (2) forecasting multiple species at a single site and single species at multiple sites (see Fig. 1). The functionality of MOHEA is tested for 7-day-ahead forecasting of the cyanobacteria Anabaena and Cylindrospermopsis in the Lakes Wivenhoe and Somerset, Queensland (Australia) based on physical and chemical water quality data monitored from 1999 to 2010. The paper validates forecasting results of different types of multi-output models and discusses ecological relationships revealed by input sensitivity analyses of the models.

Section snippets

Study sites and data

Different data were utilized for developing the three types of multi-output rule models. Eleven years of water quality data from 1999 to 2010 from Lake Wivenhoe in Queensland, Australia were used to develop single-site multi-species and multi-site single-species models. Measured data from Site30001 of Lake Wivenhoe (see Fig. 2) were used for developing single-site multi-species models and the measured data from sites 30015, 30016 and 30017 were used for developing multi-site single-species

Results for single-site multi-species model

Table 4 and Fig. 6 document the best performing model that has been developed for 7-day-ahead forecasting of Cylindrospermopsis and Anabaena at the same Site30001 of Lake Wivenhoe by 100 runs of MOHEA. As shown in Table 4 forecasts of Cylind achieved on average a higher R2 value (0.43) compared to Anabaena (0.29) also reflected by R2 values 0.54 and 0.40 for the best models for Cylind respective Anabaena. The best model selected all the input variables listed in Table 3 except Silica as

Conclusions and future work

This paper illustrates preliminary results of the multi-objective hybrid evolutionary algorithm (MOHEA) that show the potential for:

  • (1)

    spatially-explicit forecasting of population outbreaks and dispersal at different sites between or within lakes by one model with good accuracy regarding timing and differing accuracy regarding magnitudes of such events,

  • (2)

    revealing threshold conditions that trigger population outbreaks being generic for modelled sites and populations such as the water temperature of

Acknowledgements

This work was supported by Australian Research Council (ARC Grant no: LP0990453) and the industry partners SA water and Seqwater.

References (35)

  • F. Recknagel et al.

    Model ensemble for the simulation of plankton community dynamics of Lake Kinneret (Israel) induced from in situ predictor variables by evolutionary computation

    Environ. Modell. Softw.

    (2014)
  • V.H. Smith et al.

    Eutrophication science: where do we go from here?

    Trends Ecol. Evol.

    (2009)
  • B. Wei et al.

    Use of artificial neural network in the prediction of algal blooms

    Water Res.

    (2001)
  • T. Bäck et al.

    Evolutionary computation: comments on the history and current state

    IEEE Trans. Evol. Comput.

    (1997)
  • H. Cao et al.

    Hybrid evolutionary algorithm for rule set discovery in time-series data to forecast and explain algal population dynamics in two lakes different in morphometry and eutrophication

  • H. Cao et al.

    Parameter optimization algorithms for evolving rule models applied to freshwater ecosystems

    IEEE Trans. Evol. Comput.

    (2014)
  • A.C. Davison et al.

    Bootstrap Methods and Their Application

    (1997)
  • Cited by (17)

    • Automation of species-specific cyanobacteria phycocyanin fluorescence compensation using machine learning classification

      2022, Ecological Informatics
      Citation Excerpt :

      Several studies have successfully developed data-driven predictive models for cyanoHABs using water quality, meteorological and/or physical variables (see Rousso et al., 2020 for a review). Most previous research on predicting cyanoHABs has focused on models to predict cell counts or biomass of a particular cyanobacteria species of interest (e.g., Cao et al., 2016; Li et al., 2007; Ndong et al., 2014; Welk et al., 2008) or of the entire cyanobacteria community (e.g., Almuhtaram et al., 2021; Xiao et al., 2017) rather than the dominant cyanobacterial taxon. Predicting the dominant species taxa could provide useful information to support automated species-specific compensation of in-situ f-PC sensors (Bertone et al., 2019; Rousso et al., 2022a).

    • Chlorophyll and phycocyanin in-situ fluorescence in mixed cyanobacterial species assemblages: Effects of morphology, cell size and growth phase

      2022, Water Research
      Citation Excerpt :

      Examples of efficient use of data post-hoc include the development and validation of short-term forecasting (e.g., early warning systems) or long-term predictive (e.g., scenario analysis). Some machine learning and numerical models based on fluorescence estimates have recently been developed to forecast CyanoHAB occurrence (Elliott, 2012; Ndong et al., 2014; Xiao et al., 2017), predict dominant species (Cao et al., 2016; Fadel et al., 2017) and understand the main drivers for cyanobacteria succession (Shan et al., 2019; Wei et al., 2001), exemplifying the potential of high-frequency data in improving these models (Hamilton et al., 2015; Rousso et al., 2020). Species-specific calibration of fluorescence sensors must be part of an integrated management plan that considers account site-specific characteristics (e.g., dominant taxa, interferences) and supports informed decision making.

    • State of knowledge on early warning tools for cyanobacteria detection

      2021, Ecological Indicators
      Citation Excerpt :

      Almuhtaram et al. (2021b) demonstrated that three algorithms, One-Class Support Vector Machine, elliptic envelope, and Isolation Forest, are able to accurately identify cyanobacterial blooms in four datasets when trained on standardized historical phycocyanin data and tested on more recent data. Similarly, Cao et al. (2016) applied a multi-objective hybrid evolutionary algorithm to successfully identify the onset of cyanobacterial blooms using water quality parameters, and Chen et al. (2015) developed an autoregressive integrated moving average model to predict chlorophyll a concentrations and provide early warning of algal blooms. Thus, these and other machine learning algorithms can potentially be implemented as part of a utility’s harmful algal bloom monitoring strategy.

    View all citing articles on Scopus
    View full text