Abstract
Chlorophyll-a (chl-a) concentrations are often used as a proxy for water quality problems as well as phytoplankton blooms. Available chl-a models range from simple phosphorus loading models to complex regression and dynamic models. A comparison of multiple regression models was made with genetic programming (GP) techniques to predict chl-a concentrations over a large range of 104 Swedish lakes. Independent variables used were lake area, mean depth, iron, latitude, ammonium, nitrogen + nitrate, pH, phosphate, secchi depth, silicon, temperature, total phosphorus, total nitrogen and total organic carbon. GP is a method based on the Darwinian evolution theory. This implies that a program will be able to test different mathematical equations, iterating and improving each equation using fundamental ideas from evolution theory to increase the predictive power. A good correspondence was found between the multiple regression and the GP modelling approach. No significant improvement of the predictive power was found using GP, and it is therefore recommended that multiple regression methods should be preferred when predicting chl-a concentrations as these models tend to be less complex and the modelling approach is easier to use. Results from GP were in some cases more accurate compared to multiple regressions; however, the best model was created by multiple regressions which used concentrations of total phosphorus, total nitrogen and latitude as independent variables. These findings will be an important note for limnologists and modelling managers when developing future models of chl-a concentrations in lakes.
Similar content being viewed by others
References
Søndergaard, M., Larsen, A. E., Jørgensen, T. B., & Jeppesen, E. (2011). Using chlorophyll a and cyanobacteria in the ecological classification of lakes. Ecological Indicators, 11, 1403–1412.
Gregor, J., & Marsalek, B. (2004). Freshwater phytoplankton quantification by chlorophyll a: a comparative study of in vitro, in vivo and in situ methods. Water Research, 38, 517–522.
Håkanson, L., Bryhn, A. C., & Hytteborn, J. (2007). On the issue of limiting nutrient and predictions of cyanobacteria in aquatic systems. Science of the Total Environment, 379, 89–108.
Sakamoto, M. (1966). Primary production by phytoplankton community in some Japanese lakes and its dependence on depth. Archives of Hydrobiology, 62, 1–28.
Dillon, P. J., & Rigler, F. H. (1974). The phosphorus-Chlorophyll Relationship in Lakes. Limnology and Oceanography, 19, 767–773.
Jones, J. R., & Bachmann, R. W. (1976). Prediction of phosphorus and chlorophyll levels in lakes. Journal Water Pollution Control Federation, 48, 2176–2182.
Prairie, Y. T., Duarte, C. M., & Kalff, J. (1989). Unifying nutrient-chlorophyll relationships in lakes. Canadian Journal of Fisheries and Aquatic Sciences, 46, 1176–1182.
Celik, K. (2006). Spatial and seasonal variations in chlorophyll-nutrient relationships in the shallow hypertrophic lake Manyas, Turkey. Environmental Monitoring and Assessment, 117, 261–269. doi:10.1007/s10661-006-0990-z.
Carlson, R. E. (1977). A trophic state index for lakes. Limnology and Oceanography, 22, 361–369.
Wetzel, R. G. (2001). Limnology (3rd ed.). San Diego: Academic Press.
Dimberg, P. H., Hytteborn, J. K., & Bryhn, A. C. (2013). Predicting median monthly chlorophyll-a concentrations. Limnologica, 43, 169–176. doi:10.1016/j.limno.2012.08.011 DOI:10.1016/j.limno.2012.08.011#doilink.
Seip, K. L., Sas, H., & Vermij, S. (1990). The short term response to eutrophication abatement. Aquatic Sciences, 52, 199–220.
Håkanson, L., & Peters, R. H. (1995). Predictive limnology—methods for predictive modelling. Amst: SPB Academic Publishers.
Muttil, N., & Lee, J. H. W. (2005). Genetic programming for analysis and real-time prediction of coastal algal blooms. Ecological Modelling, 189, 363–376.
Muttil, N., & Chau, K.-W. (2006). Neural network and genetic programming for modelling coastal algal blooms. International Journal of Environment and Pollution, 28, 223–238.
Koza, J. (1992). Genetic programming: on the programming of computers by means of natural selection. Cambridge, MA: MIT Press.
SLU. (2012). Database. http://www.slu.se/en/ (Accessed 20 August 2012)
SMHI. (2009). SMHI—Sjödjup och sjövolym (eng: Lake depth and lake volume)
Prairie, Y. T. (1996). Evaluating the predictive power of regression models. Canadian Journal of Fisheries and Aquatic Sciences, 53, 490–492.
Hastie, T., Tibshirani, R., Friedman, J. (2009). The elements of statistical learning—data mining, inference and prediction, 2nd ed. Springer.
Håkanson, L., & Lindström, M. (1997). Frequency distributions and transformations of lakes variables, catchment area and morphometric parameters in predictive regression models for small glacial lakes. Ecological Modelling, 99, 171–201.
Oltean, M., & Grosan, C. (2003). A comparison of several linear genetic programming techniques. Complex Systems, 14, 285–313.
Searson, D. (2009). GPTIPS: genetic programming & symbolic regression for MATLAB. http://gptips.sourceforge.net
Phillips, G., Pietiläinen, O. P., Carvalho, L., Solimini, A., Solheim Lyche, A., & Cardaso, A. C. (2008). Chlorophyll-nutrient relationships of different lake types using a large European dataset. Aquatic Ecology, 42, 213–226. doi:10.1007/s10452-008-9180-0.
Acknowledgments
The authors would like to thank one anonymous reviewer and the associate editor who greatly helped in improving this article. The authors would also like to thank Gesa Weyhenmeyer and Roger Herbert for valuable comments. The Swedish University of Agricultural Sciences and the Swedish Meteorological and Hydrological Institute are also acknowledged for making data available on their web pages.
Author information
Authors and Affiliations
Corresponding author
Appendix 1
Appendix 1
The models produced using regression and GP modelling techniques (Eqs. 1 to 5). a = Regression models, b = GP models. p Value is below 0.001 for Eqs. 1, 3 and 5. p Value is below 0.05 for Eqs. 2 and 4. Cluster elimination has been made for Eqs. 4 and 5.
Rights and permissions
About this article
Cite this article
Dimberg, P.H., Olofsson, C.J. A Comparison Between Regression Models and Genetic Programming for Predictions of Chlorophyll-a Concentrations in Northern Lakes. Environ Model Assess 21, 221–232 (2016). https://doi.org/10.1007/s10666-015-9480-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10666-015-9480-4