abstract = "Several studies indicate that the data-driven models
have proven to be potentially useful tools in
hydrological modeling. Nevertheless, it is a common
perception among researchers and practitioners that the
usefulness of the system theoretic models is limited to
forecast applications, and they cannot be used as a
tool for scientific investigations. Also, the
system-theoretic models are believed to be less
reliable as they characterise the hydrological
processes by learning the input-output patterns
embedded in the dataset and not based on strong
physical understanding of the system. It is imperative
that the above concerns needs to be addressed before
the data-driven models can gain wider acceptability by
researchers and practitioners. In this research
different methods and tools that can be adopted to
promote transparency in the data-driven models are
probed with the objective of extending the usefulness
of data-driven models beyond forecast applications as a
tools for scientific investigations, by providing
additional insights into the underlying input-output
patterns based on which the data-driven models arrive
at a decision. In this regard, the utility of
self-organising networks (competitive learning and
self-organizing maps) in learning the patterns in the
input space is evaluated by developing a novel neural
network model called the spiking modular neural
networks (SMNNs). The performance of the SMNNs is
evaluated based on its ability to characterize stream
flows and actual evapotranspiration process. Also the
utility of self-organising algorithms, namely genetic
programming (GP), is evaluated with regards to its
ability to promote transparency in data-driven models.
The robustness of the GP to evolve its own model
structure with relevant parameters is illustrated by
applying GP to characterise the
actual-evapo-transpiration process. The results from
this research indicate that self-organisation in
learning, both in terms of self-organising networks and
self-organising algorithms, could be adopted to promote
transparency in data-driven models.
In pursuit of improving the reliability of the
data-driven models, different methods for incorporating
uncertainty estimates as part of the data-driven model
building exercise is evaluated in this research. The
local-scale models are shown to be more reliable than
the global-scale models in characterising the saturated
hydraulic conductivity of soils. In addition, in this
research, the importance of model structure uncertainty
in geophysical modeling is emphasised by developing a
framework to account for the model structure
uncertainty in geophysical modeling. The contribution
of the model structure uncertainty to the predictive
uncertainty of the model is shown to be larger than the
uncertainty associated with the model parameters. Also
it has been demonstrated that increasing the model
complexity may lead to a better fit of the function,
but at the cost of an increasing level of uncertainty.
It is recommended that the effect of model structure
uncertainty should be considered for developing
reliable hydrological models.",