Forecasting performance of regional innovation systems using semantic-based genetic programming with local search optimizer
Introduction
Innovation is considered as a complex and dynamic, socio-technical, socio-economic, socio-political phenomenon that has been recognised as a central issue in economic development (Carayannis et al., 2016) The concept of regional innovation systems has recently received increased attention (Hajek et al., 2014, Lau and Lo, 2015) mainly due to the growing importance of regions (and other sub-national entities) in a globalised economy (Freeman, 2002). In regional innovation systems, private and public actors intensively interact and thus promote the generation, use, and dissemination of knowledge (Tödtling and Trippl, 2005). In addition, regions are critical entities for innovation policymaking because regions provide favourable conditions for knowledge creation and transfer (Lau and Lo, 2015). In this context, measuring the innovation performance of regions has become a priority in order to develop integrated benchmarking systems in the knowledge-based economies (Carayannis et al., 2016). This enables policymakers not only to comparatively evaluate the performance of regional innovation systems but also to identify best practices (innovation leaders) and target the regions in need (lagging behind regions). This is also why regional innovation performance is annually measured in many countries, for example using the innovation scoreboard for EU regions (Hollanders et al., 2016).
The advantages of using the indicators of innovation performance at regional level can be summarized as follows Evangelista et al., 2001): ((1) analysts and statisticians have strong experience with the collection and use of such indicators; (2) these indicators comprehensively cover all countries, industries and technological fields; (3) long time-series data are available to study the dynamics in the innovation performance of firms and industries across regions. In addition, the input-output innovation relationship is considered to be more robust at the regional level compared with the firm level (Audretsch and Feldman, 2004). This is attributed to both the important role of the regional context and the existence of externalities. Indeed, the results of firm-level models may provide incorrect inference in the presence of a strong effect of the regional context on the generation of innovations (Naz et al., 2015). However, note that the results obtained at the regional level cannot be interpreted at the level of individual firms as these results might significantly differ from those obtained from firm-level data due to biased estimates (Naz et al., 2015).
The main concern in measuring regional innovation performance is the complexity and dynamic changes in regional innovation systems Hajek et al., 2014). As a result, the data for the evaluation quickly become obsolete. Therefore, an accurate and reliable forecasting tool to support decision making presents a challenging task for optimization methods. Non-linear machine learning methods such as fuzzy rule-based systems and neural networks have been used for innovation forecasting at the firm level (Wang and Chien, 2006, Chien et al., 2010). These methods outperformed traditional statistical forecasting models in terms of accuracy, indicating non-linear patterns in firm innovation activities. In addition, recent empirical evidence provides support for this assumption also at both the regional (Hajek and Henriques, 2017) and national level (de la Paz-Marín et al., 2012). Moreover, chaos theory was used to detect non-linearity and strange attractors in the evolutionary path of patent counts (Hung and Tu, 2014). Regarding innovation systems, Samara et al. (2012) developed an integrated system dynamics approach to analyse the impact of innovation policies on the performance of national innovation systems. However, no previous research known to us has forecasted the performance of regional innovation systems using artificial intelligence methods. The main advantage of these methods, compared with traditional statistical forecasting methods, is that no complex mathematical formulation of the input-output relationships is necessary. Moreover, traditional methods are not suitable for modelling phenomena characterised by a high variance (Castelli et al., 2016). In the case of regional innovation systems, the high variance is mainly due to the highly dynamic socio-economic environment. To address these issues, we develop a forecasting model based on genetic programming in this study. Specifically, we use a recently proposed and very promising variant of standard genetic programming that integrates the concept of semantic awareness and local search optimizers to generate forecasting models. We argue that this model is more appropriate to model intrinsic non-linear character of innovation performance than traditional statistical and machine learning forecasting models because genetic programming: (1) has an excellent evolvability on training data (Vanneschi, 2017) and (2) is able to generalize the solution also on testing data (Castelli et al., 2015b). Our approach combines two recent advancements in genetic programming, this is (1) geometric semantic operator (GSO) that eliminates local optima by inducing a unimodal error surface on any forecasting algorithm and (2) local search optimizer (LSO) to make the convergence faster. The main idea of combining these approaches is to achieve a balance between exploration (GSO) and exploitation (LSO). As a result, the forecasting model can be optimized faster and overfitting can be avoided.
To verify the appropriateness of the proposed model for forecasting performance of regional innovation systems, the data on European regions for the period 2004–2012 were used. The inputs of the forecasting model are represented by indicators related to the regional knowledge base (regional knowledge generation, absorption, and transfer capacity) and regional competitiveness indexes approximating regional socio-technical, socio-economic and socio-political environment. The outputs include four indicators of the performance of regional innovation systems, namely patent counts, technological and non-technological innovation activity and economic effects of innovations. The models are first trained to forecast innovation performance for 2010, and then the models are tested on 2012 data. We demonstrate that the proposed model outperforms other statistical and artificial intelligence methods in terms of accuracy on testing data.
The remainder of this paper is structured as follows. In the next section, we present the inputs and outputs of the model and describe the data used. The variant of genetic programming proposed in this study to the forecasting problem is introduced in the following section. Section 4 describes the setting of the forecasting model and provides the experimental results comparing the proposed approach to other variants of genetic programming algorithm and other state-of-the-art forecasting methods. Finally, we conclude the paper, highlighting the main contributions of this study.
Section snippets
Data
Four interacting categories of determinants have been introduced into the models of regional innovation systems, namely regional competitiveness, knowledge generation, knowledge absorption and knowledge transfer (Hajek et al., 2014, Lau and Lo, 2015, Tödtling and Trippl, 2005, Samara et al., 2012). Table 1 presents the determinants of regional innovation performance used in this study.
Different socio-economic conditions and regional competitiveness have been reported as an important determinant
An introduction to genetic programming
Genetic Programming (GP) (Koza, 1992) is a computational method that belongs to the computational intelligence research area called evolutionary computation (Eiben et al., 2003). GP consists of the automated learning of computer programs by means of a process inspired by the theory of biological evolution of Darwin. In the context of GP, the word program can be interpreted in general terms, and thus GP can be applied to the particular cases of learning expressions, functions and, as in this
Geometric semantic genetic programming
Even though the term semantics can have several different interpretations, it is a common trend in the GP community (and this is what we do also here) to define the semantics of a solution as the vector s(T) = [T(x1), T(x2), …, T(xn)] of its output values. From this perspective, a GP individual can be identified by a point (its semantics s(T)) in a multidimensional space that we call semantic space (where the number of dimensions is equal to the number of observations in the training set (or
Local search in GP and GSGP
In Section 5.1, we discuss previous approaches for integrating Local Search (LS) with standard GP. Afterwards, in Section 5.2, we present the first integration of a local searcher within GSGP.
Experiments
This section describes the data pre-processing, experimental settings and the obtained results.
Conclusion and future directions
This study argued that the proposed GP-based model is more appropriate to model intrinsic complex and non-linear character of regional innovation performance than traditional statistical and machine learning forecasting models. The results of this study indicate that the GP-based model significantly outperforms other forecasting models in terms of test error. These results also suggest that the proposed forecasting model not only provides a good solution on training data but it also avoids
Acknowledgements
We gratefully acknowledge the help provided by constructive comments of the anonymous referees. This work was supported by the scientific research project of the Czech Sciences Foundation Grant no: 17-11795S.
References (60)
- et al.
Patents and innovation counts as measures of regional production of new knowledge
Res. Policy
(2002) - et al.
Knowledge spillovers and the geography of innovation
Handb. Reg. Urban Econ.
(2004) - et al.
Innovation and spillovers in regions: evidence from European patent data
Eur. Econ. Rev.
(2003) - et al.
A multilevel and multistage efficiency evaluation of innovation systems: a multiobjective DEA approach
Expert Syst. Appl.
(2016) - et al.
An artificial intelligence system to predict quality of service in banking organizations
Comput. Intell. Neurosci.
(2016) - et al.
Application of neuro-fuzzy networks to forecast innovation performance - the example of Taiwanese manufacturing industry
Expert Syst. Appl.
(2010) - et al.
Non-linear multiclassifier model based on artificial intelligence to predict research and development performance in European countries
Technol. Forecast Soc. Change
(2012) - et al.
Measuring the regional dimension of innovation. Lessons from the Italian innovation survey
Technovation
(2001) Continental, national and sub-national innovation systems—complementarity and economic growth
Res. Policy
(2002)- et al.
The impact on innovation performance of different sources of knowledge: evidence from the UK community innovation survey
Res. Policy
(2009)
Visualising components of regional innovation systems using self-organizing maps—evidence from European regions
Technol. Forecast Soc. Change
Efficiency of knowledge bases in urban population and economic growth - evidence from European cities
Cities
Is small actually big? The chaos of technological change
Res. Policy
Regional innovation system, absorptive capacity and innovation performance: an empirical study
Technol. Forecast Soc. Change
The impact of innovation policies on the performance of national innovation systems: a system dynamics analysis
Technovation
One size fits all?
Res. Policy
Linear and logistic regression analysis
Kidney Int.
Forecasting innovation performance via neural networks - a case of Taiwanese manufacturing industry
Technovation
EU Regional Competitiveness Index
Constructing regional advantage: platform policies based on related variety and differentiated knowledge bases
Reg. Stud.
Support vector regression
Neuronal Inf. Process. Lett. Rev.
Bagging predictors
Mach. Learn.
Stacked regressions
Mach. Learn.
Methodological issues in measuring innovation performance of spatial units
Ind. Innov.
A C++ framework for geometric semantic genetic programming
Genet. Program. Evolvable Mach.
Geometric semantic genetic programming with local search
A patent time series processing component for technology intelligence by trend identification functionality
Neural Comput. Appl.
A multi-facet survey on memetic computation
IEEE Trans. Evol. Comput.
Knowledge economies
Clusters, Learning and Cooperative Advantage
Cited by (20)
A novel binary classification approach based on geometric semantic genetic programming
2022, Swarm and Evolutionary ComputationCitation Excerpt :Here the UPDRS score was predicted using 18 features. More recently, the GSGP has been enhanced by using local search operators [27,28]. In [27], the authors applied the enhanced GSGP approach to two problems in the biomedical field: computerized tomography (CT) scan and 3D Protein Structure.
Configuration Paths to Efficient National Innovation Ecosystems
2021, Technological Forecasting and Social ChangeCitation Excerpt :The proposed DEA-fsQCA model not only explores the contrarian cases but also enabled us to reveal different insightful combinations of conditions leading to high NIE efficiency. In addition, asymmetrical and non-linear dependencies can be found using the proposed model, allowing us to model intrinsic non-linear characteristics of innovation systems (Hajek et al., 2019). In what follows, we discuss the obtained results with respect to countries representing the paths to NIE efficiencies.
Typology of Firms by Innovation Performance: A Cluster Analysis of a Regional Innovation System
2024, Developments in Marketing Science: Proceedings of the Academy of Marketing ScienceInnovation systems performance drivers and outputs: a systematic literature review and directions for future research
2024, International Journal of Business Innovation and ResearchOn the hybridization of geometric semantic GP with gradient-based optimizers
2023, Genetic Programming and Evolvable Machines