Created by W.Langdon from gp-bibliography.bib Revision:1.5020
In an attempt to fill this gap, in the last decade many tools of Systems Biology have been developed to elaborate the large quantity of data generated by high-throughput experimental techniques with the increasingly sophisticated range of mathematical modelling techniques. The aim of systems biology is to integrate models at multiple biological scales and investigate system-level properties of biological organisms. This aim includes understanding at four levels:
(a) the structure of biological interaction networks;
(b) their dynamics, how states change over time in different conditions;
(c) the methods biological systems use to control the state of a cell;
(d) the design of systems, including both how they have evolved and how they may potentially be artificially constructed.
A key feature of systems biology is the integration of both theoretical modelling and empirical investigation, in which current biological knowledge informs the development of models and the analysis of these models produces a set of predictions that may then be tested in the laboratory.
Many models have been proposed to describe the network, one of the most extensively used is Boolean Network, that notwithstanding its numerous successes, in some cases could suffer from being too coarse.
Another widely studied candidate is the system of differential equations,which is a very powerful and flexible model to describe complex relations among components. But it is not necessarily easy to determine the suitable form of equations which represent the network. Thus, the form of the differential equations had been fixed during the learning phase in previous studies. As a result, their goal was to simply optimize parameters, i.e., coefficients in the fixed equations.
In the analysis of time series of gene expression data presented in this thesis, a mathematical model has been identified and a system for the reconstruction of a Gene Regulatory Network Driven from Data has been implemented. Based on Genetic Programming, its target is to extract knowledge and properties from data and so to generate the network that underlies the behaviour of genes. For this reason the system is called Data Driven Gene Regulatory Network Generator.
Planning to individualize the mutual interactions between genes, a Genetic Programming application for the extraction of the best activation function of the genes has also been developed. In order to test such a system, it has been applied to a serial temporal dataset of microarray gene expression data of breast cancer, while a study aimed at predicting the survival of a set of cancer patients has also been performed. This study has led to the definition of a Medical Decision Support System.
The activation functions of genes performed by this system have been successively used to reconstruct the gene regulatory network that underlies the development, response and regulation of the biological system. With the intent to test it, a reverse engineering of a synthetic gene regulatory network has been made and a dynamic simulation has been performed allowing for the related time series reconstruction. The gene regulatory network used for the reverse engineering has been the recently published IRMA network, a yeast synthetic network for the assessment of reverse engineering networks and modelling approaches.
Finally, in order to apply this system to a realistic gene regulatory network composed by thousands of genes, a new cluster kernel method has been identified and a framework driven by it has been developed. It is based on Gene Ontology to facilitate the detection of similar patterns of interacting genes, with the aim of reducing the dimension of the related serial temporal data.",
Supervisors Prof. Giancarlo Mauri Prof. Leonardo Vanneschi",
Genetic Programming entries for Antonella Farinaccio