Petri net modeling of high-order genetic systems using grammatical evolution

doi:10.1016/S0303-2647(03)00142-4

Biosystems

Volume 72, Issues 1–2, November 2003, Pages 177-186

https://doi.org/10.1016/S0303-2647(03)00142-4 Get rights and content

Abstract

Understanding how DNA sequence variations impact human health through a hierarchy of biochemical and physiological systems is expected to improve the diagnosis, prevention, and treatment of common, complex human diseases. We have previously developed a hierarchical dynamic systems approach based on Petri nets for generating biochemical network models that are consistent with genetic models of disease susceptibility. This modeling approach uses an evolutionary computation approach called grammatical evolution as a search strategy for optimal Petri net models. We have previously demonstrated that this approach routinely identifies biochemical network models that are consistent with a variety of genetic models in which disease susceptibility is determined by nonlinear interactions between two DNA sequence variations. In the present study, we evaluate whether the Petri net approach is capable of identifying biochemical networks that are consistent with disease susceptibility due to higher order nonlinear interactions between three DNA sequence variations. The results indicate that our model-building approach is capable of routinely identifying good, but not perfect, Petri net models. Ideas for improving the algorithm for this high-dimensional problem are presented.

Introduction

The field of human genetics is shifting its emphasis away from rare Mendelian diseases, such as cystic fibrosis, to common diseases that represent the vast majority of the public health burden. As this shift occurs, it is becoming increasingly clear that susceptibility to common human diseases, such as essential hypertension and sporadic breast cancer, is largely due to nonlinear interactions among multiple genes and multiple environmental factors (Ritchie et al., 2001, Moore and Williams, 2002, Moore, 2003). The idea that there will be a single gene with a large effect on disease susceptibility is not realistic. Thus, identification of genes that confer an increased susceptibility to a common disease will require a research strategy that embraces rather than ignores the complexity of these diseases. Further, if we are to successfully use genetic information at the public health level to improve the diagnosis, prevention, and treatment of common diseases, we will need to understand how DNA sequence information influences human health through hierarchical networks of biochemical and physiological systems. Making the connection between genes, biochemistry, and disease susceptibility using a discrete dynamic systems modeling approach is the focus of the present study.

We took the first step towards hierarchical systems modeling of disease susceptibility by addressing the following questions. First, is it possible to develop simple discrete dynamic systems models of biochemical networks that are consistent with nonlinear gene–gene interactions that are observed at the population level? Second, are these simple biochemical systems models biologically plausible? We used discrete dynamic systems models called Petri nets to develop two independent, biologically plausible, biochemical systems models of a well-known nonlinear gene–gene interaction model (unpublished results). This preliminary proof of principle study demonstrated the utility of Petri nets for modeling biochemical systems that are consistent with nonlinear gene–gene interactions in complex diseases. However, an important limitation of this modeling approach is that the Petri net models were developed by a human-based trial and error approach that is time consuming and difficult due to combinatorial complexities. In response to this limitation, Moore and Hahn (2003a) developed a machine intelligence strategy that uses an evolutionary computation approach called grammatical evolution for the automatic discovery of Petri net models. This approach routinely generates Petri net models that are consistent with a variety of genetic models in which disease susceptibility is dependent on nonlinear interactions between two DNA sequence variations (Moore and Hahn, 2003a, Moore and Hahn, 2003b).

The goal of the present study is to evaluate the ability of the grammatical evolution approach proposed by Moore and Hahn (2003a) to discover Petri net models of biochemical systems that are consistent with nonlinear interactions between three DNA sequence variations. The results indicate that our model-building approach is capable of routinely identifying good, but not perfect, Petri net models. Ideas for improving the algorithm for this high-dimensional problem are presented.

Section snippets

The nonlinear gene–gene interaction models

Our two high-order, nonlinear gene–gene interaction models are based on penetrance functions. Penetrance functions represent one approach to modeling the relationship between genetic variations and risk of disease. Penetrance is simply the probability (P) of disease (D) given a particular combination of genotypes (G) that was inherited (i.e. P[D|G]). A single genotype is determined by one allele (i.e. a specific DNA sequence state) inherited from the mother and one allele inherited from the

Introduction to Petri nets for modeling discrete dynamic systems

Petri nets are a type of directed graph that can be used to model discrete dynamical systems (Desel and Juhas, 2001). Goss and Peccoud (1998) demonstrated that Petri nets could be used to model molecular interactions in biochemical systems. The core Petri net consists of two different types of nodes: places and transitions. Using the biochemical systems analogy of Goss and Peccoud (1998), places represent molecular species. Each place has a certain number of tokens that represent the number of

Our Petri net modeling strategy

Moore and Hahn (2003a) developed a strategy for identifying Petri net models of biochemical systems that are consistent with observed population-level gene–gene interactions. The specific Petri nets used to model the biochemical pathways are Petri nets with time (Merlin, 1974, Ramchandani, 1974). Transitions had either a fixed delay or fired as soon as the preconditions of the transition were met. If a place provided input to two or more transitions but had only enough tokens to satisfy one

Overview of grammatical evolution

Evolutionary computation had many different independent origins, including evolutionary programming, that used simulated evolution for artificial intelligence (Fogel, 1962, Fogel et al., 1966) and evolution strategies for engineering optimization (Rechenberg, 1965, Schwefel, 1965). The focus of evolutionary computation on representations at the genotypic level lead to the development of genetic algorithms by Holland, 1969, Holland, 1975 and others. Genetic algorithms have become a popular

Results

The grammatical evolution algorithm was run a total of 100 times for each of the two high-order, nonlinear gene–gene interaction models. For model 1, the grammatical evolution strategy did not yield a Petri net model that was perfectly consistent with the high-risk and low-risk assignments for each combination of genotypes. However, two models were identified that misclassified disease risk status for only one of the 27 genotype combinations. The worst model of the 100 misclassified 5 out of

Discussion

Moore and Hahn, 2003a, Moore and Hahn, 2003b have previously developed a grammatical evolution approach to the discovery of discrete dynamic systems models that were consistent with genotype-specific distributions of disease risk for combinations of DNA sequence variations. These initial studies demonstrated that the grammatical evolution approach routinely identified Petri net models that are perfectly consistent with the high-risk and low-risk assignments for each combination of genotypes

Acknowledgements

We would like to thank two anonymous referees and the editors for their very thoughtful comments and suggestions. This work was supported by National Institutes of Health Grants HL65234, HL65962, GM31304, AG19085, and AG20135.

References (26)

M.D. Ritchie et al.
Multifactor dimensionality reduction reveals high-order interactions among estrogen metabolism genes in sporadic breast cancer
Am. J. Hum. Genet.
(2001)
Cantu-Paz, E., 2000. Efficient and Accurate Parallel Genetic Algorithms. Kluwer Academic...
Desel, J., Juhas, G., 2001. What is a Petri net? Informal answers for the informed reader. In: Ehrig, H., Juhas, G....
Falconer, D.S., Mackay, T.F.C., 1996. Introduction to Quantitative Genetics. Longman,...
L.J. Fogel
Autonomous automata
Ind. Res.
(1962)
Fogel, L.J., Owens, A.J., Walsh, M.J., 1966. Artificial Intelligence Through Simulated Evolution. Wiley, New...
Goldberg, D.E., 1989. Genetic Algorithms in Search, Optimization, and Machine Learning....
P.J. Goss et al.
Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets
Proc. Natl. Acad. Sci. U.S.A.
(1998)
L.W. Hahn et al.
Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions
Bioinformatics
(2003)
Holland, J.H., 1969. Adaptive plans optimal for payoff-only environments. In: Proceedings of the 2nd Hawaii...

Holland, J.H., 1975. Adaptation in Natural and Artificial Systems. University of Michigan Press, Ann...

Kitagawa, J., Iba, H., 2003. Identifying metabolic pathways and gene regulation networks with evolutionary algorithms....

Koza, J.R., 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. The MIT...

Cited by (35)

Modelling epistasis in genetic disease using Petri nets, evolutionary computation and frequent itemset mining
2011, Expert Systems with Applications
Citation Excerpt :
It is not entirely evident, however, how scalable this approach is to different and larger problems. In the field of gene–gene interaction modelling, Moore and Hahn’s (2003) approach is the first attempt to describe a technique using a genetic algorithm to learn a Petri net model of a non-linear gene interaction. However, the particular variety of genetic algorithm (“grammatical evolution”) that they employ appears to produce only very simple Petri nets.
Petri nets are useful for mathematically modelling disease-causing genetic epistasis. A Petri net model of an interaction has the potential to lead to biological insight into the cause of a genetic disease. However, defining a Petri net by hand for a particular interaction is extremely difficult because of the sheer complexity of the problem and degrees of freedom inherent in a Petri net’s architecture.
We propose therefore a novel method, based on evolutionary computation and data mining, for automatically constructing Petri net models of non-linear gene interactions. The method comprises two main steps. Firstly, an initial partial Petri net is set up with several repeated sub-nets that model individual genes and a set of constraints, comprising relevant common sense and biological knowledge, is also defined. These constraints characterise the class of Petri nets that are desired. Secondly, this initial Petri net structure and the constraints are used as the input to a genetic algorithm. The genetic algorithm searches for a Petri net architecture that is both a superset of the initial net, and also conforms to all of the given constraints. The genetic algorithm evaluation function that we employ gives equal weighting to both the accuracy of the net and also its parsimony.
We demonstrate our method using an epistatic model related to the presence of digital ulcers in systemic sclerosis patients that was recently reported in the literature. Our results show that although individual “perfect” Petri nets can frequently be discovered for this interaction, the true value of this approach lies in generating many different perfect nets, and applying data mining techniques to them in order to elucidate common and statistically significant patterns of interaction.
Learning Petri net models of non-linear gene interactions
2005, BioSystems
Citation Excerpt :
In contrast to the previous work, I believe that better Petri net optimisation can be achieved by severely restricting the parameter space being searched. Moore and Hahn (2003) allowed for the possibility of variation on almost all facets of their Petri nets. In other words, different genotypes could determine different arc weights, connectivity, place capacities, and so on.
Understanding how an individual's genetic make-up influences their risk of disease is a problem of paramount importance. Although machine-learning techniques are able to uncover the relationships between genotype and disease, the problem of automatically building the best biochemical model or “explanation” of the relationship has received less attention. In this paper, I describe a method based on random hill climbing that automatically builds Petri net models of non-linear (or multi-factorial) disease-causing gene–gene interactions. Petri nets are a suitable formalism for this problem, because they are used to model concurrent, dynamic processes analogous to biochemical reaction networks. I show that this method is routinely able to identify perfect Petri net models for three disease-causing gene–gene interactions recently reported in the literature.
Using Petri Net tools to study properties and dynamics of biological systems
2005, Journal of the American Medical Informatics Association
Citation Excerpt :
They include modeling features, analysis types that the tools support, and technical features such as the exchange format, operating systems, quality of the user interface and technical support, and whether the license is free to the public or for academic use. Examining biological systems that we have modeled in the past while working with domain experts, other biological systems that are discussed in biological textbooks of biochemistry and cellular biology, and an extensive literature study of PN models of biological systems,2–15,17–20 we defined features that characterize biological processes. They include the following:
Petri Nets (PNs) and their extensions are promising methods for modeling and simulating biological systems. We surveyed PN formalisms and tools and compared them based on their mathematical capabilities as well as by their appropriateness to represent typical biological processes. We measured the ability of these tools to model specific features of biological systems and answer a set of biological questions that we defined. We found that different tools are required to provide all capabilities that we assessed. We created software to translate a generic PN model into most of the formalisms and tools discussed. We have also made available three models and suggest that a library of such models would catalyze progress in qualitative modeling via PNs. Development and wide adoption of common formats would enable researchers to share models and use different tools to analyze them without the need to convert to proprietary formats.
Connecting the dots between genes, biochemistry, and disease susceptibility: Systems biology modeling in human genetics
2005, Molecular Genetics and Metabolism
Understanding how DNA sequence variations impact human health through a hierarchy of biochemical and physiological systems is expected to improve the diagnosis, prevention, and treatment of common, complex human diseases. We have previously developed a hierarchical dynamic systems approach based on Petri nets for generating biochemical network models that are consistent with genetic models of disease susceptibility. This modeling approach uses an evolutionary computation approach called grammatical evolution as a search strategy for optimal Petri net models. We have previously demonstrated that this approach routinely identifies biochemical network models that are consistent with a variety of genetic models in which disease susceptibility is determined by nonlinear interactions between two or more DNA sequence variations. We review here this approach and then discuss how it can be used to model biochemical and metabolic data in the context of genetic studies of human disease susceptibility.
Genetic programming with linear representation: A survey
2009, International Journal on Artificial Intelligence Tools
A survey of genetic programming and its applications
2019, KSII Transactions on Internet and Information Systems

View all citing articles on Scopus

View full text

Petri net modeling of high-order genetic systems using grammatical evolution

Abstract

Introduction

Section snippets

The nonlinear gene–gene interaction models

Introduction to Petri nets for modeling discrete dynamic systems

Our Petri net modeling strategy

Overview of grammatical evolution

Results

Discussion

Acknowledgements

Am. J. Hum. Genet.

Autonomous automata

Ind. Res.

Quantitative modeling of stochastic systems in molecular biology by using stochastic Petri nets

Proc. Natl. Acad. Sci. U.S.A.

Multifactor dimensionality reduction software for detecting gene–gene and gene–environment interactions

Bioinformatics