abstract = "We have developed a method for the generation of
hidden Markov model (HMM) representing complex motif in
DNA sequences. The procedures of the method are as
follows: (1) design of HMMs for elemental motifs in
given DNA sequences; (2) construction of a complex
motif HMM consisting of the elemental motif HMMs.
Statistical analysis and genetic programming (GP) were
applied to the respective procedures. At step (1),
left-to-right HMMs were designed and their lengths were
determined by a statistical significance. At step (2),
probabilistic tree describing HMMs was defined and its
structure was optimized by GP against a complex motif.
Concatenation, probabilistic union, probabilistic
closure, etc. were attached to nonterminal nodes. The
elemental motif HMMs and an HMM for any a letter were
attached to terminal nodes. In the method, the advance
design of elemental motif HMMs and adoption of
probabilistic tree as encoding rule of GP lead to
efficient generation of complex motif HMM. It was
observed that the generated HMM can detect the complex
motif in uncharacterized DNA sequences with high
accuracy. Further, the HMM is full of interesting
suggestions of the complex motif. (author abst.)",