abstract = "A novel method, MUTIC ( Clustering), is described for
identifying complex interactions between genes or
gene-categories based on gene expression data. The
method deals with binary categorical data, which
consists of a set of gene expression profiles divided
into two biologically meaningful categories. It does
not require data from multiple time points. Gene
expression profiles are represented by feature vectors
whose component features are either gene expression
values, or averaged expression values corresponding to
Gene Ontology or Protein Information Resource
categories. A supervised learning algorithm (genetic
programming) is used to learn an ensemble of
classification models distinguishing the two categories
based on the feature vectors corresponding to their
members. Each feature is associated with a model usage
vector, which has an entry for each high-quality
classification model found, indicating whether or not
the feature was used in that model. These usage vectors
are then clustered using a variant of hierarchical
clustering called Omniclust. The result is a set of
model-usage-based clusters, in which features are
gathered together if they are often considered together
by classification models which may be because they are
co-expressed, or may be for subtler reasons involving
multi-gene interactions. The MUTIC method is
illustrated via applying it to a dataset regarding gene
expression in human brains of various ages. Compared to
traditional expression-based clustering, MUTIC yields
clusters that have higher mathematical quality (in the
sense of homogeneity and separation) and also yield
novel insights into the underlying biological
processes.",
notes = "WCCI 2006 - A joint meeting of the IEEE, the EPS, and
the IEE.