Multiobjective Optimization in Quantitative Structure-Activity Relationships: Deriving Accurate and Interpretable QSARs
Created by W.Langdon from
gp-bibliography.bib Revision:1.8051
- @Article{Nicolotti:2002:JMC,
-
author = "Orazio Nicolotti and Valerie J. Gillet and
Peter J. Fleming and Darren V. S. Green",
-
title = "Multiobjective Optimization in Quantitative
Structure-Activity Relationships: Deriving Accurate and
Interpretable QSARs",
-
journal = "Journal of Medicinal Chemistry",
-
year = "2002",
-
volume = "45",
-
number = "23",
-
pages = "5069--5080",
-
month = nov # " 7",
-
keywords = "genetic algorithms, genetic programming, QSAR,
cheminformatics, MOGA, MOGP, GSK, Akaike Information
Criterion (AIC)",
-
ISSN = "0022-2623",
-
URL = "http://pubs3.acs.org/acs/journals/doilookup?in_doi=10.1021/jm020919o",
-
DOI = "doi:10.1021/jm020919o",
-
abstract = "Deriving quantitative structure-activity relationship
(QSAR) models that are accurate, reliable, and easily
interpretable is a difficult task. In this study, two
new methods have been developed that aim to find useful
QSAR models that represent an appropriate balance
between model accuracy and complexity. Both methods are
based on genetic programming (GP). The first method,
referred to as genetic QSAR (or GPQSAR), uses a penalty
function to control model complexity. GPQSAR is
designed to derive a single linear model that
represents an appropriate balance between the variance
and the number of descriptors selected for the model.
The second method, referred to as multiobjective
genetic QSAR (MoQSAR), is based on multiobjective GP
and represents a new way of thinking of QSAR.
Specifically, QSAR is considered as a multiobjective
optimization problem that comprises a number of
competitive objectives. Typical objectives include
model fitting, the total number of terms, and the
occurrence of nonlinear terms. MoQSAR results in a
family of equivalent QSAR models where each QSAR
represents a different tradeoff in the objectives. A
practical consideration often overlooked in QSAR
studies is the need for the model to promote an
understanding of the biochemical response under
investigation. To accomplish this, chemically intuitive
descriptors are needed but do not always give rise to
statistically robust models. This problem is addressed
by the addition of a further objective, called chemical
desirability, that aims to reward models that consist
of descriptors that are easily interpretable by
chemists. GPQSAR and MoQSAR have been tested on various
data sets including the Selwood data set and two
different solubility data sets. The study demonstrates
that the MoQSAR method is able to find models that are
at least as good as models derived using standard
statistical approaches and also yields models that
allow a medicinal chemist to trade statistical
robustness for chemical interpretability.",
-
notes = "http://pubs.acs.org/journals/jmcmar/index.html
PMID: 12408718",
- }
Genetic Programming entries for
Orazio Nicolotti
Valerie J Gillet
Peter J Fleming
Darren V S Green
Citations