abstract = "MOTIVATION: Current approaches to contact map
prediction in proteins have focused on amino acid
conservation and patterns of mutation at sequentially
distant positions. This sequence information is poorly
understood and very little progress has been made in
this area during recent years. RESULTS: In this study,
an observation of 'striped' sequence patterns across
beta-sheets prompted the development of a new type of
contact map predictor. Computer program code was
evolved with an evolutionary algorithm (genetic
programming) to select residues and residue pairs
likely to make contacts based solely on local sequence
patterns extracted with the help of self-organising
maps. The mean prediction accuracy is 27percent on a
validation set of 156 domains up to 400 residues in
length, where contacts are separated by at least 8
residues and length/10 pairs are predicted. The
retrospective accuracy on a set of 15 CASP5 targets is
27percent and 14percent for length/10 and length/2
predicted pairs, respectively (both using a minimum
residue separation of 24). This compares favourably to
the equivalent 21percent and 13percent obtained for the
best automated contact prediction methods at CASP5. The
results suggest that protein architectures impose
regularities in local sequence environments. Other
sources of information, such as correlated/compensatory
mutations, may further improve accuracy. AVAILABILITY:
A web-based prediction service is available at
http://www.sbc.su.se/~maccallr/contactmaps",