Created by W.Langdon from gp-bibliography.bib Revision:1.4910
Previous studies on bladder cancer have shown nodal involvement to be an independent indicator of prognosis and survival. This study aimed at developing an objective method for detection of nodal metastasis from molecular profiles of primary urothelial carcinoma tissues. Methods
The study included primary bladder tumor tissues from 60 patients across different stages and 5 control tissues of normal urothelium. The entire cohort was divided into training and validation sets comprised of node positive and node negative subjects. Quantitative expression profiling was performed for a panel of 70 genes using standardized competitive RT-PCR and the expression values of the training set samples were run through an iterative machine learning process called genetic programming that employed an N-fold cross validation technique to generate classifier rules of limited complexity. These were then used in a voting algorithm to classify the validation set samples into those associated with or without nodal metastasis. Results
The generated classifier rules using 70 genes demonstrated 81percent accuracy on the validation set when compared to the pathological nodal status. The rules showed a strong predilection for ICAM1, MAP2K6 and KDR resulting in gene expression motifs that cumulatively suggested a pattern ICAM1>MAP2K6>KDR for node positive cases. Additionally, the motifs showed CDK8 to be lower relative to ICAM1, and ANXA5 to be relatively high by itself in node positive tumors. Rules generated using only ICAM1, MAP2K6 and KDR were comparably robust, with a single representative rule producing an accuracy of 90percent when used by itself on the validation set, suggesting a crucial role for these genes in nodal metastasis. Conclusion
Our study demonstrates the use of standardized quantitative gene expression values from primary bladder tumor tissues as inputs in a genetic programming system to generate classifier rules for determining the nodal status. Our method also suggests the involvement of ICAM1, MAP2K6, KDR, CDK8 and ANXA5 in unique mathematical combinations in the progression towards nodal positivity. Further studies are needed to identify more class-specific signatures and confirm the role of these genes in the evolution of nodal metastasis in bladder cancer.",
65 samples. 11-fold cross validation. Max 7-genes per program.
mixing of folds and majority voting scheme. 100 Generations. p6 Analysis of gene usage 'motifs' (requires GP, could not be done with other approaches. Indicate possible biochemical pathways.
p7 'Gene transitivity'. p12 'hypothesis-generating nature of GP'
p12 'A unique feature of GP is the final output, which consists of easily readable rules expressed as executable classifier programs that define tangible relationships between the most influential genes.' p12 'filtering can create an incomplete and biased dataset that may not be representative of many complex biological systems. The curse of dimensionality'
p13.'hierarchical, KNN, K-means clustering and Neural Nets which do not scale easily to larger numbers of variables.'
p13 GP can 'handle missing values in the data'.",
Genetic Programming entries for Anirban P Mitra Arpit A Almal Ben George David W Fry Peter F Lenehan Vincenzo Pagliarulo Richard J Cote Ram H Datar William P Worzel