Created by W.Langdon from gp-bibliography.bib Revision:1.8081
Such barriers can be overcome using genetic programming. The aim of this thesis is to produce classifiers and in particular document representatives which are human readable using genetic programming. Human readability makes these representatives more interactive and adaptable by providing the possibility of integrating expert knowledge.
Genetic programming as a non-deterministic method with high flexibility is among the best options to produce human readable document representatives. To test the results of the chosen method, standard test collections are used. These standard test collections guarantee that the experiments are replicable and the results are reproducible by other researchers.
This thesis demonstrates the process of producing human readable document representatives with transparency for further modification and analysis by expert knowledge, while retaining the performance.
To obtain these findings, this thesis has contributed to the field by developing a system that introduces a novel tree structure to improve the feature selection process, and a novel fitness function to improve the quality of representative generator.
To produce a human readable representative the tree structure is changed into a new shape with more control on the number of children. This reduces the depth of each tree for certain number of features and results in a flatter structure. A fitness function is constructed by combination of classification accuracy on training and validation sets and a parsimony component. This study found that the order of matched document with representatives can improve overall performance. Different feature selections are investigated and integrated into our genetic programming based feature selection method which is based on a probability distribution derived from the feature weights.",
Supervisors: Masoud Saeedi and Ashok Jashapara",
Genetic Programming entries for Yasaman Soltan-Zadeh