Created by W.Langdon from gp-bibliography.bib Revision:1.8168
The present study is based on the premiss that authorship attribution is just one type of text classification and that advances in this area can be made by applying and adapting techniques from the field of machine learning.
Five different trainable text-classification systems are described, which differ from current stylometric practice in a number of ways, in particular by using a wider variety of marker patterns than customary and by seeking such markers automatically, without being told what to look for. A comparison of the strengths and weaknesses of these systems, when tested on a representative range of text-classification problems, confirms the importance of paying more attention than usual to alternative methods of representing distinctive differences between types of text.
The thesis concludes with suggestions on how to make further progress towards the goal of a fully automatic, trainable text-classification system.",
Genetic Programming entries for Richard Forsyth