Genre Classification

This thesis introduces an apprach for the automatic recognition of a text's genre. In a first step, I developed a new hierarchy of genres and explored the characterstics of each genre class. Based on these findings, I implemented specific analyzers for each genre - some kind of hand-crafted decision trees. In a final step, different approaches to combine these classifiers have been tried. The results were compared to those achieved by standard algorithms used in machine learning and knowledge discovery.

Evaluation showed, that my methods lead to the best results. The average values for recall and precision are close to 60% resp. 75%. Performance varied a lot between different genres.

A list of english publications can be found here.

Download PDF (in german, 1,6 MB)