Genre Classification

These perl programs can be used to identify the single genres. contains_xxx.pl recognizes, wheter genre xxx is contained in the text, find_xxx.pl whether the complete text is of genre xxx.

Usage:

perl program.pl [DIRECTORY [CORPUS [FILENAME]]]
where CORPUS := train | test
When no filename is given, the whole directory will be processed.

For the programs contains_literatur und find_literatur the TreeTagger has to be installed and the path thereto adapted in those programs. Some other programs need already tagged versions of the files. See the corpora for examples how these files have to look like.

All programs