Packages
pt.tumba.ngram The TCatNG Toolkit is a Java package that you can use to apply N-Gram analysis techniques to the process of categorizing text files.
pt.tumba.ngram.bayes Implementation of Bayesian Network Classifiers that can be used to categorize text files using N-Grams as features.
pt.tumba.ngram.blr Implementation of Bayesian Logistic Regression classification that can be used to categorize text files using N-Grams as features, based on the "Bayesian Logistic Regression Software" package* by Alexander Genkin, David D.
pt.tumba.ngram.compression Implementation of the compression-based classification technique described in the papers "Towards Parameter-Free Data Mining" and "The Similarity Metric", respectivelly by Ming Li and Keogh et al.
pt.tumba.ngram.svm Implementation of Support Vector Machines classification and regression that can be used to categorize text files using N-Grams as features.