pt.tumba.ngram
Class NGramConstants

java.lang.Object
  extended by pt.tumba.ngram.NGramConstants

public class NGramConstants
extends java.lang.Object

Contant values used in the TCatNG package.

Author:
Bruno Martins

Field Summary
static int SIMILARITYJIANG
          Use the similarity metric proposed by Jiand & Conranth in "Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy".
static int SIMILARITYLIN
          Use the similarity metric proposed by Lin in "An information-theoretic definition of similarity".
static int SIMILARITYOUTOFPLACE
          Use the similarity metric proposed by Cavnar & Trenkle.
static byte[] SKIPABLE
          Bytes skipable while building the proviles.
static boolean SMOOTHING
          Use Good-Turing smoothing on the NGram occurence frequency.
static int USEDNGRAMSMAX
          The lowest ranking position for storage in the N-gram profiles.
static int USEDNGRAMSMIN
          The highest ranking position for storage in the N-gram profile.
 
Constructor Summary
NGramConstants()
           
 
Method Summary
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

SKIPABLE

public static final byte[] SKIPABLE
Bytes skipable while building the proviles. These correspond to number characters.


USEDNGRAMSMAX

public static final int USEDNGRAMSMAX
The lowest ranking position for storage in the N-gram profiles. For instance with USEDNGRAMSMAX=400 only the top 400 highest occurring N-grams will be stored.

See Also:
Constant Field Values

USEDNGRAMSMIN

public static final int USEDNGRAMSMIN
The highest ranking position for storage in the N-gram profile. For instance with USEDNGRAMSMIN=200 the top 200 highest occurring N-grams will be skipped.

See Also:
Constant Field Values

SIMILARITYLIN

public static final int SIMILARITYLIN
Use the similarity metric proposed by Lin in "An information-theoretic definition of similarity".

See Also:
Constant Field Values

SIMILARITYJIANG

public static final int SIMILARITYJIANG
Use the similarity metric proposed by Jiand & Conranth in "Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy".

See Also:
Constant Field Values

SIMILARITYOUTOFPLACE

public static final int SIMILARITYOUTOFPLACE
Use the similarity metric proposed by Cavnar & Trenkle.

See Also:
Constant Field Values

SMOOTHING

public static final boolean SMOOTHING
Use Good-Turing smoothing on the NGram occurence frequency.

See Also:
Constant Field Values
Constructor Detail

NGramConstants

public NGramConstants()