pt.tumba.ngram.compression
Class ExcludingAdaptiveUnigramModel

java.lang.Object
  extended by pt.tumba.ngram.compression.ExcludingAdaptiveUnigramModel

final class ExcludingAdaptiveUnigramModel
extends java.lang.Object

Package class for use by the PPMModel. A fragmentary adaptive unigram model that allows exclusions in converting points to intervals and vice-versa. One such model will be used for each unigram context.

Author:
Bruno Martins

Field Summary
private  int[] _count
          Counts for each outcome.
private static int EOF_INDEX
          Index in the count array for the end-of-file outcome.
private static int MAX_INDIVIDUAL_COUNT
          Maximum count before rescaling.
 
Constructor Summary
ExcludingAdaptiveUnigramModel()
          Construct an excluding adaptive unigram model.
 
Method Summary
 void increment(int i)
          Increment the count for the given outcome.
 void interval(int symbol, int[] result, ByteSet exclusions)
          Compute the resulting interval to code the specified symbol given the specified excluded bytes.
 int pointToSymbol(int midCount, ByteSet exclusions)
          Return the symbol corresponding to the specified count, given the specified excluded bytes.
private  void rescale()
          Rescale the counts by dividing all frequencies by 2, but taking a minimum of 1.
 int totalCount(ByteSet exclusions)
          Total count for interval given specified set of exclusions.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

_count

private int[] _count
Counts for each outcome. Indices 0 to 255 for the usual counts, 256 for end-of-file, and 257 for total.


MAX_INDIVIDUAL_COUNT

private static final int MAX_INDIVIDUAL_COUNT
Maximum count before rescaling.

See Also:
Constant Field Values

EOF_INDEX

private static final int EOF_INDEX
Index in the count array for the end-of-file outcome.

See Also:
Constant Field Values
Constructor Detail

ExcludingAdaptiveUnigramModel

public ExcludingAdaptiveUnigramModel()
Construct an excluding adaptive unigram model.

Method Detail

interval

public void interval(int symbol,
                     int[] result,
                     ByteSet exclusions)
Compute the resulting interval to code the specified symbol given the specified excluded bytes.

Parameters:
symbol - Symbol to code.
result - Interval to code the symbol.
exclusions - Bytes to exclude as possible outcomes for interval.

pointToSymbol

public int pointToSymbol(int midCount,
                         ByteSet exclusions)
Return the symbol corresponding to the specified count, given the specified excluded bytes.

Parameters:
midCount - Count of symbol to return.
exclusions - Bytes to exclude from consideration.
Returns:
Symbol represented by specified count.

totalCount

public int totalCount(ByteSet exclusions)
Total count for interval given specified set of exclusions.

Parameters:
exclusions - Bytes to exclude as outcomes.
Returns:
Total count of all non-excluded outcomes.

increment

public void increment(int i)
Increment the count for the given outcome.

Parameters:
i - Outcome to increment

rescale

private void rescale()
Rescale the counts by dividing all frequencies by 2, but taking a minimum of 1.