pt.tumba.ngram.compression
Class UniformModel

java.lang.Object
  extended by pt.tumba.ngram.compression.UniformModel
All Implemented Interfaces:
ArithCodeModel

public final class UniformModel
extends java.lang.Object
implements ArithCodeModel

A singleton uniform distribution byte model. Provides a single static member that is a non-adaptive model assigning equal likelihood to all 256 bytes and the end-of-file marker. This will require approximately -log2 1/257 ~ 8.006, bits per symbol, including the end-of-file symbol.

Author:
Bruno Martins

Field Summary
private static int EOF_INDEX
          Index in the implicit count array for the end-of-file outcome.
static UniformModel MODEL
          A re-usable uniform model.
private static int NUM_BYTES
          Total number of bytes.
private static int NUM_OUTCOMES
          Index in the count array for the cumulative total of all outcomes.
 
Fields inherited from interface pt.tumba.ngram.compression.ArithCodeModel
EOF, ESCAPE
 
Constructor Summary
private UniformModel()
          Construct a uniform model.
 
Method Summary
 boolean escaped(int symbol)
          Returns true if current context has no count interval for given symbol.
 void exclude(int symbol)
          Excludes outcome from occurring in next estimate.
 void increment(int symbol)
          Increments the model as if it had just encoded or decoded the specified symbol in the stream.
 void interval(int symbol, int[] result)
          Calculates {low count, high count, total count} for the given symbol in the current context.
 int pointToSymbol(int midCount)
          Returns the symbol whose interval of low and high counts contains the given count.
 int totalCount()
          Returns the total count for the current context.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

MODEL

public static final UniformModel MODEL
A re-usable uniform model.


NUM_BYTES

private static final int NUM_BYTES
Total number of bytes.

See Also:
Constant Field Values

EOF_INDEX

private static final int EOF_INDEX
Index in the implicit count array for the end-of-file outcome.

See Also:
Constant Field Values

NUM_OUTCOMES

private static final int NUM_OUTCOMES
Index in the count array for the cumulative total of all outcomes.

See Also:
Constant Field Values
Constructor Detail

UniformModel

private UniformModel()
Construct a uniform model.

Method Detail

totalCount

public int totalCount()
Returns the total count for the current context.

Specified by:
totalCount in interface ArithCodeModel
Returns:
Total count for the current context.

pointToSymbol

public int pointToSymbol(int midCount)
Returns the symbol whose interval of low and high counts contains the given count. Ordinary outcomes are positive integers, and the two special constants EOF or ESCAPE, which are negative.

Specified by:
pointToSymbol in interface ArithCodeModel
Parameters:
count - The given count.
Returns:
The symbol whose interval contains the given count.

interval

public void interval(int symbol,
                     int[] result)
Calculates {low count, high count, total count} for the given symbol in the current context. The symbol is either an integer representation of a byte (0-255) or -1 to denote end-of-file. The cumulative counts in the return must be such that 0 <= low count < high count <= total count.

This method will be called exactly once for each symbol being encoded or decoded, and the calls will be made in the order in which they appear in the original file. Adaptive models may only update their state to account for seeing a symbol after returning its current interval.

Specified by:
interval in interface ArithCodeModel
Parameters:
symbol - The next symbol to decode.
result - Array into which to write range.

escaped

public boolean escaped(int symbol)
Returns true if current context has no count interval for given symbol. Successive calls to escaped(symbol) followed by interval(ESCAPE) must eventually lead to a a false return from escaped(symbol) after a number of calls equal to the maximum context size. The integer representation of symbol is as in interval.

Specified by:
escaped in interface ArithCodeModel
Parameters:
symbol - Symbol to test whether it is encoded.
Returns:
true if given symbol is not represented in the current context.

exclude

public void exclude(int symbol)
Excludes outcome from occurring in next estimate. A symbol must not be excluded and then coded or decoded. Exclusions in the model must be coordinated for encoding and decoding.

Specified by:
exclude in interface ArithCodeModel
Parameters:
symbol - Symbol which can be excluded from the next outcome.

increment

public void increment(int symbol)
Increments the model as if it had just encoded or decoded the specified symbol in the stream. May be used to prime models by "injecting" a symbol into the model's stream without coding/decoding it in the stream of coded bytes. Calls must be coordinated for encoding and decoding. Will be called automatically by the models for symbols they encode or decode.

Specified by:
increment in interface ArithCodeModel
Parameters:
symbol - Symbol to add to the model.