|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object pt.tumba.ngram.compression.AdaptiveUnigramModel
public final class AdaptiveUnigramModel
Provides an adaptive model based on bytes observed in the input
stream. Each byte count is initialized at 1
and
incremented by 1
for each instance seen. If
incrementing an outcome causes the total count to exceed
MAX_COUNT
, then all counts are divided by 2 and
rounded up. Estimation is by frequency (also known as a maximum
likelihood estimate).
Field Summary | |
---|---|
private int[] |
_count
Counts for each outcome. |
private static int |
EOF_INDEX
Index in the count array for the end-of-file outcome. |
private static int |
MAX_COUNT
Maximum count before rescaling. |
private static int |
NUM_BYTES
Total number of bytes. |
private static int |
TOTAL_INDEX
Index in the count array for the cumulative total of all outcomes. |
Fields inherited from interface pt.tumba.ngram.compression.ArithCodeModel |
---|
EOF, ESCAPE |
Constructor Summary | |
---|---|
AdaptiveUnigramModel()
Construct an adaptive unigram model, initializing all byte counts and end-of-file to 1 . |
Method Summary | |
---|---|
boolean |
escaped(int symbol)
Returns true if current context has no count
interval for given symbol. |
void |
exclude(int i)
Excludes outcome from occurring in next estimate. |
private int |
highCount(int i)
The cumulative count of all outcomes below given outcome plus the count of the outcome. |
void |
increment(int i)
Increments the model as if it had just encoded or decoded the specified symbol in the stream. |
void |
interval(int symbol,
int[] result)
Calculates {low count, high count, total count} for
the given symbol in the current context. |
private int |
lowCount(int i)
The cumulative count of all outcomes below given outcome. |
int |
pointToSymbol(int midCount)
Returns the symbol whose interval of low and high counts contains the given count. |
private void |
rescale()
Rescale the counts by adding 1 to all counts and dividing by 2 . |
int |
totalCount()
Returns the total count for the current context. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
private int[] _count
private static final int MAX_COUNT
private static final int NUM_BYTES
private static final int EOF_INDEX
private static final int TOTAL_INDEX
Constructor Detail |
---|
public AdaptiveUnigramModel()
1
.
Method Detail |
---|
public void interval(int symbol, int[] result)
{low count, high count, total count}
for
the given symbol in the current context. The symbol is either
an integer representation of a byte (0-255) or -1 to denote end-of-file.
The cumulative counts
in the return must be such that 0 <= low count < high
count <= total count
.
This method will be called exactly once for each symbol being
encoded or decoded, and the calls will be made in the order in
which they appear in the original file. Adaptive models
may only update their state to account for seeing a symbol
interval
in interface ArithCodeModel
symbol
- The next symbol to decode.result
- Array into which to write range.public int pointToSymbol(int midCount)
EOF
or
ESCAPE
, which are negative.
pointToSymbol
in interface ArithCodeModel
count
- The given count.
public int totalCount()
totalCount
in interface ArithCodeModel
public boolean escaped(int symbol)
true
if current context has no count
interval for given symbol. Successive calls to
escaped(symbol)
followed by
interval(ESCAPE)
must eventually lead to a a
false
return from escaped(symbol)
after a number of calls equal to the maximum context size.
The integer representation of symbol is as in interval
.
escaped
in interface ArithCodeModel
symbol
- Symbol to test whether it is encoded.
true
if given symbol is not represented in the current context.public void exclude(int i)
exclude
in interface ArithCodeModel
symbol
- Symbol which can be excluded from the next outcome.public void increment(int i)
increment
in interface ArithCodeModel
symbol
- Symbol to add to the model.private int lowCount(int i)
i
- Index of given outcome.
private int highCount(int i)
i
- Index of given outcome.
private void rescale()
2
.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |