Implementation of the Context Tree Weighting algorithm as described in the paper by M. J. Willems.
Requires the javascript node environment.
git clone git@github.com:andrewrocks/ctw.git
cd ctw
npm install
npm test
To estimate the probability of the next symbol in a string.
> const ctw = require('ctw')
> new ctw.Predictor('01010101010101', 2).predict('0')
0.9281081081081081
> new ctw.Predictor('01010101010101', 2).predict('1')
0.0718918918918919
We compute the KT estimator with the recursive function shown on
wikipedia.
And there's another way to write the formula
which we see in On Prediction Using Variable Order Markov Models.
We test the implementation against examples from
Data Compression: The Complete Reference.
Our code can compute the estimated probability and weighted probability
for every node in the tree, just like in this diagram from the paper.
We don't. There are no trees. There doesen't seem to be some certain fancy
way of building them either. Have a look at the
On Prediction Using Variable Order Markov Models
paper:
They put into the tree every context seen in the given string without hesitating.
Even the paper includes the nodes for contexts that don't even occur in the tree.
Because the algorithm requires an initial context for the string to be
processed, we chop away D bits from the strings to use as context, the way they do
in the prediction paper,
Intuitivley we would have expected to see more conditional probability
assignments in the CTW paper, but apparently they can simply be implicit.
In On Prediction Using Variable Order Markov Models,
they actually say that they switch between a complete probability distribution
and a conditional one interchangebly.
We're computing the KT estimate with a recursive function. A memoized recursive function might not be the best way to compute these values. The cache could grow very large (or very sparse if the library throws our results). A better way might be to update the probabilities in the counts directly the way other libraries to it.