Skip to content

a-robu/ctw

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

42 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ctw

Implementation of the Context Tree Weighting algorithm as described in the paper by M. J. Willems.

Install and run unit tests

Requires the javascript node environment.

git clone git@github.com:andrewrocks/ctw.git
cd ctw
npm install
npm test

How to use the predictor

To estimate the probability of the next symbol in a string.

> const ctw = require('ctw')
> new ctw.Predictor('01010101010101', 2).predict('0')
0.9281081081081081
> new ctw.Predictor('01010101010101', 2).predict('1')
0.0718918918918919

Krichevsky–Trofimov estimator

We compute the KT estimator with the recursive function shown on wikipedia. wikipedia-function

And there's another way to write the formula

textbook-examples

which we see in On Prediction Using Variable Order Markov Models.

We test the implementation against examples from Data Compression: The Complete Reference. textbook-examples

Example Tree

Our code can compute the estimated probability and weighted probability for every node in the tree, just like in this diagram from the paper. diagram-of-tree

How we build the trees

We don't. There are no trees. There doesen't seem to be some certain fancy way of building them either. Have a look at the On Prediction Using Variable Order Markov Models paper: tree-making-hint They put into the tree every context seen in the given string without hesitating. Even the paper includes the nodes for contexts that don't even occur in the tree.

Initial Context

Because the algorithm requires an initial context for the string to be processed, we chop away D bits from the strings to use as context, the way they do in the prediction paper, diagram-of-tree

Prediction

Intuitivley we would have expected to see more conditional probability assignments in the CTW paper, but apparently they can simply be implicit. In On Prediction Using Variable Order Markov Models, they actually say that they switch between a complete probability distribution and a conditional one interchangebly. interchangable-snippet

To improve

We're computing the KT estimate with a recursive function. A memoized recursive function might not be the best way to compute these values. The cache could grow very large (or very sparse if the library throws our results). A better way might be to update the probabilities in the counts directly the way other libraries to it.

About

Implementation of the Context Tree Weighting algorithm.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors