WORDCOMPRESSION

Let's test different approaches for compressing and decompressing a dictionnary of english words (uppercase, no accent or punctuation, 1-5 letters long) in JavaScript.

The goal is not to find the smallest encoded string, but the string that will compress the best through RoadRoller.js and gzip.

Previous work:

words.txt: raw data (14915 words, 80.8 kb)
words.js: json data (109 kb)
words.txt zipped: ~24 kb
txt + RoadRoller.js + zip: ~14 kb
json + MiniPrefixRemover.js + RoadRoller.js + zip: 12.4kb (12741b)

New approaches:

prefix.html:
Alphabetical ordering + smart prefix handling + one magic number to represent "last word + s"
encoded json + RoadRoller.js + zip: 11.3kb (11585b)
prefix_remapped.html:
same as above but with a remap of the encoded alphabet to use the most used letters first
encoded json + RoadRoller.js + zip: 11.2kb (11447b)
prefix_remapped_shuffled.html:
same as above but with a more zip-friendly order for the dictionnay (see shuffler.html)
encoded json + RoadRoller.js + zip: 11.1kb (11335b)

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.gitattributes		.gitattributes
README.md		README.md
a.html		a.html
b.html		b.html
output.js		output.js
prefix.html		prefix.html
prefix.js		prefix.js
prefix2.html		prefix2.html
prefix_remapped.html		prefix_remapped.html
prefix_remapped_shuffled.html		prefix_remapped_shuffled.html
shuffler.html		shuffler.html
words.js		words.js
words.txt		words.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WORDCOMPRESSION

Previous work:

New approaches:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WORDCOMPRESSION

Previous work:

New approaches:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages