Skip to content

JuliaIO/Zstandard.jl

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

31 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zstandard.jl

A pure Julia implementation of the Zstandard (RFC 8878) compression algorithm. No C dependencies required.

Features

  • Compression and decompression of Zstandard frames
  • Streaming decompression via ZstdDecompressorStream (implements IO interface)
  • Huffman and FSE entropy coding
  • Repeat offset encoding
  • Multi-block support for data larger than 128KB
  • Dictionary decompression
  • Skippable frames
  • Optional content checksums (XXH64)

See Also

For production use, these C-backed packages are likely more reliable and performant. Zstandard.jl is useful when a pure Julia implementation is preferred, such as for environments where C dependencies are unavailable or for educational purposes.

Installation

using Pkg
Pkg.add(url="https://github.com/JuliaIO/Zstandard.jl")

Note: This package depends on a fork of XXHashNative.jl that adds XXH64 support. The [sources] section in Project.toml points to this fork until Moelf/XXHashNative.jl#28 is merged.

Usage

using Zstandard

# Compress
compressed = compress("Hello, world!")
compressed = compress(read("input.bin"))

# Decompress
data = decompress(compressed)

# Streaming decompression
open("file.zst") do io
    stream = ZstdDecompressorStream(io)
    decompressed = read(stream)
end

# Optional content checksum
compressed = compress(data, checksum=true)

Performance

Benchmarked against libzstd (via CodecZstd.jl) on an AMD Ryzen system, Julia 1.11:

Compression Ratio

Dataset Zstd.jl libzstd Gap
text 5KB 72.5:1 73.5:1 1.0x
text 50KB 838.1:1 825.0:1 1.0x
repetitive 160KB 4311.6:1 4551.1:1 1.1x
repetitive 1MB 8388.6:1 8738.1:1 1.0x
random 1MB 1.0:1 1.0:1 1.0x

Compression ratios are within 6% of libzstd across all tested datasets.

Throughput (MB/s)

Dataset Compress (jl/lib) Decompress (jl/lib)
text 5KB 141 / 112 314 / 488
text 50KB 374 / 894 551 / 957
repetitive 160KB 446 / 1611 497 / 924
repetitive 1MB 448 / 1708 450 / 1063
random 1MB 918 / 753 809 / 2474

Compression is faster than libzstd on small text and incompressible data. Decompression is within 2-3x of libzstd for compressible data.

Running Tests

julia --project=. -e 'using Pkg; Pkg.test()'

About

A pure Julia implementation of the Zstandard (RFC 8878) compression algorithm

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages