Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment : pre-splitter #4136

Draft
wants to merge 6 commits into
base: dev
Choose a base branch
from
Draft

Experiment : pre-splitter #4136

wants to merge 6 commits into from

Conversation

Cyan4973
Copy link
Contributor

@Cyan4973 Cyan4973 commented Sep 3, 2024

This is currently just an experiment:

Instead of ingesting full blocks only (128 KB),
make an a-priori analysis of the data,
and infer a position to split, if useful.

This leads to some small yet non-trivial compression gains, sometimes equivalent to a full compression level, yet for a correspondingly small speed cost.
The benefit is higher when there isn't a post-splitter (like in higher btultra and above),
but even when there is, there is still some small compression ratio benefit left, making it desirable for higher compression modes.

However, the dynamic analysis is not free, so it's currently reserved for higher compression strategies (currently btlazy2 and above), where the speed cost can be considered "negligible" (< 5%).
For other modes, this is replaced by a static block size strategy, but it's no longer limited to 128 KB only. Through tests, it appears that a 92 KB block size seems preferable, scoring higher compression ratio at a very small compression speed loss (due to increased nb of blocks, hence of block headers).

silesia.tar :

level dev this PR savings note
1 73422432 73324476 -97956 new static block size
12 58213308 58149669 -63639 last static block size
13 57994603 57720597 -274006 first dynamic size
15 57176243 56905822 -270421 last dynamic size alone
16 55338671 55256977 -81694 combine pre+post splitting
19 52887208 52842629 -44579 idem
22 52322760 52284324 -38436 idem

calgary.tar :

level dev this PR savings note
1 1143607 1129229 -14378 new static block size
12 930831 924326 -6505 last static block size
13 922061 914326 -7735 first dynamic size
15 921606 912055 -9951 last dynamic size alone
16 882353 881217 -1136 combine pre+post splitting
19 861336 859737 -1599 idem
22 861200 859578 -1622 idem

It might be preferable to consider this parameter as yet another knob that could be enabled by the trainer.
But this would first require a refactor, to make this parameter independent.

Todo :

  • Employ workspace for allocation
  • Make the parameter selectable

instead of ingesting only full blocks, make an analysis of data, and infer where to split.
for better portability on Linux kernel
though I really wonder if this is a property worth maintaining.
@Cyan4973
Copy link
Contributor Author

Cyan4973 commented Sep 4, 2024

mmmh,
this could be a problem:

test 45 : in-place decompression : Error : test (margin <= ZSTD_DECOMPRESSION_MARGIN(CNBuffSize, ZSTD_BLOCKSIZE_MAX)) failed

I suspect the problem is that the macro ZSTD_DECOMPRESSION_MARGIN() may make an assumption about block sizes being always full (128 KB), and derive an assumption about maximum expansion, which is no longer respected when block sizes are 92 KB by default (lower compression levels). In contrast, ZSTD_decompressionMargin() scans the compressed data, hence it probably ends up with a different result, and now both results differ.

I'm not completely sure what's the wanted behavior here...

edit: confirmed that, when I change the default block size to anything other than 128 KB, it breaks this test.
On the other hand, ZSTD_DECOMPRESSION_MARGIN() macro requires a parameter blockSize, and the test passes ZSTD_BLOCKSIZE_MAX, aka 128 KB, so it's pretty clear what it's expecting.
So the question is: how could this test be passed "reasonably", i.e. without inner knowledge of how the internal block splitting decision is done ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants