Performance optimization for size info in NTFS#33
Conversation
LordMike
left a comment
There was a problem hiding this comment.
Great contribution. We can probably improve even more, but it's a great start. 👍
Please see the comments I've made.
| return 0; | ||
| } | ||
|
|
||
| public int GetBytes(byte[] buffer, long index, int count) |
There was a problem hiding this comment.
We CANNOT have a public method formed like this. I would assume you mean the Index of the buffer, and not the index of the underlying structure.
There was a problem hiding this comment.
Either:
- make it private/internal
- change it so that it takes an offset for the buffer as well (like
Array.Copy()) - some other way, so that we keep within the observed standards in .NET API's .. :)
There was a problem hiding this comment.
Also. Name the offset in buffer just that: offset. So we're consistent with others. :)
There was a problem hiding this comment.
This signature was actually a bad copy of GetByte. I've added offset, made both internal and reordered the parameters to be more clear.
|
|
||
| namespace DiscUtils.Streams | ||
| { | ||
| public static class BitCounter |
There was a problem hiding this comment.
Could we add a little documentation on the purpose of this class? .. Much of the API does not have docmentation, but it'd be nice to know that the purpose of this class is (set) bit counting. Maybe on the public methods too.
| { | ||
| var end = start + count; | ||
| if (end > values.Length) | ||
| return 0; |
There was a problem hiding this comment.
Please throw exceptions for exceptional states.
There was a problem hiding this comment.
done - but I'm actually unsure which exception and message fits best - I first thought about ArgumentOutOfRangeException, but there's no single argument, which is out of range.
| return _lookupTable[value]; | ||
| } | ||
|
|
||
| public static long Count(byte[] values, int start, int count) |
There was a problem hiding this comment.
Can we name them offset and count?
I've found a major performance penalty in the space usage calculation of NTFS. Currenty the ClusterBitmap is read byte by byte from the underlying disk. This result in runtimes of up to one minute for large filesystems (>1TiB fs with a bitmap >50MiB).
I've changed the implementation to read the bitmap in blocks of 4K, which increased the performance by > 99% (in my test from ~1200ms down to ~45ms).
As a second optimization I've added a BitCounter which uses a precalculated lookup table to further speed up the size calculation from ~45ms down to ~15ms.