perf: cell reader buffer reuse, optimise maps by alexander-beedie · Pull Request #611 · tafia/calamine

alexander-beedie · 2026-02-27T12:57:58Z

Three optimisations - two (very) small/minor, one medium, no public API changes:

Optimisations

Replaced s.chunks(4) with s.chunks_exact(4) for the most micro of micro-optimisations.
As per @tafia's review suggestion perf: cache path and attr lookups, reuse buffers, minimise allocs #606 (review), replaced use of the standard HashMap with hashbrown::HashMap (which uses ahash). This was already a transitive dependency via zip, so doesn't actually expand the current set of deps (though is now explicit in Cargo.toml). Also replaced BTreeMap with HashMap in two places where the map does not actually need to maintain ordering ("read_relationships" and "number_formats") to get O(n) instead of O(log n); marginal impact though as these maps are rarely (if ever?) going to be large.

Update: reverted explicit hashbrown integration (benefit was too marginal to reliably measure - maps are likely too small in relation to everything else), but kept the switch from BTreeMap to HashMap in places where order isn't needed..

Extended the same buffer-reuse optimisation that was part of perf: cache path and attr lookups, reuse buffers, minimise allocs #606 to XlsxCellReader, but cleaned it up a bit by consolidating the buffers in a new ValueBufs struct. No more use of read_string, so dropped it (only the variant that takes shared buffers is used now).

Miscellaneous

clippy complained about one of the functions having one too many args, so I also consolidated workbook-related args into a WorkbookContext struct; seems tidy?

Results

~6-7% speedup¹ when testing on a large (10 million cell) mixed-dtype Workbook.

Benchmarked on: Apple Silicon M3 Max ↩

jmcnamara

+1 on adding the context.

jmcnamara · 2026-02-28T12:23:23Z

Overall looks good. My only concern is with adding hashbrown as an explicit additional dependency (even if it is included by zip.rs anyway). What % performance does that give in your test case. Add @jqnatividad for a second opinion on the adding a dependency.

Also, I think the first commit could have been 3 functionally separate commits (even if they are small). You don't need to go back and change that though. Just a note for the future.

alexander-beedie · 2026-02-28T13:23:13Z

What % performance does that give in your test case. Add @jqnatividad for a second opinion on the adding a dependency.

Honestly? Couldn't even measure it; just kept it as the dependency was already implicit 😄
Totally happy to switch it back to a regular HashMap to keep things simple 👍

(Update: done)

Also, I think the first commit could have been 3 functionally separate commits (even if they are small). You don't need to go back and change that though. Just a note for the future.

Noted (and agreed ;)

jqnatividad · 2026-03-01T10:36:53Z

LGTM! Every little bit of performance is always appreciated!

jmcnamara · 2026-03-06T14:47:42Z

Merged. Thanks.

alexander-beedie force-pushed the perf-cell-reader-bufs branch 4 times, most recently from 68cd545 to 5ffad42 Compare February 27, 2026 13:05

jmcnamara reviewed Feb 28, 2026

View reviewed changes

alexander-beedie force-pushed the perf-cell-reader-bufs branch from 5efabc2 to bde3b37 Compare February 28, 2026 17:53

alexander-beedie added 2 commits March 1, 2026 12:59

perf: cell reader buffer reuse, optimise maps

cc00a2b

xlsx: use a WorkbookContext struct to tidy-up related parameters

0cf5dbc

alexander-beedie force-pushed the perf-cell-reader-bufs branch from bde3b37 to 0cf5dbc Compare March 1, 2026 08:59

jmcnamara merged commit d10b642 into tafia:master Mar 6, 2026
6 checks passed

alexander-beedie deleted the perf-cell-reader-bufs branch March 6, 2026 14:52

jmcnamara pushed a commit that referenced this pull request Mar 7, 2026

perf: cell reader buffer reuse, optimise maps (#611)

5fe50aa

alexander-beedie mentioned this pull request Mar 8, 2026

perf: optimise <v> cell value parsing #622

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf: cell reader buffer reuse, optimise maps#611

perf: cell reader buffer reuse, optimise maps#611
jmcnamara merged 2 commits into
tafia:masterfrom
alexander-beedie:perf-cell-reader-bufs

alexander-beedie commented Feb 27, 2026 •

edited

Loading

Uh oh!

jmcnamara left a comment

Uh oh!

jmcnamara commented Feb 28, 2026

Uh oh!

alexander-beedie commented Feb 28, 2026 •

edited

Loading

Uh oh!

jqnatividad commented Mar 1, 2026

Uh oh!

Uh oh!

jmcnamara commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

alexander-beedie commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Optimisations

Miscellaneous

Results

Footnotes

Uh oh!

jmcnamara left a comment

Choose a reason for hiding this comment

Uh oh!

jmcnamara commented Feb 28, 2026

Uh oh!

alexander-beedie commented Feb 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jqnatividad commented Mar 1, 2026

Uh oh!

Uh oh!

jmcnamara commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexander-beedie commented Feb 27, 2026 •

edited

Loading

alexander-beedie commented Feb 28, 2026 •

edited

Loading