xlsx: fix ignoring rich text <r> after initial plain <t>#637
Merged
Conversation
3 tasks
ddimaria
added a commit
to ddimaria/calamine
that referenced
this pull request
May 12, 2026
…eet layout Add style support for Calamine xlsx files. Public API: - `Xlsx::worksheet_style(sheet)` returns a row x col grid of cell styles using run-length encoding for memory efficiency on large workbooks. - `Xlsx::worksheet_layout(sheet)` returns column widths and row heights. Style types (in `src/style.rs`): - `Style` with optional Font / Fill / Borders / Alignment / NumberFormat / Protection. - `Color` with theme + tint resolution and indexed-color fallback. - `RichText` / `TextRun` for cells with mixed inline formatting. - `StyleRange` with RLE storage and a `cells()` iterator. Parser in `src/xlsx/style_parser.rs` handles fonts (bold / italic / underline / strikethrough / sz / color), fills, borders (with color and style per side), number formats (built-in + custom format codes), alignment (horizontal / vertical / wrap / indent / shrink / text rotation incl. stacked), protection (locked default per OOXML), theme colors with tint, and sysClr lastClr fallback. Shared-string reader now decodes rich text runs and preserves their formatting, while also handling plain text that precedes rich runs (consistent with upstream PR tafia#637). Includes benchmarks in `benches/style.rs` and test fixtures (styles.xlsx, borders.xlsx, EMSI_JobChange_UK.xlsx, problematic_formats.xlsx, styles_1M.xlsx) covering the various code paths. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Currently
read_string_with_bufsjust ignores all subsequent events until closing tag if hitting a plain text node first. This causes it to ignore subsequent text in examples like this:I guess this was for speed reasons? I think in the typical case the next event would be the closing tag anyway, so shouldn't be much different, and I don't think this introduces any additional copies. It does make
rich_buffera bit of a misnomer though, so maybe should be changed (thoughtext_bufmeans something different in this function).