Skip to content

Improve appender speed v2#41

Open
JAicewizard wants to merge 4 commits into
duckdb:mainfrom
JAicewizard:improve_appender_speed_v2
Open

Improve appender speed v2#41
JAicewizard wants to merge 4 commits into
duckdb:mainfrom
JAicewizard:improve_appender_speed_v2

Conversation

@JAicewizard

Copy link
Copy Markdown

This currently also includes the changes of marcboeker#483, but I will rebase once that is merged.

This adds multiple ways to append, using all the variations of table sources.
Using just row-based is actually slower than the current solution, which may be expected as it operates very similarly but with extra steps. The real performance gains come from the parallel implementations.
The benchmarks don't show much improvement (~25% for both parallel row and parallel chunk), however this is due to the fact that they are both bottlenecked on appending the chunks in DuckDB itself. If instead the bottleneck would be the data ingestion/computation on the user-side (instead of a simple counter), this would be much more favorable to the parallel variations. Currently they barely show up as parallel on my laptop.

Another improvement would be setting entire vectors of data at a time, instead of individual values, this could be done in a future PR.

@JAicewizard

Copy link
Copy Markdown
Author

Let me know if this is still welcome

@taniabogatsch

Copy link
Copy Markdown
Member

Hi @JAicewizard - now that the dust has settled a bit on the migration: yes, this is still welcome! Feel free to ping me once there's a review to be left here. :)

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 84954f6 to 39958b1 Compare December 28, 2025 11:53
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 39958b1 to 8e9fbd2 Compare December 28, 2025 12:27
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 8e9fbd2 to 957d854 Compare December 28, 2025 12:47
@JAicewizard

Copy link
Copy Markdown
Author

@taniabogatsch I really havn't been able to work on much on this over the past couple of months, but this should work now! I also noticed some nice improvements like projected chunks :)

@mlafeldt mlafeldt left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this big contribution!

Here's my initial review.

In addition to the inline remarks, I also noticed that this could use more comprehensive tests, e.g.

  • FillRow() returns error mid-stream
  • FillChunk() returns error
  • AppendDataChunk() fails (StateError)
  • Multiple goroutines fail simultaneously

Some work needed before merge, but the direction is solid.

Comment thread appender.go Outdated
Comment thread appender.go
Comment thread appender.go
Comment thread appender_test.go Outdated
Comment thread appender.go Outdated
Comment thread appender.go Outdated
Comment thread appender_test.go
@JAicewizard

Copy link
Copy Markdown
Author

Thanks for the review! I see you also worked on getting zig extensions to build nicely, I will definitely try that out as my current setup is quite outdated and annoying to work with!

Besides that, applied all the suggestions locally, still working on getting the error tests in place. With the error workgroup, I don't think testing multiple simultaneous errors is necessary, as the error handling itself is in a package from the Go project and I expect it to function. We could add one, but it would be a maintenance burden (defining a parallel table source + test takes up quite a bit of code) and be flaky at best.

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from 294e8ee to c7a7ee4 Compare January 20, 2026 23:27
@JAicewizard

Copy link
Copy Markdown
Author

I am not entirely sure how to test "AppendDataChunk() fails (StateError)", so that is not yet tested. Besides that I think adding error tests was a good idea, as it made me realise we could handle it a bit more gracefully.

@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch 2 times, most recently from a5db2c8 to ba322ae Compare January 20, 2026 23:41
@JAicewizard JAicewizard force-pushed the improve_appender_speed_v2 branch from ba322ae to 66b94ce Compare January 20, 2026 23:44
@mlafeldt

Copy link
Copy Markdown
Member

@JAicewizard Sorry for the delay. I really want to push this feature forward.

Do you mind resolving the merge conflict? This needs a bit more polish before merging. I would also be happy to take over the PR if you don't have time.

@mlafeldt mlafeldt added feature / enhancement Code improvements or a new feature changes requested Changes have been requested to a PR or issue labels Jun 12, 2026
@JAicewizard

Copy link
Copy Markdown
Author

Yes! I am currently overcome by events but I really want to get this merged as well. I'll mark this as unread, I'll get to it somewhere this summer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changes requested Changes have been requested to a PR or issue feature / enhancement Code improvements or a new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants