-
Notifications
You must be signed in to change notification settings - Fork 2.2k
[Feature] Support data ingestion for range distribution table #66196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
@cursor review |
|
@cursor review |
🧪 CI InsightsHere's what we observed from your CI run for a007d96. 🟢 All jobs passed!But CI Insights is watching 👀 |
|
@cursor review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
a5ce2b8 to
6ebc144
Compare
|
@cursor review |
fe/fe-core/src/main/java/com/starrocks/planner/OlapTableSink.java
Outdated
Show resolved
Hide resolved
| class SlotDescriptor; | ||
|
|
||
| // RangeRouter is responsible for routing rows will ranges which represent the entire (-inf, +inf) | ||
| class RangeRouter { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we reuse the partition range code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code logic is similar, but the RangeRouter is more memory-efficient. In addition, OlapTablePartitionParam contains a lot of partition-related logic, and I think decoupling it would be more appropriate for future iterations.
|
@cursor review |
## What I'm doing: Implement end-to-end range-based routing for range distribution tables on both FE and BE sides: FE: Add TabletRange and Tuple.toThrift / Variant.toThrift to serialize tablet ranges as TTabletRange/TVariant. OlapTableSink / FrontendServiceImpl now populate range_distributed_columns and per-tablet range in TOlapTablePartitionParam and TOlapTableIndexTablets, and temporarily disable colocate MV index for RANGE distribution tables. BE: Introduce RangeRouter to route rows to tablets based on TTabletRange, supporting multi-column ranges and various open/closed/±inf intervals; Extend TabletSinkSender with RangeTabletSinkSender. Tests: Add/extend FE tests (VariantTest, TabletRangeTest) to cover numeric, string, boolean and date/datetime serialization to Thrift. Add BE tests (RangeRouterTest, TabletSinkSenderRangeTest) covering typical routing paths, boundary behavior, sparse selections and error cases. Signed-off-by: srlch <linzichao@starrocks.com>
|
@cursor review |
🚨 Bugbot couldn't runSomething went wrong. Try again by commenting "Cursor review" or "bugbot run", or contact support (requestId: serverGenReqId_b6e0ba7e-f4b8-410c-9135-5955b40f212f). |
|
@cursor review |
|
@cursor review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
|
@cursor review |
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
✅ Bugbot reviewed your changes and found no bugs!
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]❌ fail : 54 / 69 (78.26%) file detail
|
[BE Incremental Coverage Report]✅ pass : 202 / 229 (88.21%) file detail
|
| // the _boundaries will be: [100, 200] | ||
| // the _lower_bound_inclusive will be: [false, true, true] | ||
| std::vector<ColumnPtr> _boundaries; | ||
| std::vector<uint8_t> _lower_bound_inclusive; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first element of _lower_bound_inclusive is always false, so it do not need to save?
What I'm doing:
Implement end-to-end range-based routing for range distribution tables on both FE and BE sides:
FE:Add TabletRange and Tuple.toThrift / Variant.toThrift to serialize tablet ranges as TTabletRange/TVariant. OlapTableSink / FrontendServiceImpl now populate range_distributed_columns and per-tablet range in TOlapTablePartitionParam and TOlapTableIndexTablets, and temporarily disable colocate MV index for RANGE distribution tables.
BE:Introduce RangeRouter to route rows to tablets based on TTabletRange, supporting multi-column ranges and various open/closed/±inf intervals; Extend TabletSinkSender with RangeTabletSinkSender.
Tests:Add/extend FE tests (VariantTest, TabletRangeTest) to cover numeric, string, boolean and date/datetime serialization to Thrift. Add BE tests (RangeRouterTest, TabletSinkSenderRangeTest) covering typical routing paths, boundary behavior, sparse selections and error cases.
Fixes #64986
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check:
Note
Implements range-based row routing for range-distributed tables, adding FE/BE support, Thrift/Proto range serialization, and comprehensive tests.
RangeRouterandRangeTabletSinkSenderto route rows byTTabletRange; wire intoOlapTableSinkwhenis_range_distribution().OlapTablePartitionParamto carrydistribution_type; compute hashes only for HASH; randomize for RANGE/RANDOM.range_router_testandtablet_sink_sender_range_test.TabletRange,Tuple.toThrift(), andVariant.toThrift()(BOOL/INT/LARGEINT/STRING/DATE) to serialize tablet ranges.OlapTableSink/FrontendServiceImpl: populatedistribution_type, range-distributed columns, and per-tabletrangeinTOlapTablePartitionParam/TOlapTableIndexTablets; disable colocate MV for RANGE.MetaUtilsfor range distribution columns/ids;Tabletnow holdsTabletRange.VariantTest,TabletRangeTest; adjust existing tests for distribution info.TOlapTableDistributionTypeanddistribution_typetoTOlapTablePartitionParam.TOlapTableTablet.rangeandTOlapTableIndexTablets.tablets.TTabletRangenow includes bounds and inclusivity;TVariant/VariantPBuselong_valueandstring_valuefields.Written by Cursor Bugbot for commit a007d96. This will update automatically on new commits. Configure here.