Skip to content

Conversation

@srlch
Copy link
Contributor

@srlch srlch commented Dec 2, 2025

What I'm doing:

Implement end-to-end range-based routing for range distribution tables on both FE and BE sides:

FE:
Add TabletRange and Tuple.toThrift / Variant.toThrift to serialize tablet ranges as TTabletRange/TVariant. OlapTableSink / FrontendServiceImpl now populate range_distributed_columns and per-tablet range in TOlapTablePartitionParam and TOlapTableIndexTablets, and temporarily disable colocate MV index for RANGE distribution tables.

BE:
Introduce RangeRouter to route rows to tablets based on TTabletRange, supporting multi-column ranges and various open/closed/±inf intervals; Extend TabletSinkSender with RangeTabletSinkSender.

Tests:
Add/extend FE tests (VariantTest, TabletRangeTest) to cover numeric, string, boolean and date/datetime serialization to Thrift. Add BE tests (RangeRouterTest, TabletSinkSenderRangeTest) covering typical routing paths, boundary behavior, sparse selections and error cases.

Fixes #64986

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.0
    • 3.5
    • 3.4
    • 3.3

Note

Implements range-based row routing for range-distributed tables, adding FE/BE support, Thrift/Proto range serialization, and comprehensive tests.

  • Backend (BE):
    • Add RangeRouter and RangeTabletSinkSender to route rows by TTabletRange; wire into OlapTableSink when is_range_distribution().
    • Enhance OlapTablePartitionParam to carry distribution_type; compute hashes only for HASH; randomize for RANGE/RANDOM.
    • Update build/tests; add range_router_test and tablet_sink_sender_range_test.
  • Frontend (FE):
    • Introduce TabletRange, Tuple.toThrift(), and Variant.toThrift() (BOOL/INT/LARGEINT/STRING/DATE) to serialize tablet ranges.
    • OlapTableSink/FrontendServiceImpl: populate distribution_type, range-distributed columns, and per-tablet range in TOlapTablePartitionParam/TOlapTableIndexTablets; disable colocate MV for RANGE.
    • Add helpers in MetaUtils for range distribution columns/ids; Tablet now holds TabletRange.
    • Add tests VariantTest, TabletRangeTest; adjust existing tests for distribution info.
  • Thrift/Proto:
    • Add TOlapTableDistributionType and distribution_type to TOlapTablePartitionParam.
    • Add TOlapTableTablet.range and TOlapTableIndexTablets.tablets.
    • TTabletRange now includes bounds and inclusivity; TVariant/VariantPB use long_value and string_value fields.

Written by Cursor Bugbot for commit a007d96. This will update automatically on new commits. Configure here.

@alvin-celerdata
Copy link
Contributor

@cursor review

@alvin-celerdata
Copy link
Contributor

@cursor review

@mergify
Copy link
Contributor

mergify bot commented Dec 5, 2025

🧪 CI Insights

Here's what we observed from your CI run for a007d96.

🟢 All jobs passed!

But CI Insights is watching 👀

@alvin-celerdata
Copy link
Contributor

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!


@srlch srlch force-pushed the range_write branch 2 times, most recently from a5ce2b8 to 6ebc144 Compare December 8, 2025 09:43
@alvin-celerdata
Copy link
Contributor

@cursor review

class SlotDescriptor;

// RangeRouter is responsible for routing rows will ranges which represent the entire (-inf, +inf)
class RangeRouter {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we reuse the partition range code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code logic is similar, but the RangeRouter is more memory-efficient. In addition, OlapTablePartitionParam contains a lot of partition-related logic, and I think decoupling it would be more appropriate for future iterations.

@alvin-celerdata
Copy link
Contributor

@cursor review

## What I'm doing:
Implement end-to-end range-based routing for range distribution tables on both FE and BE sides:

FE:
Add TabletRange and Tuple.toThrift / Variant.toThrift to serialize tablet ranges as TTabletRange/TVariant.
OlapTableSink / FrontendServiceImpl now populate range_distributed_columns and per-tablet range in TOlapTablePartitionParam and TOlapTableIndexTablets, and temporarily disable colocate MV index for RANGE distribution tables.

BE:
Introduce RangeRouter to route rows to tablets based on TTabletRange, supporting multi-column ranges and various open/closed/±inf intervals;
Extend TabletSinkSender with RangeTabletSinkSender.

Tests:
Add/extend FE tests (VariantTest, TabletRangeTest) to cover numeric, string, boolean and date/datetime serialization to Thrift.
Add BE tests (RangeRouterTest, TabletSinkSenderRangeTest) covering typical routing paths, boundary behavior, sparse selections and error cases.

Signed-off-by: srlch <linzichao@starrocks.com>
@alvin-celerdata
Copy link
Contributor

@cursor review

Signed-off-by: srlch <linzichao@starrocks.com>
@cursor
Copy link

cursor bot commented Dec 9, 2025

🚨 Bugbot couldn't run

Something went wrong. Try again by commenting "Cursor review" or "bugbot run", or contact support (requestId: serverGenReqId_b6e0ba7e-f4b8-410c-9135-5955b40f212f).

@alvin-celerdata
Copy link
Contributor

@cursor review

srlch added 2 commits December 9, 2025 14:59
Signed-off-by: srlch <linzichao@starrocks.com>
Signed-off-by: srlch <linzichao@starrocks.com>
@alvin-celerdata
Copy link
Contributor

@cursor review

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!


Signed-off-by: srlch <linzichao@starrocks.com>
@alvin-celerdata
Copy link
Contributor

@cursor review

@sonarqubecloud
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
3.5% Duplication on New Code (required ≤ 3%)

See analysis details on SonarQube Cloud

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no bugs!


@github-actions
Copy link

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link

[FE Incremental Coverage Report]

fail : 54 / 69 (78.26%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/catalog/LargeIntVariant.java 0 4 00.00% [62, 63, 64, 65]
🔵 com/starrocks/sql/common/MetaUtils.java 0 6 00.00% [242, 247, 252, 256, 260, 264]
🔵 com/starrocks/catalog/Tablet.java 1 4 25.00% [41, 42, 48]
🔵 com/starrocks/planner/OlapTableSink.java 11 12 91.67% [517]
🔵 com/starrocks/catalog/TabletRange.java 11 12 91.67% [30]
🔵 com/starrocks/catalog/BoolVariant.java 4 4 100.00% []
🔵 com/starrocks/catalog/IntVariant.java 4 4 100.00% []
🔵 com/starrocks/catalog/Tuple.java 3 3 100.00% []
🔵 com/starrocks/catalog/DateVariant.java 4 4 100.00% []
🔵 com/starrocks/catalog/StringVariant.java 4 4 100.00% []
🔵 com/starrocks/service/FrontendServiceImpl.java 12 12 100.00% []

@github-actions
Copy link

[BE Incremental Coverage Report]

pass : 202 / 229 (88.21%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 be/src/exec/range_tablet_sink_sender.h 0 4 00.00% [25, 30, 32, 34]
🔵 be/src/exec/tablet_sink.cpp 1 5 20.00% [323, 324, 325, 326]
🔵 be/src/exec/tablet_info.h 7 9 77.78% [283, 288]
🔵 be/src/exec/range_router.cpp 138 154 89.61% [32, 33, 37, 61, 62, 63, 93, 124, 128, 131, 152, 153, 268, 269, 275, 276]
🔵 be/src/exec/range_tablet_sink_sender.cpp 46 47 97.87% [81]
🔵 be/src/exec/range_router.h 2 2 100.00% []
🔵 be/src/exec/tablet_info.cpp 8 8 100.00% []

// the _boundaries will be: [100, 200]
// the _lower_bound_inclusive will be: [false, true, true]
std::vector<ColumnPtr> _boundaries;
std::vector<uint8_t> _lower_bound_inclusive;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first element of _lower_bound_inclusive is always false, so it do not need to save?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

New multi-tenant data management

6 participants