Skip to content

Prefix bloom filters are not used for normal range scans #1778

@hachikuji

Description

@hachikuji

SlateDB supports prefix bloom filters, and scan_prefix can use them to skip SSTs that do not contain the requested prefix. However, normal bounded range scans do not currently leverage prefix bloom filters, even when the range clearly targets a prefix-shaped keyspace.

For example, a scan like:

db.scan(b"user:123:"..b"user:124:")

may be semantically similar to a prefix scan over user:123:, but today it goes through the normal range-scan path and does not pass a prefix query to SST filter evaluation. This can make range scans over prefix-partitioned keys scale with the number of overlapping SSTs even when configured prefix bloom filters could rule many of them out.

This is related to the broader scan latency problem discussed in #1302 and the prefix filtering work in #1334, but the remaining gap is specifically that scan / scan_with_options cannot benefit from prefix bloom filters.

Possible directions:

  • Add an explicit scan option that lets callers provide a prefix-filter hint for a range scan, with validation that the requested range is contained within that prefix range.
  • Explore safe inference for ranges whose bounds unambiguously describe a single prefix interval, while preserving the current non-inference behavior where ambiguous.

The goal would be to let users express prefix-shaped bounded scans without losing the SST pruning benefits of prefix bloom filters.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions