[dbnode] Optimize block range scan in queryWithSpan by linasm · Pull Request #3813 · m3db/m3

linasm · 2021-10-05T05:51:25Z

What this PR does / why we need it:
The client could query for an arbitrary wide time range (eg. 1000 years) which would cause an expensive iteration over all of it in the inner most loop. This change narrows down the scanned range of blocks to at most what is actually covered by the index entry.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:
NONE

Does this PR require updating code package or user-facing documentation?:
NONE

codecov · 2021-10-05T10:45:40Z

Codecov Report

Merging #3813 (e40889e) into master (e40889e) will not change coverage.
The diff coverage is n/a.

❗ Current head e40889e differs from pull request most recent head 4c5005f. Consider uploading reports for the commit 4c5005f to get more accurate results

@@          Coverage Diff           @@
##           master   #3813   +/-   ##
======================================
  Coverage    57.0%   57.0%           
======================================
  Files         552     552           
  Lines       63540   63540           
======================================
  Hits        36280   36280           
  Misses      24054   24054           
  Partials     3206    3206

Flag	Coverage Δ
aggregator	`63.3% <0.0%> (ø)`
cluster	`∅ <0.0%> (∅)`
collector	`58.4% <0.0%> (ø)`
dbnode	`60.7% <0.0%> (ø)`
m3em	`46.4% <0.0%> (ø)`
metrics	`19.7% <0.0%> (ø)`
msg	`74.4% <0.0%> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e40889e...4c5005f. Read the comment docs.

ryanhall07

nice find!

ryanhall07 · 2021-10-05T14:52:56Z

 	}
 }

+func (s *entryIndexState) indexedRangeWithRLock() (xtime.UnixNano, xtime.UnixNano) {


nit: why the separate method instead of inlining in IndexedRange ?

IndexedRange is on Entry struct, indexedRangeWithRLock is on entryIndexState.

ryanhall07 · 2021-10-05T14:56:56Z

guess we have some work to do over the next 1000 years!

nbroyles · 2021-10-21T18:19:10Z

+			if currentBlock.Before(minIndexed) {
+				currentBlock = minIndexed
+			}
+			maxIndexedExclusive := maxIndexed.Add(time.Nanosecond)


Why add the nanosecond?

To convert inclusive timestamp to exclusive one. So that we can compare and replace endExclusive with maxIndexedExclusive in the subsequent if.

nbroyles · 2021-10-21T18:34:00Z

+				endExclusive = maxIndexedExclusive
+			}
+
+			for !inBlock && currentBlock.Before(endExclusive) {


So, I don't quite follow this for loop. queryWithSpan is called on a block and takes a queryIter which iterates over index results within the same block. Given that, why do we need to do this loop over every block in the query range to check and see if the doc is indexed? Don't we only need to check that the block itself is within the query range and that doc is indexed for this same block (which it should be since it's in the queryIter)? I'm not super familiar with the intricacies of the read path, so could be misunderstanding here.

TBH I'm not really familiar with this aspect, either. I saw the opportunity to optimize this code without affecting its semantics, in which case I don't have to fully understand the context of it.
I think the best bet is to ask @robskillington who wrote the original code to shed some light on the purpose of this loop.

Hm, this loop behaves slightly different than before. Previously, this loop was being iterated no less than once, now it might not iterate at all. While the current implementation is more correct, maybe the less correct version was necessary for proper functioning? (Though conditions when the behavior would differ seem to be too rare for this to matter).

You mean the case when start == endInclusive? I think this was a bug, and it would have been difficult to replicate with the optimized version, so I chose to fix it. I believe this is not a realistic edge case, also this is certainly not the issue that we could have seen.

linasm added 11 commits October 5, 2021 08:47

[dbnode] Optimize block range scan in queryWithSpan

2327419

TestEntryIndexedRange

0a7db36

Fix tests

f341b43

TestBlockE2EInsertAddResultsQueryNarrowingBlockRange

60f500d

lint

66a94fb

fmt

15dfb50

Fix TestNamespaceForwardIndexInsertQuery

7c4a702

Fix TestNamespaceIndexHighConcurrentQueries*

399e276

Fix TestIndexMultipleBlockQuery

0cf97a6

Fix edge case

e2590eb

Fix TestNamespaceIndexInsertQuery

79c1ef9

linasm changed the title ~~WIP [dbnode] Optimize block range scan in queryWithSpan~~ [dbnode] Optimize block range scan in queryWithSpan Oct 5, 2021

linasm marked this pull request as ready for review October 5, 2021 10:46

linasm requested review from rallen090, robskillington and ryanhall07 October 5, 2021 10:47

ryanhall07 approved these changes Oct 5, 2021

View reviewed changes

rallen090 reviewed Oct 5, 2021

View reviewed changes

Comment thread src/dbnode/storage/entry.go Outdated

rallen090 approved these changes Oct 5, 2021

View reviewed changes

linasm and others added 2 commits October 6, 2021 08:21

Merge branch 'master' into linasm/optimize-queryWithSpan-block-scan

0a65bd3

Address PR feedback

4c5005f

linasm enabled auto-merge (squash) October 6, 2021 05:29

linasm merged commit e0a3682 into master Oct 6, 2021

linasm deleted the linasm/optimize-queryWithSpan-block-scan branch October 6, 2021 05:45

nbroyles reviewed Oct 21, 2021

View reviewed changes

linasm mentioned this pull request Oct 29, 2021

sudden increase in CPU across all nodes in a cluster causing query failure #3878

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[dbnode] Optimize block range scan in queryWithSpan#3813

[dbnode] Optimize block range scan in queryWithSpan#3813
linasm merged 13 commits into
masterfrom
linasm/optimize-queryWithSpan-block-scan

linasm commented Oct 5, 2021 •

edited

Loading

Uh oh!

codecov Bot commented Oct 5, 2021 •

edited

Loading

Uh oh!

ryanhall07 left a comment

Uh oh!

Uh oh!

ryanhall07 Oct 5, 2021

Uh oh!

linasm Oct 6, 2021

Uh oh!

ryanhall07 commented Oct 5, 2021

Uh oh!

Uh oh!

nbroyles Oct 21, 2021

Uh oh!

linasm Oct 21, 2021

Uh oh!

nbroyles Oct 21, 2021 •

edited

Loading

Uh oh!

linasm Oct 21, 2021

Uh oh!

vpranckaitis Oct 22, 2021 •

edited

Loading

Uh oh!

linasm Oct 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

linasm commented Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Oct 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ryanhall07 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ryanhall07 Oct 5, 2021

Choose a reason for hiding this comment

Uh oh!

linasm Oct 6, 2021

Choose a reason for hiding this comment

Uh oh!

ryanhall07 commented Oct 5, 2021

Uh oh!

Uh oh!

nbroyles Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

linasm Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

nbroyles Oct 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linasm Oct 21, 2021

Choose a reason for hiding this comment

Uh oh!

vpranckaitis Oct 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

linasm Oct 22, 2021

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

linasm commented Oct 5, 2021 •

edited

Loading

codecov Bot commented Oct 5, 2021 •

edited

Loading

nbroyles Oct 21, 2021 •

edited

Loading

vpranckaitis Oct 22, 2021 •

edited

Loading