[dbnode] Optimize `filesetFiles` function by soundvibe · Pull Request #3900 · m3db/m3

soundvibe · 2021-11-04T09:28:23Z

What this PR does / why we need it:

For namespaces with long retentions huge CPU spikes could emerge during node bootstrap. Looking at the CPU profile, we can see that filesetFiles function is not very efficient.

This PR optimizes filesetFiles function by reducing the amount of work it needs to do. It now collects blockStart and volumeIndex fields and later reuse them during sorting and further iterations (without a need to parse them again).

I've written a small benchmark to measure the new implementation (new implementation is ~ 30% faster):

230140 ns/op - new implementation
324493 ns/op - old implementation

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:

Does this PR require updating code package or user-facing documentation?:

* master: Fix race when checking for dirty aggregations (#3886) [aggregator] Add test coverage to expireValues (#3898) [aggregator] Propagate cancellation through tick (#3895)

codecov · 2021-11-04T09:45:48Z

Codecov Report

Merging #3900 (ad56b53) into master (513748e) will decrease coverage by 0.4%.
The diff coverage is 90.2%.

@@           Coverage Diff            @@
##           master   #3900     +/-   ##
========================================
- Coverage    57.0%   56.6%   -0.5%     
========================================
  Files         553     553             
  Lines       63639   63275    -364     
========================================
- Hits        36297   35827    -470     
- Misses      24136   24238    +102     
- Partials     3206    3210      +4

Flag	Coverage Δ
aggregator	`62.3% <ø> (-0.1%)`	⬇️
cluster	`∅ <ø> (∅)`
collector	`58.4% <ø> (ø)`
dbnode	`60.3% <90.2%> (-0.6%)`	⬇️
m3em	`46.4% <ø> (ø)`
metrics	`19.7% <ø> (ø)`
msg	`74.2% <ø> (-0.3%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 513748e...ad56b53. Read the comment docs.

vpranckaitis

LGTM 👍 Though would be good to get another review by someone else before merging

src/dbnode/persist/fs/files.go

vpranckaitis · 2021-11-04T13:24:42Z

src/dbnode/persist/fs/files.go

+	result := make([]filesetFile, len(matched))
+	for i, file := range matched {
+		blockStart, volume, err := fn(file)
+		if err != nil {
+			return nil, err
+		}
+
+		result[i] = filesetFile{
+			fileName:    file,
+			blockStart:  blockStart,
+			volumeIndex: volume,
+		}
+	}


nit:

Suggested change

result := make([]filesetFile, len(matched))

for i, file := range matched {

blockStart, volume, err := fn(file)

if err != nil {

return nil, err

}

result[i] = filesetFile{

fileName: file,

blockStart: blockStart,

volumeIndex: volume,

}

}

result := make([]filesetFile, 0, len(matched))

for _, file := range matched {

blockStart, volume, err := fn(file)

if err != nil {

return nil, err

}

result = append(result, filesetFile{

fileName: file,

blockStart: blockStart,

volumeIndex: volume,

})

}

linasm

Would you care to share the benchmark code (would be enough to put it in PR description)?
I am surprised that the improvement is only 30%. What was the sample size for benchmarking?

soundvibe · 2021-11-08T11:29:58Z

Would you care to share the benchmark code (would be enough to put it in PR description)? I am surprised that the improvement is only 30%. What was the sample size for benchmarking?

So initially I was benchmarking with only 100 files so the performance gain was quite low. Tested with more files and most of the time it is ~2x improvement.

func BenchmarkFilesetFiles(b *testing.B) {
	shard := uint32(0)
	numIters := 100
	dir := createDataFilesWithVolumeIndex(nil, dataDirName, testNs1ID, shard, numIters, true, dataFileSuffix, 0)
	defer os.RemoveAll(dir)

	for i := 0; i < b.N; i++ {
		files, err := filesetFiles(filesetFilesSelector{
			fileSetType:    persist.FileSetFlushType,
			contentType:    persist.FileSetDataContentType,
			filePathPrefix: dir,
			namespace:      testNs1ID,
			shard:          shard,
			pattern:        filesetFilePrefix + "*",
		})
		if err != nil {
			b.Fatal(err)
		}

		if len(files) == 0 {
			b.Fatal("no files found")
		}
	}
}

-test.benchtime 3s
// new
numIters := 1000
BenchmarkFilesetFiles-12 2124 1611317 ns/op
BenchmarkFilesetFiles-12 2108 1619529 ns/op
numIters := 10000
BenchmarkFilesetFiles-12 121 27584583 ns/op
BenchmarkFilesetFiles-12 116 26826172 ns/op
numIters := 20000
BenchmarkFilesetFiles-12 21 146918829 ns/op
BenchmarkFilesetFiles-12 18 188116477 ns/op

//old
numIters := 1000
BenchmarkFilesetFiles-12 1189 3033102 ns/op
BenchmarkFilesetFiles-12 1174 2936289 ns/op
numIters := 10000
BenchmarkFilesetFiles-12 55 56059259 ns/op
BenchmarkFilesetFiles-12 58 54876307 ns/op
numIters := 20000
BenchmarkFilesetFiles-12 14 233025020 ns/op
BenchmarkFilesetFiles-12 10 308521944 ns/op

linasm · 2021-11-08T11:41:07Z

func BenchmarkFilesetFiles(b *testing.B) {
shard := uint32(0)
numIters := 100
dir := createDataFilesWithVolumeIndex(nil, dataDirName, testNs1ID, shard, numIters, true, dataFileSuffix, 0)
defer os.RemoveAll(dir)

This is benchmarking creation of the files on disk. You need to call b.ResetTimer() between benchmark setup and the actual benchmark loop.

soundvibe · 2021-11-08T13:05:48Z

func BenchmarkFilesetFiles(b *testing.B) {
shard := uint32(0)
numIters := 100
dir := createDataFilesWithVolumeIndex(nil, dataDirName, testNs1ID, shard, numIters, true, dataFileSuffix, 0)
defer os.RemoveAll(dir)

This is benchmarking creation of the files on disk. You need to call b.ResetTimer() between benchmark setup and the actual benchmark loop.

🤦🏻 Thanks for spotting it. Updated results:

numIters := 1000
--new
BenchmarkFilesetFiles-12    	    2260	   1600852 ns/op
BenchmarkFilesetFiles-12    	    2282	   1595209 ns/op
--old
BenchmarkFilesetFiles-12    	    1023	   2945657 ns/op
BenchmarkFilesetFiles-12    	    1142	   3255573 ns/op

numIters := 10000
--new
BenchmarkFilesetFiles-12    	     157	  22408915 ns/op
BenchmarkFilesetFiles-12    	     152	  22152958 ns/op
--old
BenchmarkFilesetFiles-12    	      76	  42065111 ns/op
BenchmarkFilesetFiles-12    	      76	  40777160 ns/op

numIters := 20000
--new
BenchmarkFilesetFiles-12    	      62	  51564310 ns/op
BenchmarkFilesetFiles-12    	      69	  48784302 ns/op
--old
BenchmarkFilesetFiles-12    	      30	 103220878 ns/op
BenchmarkFilesetFiles-12    	      30	 104825039 ns/op

Still ~2x improvement.

linasm

LGTM with some comments.

src/dbnode/persist/fs/files.go

linasm · 2021-11-08T15:43:57Z

src/dbnode/persist/fs/files.go

+	if len(matched) == 0 {
+		return nil, nil
+	}


I think this if is redundant.

I didn't want to change semantics on what is returned from this function. Previously nil would be returned if filepath.Glob returns nil. Without this if check we would be returning empty slice which is not the same thing as before.

findSortedFilesetFiles is a new function so I guess there can be no existing semantics for it.
And for the filesetFiles, there is still an if for this purpose:
https://github.com/m3db/m3/pull/3900/files#diff-78d9cf687193bca4cdc4ae73e54059f6a3b3ab4360a627c4bf26c070b8a0f909R1340

findSortedFilesetFiles replaced findFiles (which is still present) so its semantics were based on it so that these both functions could be used interchangeably if needed. In Go returning nil slices is quite common and I don't see problems with this approach here.

But findFiles does not have such check, it returns whatever is returned by filepath.Glob...

Exactly, and filepath.Glob returns nil if it founds nothing. So findSortedFilesetFilesshould also return nil when it founds nothing. I could have probably checked if matched == nil but since filepath.Glob never returns empty slice, there is no difference here.

src/dbnode/persist/fs/files.go

Linas Naginionis added 3 commits November 3, 2021 17:39

Initial refactoring.

85f314f

Small cleanup.

3d8cd47

Merge branch 'master' into linasn/find-files-improve-perf

26815db

* master: Fix race when checking for dirty aggregations (#3886) [aggregator] Add test coverage to expireValues (#3898) [aggregator] Propagate cancellation through tick (#3895)

soundvibe requested review from Antanukas, linasm and vpranckaitis November 4, 2021 10:05

vpranckaitis approved these changes Nov 4, 2021

View reviewed changes

Deleted unused code.

1ebfe77

linasm reviewed Nov 8, 2021

View reviewed changes

Merge branch 'master' into linasn/find-files-improve-perf

a20a7a4

linasm approved these changes Nov 8, 2021

View reviewed changes

Small cleanup after code review.

ad56b53

soundvibe merged commit 59ea90c into master Nov 9, 2021

soundvibe deleted the linasn/find-files-improve-perf branch November 9, 2021 08:10

Conversation

soundvibe commented Nov 4, 2021

Uh oh!

codecov bot commented Nov 4, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

vpranckaitis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vpranckaitis Nov 4, 2021

Choose a reason for hiding this comment

Uh oh!

linasm left a comment

Choose a reason for hiding this comment

Uh oh!

soundvibe commented Nov 8, 2021

Uh oh!

linasm commented Nov 8, 2021

Uh oh!

soundvibe commented Nov 8, 2021

Uh oh!

linasm left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

linasm Nov 8, 2021

Choose a reason for hiding this comment

Uh oh!

soundvibe Nov 8, 2021

Choose a reason for hiding this comment

Uh oh!

linasm Nov 8, 2021

Choose a reason for hiding this comment

Uh oh!

soundvibe Nov 9, 2021

Choose a reason for hiding this comment

Uh oh!

linasm Nov 9, 2021

Choose a reason for hiding this comment

Uh oh!

soundvibe Nov 9, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Nov 4, 2021 •

edited

Loading

soundvibe Nov 9, 2021 •

edited

Loading