Parallelize readdir() reading and stat().#19
Open
james-antill wants to merge 3 commits into
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This is a port of the parallel node.Visit() code from my fork. It spawns upto 32 goroutines to do a node.Visit() and when it reaches the limit each goroutine will fall back to processing them directly. The code should work identically if the global semWait is set to 0.
It does work, however there are currently four test failures that only happen when semWait is set all due to of the same root problem ... the nodes are added to the directory in the semi random order they are received from the channel (see line 513). The way each test "sorts" is a noop, so the original order is kept.
In theory you can "fix" some of this with:
Part of this patch which sorts by name/mtime/etc. as well as directory, when we do dir. sorts: james-antill@d27032a
This patch which sorts by name as well as size/ctime/mtime when we sort by those: james-antill@ff168c2
and in a similar situation, sort directory sizes by printed size: james-antill@76d027f