-
-
Notifications
You must be signed in to change notification settings - Fork 492
Faster org-roam-list-files (based on directory-files-and-attributes)
#2558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
3238434 to
1083827
Compare
|
Haven't checked the code, but regarding the hashing - I have had eliminated this much before from my system This hasn't caused me any issues whatsoever. Unless something constantly corrupts these attributes for some users! |
1083827 to
a380869
Compare
|
It looks great. One suggestion: instead of using dolist and cl-loop use mapcar. This turns the code from mutable to immutable and makes it easier to read (in my opinion) especially if you extract the mapcar larger lambdas (like in org-roam--list-recursive-subdirs) mapcan does mutate the original list to the result list. I think that is desirable in this case. Otherwise it should be replaced with mapcar. the advantage of mapcan is that it does not create a new list. See the documentation for each. The code below uses mapcan For example: |
|
Thank you for your kind words. I don't think I agree on easier to read, and you'll have to explain what you mean by turning it from mutable to immutable. AFAIK these are identical in effect: (cl-loop for x in '(1 2 3 4)
collect (+ x 10))(let (result)
(dolist (x '(1 2 3 4))
(push (+ x 10) result))
(nreverse result))(mapcar (lambda (x)
(+ x 10))
'(1 2 3 4))I generally prefer cl-loop as my go-to tool, since it also has destructuring (the ;; return directories in current dir
(cl-loop for (file . attr) in (directory-files-and-attributes ".")
if (eq t (file-attribute-type attr))
collect file)and many other things, but destructuring in particular is really annoying to lack when using mapcar and friends. Anyway, just a tangent :) |
|
Btw, the design of Not that the built-ins are well-known for ease of reading... but that's why it's how it is. |
|
Unrelated but I too prefer cl-loop. I don't get why some library creators insist on excluding cl - for example the first thing that is written in I think a reader will find a cl-loop much easy to reason about than something built from first principles. |
I am of the opinion that mutating variables increases the probability of bugs (as one has to keep track of how the variable is mutated). That is why I think it is not very good idea to define bindings equal to nil and then assign them. That means that until they are assigned, they cannot be used, which is what I mean as increasing the mental load on whoever is reading the code. That is why I prefer do-loop and mapcar. But do-loop is a macro, so we might need to see if the macro creates garbage. mapcar is low level and very fast. In this case, I suspect that mapcan will be slightly faster. But I am happy with cl-loop. Much better than using do-list. cl-loop is basically for python people ;) as it mimics a comprehension. |
|
This might be relevant to speeding up org-roam |
As yantar92 pointed out on the Reddit discussion, that article does one thing wrong: But yes, some of the other tricks are good. I said last month (#2547 (comment)) that it'd make sense to rework |
8a2fe7e to
e9677d4
Compare
|
Finally, the checks pass. The tests did good. |
3753313 to
4f42e97
Compare
|
Added debatable fifth commit that sets Since it is my impression these alternative code paths will be unnecessary -- they only ever existed because the elisp version was thought to be slow, right? The bottleneck all along seems to have been |
4f42e97 to
b2b3b5a
Compare
9304536 to
4d6cd0d
Compare
0654094 to
715a2f3
Compare
|
Tests pass. |
ca7f88e to
bed80a6
Compare
d56fd8b to
86067a8
Compare
During one round of testing, there were ".#foo.org" ".#with-aliases.org" symlinks during the round, created by earlier tests in the same round...
86067a8 to
28620a5
Compare
|
Two things I can do after this is merged:
|
Motivation for this change
This is an alternative implementation of: #2546
It's probably better.
Some differences:
(org-roam-list-files)now takes me 0.01-0.05s!org-roam-list-files-commandstonil, of course. Probably there is no need to keep that user option any more (Fasterorg-roam-list-files(based on directory-files-and-attributes) #2558 (comment)).(org-roam-db-sync)now takes me 0.10s (when there is nothing to update)org-roam-file-pdirectly to filter the output oforg-roam-list-files, because it is not necessary -- several facts, such as being a descendant oforg-roam-directory, can be safely assumed! Theorg-roam-file-ppredicate is more useful for checking any wild file name in other programming contexts.file-truenamenorfile-relative-name, sofile-name-handler-alistwould not affect the runtime much (the earlier PR discussed overriding that setting for perf gains).org-roam-file-exclude-regexpinto 4 separate options, which are both more friendly to modification by users, and easier to optimize with.org-roam-db-syncfrom comparing content hashes is eliminated, because it now just checks filesystem mtime. I do not know if that is sane (feedback please), but I can't see why not.Content hashes
Note that the content hash is still written into the DB, because I haven't changed
org-roam-db-update-file.Likewise, it still won't update if it sees that the content hash has not changed.
So the difference is only at
org-roam-db-synctime, where if hypothetically there exists some file whose contents have changed, but its mtime was held unchanged, then it would not update the DB for that file.I imagine some rsync command that deliberately doesn't update mtime, but I think that sort of thing has to be the responsibility of the user...
Data points: