-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Autoloading helper file system: allow either autoloading or proper errors in more file operations #19198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
samansmink
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I would say expected behaviour should just be:
VirtualFileSystem::FindFileSystem(const string &path):
- if path matches existing filesystem -> return that
- try autoloading relevant extension and retry, if matches -> return that
- else return localfilesystem
|
Note to self: this should guarantee (given on failure to load this throws) that each query either a LOAD is attempted and failed (but that is notified) or an attempt worked, that reduces the space of available prefixes. One question I am not sure is whether there are ever problems with INSTALL / LOAD in parallel, that this will stress test more. Should there be a lock preventing multiple threads to autoload side-by-side? |
|
Regression failure is expected, to be solved changing link in #19206 |
Mytherin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Looks great - minor comments
effb100 to
e7978ec
Compare
|
On my side this is good to go |
|
Thanks! |
Autoloading helper file system: allow either autoloading or proper errors in more file operations (duckdb/duckdb#19198) Bump: delta (duckdb/duckdb#19220) Fixup templated version of TryGetSecretKeyOrSetting (duckdb/duckdb#19218) build spatial extension for mingw (duckdb/duckdb#19207)
Autoloading helper file system: allow either autoloading or proper errors in more file operations (duckdb/duckdb#19198) Bump: delta (duckdb/duckdb#19220) Fixup templated version of TryGetSecretKeyOrSetting (duckdb/duckdb#19218) build spatial extension for mingw (duckdb/duckdb#19207) Co-authored-by: krlmlr <krlmlr@users.noreply.github.com>
Currently, if
httpfsextension has not been loaded yet:either fails like:
where
s3://my_bucket/my_file.parquetis treated as a local file system and can't find the base directorys3://my_bucketthat is necessary to actually save the file.will rightly fail like:
that, given I choose a random bucket name that I have no permissions for is appropriate.
Goal of this PR is making autoloading (or detection that loading an extension is needed) work in more broader contest, in particular:
SELECT name, id FROM ICEBERG_SCAN('https://raw.githubusercontent.com/duckdb/duckdb-iceberg/main/data/persistent/equality_deletes/warehouse/mydb/mytable', VERSION='1', allow_moved_paths = true);will now correctly trigger autoloading of httpfs)This infrastructure is also easy to generalize, it's a matter of calling
VirtualFileSystem::FindFileSystem(path)overload that takes either a ClientContext or a FileOpener:VirtualFileSystem::FindFileSystem(path, context_or_file_opener).After this PR, same example as above would either autoload and throw HTTP Error:
or throw a Missing Extension Error:
Fixes #19180
Fixes autoloading of FileExist, example: