-
Notifications
You must be signed in to change notification settings - Fork 2.8k
[Python Arrow] Fix issue related to TIMESTAMP_TZ and filter pushdown into PyArrow #8856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Looks like the exception type is wrong in the test, could you have another look? |
|
On Mac with On a Linux docker container: |
…dataset().IsLoaded()'
|
Think I found the culprit, the test that's failing is actually underspecified, which I now fixed. The test however monkeypatches the pyarrow dataset to not be present, so in the sys modules it's no longer present. I don't see why that's a huge deal though, in both cases it throws an exception, the one we throw is just a little more descriptive, none the less I changed and hopefully fixed it now :) This doesn't fix it either, unless we actually make an environment where If we |
|
I am tempted to remove the failing test If the test is run standalone it passes, when I run it together with other tests, it doesn't raise an exception (because pyarrow is loaded before it was monkeypatched out) |
| bool PythonImportCacheItem::IsLoaded() const { | ||
| auto type = (*this)(); | ||
| return type.ptr() != nullptr; | ||
| bool loaded = type.ptr() != nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what the point of this change is?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None really, I had initially meant to check ModuleIsLoaded if the ptr was none-null, to double verify
But even with monkeypatching this looked like it didn't make IsLoaded return false
So this is left over code that I might want to undo
|
Thanks! |
This PR fixes #8522
Arrow supports TIMESTAMP_TZ with any time unit (s, ms, ns, us)
We only support the time unit us for them, so we convert to us.
When a filter is applied that does a comparison against a constant, the constant is cast to the type of the column.
So the constant gets cast to TIMESTAMP_TZ (us)
When this filter gets pushed down into arrow, that results in attempting to compare TIMESTAMP_TZ(ms) against TIMESTAMP_TZ(us), which is not supported.