-
Notifications
You must be signed in to change notification settings - Fork 478
MNT Narwhals in utils._input_validations
#1533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| # TODO: sklearn.utils.validation.check_array does not support it | ||
| # def pandas_nullable_constructor(obj) -> IntoDataFrame: | ||
| # return pd.DataFrame(obj).convert_dtypes(dtype_backend="numpy_nullable") # type: ignore[no-any-return] | ||
|
|
||
| # TODO: sklearn.utils.validation.check_array does not support it | ||
| # def pandas_pyarrow_constructor(obj) -> IntoDataFrame: | ||
| # return pd.DataFrame(obj).convert_dtypes(dtype_backend="pyarrow") # type: ignore[no-any-return] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an interesing one.
To reproduce:
import pandas as pd
from sklearn.utils.validation import check_array
df = pd.DataFrame({"alpha": ["a", "a", "b"], "beta": [1, 2, 1]})
df = df.convert_dtypes(dtype_backend="pyarrow")
# or
# df = df.convert_dtypes(dtype_backend="numpy_nullable")
check_array(df, dtype=None)Did you encounter this already @FBruzzesi? Maybe in scikit-lego?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe scikit-lego was very early days of Narwhals, and we are not testing with different pandas dtype_backend 🙈
|
Hi @EdAbati, just acknowledging today that I saw this PR! I think that the frequency of my engagement here is not enough to keep everything up to date on the spin-off branch fully unfortunately, I was expecting a different situation these weeks but here we are 😄. We have a maintainer meeting tomorrow, I might discuss to move the merges directly into |
|
No problem at all @TamaraAtanasoska :) I changed the base branch to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@EdAbati I found some time to skim through this again! Left a couple of comments actually one relevant comment and it does look 🔥 😁
|
Apologies for the delay, but I finally found time to address the comments! 😄 Thank you for the review everyone, please let me know if there is anything else that doesn't look right |
| # TODO: remove this once we supports all narwhals inputs in this module | ||
| default_backend=nw.Implementation.PANDAS, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to add this in places were narwhals is not yet supported. Since we import polars in the tests, it becomes the default backend.
| if get_polars(): | ||
| return Implementation.POLARS | ||
| if get_pandas(): | ||
| return Implementation.PANDAS | ||
| if get_pyarrow(): | ||
| return Implementation.PYARROW | ||
| raise ImportError( | ||
| "No supported dataframe backend found. Please install either polars, pandas or pyarrow." | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is everyone happy about this priority?
Description
Towards #1522
(thanks @StefanieSenger for your refactoring PR, made this work way easier :) )
This PR makes the
utils _input_validations.pynarwhals compatible. Since all the rest doesn't expectnarwhalsyet, I convert everything to native before the return. This should be updated in the future :)FYI @TamaraAtanasoska I made the PR to
narwhals-integrationthat is behindmain(but branched ofmain): that's why you see some unrelated commits. We should updatenarwhals-integrationto see a cleaner diff.Tests
Documentation
Screenshots
Tagging the narwhalifying experts @FBruzzesi @MarcoGorelli for suggestions on how to improve this