-
Notifications
You must be signed in to change notification settings - Fork 109
Separate XGBoost JLL package into CPU and GPU versions #214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Also the underlying XGBoost library has been updated to v2.1.4 |
|
Thanks! A few minor concerns, see above. |
|
@ExpandingMan I can't see any comments above? :) |
src/Lib.jl
Outdated
| # only enable GPU support if there is a valid binary compatible with this system | ||
| # should we place a warning here? | ||
| if XGBoost_GPU_jll.is_available() | ||
| lib_xgboost = XGBoost_GPU_jll.libxgboost |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the symbol to lib_xgboost? Is the generator now generating that instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if there would be any confusion if I used the same symbol as the libraries exported by the JLL's. I can change it back if you like though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's even exported here, but if it were one could just use import to avoid masking.
Please change this back, it's more verbose and should be unnecessary.
test/runtests.jl
Outdated
|
|
||
| has_cuda() && @testset "cuda" begin | ||
| # only test GPU cababilities if an appropriate device + artifact is available | ||
| has_cuda() && XGBoost_GPU_jll.is_available() && @testset "cuda" begin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any scenario in which we would expect has_cuda() to be true but the xgboost cuda binaries not to be available? I would think that we'd want this to fail anyway if XGBoost_GPU_jll.is_available() fails here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the case where the user has an incompatible CUDA version with XGBoost_GPU_jll we won't want it to fail I think? Although that's more to stop it failing if e.g. a user has CUDA 13 which isn't supported yet by XGBoost.jl, rather than specifically for the test suite. I can condition this bit just on the artifact being available though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into the PR yet, just trying to see if I can help with the issue.
XGBoost has a build_info function that exposes build-time information like the CUDA version used for compiling XGBoost. If it's a CPU-only build, it should be reflected in the build_info as well. You can access this field without touching CTK13-specific functions (hence no runtime error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel pretty comfortable saying that if has_cuda() is true but the library isn't available the tests should fail anyway. Pkg is supposed to handle ensuring it's installed and if it's missing something would have gone wrong.
I would say let's make this has_cuda() only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked into the PR yet, just trying to see if I can help with the issue.
XGBoost has a build_info function that exposes build-time information like the CUDA version used for compiling XGBoost. If it's a CPU-only build, it should be reflected in the
build_infoas well. You can access this field without touching CTK13-specific functions (hence no runtime error).
I can't quite see where this function is defined in here?
|
Man, this github review feature feels so broken. Ok can you see it now? |
Lol yep, thanks :) |
|
Alright, I think this is fine, but I will allow a chance for more comments before merging. |
|
Alright, let's do this. Thanks @BenCurran98 . |
This is to fix the issue mentioned here where XGBoost wasn't working when users had CUDA v13 present. Now the behaviour of the package is to attempt to load GPU-compatible artifacts for XGBoost via XGBoost_GPU_jll, but if it can't find a compatible artifact it falls back to the standard CPU build in XGBoost_jll