Skip to content

Conversation

@jyliuu
Copy link
Collaborator

@jyliuu jyliuu commented Jun 5, 2025

This pull request introduces a new version (0.5.1) of the glex package, focusing on improving functionality, addressing input handling, and enhancing the algorithm's robustness. Key changes include support for data frame inputs, stricter input validation, and a more robust path-dependent algorithm. Additionally, new tests ensure the correctness of these updates.

Version Update and Documentation:

  • Updated DESCRIPTION to reflect the new version (0.5.1).
  • Added release notes in NEWS.md, highlighting the fixes to the path-dependent algorithm and support for data frame inputs in glex().

Input Handling Improvements:

  • Enhanced glex.xgb.Booster and glex.ranger functions in R/glex.R to accept data frames as input, converting them to matrices when necessary. Added validation to ensure no factor columns are present, with clear error messages. [1] [2]

Algorithm Enhancements:

  • Modified the path-dependent algorithm in src/path_dependent_explain_tree.cpp to compute node covers manually, replacing reliance on precomputed "Cover" values. This ensures correctness and adaptability to different data scenarios. [1] [2]
  • Updated the recursive function recursePathDependent to use the newly computed node covers for more accurate weight calculations.

Testing:

  • Added new tests in tests/testthat/test-glex-dataframe.R to verify that glex() works correctly with both matrix and data frame inputs for XGBoost and Ranger models.

Miscellaneous:

  • Updated README.md to include references to an additional paper on partial dependence functions.

@jyliuu jyliuu requested review from Copilot and jemus42 June 5, 2025 12:04
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR updates glex to version 0.5.1 by adding data frame input support, improving the path-dependent weighting algorithm, and expanding tests and documentation.

  • Add support for data frame inputs in glex.xgb.Booster and glex.ranger, with factor-column checks and matrix conversion
  • Compute node covers dynamically in C++ (calculate_data_driven_covers) and pass them to recursePathDependent
  • Add tests for data frame inputs and update documentation, NEWS, and README

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/testthat/test-glex-dataframe.R New tests verifying that glex() yields identical results on matrix vs. data frame inputs
src/path_dependent_explain_tree.cpp Implemented calculate_data_driven_covers, updated recursePathDependent to use covers
R/glex.R Added input checks for factor columns and conversion of data frames to matrices
README.md Added reference to new paper on fast PD estimation
NEWS.md Bumped version to 0.5.1 and added release notes
DESCRIPTION Updated package version from 0.5.0 to 0.5.1
Comments suppressed due to low confidence (1)

src/path_dependent_explain_tree.cpp:95

  • The function is still invoked as a template (recursePathDependent<...>) but its definition no longer has a template<typename Comparison> declaration. Re-add the template header before the function signature or remove the template arguments in calls to ensure it compiles.
NumericMatrix recursePathDependent(

@jyliuu jyliuu force-pushed the fix-path-dependent branch from e342afb to 4952eea Compare June 5, 2025 12:12
@jyliuu jyliuu force-pushed the fix-path-dependent branch from 4952eea to 1fb8e9b Compare June 5, 2025 12:13
Copy link
Collaborator

@jemus42 jemus42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍🏻 looks good aside from the README.Rmd thing

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't modify README.md directly - it is generated from README.Rmd. Usually there also should be a pre-commit hook to make it impossible to only ever commit one of the two files but I guess that's not installed automatically.

jyliuu added 2 commits June 6, 2025 15:39
…umns in data frames and update example usage in documentation.
@jyliuu jyliuu force-pushed the fix-path-dependent branch from 94ff343 to 6003c42 Compare June 6, 2025 13:39
@jyliuu jyliuu merged commit 21b1a5d into master Jun 6, 2025
5 of 6 checks passed
@jyliuu jyliuu deleted the fix-path-dependent branch June 6, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants