-
Notifications
You must be signed in to change notification settings - Fork 115
implement basic extension system for pkgdatas
#962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
baa11ac to
854879d
Compare
We are currently implementing an incremental analysis system for JET and JETLS by integrating them with Revise. We considered other approaches such as implementing their own Revise-like system on the JET[LS] side (e.g. #938), but concluded that implementing incremental analysis by making Revise itself extensible would be better in terms of implementation reliability and future maintenance costs. This commit implements the first step towards this goal by extending Revise's most important data structure `pkgdatas` to hold custom data and providing utilities for interacting with it. Specifically, this commit extends the leaf nodes of the `pkgdatas` data structure, `MethodInfoKey(method_table => signature)`, to a new data structure `SigInfo(method_table, signature, ext_data::ExtendedData)`. `SigInfo` is defined as follows: ```julia struct SigInfo mt::Union{Nothing,MethodTable} sig::Type ext::ExtendedData end struct ExtendedData owner::Symbol data::Any next::ExtendedData ExtendedData(owner::Symbol, data, next::ExtendedData) = new(owner, data, next) ExtendedData(owner::Symbol, data) = new(owner, data) ExtendedData() = new(:Revise, nothing) end ``` As seen from the type definition, `SigInfo` can hold `ExtendedData` in addition to the information that the conventional `MethodInfoKey` holds. `ExtendedData` is implemented as a linked list, enabling safe data management per owner through the `owner` field (similar to `Core.CodeInstance` design). We also implement the following utility functions for getting or replacing `ext_data`: - `get_extended_data(ext::ExtendedData, owner::Symbol) -> ext::Union{ExtendedData,Nothing}` - `get_extended_data(siginfo::SigInfo, owner::Symbol) -> ext::Union{ExtendedData,Nothing}` - `replace_extended_data(ext::ExtendedData, owner::Symbol, data) -> new_ext::ExtendedData` - `replace_extended_data(siginfo::SigInfo, owner::Symbol, data) -> new_siginfo::SigInfo` This commit implements these simple extension setup only. To actually use this extension system, external packages need to decide when to inspect `Revise.pkgdatas` by themselves and use functions like `replace_extended_data` to hold custom data. While packages using this extension system and Revise's revision pipeline are not tightly synchronized, stale `SigInfo` objects are automatically removed, enabling minimal external incremental systems. Specifically, using this commit, we can replace `JET.report_package` with a Revise-based implementation and make it incremental (aviatesk/JET.jl#763). In the future, we may need to set up some callbacks in Revise's revision pipeline to allow external packages to customize in sync with Revise's revision pipeline. However, since the necessary requirements are still unclear at the current stage of extending `JET.report_package`, this commit does not implement them.
854879d to
ec0edec
Compare
This commit fundamentally reimplements `report_package` to leverage Revise.jl's signature extraction capabilities, bringing two major improvements: 1. Robustness: The previous implementation relied on JET's custom code loading mechanism via `virtual_process`, which proved fragile when handling the various patterns of top-level code used in real-world packages. By delegating to Revise's battle-tested infrastructure for tracking package definitions, we gain much more reliable analysis availability. 2. Incremental analysis: The new implementation can cache analysis results for each method signature in Revise's tracking data (via `SigAnalysisResult` stored in extended data). On subsequent analysis runs, unchanged methods reuse their cached results, dramatically reducing analysis time for iterative workflows. This is particularly valuable during package development when only small portions of code change between analysis runs. The implementation works by: - Using `Revise.pkgdatas` to extract all method signatures within target package - Analyzing each signature individually via `analyze_method_signature!` - Caching analysis results into `Revise.pkgdatas` via its newly implemented extension mechanism (Requires timholy/Revise.jl#962.) - Checking world age validity to determine when cached results can be safely reused The previous `analyze_from_definitions` and `concretization_patterns` configurations are no longer needed, as the signature-based approach provides more direct and reliable analysis of package definitions. In the future, rather than loading the entire package of Revise, which automatically enables file watching and REPL hooks, we should refactor the core diff-and-patch functionality for code from Revise into a new ReviseCore.jl module, and have JET use only that functionality. However, this commit does not implement that approach. Breaking changes: - `report_package` now requires the package to be loaded as a `Module` rather than accepting a package name string - The function signature changes from `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)` - Revise.jl is now a required dependency (not a weak dependency)
This commit fundamentally reimplements `report_package` to leverage Revise.jl's signature extraction capabilities, bringing two major improvements: 1. Robustness: The previous implementation relied on JET's custom code loading mechanism via `virtual_process`, which proved fragile when handling the various patterns of top-level code used in real-world packages. By delegating to Revise's battle-tested infrastructure for tracking package definitions, we gain much more reliable analysis availability. 2. Incremental analysis: The new implementation can cache analysis results for each method signature in Revise's tracking data (via `SigAnalysisResult` stored in extended data). On subsequent analysis runs, unchanged methods reuse their cached results, dramatically reducing analysis time for iterative workflows. This is particularly valuable during package development when only small portions of code change between analysis runs. The implementation works by: - Using `Revise.pkgdatas` to extract all method signatures within target package - Analyzing each signature individually via `analyze_method_signature!` - Caching analysis results into `Revise.pkgdatas` via its newly implemented extension mechanism (Requires timholy/Revise.jl#962.) - Checking world age validity to determine when cached results can be safely reused The previous `analyze_from_definitions` and `concretization_patterns` configurations are no longer needed, as the signature-based approach provides more direct and reliable analysis of package definitions. In the future, rather than loading the entire package of Revise, which automatically enables file watching and REPL hooks, we should refactor the core diff-and-patch functionality for code from Revise into a new ReviseCore.jl module, and have JET use only that functionality. However, this commit does not implement that approach. Breaking changes: - `report_package` now requires the package to be loaded as a `Module` rather than accepting a package name string - The function signature changes from `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)` - Revise.jl is now a required dependency (not a weak dependency)
|
Alright, I've completed all the necessary changes on the aviatesk/JET.jl#763 side, which is the first user of this feature. On the other hand, as mentioned in aviatesk/JET.jl#763, moving forward with the integration of JET and JETLS with Revise will require fairly significant refactoring, and considering these factors, I also have a desire to speed up the merge cycle for this PR and future ones I will submit to this repository. So, while I think this refactor should ideally wait for your opinion, if there's no response by tomorrow, I would like to take responsibility and merge this change myself. |
This commit fundamentally reimplements `report_package` to leverage Revise.jl's signature extraction capabilities, bringing two major improvements: 1. Robustness: The previous implementation relied on JET's custom code loading mechanism via `virtual_process`, which proved fragile when handling the various patterns of top-level code used in real-world packages. By delegating to Revise's battle-tested infrastructure for tracking package definitions, we gain much more reliable analysis availability. 2. Incremental analysis: The new implementation can cache analysis results for each method signature in Revise's tracking data (via `SigAnalysisResult` stored in extended data). On subsequent analysis runs, unchanged methods reuse their cached results, dramatically reducing analysis time for iterative workflows. This is particularly valuable during package development when only small portions of code change between analysis runs. The implementation works by: - Using `Revise.pkgdatas` to extract all method signatures within target package - Analyzing each signature individually via `analyze_method_signature!` - Caching analysis results into `Revise.pkgdatas` via its newly implemented extension mechanism (Requires timholy/Revise.jl#962.) - Checking world age validity to determine when cached results can be safely reused The previous `analyze_from_definitions` and `concretization_patterns` configurations are no longer needed, as the signature-based approach provides more direct and reliable analysis of package definitions. In the future, rather than loading the entire package of Revise, which automatically enables file watching and REPL hooks, we should refactor the core diff-and-patch functionality for code from Revise into a new ReviseCore.jl module, and have JET use only that functionality. However, this commit does not implement that approach. Breaking changes: - `report_package` now requires the package to be loaded as a `Module` rather than accepting a package name string - The function signature changes from `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)` - Revise.jl is now a required dependency (not a weak dependency)
We are currently implementing an incremental analysis system for JET and JETLS by integrating them with Revise. We considered other approaches such as implementing their own Revise-like system on the JET[LS] side (e.g. #938), but concluded that implementing incremental analysis by making Revise itself extensible would be better in terms of implementation reliability and future maintenance costs.
This commit implements the first step towards this goal by extending Revise's most important data structure
pkgdatasto hold custom data and providing utilities for interacting with it. Specifically, this commit extends the leaf nodes of thepkgdatasdata structure,MethodInfoKey(method_table => signature), to a new data structureSigInfo(method_table, signature, ext_data::ExtendedData).SigInfois defined as follows:As seen from the type definition,
SigInfocan holdExtendedDatain addition to the information that the conventionalMethodInfoKeyholds.ExtendedDatais implemented as a linked list, enabling safe data management per owner through theownerfield (similar toCore.CodeInstancedesign).We also implement the following utility functions for getting or replacing
ext_data:get_extended_data(ext::ExtendedData, owner::Symbol) -> ext::Union{ExtendedData,Nothing}get_extended_data(siginfo::SigInfo, owner::Symbol) -> ext::Union{ExtendedData,Nothing}replace_extended_data(ext::ExtendedData, owner::Symbol, data) -> new_ext::ExtendedDatareplace_extended_data(siginfo::SigInfo, owner::Symbol, data) -> new_siginfo::SigInfoThis commit implements these simple extension setup only. To actually use this extension system, external packages need to decide when to inspect
Revise.pkgdatasby themselves and use functions likereplace_extended_datato hold custom data. While packages using this extension system and Revise's revision pipeline are not tightly synchronized, staleSigInfoobjects are automatically removed, enabling minimal external incremental systems. Specifically, using this commit, we can replaceJET.report_packagewith a Revise-based implementation and make it incremental (aviatesk/JET.jl#763).In the future, we may need to set up some callbacks in Revise's revision pipeline to allow external packages to customize in sync with Revise's revision pipeline. However, since the necessary requirements are still unclear at the current stage of extending
JET.report_package, this commit does not implement them.