Skip to content

Conversation

@aviatesk
Copy link
Collaborator

We are currently implementing an incremental analysis system for JET and JETLS by integrating them with Revise. We considered other approaches such as implementing their own Revise-like system on the JET[LS] side (e.g. #938), but concluded that implementing incremental analysis by making Revise itself extensible would be better in terms of implementation reliability and future maintenance costs.

This commit implements the first step towards this goal by extending Revise's most important data structure pkgdatas to hold custom data and providing utilities for interacting with it. Specifically, this commit extends the leaf nodes of the pkgdatas data structure, MethodInfoKey(method_table => signature), to a new data structure SigInfo(method_table, signature, ext_data::ExtendedData).

SigInfo is defined as follows:

struct SigInfo
    mt::Union{Nothing,MethodTable}
    sig::Type
    ext::ExtendedData
end

struct ExtendedData
    owner::Symbol
    data::Any
    next::ExtendedData
    ExtendedData(owner::Symbol, data, next::ExtendedData) = new(owner, data, next)
    ExtendedData(owner::Symbol, data) = new(owner, data)
    ExtendedData() = new(:Revise, nothing)
end

As seen from the type definition, SigInfo can hold ExtendedData in addition to the information that the conventional MethodInfoKey holds. ExtendedData is implemented as a linked list, enabling safe data management per owner through the owner field (similar to Core.CodeInstance design).

We also implement the following utility functions for getting or replacing ext_data:

  • get_extended_data(ext::ExtendedData, owner::Symbol) -> ext::Union{ExtendedData,Nothing}
  • get_extended_data(siginfo::SigInfo, owner::Symbol) -> ext::Union{ExtendedData,Nothing}
  • replace_extended_data(ext::ExtendedData, owner::Symbol, data) -> new_ext::ExtendedData
  • replace_extended_data(siginfo::SigInfo, owner::Symbol, data) -> new_siginfo::SigInfo

This commit implements these simple extension setup only. To actually use this extension system, external packages need to decide when to inspect Revise.pkgdatas by themselves and use functions like replace_extended_data to hold custom data. While packages using this extension system and Revise's revision pipeline are not tightly synchronized, stale SigInfo objects are automatically removed, enabling minimal external incremental systems. Specifically, using this commit, we can replace JET.report_package with a Revise-based implementation and make it incremental (aviatesk/JET.jl#763).

In the future, we may need to set up some callbacks in Revise's revision pipeline to allow external packages to customize in sync with Revise's revision pipeline. However, since the necessary requirements are still unclear at the current stage of extending JET.report_package, this commit does not implement them.

@aviatesk aviatesk force-pushed the avi/extension branch 2 times, most recently from baa11ac to 854879d Compare October 21, 2025 07:40
We are currently implementing an incremental analysis system for JET and
JETLS by integrating them with Revise. We considered other approaches
such as implementing their own Revise-like system on the JET[LS] side
(e.g. #938), but concluded that implementing
incremental analysis by making Revise itself extensible would be better
in terms of implementation reliability and future maintenance costs.

This commit implements the first step towards this goal by extending
Revise's most important data structure `pkgdatas` to hold custom data
and providing utilities for interacting with it. Specifically, this
commit extends the leaf nodes of the `pkgdatas` data structure,
`MethodInfoKey(method_table => signature)`, to a new data structure
`SigInfo(method_table, signature, ext_data::ExtendedData)`.

`SigInfo` is defined as follows:
```julia
struct SigInfo
    mt::Union{Nothing,MethodTable}
    sig::Type
    ext::ExtendedData
end

struct ExtendedData
    owner::Symbol
    data::Any
    next::ExtendedData
    ExtendedData(owner::Symbol, data, next::ExtendedData) = new(owner, data, next)
    ExtendedData(owner::Symbol, data) = new(owner, data)
    ExtendedData() = new(:Revise, nothing)
end
```

As seen from the type definition, `SigInfo` can hold `ExtendedData` in
addition to the information that the conventional `MethodInfoKey` holds.
`ExtendedData` is implemented as a linked list, enabling safe data
management per owner through the `owner` field (similar to
`Core.CodeInstance` design).

We also implement the following utility functions for getting or
replacing `ext_data`:
- `get_extended_data(ext::ExtendedData, owner::Symbol) -> ext::Union{ExtendedData,Nothing}`
- `get_extended_data(siginfo::SigInfo, owner::Symbol) -> ext::Union{ExtendedData,Nothing}`
- `replace_extended_data(ext::ExtendedData, owner::Symbol, data) -> new_ext::ExtendedData`
- `replace_extended_data(siginfo::SigInfo, owner::Symbol, data) -> new_siginfo::SigInfo`

This commit implements these simple extension setup only. To actually
use this extension system, external packages need to decide when to
inspect `Revise.pkgdatas` by themselves and use functions like
`replace_extended_data` to hold custom data. While packages using this
extension system and Revise's revision pipeline are not tightly
synchronized, stale `SigInfo` objects are automatically removed,
enabling minimal external incremental systems. Specifically, using this
commit, we can replace `JET.report_package` with a Revise-based
implementation and make it incremental (aviatesk/JET.jl#763).

In the future, we may need to set up some callbacks in Revise's revision
pipeline to allow external packages to customize in sync with Revise's
revision pipeline. However, since the necessary requirements are still
unclear at the current stage of extending `JET.report_package`, this
commit does not implement them.
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Oct 21, 2025
This commit fundamentally reimplements `report_package` to leverage
Revise.jl's signature extraction capabilities, bringing two major
improvements:

1. Robustness: The previous implementation relied on JET's custom
   code loading mechanism via `virtual_process`, which proved fragile
   when handling the various patterns of top-level code used in
   real-world packages. By delegating to Revise's battle-tested
   infrastructure for tracking package definitions, we gain much more
   reliable analysis availability.

2. Incremental analysis: The new implementation can cache analysis
   results for each method signature in Revise's tracking data
   (via `SigAnalysisResult` stored in extended data). On subsequent
   analysis runs, unchanged methods reuse their cached results,
   dramatically reducing analysis time for iterative workflows.
   This is particularly valuable during package development when only
   small portions of code change between analysis runs.

The implementation works by:
- Using `Revise.pkgdatas` to extract all method signatures within
  target package
- Analyzing each signature individually via `analyze_method_signature!`
- Caching analysis results into `Revise.pkgdatas` via its newly
  implemented extension mechanism (Requires timholy/Revise.jl#962.)
- Checking world age validity to determine when cached results can be
  safely reused

The previous `analyze_from_definitions` and `concretization_patterns`
configurations are no longer needed, as the signature-based approach
provides more direct and reliable analysis of package definitions.

In the future, rather than loading the entire package of Revise, which
automatically enables file watching and REPL hooks, we should refactor
the core diff-and-patch functionality for code from Revise into a new
ReviseCore.jl module, and have JET use only that functionality.
However, this commit does not implement that approach.

Breaking changes:
- `report_package` now requires the package to be loaded as a `Module`
  rather than accepting a package name string
- The function signature changes from
  `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)`
- Revise.jl is now a required dependency (not a weak dependency)
aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Oct 21, 2025
This commit fundamentally reimplements `report_package` to leverage
Revise.jl's signature extraction capabilities, bringing two major
improvements:

1. Robustness: The previous implementation relied on JET's custom
   code loading mechanism via `virtual_process`, which proved fragile
   when handling the various patterns of top-level code used in
   real-world packages. By delegating to Revise's battle-tested
   infrastructure for tracking package definitions, we gain much more
   reliable analysis availability.

2. Incremental analysis: The new implementation can cache analysis
   results for each method signature in Revise's tracking data
   (via `SigAnalysisResult` stored in extended data). On subsequent
   analysis runs, unchanged methods reuse their cached results,
   dramatically reducing analysis time for iterative workflows.
   This is particularly valuable during package development when only
   small portions of code change between analysis runs.

The implementation works by:
- Using `Revise.pkgdatas` to extract all method signatures within
  target package
- Analyzing each signature individually via `analyze_method_signature!`
- Caching analysis results into `Revise.pkgdatas` via its newly
  implemented extension mechanism (Requires timholy/Revise.jl#962.)
- Checking world age validity to determine when cached results can be
  safely reused

The previous `analyze_from_definitions` and `concretization_patterns`
configurations are no longer needed, as the signature-based approach
provides more direct and reliable analysis of package definitions.

In the future, rather than loading the entire package of Revise, which
automatically enables file watching and REPL hooks, we should refactor
the core diff-and-patch functionality for code from Revise into a new
ReviseCore.jl module, and have JET use only that functionality.
However, this commit does not implement that approach.

Breaking changes:
- `report_package` now requires the package to be loaded as a `Module`
  rather than accepting a package name string
- The function signature changes from
  `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)`
- Revise.jl is now a required dependency (not a weak dependency)
@aviatesk
Copy link
Collaborator Author

Alright, I've completed all the necessary changes on the aviatesk/JET.jl#763 side, which is the first user of this feature.
@timholy I'm afraid for the long description in this PR, but if you have time, I would appreciate your comments.

On the other hand, as mentioned in aviatesk/JET.jl#763, moving forward with the integration of JET and JETLS with Revise will require fairly significant refactoring, and considering these factors, I also have a desire to speed up the merge cycle for this PR and future ones I will submit to this repository.

So, while I think this refactor should ideally wait for your opinion, if there's no response by tomorrow, I would like to take responsibility and merge this change myself.
I will also rebase teh/revise_struct branch accordingly after merge.
At least for this PR, all test cases are passing, and given the nature of the code, I don't believe it will cause any performance degradation.

aviatesk added a commit to aviatesk/JET.jl that referenced this pull request Oct 21, 2025
This commit fundamentally reimplements `report_package` to leverage
Revise.jl's signature extraction capabilities, bringing two major
improvements:

1. Robustness: The previous implementation relied on JET's custom
   code loading mechanism via `virtual_process`, which proved fragile
   when handling the various patterns of top-level code used in
   real-world packages. By delegating to Revise's battle-tested
   infrastructure for tracking package definitions, we gain much more
   reliable analysis availability.

2. Incremental analysis: The new implementation can cache analysis
   results for each method signature in Revise's tracking data
   (via `SigAnalysisResult` stored in extended data). On subsequent
   analysis runs, unchanged methods reuse their cached results,
   dramatically reducing analysis time for iterative workflows.
   This is particularly valuable during package development when only
   small portions of code change between analysis runs.

The implementation works by:
- Using `Revise.pkgdatas` to extract all method signatures within
  target package
- Analyzing each signature individually via `analyze_method_signature!`
- Caching analysis results into `Revise.pkgdatas` via its newly
  implemented extension mechanism (Requires timholy/Revise.jl#962.)
- Checking world age validity to determine when cached results can be
  safely reused

The previous `analyze_from_definitions` and `concretization_patterns`
configurations are no longer needed, as the signature-based approach
provides more direct and reliable analysis of package definitions.

In the future, rather than loading the entire package of Revise, which
automatically enables file watching and REPL hooks, we should refactor
the core diff-and-patch functionality for code from Revise into a new
ReviseCore.jl module, and have JET use only that functionality.
However, this commit does not implement that approach.

Breaking changes:
- `report_package` now requires the package to be loaded as a `Module`
  rather than accepting a package name string
- The function signature changes from
  `report_package(pkg::Union{AbstractString,Module})` to `report_package(pkg::Module)`
- Revise.jl is now a required dependency (not a weak dependency)
@aviatesk aviatesk merged commit ebdf0a2 into master Oct 22, 2025
12 of 14 checks passed
@aviatesk aviatesk deleted the avi/extension branch October 22, 2025 11:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants