[SOT] Add C++ compiled guard lookup by SigureMo · Pull Request #79036 · PaddlePaddle/Paddle

SigureMo · 2026-05-17T11:44:56Z

PR Category

Performance Optimization

PR Types

Performance

Description

本 PR 为 SOT guard 增加 C++ compiled guard 与跨 guard 的 trie lookup，用于降低热启动 cache 命中链路中的 guard 开销。

主要改动：

将 Python guard spec 编译为 C++ CompiledGuard，覆盖 type/id/value/length/tensor meta/layer hook 等 SOT guard 场景。
新增 CompiledGuardLookup，按 guard op key 构建 trie，在多 cache entry 场景下复用公共前缀并直接返回 cache index。
executor cache 接入 compiled lookup；strict guard 模式下仍会运行原 guard/mirror guard 并校验 index 一致性，不提供 fallback-to-Python guard 逻辑。
增加 test/sot/test_compiled_guard.py 覆盖 hit/miss、layer hook、constructor error、3 miss + 4th hit lookup 等 correctness case。
增加 test/sot/benchmark_compiled_guard.py 便于本地和 nightly 对比 Paddle/Torch guard 性能。

本地验证：

prek
ninja -C build_312 python/paddle/base/libpaddle.so
python -m py_compile ...
python test/sot/test_compiled_guard.py
SOT_ENABLE_STRICT_GUARD_CHECK=True python test/sot/test_sot_resnet.py
python -m unittest test.sot.test_guard_tree test.sot.test_sot_cache
python test/sot/test_guard_fastpath_strategy.py
python test/sot/benchmark_compiled_guard.py --case resnet18 --resnet-image-size 64 --iterations 10000 --hot-iterations 20 --rounds 7 --compare-torch --multi-cache-count 4 --max-torch-guard-ratio 1.1 --max-torch-multi-lookup-ratio 1.1

ResNet guard benchmark result:

Paddle compiled guard only: 2.516 us/check
Torch Dynamo guard only: 2.448 us/check
Paddle/Torch single guard ratio: 1.03x
Paddle compiled trie lookup, 4-cache hit after 3 misses: 2.944 us/lookup
Torch Dynamo 4-guard lookup: 4.150 us/lookup
Paddle/Torch multi-guard ratio: 0.71x

是否引起精度变化

否。该改动只影响 SOT guard/cache 命中判断路径，不改变算子计算或模型数值逻辑；strict guard check 下会校验 compiled guard 与原 guard 结果一致。

Add a C++ compiled guard implementation and trie-based lookup for SOT cache entries. Wire it into the executor cache without Python guard fallback, and keep strict guard checking as a mirror path that validates compiled hits against the original guard semantics. Add focused correctness tests plus a ResNet guard benchmark comparing Paddle compiled guard lookup with Torch Dynamo guard behavior. Co-authored-by: Codex <noreply@openai.com>

paddle-bot · 2026-05-17T11:45:03Z

你的PR提交成功，感谢你对开源项目的贡献!
请关注后续CI自动化测试结果，详情请参考Paddle-CI手册。
Your PR has been submitted. Thanks for your contribution!
Please wait for the result of CI firstly. See Paddle CI Manual for details.

Co-authored-by: Codex <noreply@openai.com>

SigureMo and others added 5 commits May 17, 2026 22:35

[SOT] Support compiled guard for dist tensor meta

0966d1d

Co-authored-by: Codex <noreply@openai.com>

[SOT] Avoid skip decorator in compiled guard test

9b9d3a0

Co-authored-by: Codex <noreply@openai.com>

[SOT] Avoid Windows macro conflict in compiled guard

e3e0f51

Co-authored-by: Codex <noreply@openai.com>

[SOT] Retire legacy guard paths

c56c22b

Co-authored-by: Codex <noreply@openai.com>

[SOT] Type compiled guard specs

7b02901

Co-authored-by: Codex <noreply@openai.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SOT] Add C++ compiled guard lookup#79036

[SOT] Add C++ compiled guard lookup#79036
SigureMo wants to merge 6 commits into
PaddlePaddle:developfrom
cattidea:sot/cpp-guard-lookup

SigureMo commented May 17, 2026

Uh oh!

paddle-bot Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

SigureMo commented May 17, 2026

PR Category

PR Types

Description

是否引起精度变化

Uh oh!

paddle-bot Bot commented May 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant