[None][fix] pool-qualify KV cache transfer pending keys#15272
[None][fix] pool-qualify KV cache transfer pending keys#15272chienchunhung wants to merge 2 commits into
Conversation
Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>
f8f1a27 to
1a84767
Compare
|
/bot run --disable-fail-fast |
|
PR_Github #53720 [ run ] triggered by Bot. Commit: |
Signed-off-by: Chien-Chun Hung <2679986+chienchunhung@users.noreply.github.com>
1a84767 to
d88df99
Compare
|
/bot run --disable-fail-fast |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
📝 WalkthroughWalkthrough
ChangesPool-Qualified Pending Transfer Tracking
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
PR_Github #53767 [ run ] triggered by Bot. Commit: |
|
PR_Github #53720 [ run ] completed with state |
|
/bot run --disable-fail-fast |
|
PR_Github #53792 [ run ] triggered by Bot. Commit: |
Summary
This draft fixes pending KV cache transfer tracking so primary and secondary pools no longer collide when they use the same dense memory-pool slot index.
KVCacheIndex::get()strips the secondary-pool bit, but the transfer manager used that dense value as the key for pending read/write CUDA events. That aliases physical locations such as primary slot 0 and secondary slot 0. The change adds a pool-qualified pending-transfer key and routes onboard/offload read/write tracking through it.Test Coverage
Added a deterministic regression test in
kvCacheManagerTestthat schedules:primary slot 0 -> secondary slot 1secondary slot 0 -> primary slot 1The test verifies the transfer manager keeps two pending reads and two pending writes, one per physical
(pool, slot)location. This would collapse to one read/write entry with dense-only keys.Validation
git diff --checkclang-formatpassingNot yet run: GPU C++ gtest. Suggested command:
./cpp/build/tests/unit_tests/batch_manager/kvCacheManagerTest --gtest_filter=KVCacheManagerTest.KVCacheTransferManager*Pre-commit note: skipped
waive list check,validate-test-lists, andtype-checkon the successful commit becausescripts/check_test_list.pyfailed in this local hook environment withTypeError: unsupported operand type(s) for |: 'type' and 'NoneType', unrelated to this C++ change.Summary by CodeRabbit
Refactor
Tests