-
Notifications
You must be signed in to change notification settings - Fork 74
Add simple warp specialized example to direct bindings. #5683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
* Add `warp_specialize` and `inline_at` schedule operations
|
!test |
|
Review updated until commit 647ceb8 Description
|
| Relevant files | |||
|---|---|---|---|
| Enhancement |
| ||
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Missing Parameter Validation
inline_at and warp_specialize functions lack input validation. Consider adding checks for null pointers, valid position indices, positive stage counts, and reasonable prefetch distances to prevent runtime errors and improve user experience. |
Greptile OverviewGreptile SummaryThis PR adds Python bindings for warp specialization circular buffering to the direct bindings API:
The implementation follows existing patterns in the codebase and correctly handles the optional Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Test as Test Code
participant FD as FusionDefinition
participant Sched as Schedule Module
participant TV as TensorView
participant CB as CircularBuffer
Test->>FD: Create fusion definition
FD->>TV: from_pytorch(t0, t1)
FD->>TV: ops.add(tv0, tv1)
FD->>FD: add_output(tv2)
Note over Test,CB: Schedule Phase
Test->>TV: cache_after(LoadStoreOpType.tma)
Test->>TV: set_memory_type(MemoryType.shared)
Test->>TV: split(-1, bulk_inner_dim)
Test->>Sched: transform_like(reference)
Test->>TV: parallelize(ParallelType.grid_x)
Test->>Sched: inline_at(reference, pos=2)
Test->>TV: parallelize(ParallelType.tma)
Test->>Sched: warp_specialize(tv, stages, prefetch, ParallelType.block_y)
Sched->>CB: WarpSpecialized(parallel_type)
CB->>TV: circularBuffer(stages, prefetch, type)
Note over Test,CB: Execution Phase
Test->>FD: manual_execute(inputs)
FD-->>Test: outputs
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 files reviewed, 1 comment
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
!test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 files reviewed, no comments
warp_specializeandinline_atschedule operationstest_register_sharing_circular_buffering_pointwiseexample.