Skip to content

Conversation

@j2kun
Copy link
Collaborator

@j2kun j2kun commented Sep 9, 2025

tensor.insert kernel is broken into four cases:

  • insertion indices are static/dynamic
  • scalar v to insert is secret/cleartext

In all cases, this PR has the kernel generate a mask for the ciphertext-semantic scalar and dest tensors. The mask for the scalar has 1's or v's for the secret/cleartext cases, and the mask for the dest has 1's everywhere except 0's in the target of the insertion.

If the indices are static, then the insertions can be done directly: take the layout, add constraints for the static indices on the domain, then simplify and (using ISL) enumerate the points of the range. Each point becomes one insertion. A new utility getRangePoints was added to support this, which can be generalized as needed.

If the indices are dynamic, then we have to use the codegen, where the body of the loop does the relevant mask construction.

This improves over the previous insert kernel in that:

This one feels like another win for the new layout system, since it is so straightforward to express the assembly of a special plaintext mask.

@j2kun j2kun requested a review from asraa September 9, 2025 19:58
@j2kun
Copy link
Collaborator Author

j2kun commented Sep 9, 2025

ok I did realize I missed a big part of the "dynamic mask generation" step. I need to add an if statement to make sure the dynamic index values are actually hit during the iteration

@j2kun j2kun force-pushed the insert-extract-kernel branch 2 times, most recently from 73fea6f to 43dba22 Compare September 10, 2025 18:54
@j2kun j2kun force-pushed the insert-extract-kernel branch from 4008a29 to f06d254 Compare September 10, 2025 19:01
@j2kun j2kun added the pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing label Sep 10, 2025
@copybara-service copybara-service bot merged commit f7aab50 into google:main Sep 10, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pull_ready Indicates whether a PR is ready to pull. The copybara worker will import for internal testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants