Add a tensor.insert kernel for new layouts #2204
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
tensor.insert kernel is broken into four cases:
vto insert is secret/cleartextIn all cases, this PR has the kernel generate a mask for the ciphertext-semantic scalar and dest tensors. The mask for the scalar has 1's or
v's for the secret/cleartext cases, and the mask for the dest has 1's everywhere except 0's in the target of the insertion.If the indices are static, then the insertions can be done directly: take the layout, add constraints for the static indices on the domain, then simplify and (using ISL) enumerate the points of the range. Each point becomes one insertion. A new utility
getRangePointswas added to support this, which can be generalized as needed.If the indices are dynamic, then we have to use the codegen, where the body of the loop does the relevant mask construction.
This improves over the previous insert kernel in that:
This one feels like another win for the new layout system, since it is so straightforward to express the assembly of a special plaintext mask.