-
Notifications
You must be signed in to change notification settings - Fork 109
Optimisation: Extend the Lut generation to LUT4 elements #2458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Lut4 by default sounds great, but please make the lut width configurable on the yosys optimizer sub-pipeline via pass option. Maybe it makes sense to just extend the IIUC if we can limit the yosys optimizer pipeline to generate luts of a given size by configuration, then the rest of the pipeline doesn't need any special configuration and can just lower whatever luts it sees/supports at that stage. I.e., you don't need to touch the jaxite emitter and just let it fail on lut4, and secret-to-cggi can have both lut3 and lut4 patterns added without conflict. I'm not too worried about having stuff in HEIR that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd recommend grepping the entire repo (including the .github/workflows) for instances of techmap.v to ensure that this new techmap file is included in all places.
For example, it's included in the release workflows like https://github.com/google/heir/blob/nightly/.github/workflows/nightly.yml
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've refactored the whole repo. Now, by default the LUT3 optimisation will be chosen.
b3aadc6 to
f8d3a8a
Compare
Working 4lut impl Updated tests Update Emitter Update Emitter Working mode Lut3, Lut4 selector Correct generation of LWE Types
|
Ooh, very cool to see the past research! Is just thought about the ability to run on FPGAs, and would be a 'nice to have' as an appendix somewhere to write about the FPGA performance of the current CGGI pipelines (and mostly to show how much improvement we can do in the future). |
We can make the LUT CGGI pipeline more efficient by using 4 elements LUTs instead of 3 elements. Since each LUT only produces a single bit output, and the whole program only consits of 4-to-1 LUT, we cannot overflow a 4b ciphertext.
In the case we use TFHE-rs as backend, this will leads to a more efficient use of PBS (need to check the impact on the Jaxite backend).
For now, would we like to dual support of lut3 and lut4 elements? For respectively, the jaxite and TFHE-rs backends?