Faster `mont_sqr` making use of symmetry 

GMP performs [basecase squaring](https://gmplib.org/manual/Basecase-Multiplication) ~1.5x faster than multiplication of `a*b` by noticing the upper half of the cross-product is symmetric and skipping those duplicated multiplications.

I've read over the [core_mont.cu](https://github.com/NVlabs/CGBN/blob/master/include/cgbn/core/core_mont.cu) code, I believ that core_mont.cu is used when TPI=limps. In this case it's not clear how to do fast exp.

Have you thought about this possibility?
Have you seen this anyone take advantage of this on a GPU?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Faster `mont_sqr` making use of symmetry #19

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Faster mont_sqr making use of symmetry #19

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Faster `mont_sqr` making use of symmetry #19