Search results

2025/796 (PDF) Last updated: 2025-05-04

Unified MEDS Accelerator

Sanjay Deshpande, Yongseok Lee, Mamuri Nawan, Kashif Nawaz, Ruben Niederhagen, Yunheung Paek, Jakub Szefer

Implementation

The Matrix Equivalence Digital Signature (MEDS) scheme a code-based candidate in the first round of NIST’s Post-Quantum Cryptography (PQC) standardization process, offers competitively small signature sizes but incurs high computational costs for signing and verification. This work explores how a high-performance FPGA-based hardware implementation can enhance MEDS performance by leveraging the inherent parallelism of its computations, while examining the trade-offs between performance gains...

2025/621 (PDF) Last updated: 2025-04-05

SPHINCSLET: An Area-Efficient Accelerator for the Full SPHINCS+ Digital Signature Algorithm

Sanjay Deshpande, Yongseok Lee, Cansu Karakuzu, Jakub Szefer, Yunheung Paek

Implementation

This work presents SPHINCSLET, the first fully standard-compliant and area-efficient hardware implementation of the SLH-DSA algorithm, formerly known as SPHINCS+, a post-quantum digital signature scheme. SPHINCSLET is designed to be parameterizable across different security levels and hash functions, offering a balanced trade-off between area efficiency and performance. Existing hardware implementations either feature a large area footprint to achieve fast signing and verification or adopt a...

2025/620 (PDF) Last updated: 2025-04-04

Need for zkSpeed: Accelerating HyperPlonk for Zero-Knowledge Proofs

Alhad Daftardar, Jianqiao Mo, Joey Ah-kiow, Benedikt Bünz, Ramesh Karri, Siddharth Garg, Brandon Reagen

Implementation

(Preprint) Zero-Knowledge Proofs (ZKPs) are rapidly gaining importance in privacy-preserving and verifiable computing. ZKPs enable a proving party to prove the truth of a statement to a verifying party without revealing anything else. ZKPs have applications in blockchain technologies, verifiable machine learning, and electronic voting, but have yet to see widespread adoption due to the computational complexity of the proving process.Recent works have accelerated the key primitives of...

2025/601 (PDF) Last updated: 2025-04-02

PHOENIX: Crypto-Agile Hardware Sharing for ML-KEM and HQC

Antonio Ras, Antoine Loiseau, Mikaël Carmona, Simon Pontié, Guénaël Renault, Benjamin Smith, Emanuele Valea

Implementation

The transition to quantum-safe public-key cryptography has begun: for key agreement, NIST has standardized ML-KEM and selected HQC for future standardization. The relative immaturity of these schemes encourages crypto-agile implementations, to facilitate easy transitions between them. Intelligent crypto-agility requires efficient sharing strategies to compute operations from different cryptosystems using the same resources. This is particularly challenging for cryptosystems with distinct...

2025/559 (PDF) Last updated: 2025-03-26

Is Your Bluetooth Chip Leaking Secrets via RF Signals?

Yanning Ji, Elena Dubrova, Ruize Wang

Attacks and cryptanalysis

In this paper, we present a side-channel attack on the hardware AES accelerator of a Bluetooth chip used in millions of devices worldwide, ranging from wearables and smart home products to industrial IoT. The attack leverages information about AES computations unintentionally transmitted by the chip together with RF signals to recover the encryption key. Unlike traditional side-channel attacks that rely on power or near-field electromagnetic emissions as sources of information, RF-based...

2025/520 (PDF) Last updated: 2025-03-19

Masking-Friendly Post-Quantum Signatures in the Threshold-Computation-in-the-Head Framework

Thibauld Feneuil, Matthieu Rivain, Auguste Warmé-Janville

Cryptographic protocols

Side-channel attacks pose significant threats to cryptographic implementations, which require the inclusion of countermeasures to mitigate these attacks. In this work, we study the masking of state-of-the-art post-quantum signatures based on the MPC-in-the-head paradigm. More precisely, we focus on the recent threshold-computation-in-the-head (TCitH) framework that applies to some NIST candidates of the post-quantum standardization process. We first provide an analysis of side-channel attack...

2025/137 (PDF) Last updated: 2025-04-19

FINAL bootstrap acceleration on FPGA using DSP-free constant-multiplier NTTs

Jonas Bertels, Hilder V. L. Pereira, Ingrid Verbauwhede

Implementation

This work showcases Quatorze-bis, a state-of-the-art Number Theoretic Transform circuit for TFHE-like cryptosystems on FPGAs. It contains a novel modular multiplication design for modular multiplication with a constant for a constant modulus. This modular multiplication design does not require any DSP units or any dedicated multiplier unit, nor does it require extra logic when compared to the state-of-the-art modular multipliers. Furthermore, we present an implementation of a constant...

2025/121 (PDF) Last updated: 2025-01-26

On symbolic computations over arbitrary commutative rings and cryptography with the temporal Jordan-Gauss graphs.

Vasyl Ustimenko

Foundations

The paper is dedicated to Multivariate Cryptography over general commutative ring K and protocols of symbolic computations for safe delivery of multivariate maps. We consider itera-tive algorithm of generation of multivariate maps of prescribed degree or density with the trapdoor accelerator, i.e. piece of information which allows to compute the reimage of the map in polynomial time. The concept of Jordan-Gauss temporal graphs is used for the obfus-cation of known graph based public keys ...

2024/1918 (PDF) Last updated: 2025-04-20

Accelerating Hash-Based Polynomial Commitment Schemes with Linear Prover Time

Florian Hirner, Florian Krieger, Constantin Piber, Sujoy Sinha Roy

Implementation

Zero-knowledge proofs (ZKPs) are cryptographic protocols that enable one party to prove the validity of a statement without revealing any information beyond its truth. A central building block in many ZKPs are polynomial commitment schemes (PCS) where constructions with \textit{linear-time provers} are especially attractive. Two such examples are Brakedown and its extension Orion which enable linear-time and quantum-resistant proving by leveraging linear-time encodable Spielman codes....

2024/1827 (PDF) Last updated: 2024-11-07

OPTIMSM: FPGA hardware accelerator for Zero-Knowledge MSM

Xander Pottier, Thomas de Ruijter, Jonas Bertels, Wouter Legiest, Michiel Van Beirendonck, Ingrid Verbauwhede

Implementation

The Multi-Scalar Multiplication (MSM) is the main barrier to accelerating Zero-Knowledge applications. In recent years, hardware acceleration of this algorithm on both FPGA and GPU has become a popular research topic and the subject of a multi-million dollar prize competition (ZPrize). This work presents OPTIMSM: Optimized Processing Through Iterative Multi-Scalar Multiplication. This novel accelerator focuses on the acceleration of the MSM algorithm for any Elliptic Curve (EC) by improving...

2024/1740 (PDF) Last updated: 2024-11-13

OpenNTT: An Automated Toolchain for Compiling High-Performance NTT Accelerators in FHE

Florian Krieger, Florian Hirner, Ahmet Can Mert, Sujoy Sinha Roy

Implementation

Modern cryptographic techniques such as fully homomorphic encryption (FHE) have recently gained broad attention. Most of these cryptosystems rely on lattice problems wherein polynomial multiplication forms the computational bottleneck. A popular method to accelerate these polynomial multiplications is the Number-Theoretic Transformation (NTT). Recent works aim to improve the practical deployability of NTT and propose toolchains supporting the NTT hardware accelerator design processes....

2024/1480 (PDF) Last updated: 2024-09-21

On Schubert cells of Projective Geometry and quadratic public keys of Multivariate Cryptography

Vasyl Ustimenko

Public-key cryptography

Jordan-Gauss graphs are bipartite graphs given by special quadratic equations over the commutative ring K with unity with partition sets K^n and K^m , n ≥m such that the neighbour of each vertex is defined by the system of linear equation given in its row-echelon form. We use families of this graphs for the construction of new quadratic and cubic surjective multivariate maps F of K^n onto K^m (or K^n onto K^n) with the trapdoor accelerators T , i. e. pieces of information which...

2024/1253 (PDF) Last updated: 2024-08-08

FELIX (XGCD for FALCON): FPGA-based Scalable and Lightweight Accelerator for Large Integer Extended GCD

Sam Coulon, Tianyou Bao, Jiafeng Xie

Implementation

The Extended Greatest Common Divisor (XGCD) computation is a critical component in various cryptographic applications and algorithms, including both pre- and post-quantum cryptosystems. In addition to computing the greatest common divisor (GCD) of two integers, the XGCD also produces Bezout coefficients $b_a$ and $b_b$ which satisfy $\mathrm{GCD}(a,b) = a\times b_a + b\times b_b$. In particular, computing the XGCD for large integers is of significant interest. Most recently, XGCD computation...

2024/1246 (PDF) Last updated: 2024-08-06

MSMAC: Accelerating Multi-Scalar Multiplication for Zero-Knowledge Proof

Pengcheng Qiu, Guiming Wu, Tingqiang Chu, Changzheng Wei, Runzhou Luo, Ying Yan, Wei Wang, Hui Zhang

Implementation

Multi-scalar multiplication (MSM) is the most computation-intensive part in proof generation of Zero-knowledge proof (ZKP). In this paper, we propose MSMAC, an FPGA accelerator for large-scale MSM. MSMAC adopts a specially designed Instruction Set Architecture (ISA) for MSM and optimizes pipelined Point Addition Unit (PAU) with hybrid Karatsuba multiplier. Moreover, a runtime system is proposed to split MSM tasks with the optimal sub-task size and orchestrate execution of Processing Elements...

2024/1192 (PDF) Last updated: 2024-07-24

Towards ML-KEM & ML-DSA on OpenTitan

Amin Abdulrahman, Felix Oberhansl, Hoang Nguyen Hien Pham, Jade Philipoom, Peter Schwabe, Tobias Stelzer, Andreas Zankl

Implementation

This paper presents extensions to the OpenTitan hardware root of trust that aim at enabling high-performance lattice-based cryptography. We start by carefully optimizing ML-KEM and ML-DSA - the two primary algorithms selected by NIST for standardization - in software targeting the OTBN accelerator. Based on profiling results of these implementations, we propose tightly integrated extensions to OTBN, specifically an interface from OTBN to OpenTitan's Keccak accelerator (KMAC core) and...

2024/744 (PDF) Last updated: 2025-04-16

An NVMe-based Secure Computing Platform with FPGA-based TFHE Accelerator

Yoshihiro Ohba, Tomoya Sanuki, Claude Gravel, Kentaro Mihara, Asuka Wakasugi, Kenta Adachi

Implementation

In this study, we introduce a new approach to secure computing by implementing a platform that utilizes a non-volatile memory express (NVMe)-based system with an FPGA-based Torus fully homomorphic encryption (TFHE) accelerator, solid state drive (SSD), and middleware on the host-side. Our platform is the first to offer completely secure computing capabilities for TFHE by using an FPGA-based accelerator. We defined secure computing instructions to evaluate 14-bit to 14-bit functions using...

2024/633 (PDF) Last updated: 2024-06-27

Vision Mark-32: ZK-Friendly Hash Function Over Binary Tower Fields

Tomer Ashur, Mohammad Mahzoun, Jim Posen, Danilo Šijačić

Implementation

Zero-knowledge proof systems are widely used in different applications on the Internet. Among zero-knowledge proof systems, SNARKs are a popular choice because of their fast verification time and small proof size. The efficiency of zero-knowledge systems is crucial for usability, resulting in the development of so-called arithmetization-oriented ciphers. In this work, we introduce Vision Mark-32, a modified instance of Vision defined over binary tower fields, with an optimized number of...

2024/319 (PDF) Last updated: 2024-02-24

On the cryptosystems based on two Eulerian transfor-mations defined over the commutative rings $Z_{2^s}, s>1$.

Vasyl Ustimenko

Cryptographic protocols

We suggest the family of ciphers s^E^n, n=2,3,.... with the space of plaintexts (Z*_{2^s})^n, s >1 such that the encryption map is the composition of kind G=G_1A_1G_2A_2 where A_i are the affine transformations from AGL_n(Z_{2^s}) preserving the variety (Z*_{2^s)}^n , Eulerian endomorphism G_i , i=1,2 of K[x_1, x_2,...., x_n] moves x_i to monomial term ϻ(x_1)^{d(1)}(x_2)^{d(2)}...(x_n)^{d(n)} , ϻϵ Z*_{2^s} and act on (Z*_{2^s})^n as bijective transformations. The cipher is...

2024/314 (PDF) Last updated: 2024-11-07

Exploring the Advantages and Challenges of Fermat NTT in FHE Acceleration

Andrey Kim, Ahmet Can Mert, Anisha Mukherjee, Aikata Aikata, Maxim Deryabin, Sunmin Kwon, HyungChul Kang, Sujoy Sinha Roy

Implementation

Recognizing the importance of a fast and resource-efficient polynomial multiplication in homomorphic encryption, in this paper, we design a multiplier-less number theoretic transform using a Fermat number as an auxiliary modulus. To make this algorithm scalable with the degree of polynomial, we apply a univariate to multivariate polynomial ring transformation. We develop an accelerator architecture for fully homomorphic encryption using these algorithmic techniques for efficient...

2024/217 (PDF) Last updated: 2024-02-12

Hardware Acceleration of the Prime-Factor and Rader NTT for BGV Fully Homomorphic Encryption

David Du Pont, Jonas Bertels, Furkan Turan, Michiel Van Beirendonck, Ingrid Verbauwhede

Implementation

Fully Homomorphic Encryption (FHE) enables computation on encrypted data, holding immense potential for enhancing data privacy and security in various applications. Presently, FHE adoption is hindered by slow computation times, caused by data being encrypted into large polynomials. Optimized FHE libraries and hardware acceleration are emerging to tackle this performance bottleneck. Often, these libraries implement the Number Theoretic Transform (NTT) algorithm for efficient polynomial...

2023/1736 (PDF) Last updated: 2024-02-28

Aloha-HE: A Low-Area Hardware Accelerator for Client-Side Operations in Homomorphic Encryption

Florian Krieger, Florian Hirner, Ahmet Can Mert, Sujoy Sinha Roy

Implementation

Homomorphic encryption (HE) has gained broad attention in recent years as it allows computations on encrypted data enabling secure cloud computing. Deploying HE presents a notable challenge since it introduces a performance overhead by orders of magnitude. Hence, most works target accelerating server-side operations on hardware platforms, while little attention has been given to client-side operations. In this paper, we present a novel design methodology to implement and accelerate the...

2023/1396 (PDF) Last updated: 2025-04-01

Accelerating Isogeny Walks for VDF Evaluation

David Jacquemin, Anisha Mukherjee, Ahmet Can Mert, Sujoy Sinha Roy

Implementation

VDFs are characterized by sequential function evaluation but an immediate output verification. In order to ensure secure use of VDFs in real-world applications, it is important to determine the fastest implementation. Considering the point of view of an attacker (say with unbounded resources), this paper aims to accelerate the isogeny-based VDF proposed by De Feo-Mason-Petit-Sanso in 2019. It is the first work that implements a hardware accelerator for the evaluation step of an isogeny VDF....

2023/1267 (PDF) Last updated: 2024-08-16

Whipping the MAYO Signature Scheme using Hardware Platforms

Florian Hirner, Michael Streibl, Florian Krieger, Ahmet Can Mert, Sujoy Sinha Roy

Implementation

NIST issued a new call in 2023 to diversify the portfolio of quantum-resistant digital signature schemes since the current portfolio relies on lattice problems. The MAYO scheme, which builds on the Unbalanced Oil and Vinegar (UOV) problem, is a promising candidate for this new call. MAYO introduces emulsifier maps and a novel 'whipping' technique to significantly reduce the key sizes compared to previous UOV schemes. This paper provides a comprehensive analysis of the implementation...

2023/1190 (PDF) Last updated: 2025-01-17

REED: Chiplet-Based Accelerator for Fully Homomorphic Encryption

Aikata Aikata, Ahmet Can Mert, Sunmin Kwon, Maxim Deryabin, Sujoy Sinha Roy

Implementation

Fully Homomorphic Encryption (FHE) enables privacy-preserving computation and has many applications. However, its practical implementation faces massive computation and memory overheads. To address this bottleneck, several Application-Specific Integrated Circuit (ASIC) FHE accelerators have been proposed. All these prior works put every component needed for FHE onto one chip (monolithic), hence offering high performance. However, they encounter common challenges associated with large-scale...

2023/716 (PDF) Last updated: 2023-05-18

Towards High-speed ASIC Implementations of Post-Quantum Cryptography

Malik Imran, Aikata Aikata, Sujoy Sinha Roy, Samuel pagliarini

Implementation

In this brief, we realize different architectural techniques towards improving the performance of post-quantum cryptography (PQC) algorithms when implemented as hardware accelerators on an application-specific integrated circuit (ASIC) platform. Having SABER as a case study, we designed a 256-bit wide architecture geared for high-speed cryptographic applications that incorporates smaller and distributed SRAM memory blocks. Moreover, we have adapted the building blocks of SABER to process...

2023/686 (PDF) Last updated: 2024-08-13

Efficient Accelerator for NTT-based Polynomial Multiplication

Raziyeh Salarifard, Hadi Soleimany

Implementation

The Number Theoretic Transform (NTT) is used to efficiently execute polynomial multiplication. It has become an important part of lattice-based post-quantum methods and the subsequent generation of standard cryptographic systems. However, implementing post-quantum schemes is challenging since they rely on intricate structures. This paper demonstrates how to develop a high-speed NTT multiplier highly optimized for FPGAs with few logical resources. We describe a novel architecture for NTT...

2023/678 (PDF) Last updated: 2023-05-17

A 334µW 0.158mm2 ASIC for Post-Quantum Key-Encapsulation Mechanism Saber with Low-latency Striding Toom-Cook Multiplication Extended Version

Archisman Ghosh, Jose Maria Bermudo Mera, Angshuman Karmakar, Debayan Das, Santosh Ghosh, Ingrid Verbauwhede, Shreyas Sen

Public-key cryptography

The hard mathematical problems that assure the security of our current public-key cryptography (RSA, ECC) are broken if and when a quantum computer appears rendering them ineffective for use in the quantum era. Lattice based cryptography is a novel approach to public key cryptography, of which the mathematical investigation (so far) resists attacks from quantum computers. By choosing a module learning with errors (MLWE) algorithm as the next standard, National Institute of Standard \&...

2023/618 (PDF) Last updated: 2023-04-30

Hardware Acceleration of FHEW

Jonas Bertels, Michiel Van Beirendonck, Furkan Turan, Ingrid Verbauwhede

Implementation

The magic of Fully Homomorphic Encryption (FHE) is that it allows operations on encrypted data without decryption. Unfortunately, the slow computation time limits their adoption. The slow computation time results from the vast memory requirements (64Kbits per ciphertext), a bootstrapping key of 1.3 GB, and sizeable computational overhead (10240 NTTs, each NTT requiring 5120 32-bit multiplications). We accelerate the FHEW bootstrapping in hardware on a high-end U280 FPGA. To reduce the...

2023/521 (PDF) Last updated: 2023-04-18

TREBUCHET: Fully Homomorphic Encryption Accelerator for Deep Computation

David Bruce Cousins, Yuriy Polyakov, Ahmad Al Badawi, Matthew French, Andrew Schmidt, Ajey Jacob, Benedict Reynwar, Kellie Canida, Akhilesh Jaiswal, Clynn Mathew, Homer Gamil, Negar Neda, Deepraj Soni, Michail Maniatakos, Brandon Reagen, Naifeng Zhang, Franz Franchetti, Patrick Brinich, Jeremy Johnson, Patrick Broderick, Mike Franusich, Bo Zhang, Zeming Cheng, Massoud Pedram

Implementation

Secure computation is of critical importance to not only the DoD, but across financial institutions, healthcare, and anywhere personally identifiable information (PII) is accessed. Traditional security techniques require data to be decrypted before performing any computation. When processed on untrusted systems the decrypted data is vulnerable to attacks to extract the sensitive information. To address these vulnerabilities Fully Homomorphic Encryption (FHE) keeps the data encrypted...

2023/465 (PDF) Last updated: 2023-03-30

RPU: The Ring Processing Unit

Deepraj Soni, Negar Neda, Naifeng Zhang, Benedict Reynwar, Homer Gamil, Benjamin Heyman, Mohammed Nabeel Thari Moopan, Ahmad Al Badawi, Yuriy Polyakov, Kellie Canida, Massoud Pedram, Michail Maniatakos, David Bruce Cousins, Franz Franchetti, Matthew French, Andrew Schmidt, Brandon Reagen

Applications

Ring-Learning-with-Errors (RLWE) has emerged as the foundation of many important techniques for improving security and privacy, including homomorphic encryption and post-quantum cryptography. While promising, these techniques have received limited use due to their extreme overheads of running on general-purpose machines. In this paper, we present a novel vector Instruction Set Architecture (ISA) and microarchitecture for accelerating the ring-based computations of RLWE. The ISA, named B512,...

2023/368 (PDF) Last updated: 2023-03-14

AI Attacks AI: Recovering Neural Network architecture from NVDLA using AI-assisted Side Channel Attack

Naina Gupta, Arpan Jati, Anupam Chattopadhyay

Attacks and cryptanalysis

During the last decade, there has been a stunning progress in the domain of AI with adoption in both safety-critical and security-critical applications. A key requirement for this is highly trained Machine Learning (ML) models, which are valuable Intellectual Property (IP) of the respective organizations. Naturally, these models have become targets for model recovery attacks through side-channel leakage. However, majority of the attacks reported in literature are either on simple embedded...

2023/090 (PDF) Last updated: 2023-01-24

Unlimited Results: Breaking Firmware Encryption of ESP32-V3

Karim M. Abdellatif, Olivier Hériveaux, Adrian Thillard

Attacks and cryptanalysis

Because of the rapid growth of Internet of Things (IoT), embedded systems have become an interesting target for experienced attackers. ESP32~\cite{tech-ref-man} is a low-cost and low-power system on chip (SoC) series created by Espressif Systems. The firmware extraction of such embedded systems is a real threat to the manufacturer as it breaks its intellectual property and raises the risk of creating equivalent systems with less effort and resources. In 2019,...

2022/1635 (PDF) Last updated: 2023-10-18

FPT: a Fixed-Point Accelerator for Torus Fully Homomorphic Encryption

Michiel Van Beirendonck, Jan-Pieter D'Anvers, Furkan Turan, Ingrid Verbauwhede

Implementation

Fully Homomorphic Encryption (FHE) is a technique that allows computation on encrypted data. It has the potential to drastically change privacy considerations in the cloud, but high computational and memory overheads are preventing its broad adoption. TFHE is a promising Torus-based FHE scheme that heavily relies on bootstrapping, the noise-removal tool invoked after each encrypted logical/arithmetical operation. We present FPT, a Fixed-Point FPGA accelerator for TFHE bootstrapping. FPT...

2022/1537 (PDF) Last updated: 2022-11-06

On Extremal Algebraic Graphs and Multivariate Cryptosystems

Vasyl Ustimenko

Public-key cryptography

Multivariate rule x_i -> f_i, i = 1, 2, ..., n, f_i from K[x_1, x_2, ..., x_n] over commutative ring K defines endomorphism σ_n of K[x_1, x_2, ..., x_n] into itself given by its values on variables x_i. Degree of σ_n can be defined as maximum of degrees of polynomials f_i. We say that family σ_n, n = 2, 3, .... has trapdoor accelerator ^nT if the knowledge of the piece of information ^nT allows to compute reimage x of y = σ_n(x) in time O(n^2). We use extremal algebraic graphs for the...

2022/1093 (PDF) Last updated: 2023-07-25

HPKA: A High-Performance CRYSTALS-Kyber Accelerator Exploring Efficient Pipelining

Ziying Ni, Ayesha Khalid, Dur-e-Shahwar Kundi, Máire O’Neill, Weiqiang Liu

Implementation

CRYSTALS-Kyber (Kyber) was recently chosen as the first quantum resistant Key Encapsulation Mechanism (KEM) scheme for standardisation, after three rounds of the National Institute of Standards and Technology (NIST) initiated PQC competition which begin in 2016 and search of the best quantum resistant KEMs and digital signatures. Kyber is based on the Module-Learning with Errors (M-LWE) class of Lattice-based Cryptography, that is known to manifest efficiently on FPGAs. This work explores...

2022/881 (PDF) Last updated: 2022-08-16

A Novel High-performance Implementation of CRYSTALS-Kyber with AI Accelerator

Lipeng Wan, Fangyu Zheng, Guang Fan, Rong Wei, Lili Gao, Jiankuo Dong, Jingqiang Lin, Yuewu Wang

Implementation

Public-key cryptography, including conventional cryptosystems and post-quantum cryptography, involves computation-intensive workloads. With noticing the extraordinary computing power of AI accelerators, in this paper, we further explore the feasibility to introduce AI accelerators into high-performance cryptographic computing. Since AI accelerators are dedicated to machine learning or neural networks, the biggest challenge is how to transform cryptographic workloads into their operations,...

2022/657 (PDF) Last updated: 2023-09-06

BASALISC: Programmable Hardware Accelerator for BGV Fully Homomorphic Encryption

Robin Geelen, Michiel Van Beirendonck, Hilder V. L. Pereira, Brian Huffman, Tynan McAuley, Ben Selfridge, Daniel Wagner, Georgios Dimou, Ingrid Verbauwhede, Frederik Vercauteren, David W. Archer

Implementation

Fully Homomorphic Encryption (FHE) allows for secure computation on encrypted data. Unfortunately, huge memory size, computational cost and bandwidth requirements limit its practicality. We present BASALISC, an architecture family of hardware accelerators that aims to substantially accelerate FHE computations in the cloud. BASALISC is the first to implement the BGV scheme with fully-packed bootstrapping – the noise removal capability necessary for arbitrary-depth computation. It supports a...

2022/628 (PDF) Last updated: 2022-05-23

High-Performance Polynomial Multiplication Hardware Accelerators for KEM Saber and NTRU

Elizabeth Carter, Pengzhou He, Jiafeng Xie

Implementation

Along the rapid development in building large-scale quantum computers, post-quantum cryptography (PQC) has drawn significant attention from research community recently as it is proven that the existing public-key cryptosystems are vulnerable to the quantum attacks. Following this direction, this paper presents a novel implementation of high-performance polynomial multiplication hardware accelerators for key encapsulation mechanism (KEM) Saber and NTRU, two PQC algorithms that are currently...

2022/530 (PDF) Last updated: 2022-05-10

High-speed SABER Key Encapsulation Mechanism in 65nm CMOS

Malik Imran, Felipe Almeida, Andrea Basso, Sujoy Sinha Roy, Samuel Pagliarini

Public-key cryptography

Quantum computers will break cryptographic primitives that are based on integer factorization and discrete logarithm problems. SABER is a key agreement scheme based on the Learning With Rounding problem that is quantum-safe, i.e., resistant to quantum computer attacks. This article presents a high-speed silicon implementation of SABER in a 65nm technology as an Application Specific Integrated Circuit. The chip measures 1$mm^2$ in size and can operate at a maximum frequency of 715$MHz$ at a...

2022/496 (PDF) Last updated: 2022-04-28

Lightweight Hardware Accelerator for Post-Quantum Digital Signature CRYSTALS-Dilithium

Naina Gupta, Arpan Jati, Anupam Chattopadhyay, Gautam Jha

Implementation

The looming threat of an adversary with Quantum computing capability led to a worldwide research effort towards identifying and standardizing novel post-quantum cryptographic primitives. Post-standardization, all existing security protocols will need to support efficient implementation of these primitives. In this work, we contribute to these efforts by reporting the smallest implementation of CRYSTALS-Dilithium, a finalist candidate for post-quantum digital signature. By invoking multiple...

2022/480 (PDF) Last updated: 2022-10-12

Medha: Microcoded Hardware Accelerator for computing on Encrypted Data

Ahmet Can Mert, Aikata, Sunmin Kwon, Youngsam Shin, Donghoon Yoo, Yongwoo Lee, Sujoy Sinha Roy

Implementation

Homomorphic encryption enables computation on encrypted data, and hence it has a great potential in privacy-preserving outsourcing of computations to the cloud. Hardware acceleration of homomorphic encryption is crucial as software implementations are very slow. In this paper, we present design methodologies for building a programmable hardware accelerator for speeding up the cloud-side homomorphic evaluations on encrypted data. First, we propose a divide-and-conquer technique that...

2021/1555 (PDF) Last updated: 2022-02-18

Accelerator for Computing on Encrypted Data

Sujoy Sinha Roy, Ahmet Can Mert, Aikata, Sunmin Kwon, Youngsam Shin, Donghoon Yoo

Implementation

Fully homomorphic encryption enables computation on encrypted data, and hence it has a great potential in privacy-preserving outsourcing of computations. In this paper, we present a complete instruction-set processor architecture ‘Medha’ for accelerating the cloud-side operations of an RNS variant of the HEAAN homomorphic encryption scheme. Medha has been designed following a modular hardware design approach to attain a fast computation time for computationally expensive homomorphic...

2021/1527 (PDF) Last updated: 2021-11-22

CoHA-NTT: A Configurable Hardware Accelerator for NTT-based Polynomial Multiplication

Kemal Derya, Ahmet Can Mert, Erdinç Öztürk, Erkay Savaş

Public-key cryptography

In this paper, we introduce a configurable hardware architecture that can be used to generate unified and parametric NTT-based polynomial multipliers that support a wide range of parameters of lattice-based cryptographic schemes proposed for post-quantum cryptography. Both NTT and inverse NTT operations can be performed using the unified butterfly unit of our architecture, which constitutes the core building block in NTT operations. The multitude of this unit plays an essential role in...

2021/1292 (PDF) Last updated: 2022-09-16

A Fast Large-Integer Extended GCD Algorithm and Hardware Design for Verifiable Delay Functions and Modular Inversion

Kavya Sreedhar, Mark Horowitz, Christopher Torng

Implementation

The extended GCD (XGCD) calculation, which computes Bézout coefficients b_a, b_b such that b_a ∗ a_0 + b_b ∗ b_0 = GCD(a_0, b_0), is a critical operation in many cryptographic applications. In particular, large-integer XGCD is computationally dominant for two applications of increasing interest: verifiable delay functions that square binary quadratic forms within a class group and constant-time modular inversion for elliptic curve cryptography. Most prior work has focused on fast software...

2021/1202 (PDF) Last updated: 2021-09-17

Design Space Exploration of SABER in 65nm ASIC

Malik Imran, Felipe Almeida, Jaan Raik, Andrea Basso, Sujoy Sinha Roy, Samuel Pagliarini

Public-key cryptography

This paper presents a design space exploration for SABER, one of the finalists in NIST’s quantum-resistant public-key cryptographic standardization effort. Our design space exploration targets a 65nmASIC platform and has resulted in the evaluation of 6 different architectures. Our exploration is initiated by setting a baseline architecture which is ported from FPGA. In order to improve the clock frequency (the primary goal in our exploration), we have employed several optimizations: (i) use...

2021/1189 (PDF) Last updated: 2021-09-17

A Configurable Crystals-Kyber Hardware Implementation with Side-Channel Protection

Arpan Jati, Naina Gupta, Anupam Chattopadhyay, Somitra Kumar Sanadhya

Implementation

In this work, we present a configurable and side channel resistant implementation of the post-quantum key-exchange algorithm Crystals-Kyber. The implemented design can be configured for different performance and area requirements leading to different trade-offs for different applications. A low area implementation can be achieved in 5269 LUTs and 2422 FFs, whereas a high performance implementation required 7151 LUTs and 3730 FFs. Due to a deeply pipelined architecture, a high operating speed...

2021/973 (PDF) Last updated: 2021-07-22

A Multiplatform Parallel Approach for Lattice Sieving Algorithms

Michał Andrzejczak, Kris Gaj

Implementation

Lattice sieving is currently the leading class of algorithms for solving the shortest vector problem over lattices. The computational difficulty of this problem is the basis for constructing secure post-quantum public-key cryptosystems based on lattices. In this paper, we present a novel massively parallel approach for solving the shortest vector problem using lattice sieving and hardware acceleration. We combine previously reported algorithms with a proper caching strategy and develop...

2021/714 (PDF) Last updated: 2021-05-31

CARiMoL: A Configurable Hardware Accelerator for Ringand Module Lattice-Based Post-Quantum Cryptography

Afifa Ishtiaq, Dr. Muhammad Shafique, Dr. Osman Hassan

Implementation

Abstract—CARiMoL is a novel run-time Configurable Hardware Accelerator for Ring and Module Lattice-based postquantum cryptography. It’s flexible design can be configured to key-pair generation, encapsulation, and decapsulation for NewHope and CRYSTALS-Kyber schemes using same hardware. CARiMoL offers run-time configurability for multiple security levels of NewHope and CRYSTALS-Kyber schemes, supporting both Chosen-Plaintext Attack (CPA) and Chosen-Ciphertext Attack (CCA) secure...

2021/597 (PDF) Last updated: 2021-05-10

Accelerated RISC-V for Post-Quantum SIKE

Rami Elkhatib, Reza Azarderakhsh, Mehran Mozaffari-Kermani

Public-key cryptography

Software implementations of cryptographic algorithms are slow but highly flexible and relatively easy to implement. On the other hand, hardware implementations are usually faster but provide little flexibility and require a lot of time to implement efficiently. In this paper, we develop a hybrid software-hardware implementation of the third round of Supersingular Isogeny Key Encapsulation (SIKE), a post-quantum cryptography algorithm candidate for NIST. We implement an isogeny field...

2021/563 (PDF) Last updated: 2021-05-03

High-Speed NTT-based Polynomial Multiplication Accelerator for CRYSTALS-Kyber Post-Quantum Cryptography

Mojtaba Bisheh-Niasar, Reza Azarderakhsh, Mehran Mozaffari-Kermani

Implementation

This paper demonstrates an architecture for accelerating the polynomial multiplication using number theoretic transform (NTT). Kyber is one of the finalists in the third round of the NIST post-quantum cryptography standardization process. Simultaneously, the performance of NTT execution is its main challenge, requiring large memory and complex memory access pattern. In this paper, an efficient NTT architecture is presented to improve the respective computation time. We propose several...

2021/561 (PDF) Last updated: 2021-05-03

Kyber on ARM64: Compact Implementations of Kyber on 64-bit ARM Cortex-A Processors

Pakize Sanal, Emrah Karagoz, Hwajeong Seo, Reza Azarderakhsh, Mehran Mozaffari-Kermani

Implementation

Public-key cryptography based on the lattice problem is efficient and believed to be secure in a post-quantum era. In this paper, we introduce carefully optimized implementations of Kyber encryption schemes for 64-bit ARM Cortex-A processors. Our research contribution includes several optimizations for Number Theoretic Transform (NTT), noise sampling, and AES accelerator based symmetric function implementations. The proposed Kyber512 implementation on ARM64 improved previous works by 1.72×,...

2021/485 (PDF) Last updated: 2021-04-16

A Hardware Accelerator for Polynomial Multiplication Operation of CRYSTALS-KYBER PQC Scheme

Ferhat Yaman, Ahmet Can Mert, Erdinç Öztürk, Erkay Savaş

Cryptographic protocols

Polynomial multiplication is one of the most time-consuming operations utilized in lattice-based post-quantum cryptography (PQC) schemes. CRYSTALS-KYBER is a lattice-based key encapsulation mechanism (KEM) and it was recently announced as one of the four finalists at round three in NIST's PQC Standardization. Therefore, efficient implementations of polynomial multiplication operation are crucial for high-performance CRYSTALS-KYBER applications. In this paper, we propose three different...

2021/352 (PDF) Last updated: 2021-03-18

A Configurable Hardware Implementation of XMSS

Jan Philipp Thoma, Tim Güneysu

Implementation

Quantum computers are about to herald a new age of cryptography. As a fundamental building block in today’s digitalized world, Digital Signature Schemes (DSS) provide the ability to authenticate messages exchanged over untrusted channels. Unfortunately, virtually all currently used DSS are built upon mathematical problems that can efficiently be solved using quantum computers, thus rendering schemes such as RSA and ECC insecure. Due to its conservative security properties, the eXtended...

2021/124 (PDF) Last updated: 2021-02-05

Efficient Number Theoretic Transform Implementation on GPU for Homomorphic Encryption

Ozgun Ozerk, Can Elgezen, Ahmet Can Mert, Erdinc Ozturk, Erkay Savas

Implementation

Lattice-based cryptography forms the mathematical basis for homomorphic encryption, which allows computation directly on encrypted data. Homomorphic encryption enables privacy-preserving applications such as secure cloud computing; yet, its practical applications suffer from the high computational complexity of homomorphic operations. Fast implementations of the homomorphic encryption schemes heavily depend on efficient polynomial arithmetic; multiplication of very large degree polynomials...

2020/1148 (PDF) Last updated: 2021-03-29

An Area Aware Accelerator for Elliptic Curve Point Multiplication

Malik Imran, Samuel Pagliarini, Muhammad Rashid

Public-key cryptography

This work presents a hardware accelerator, for the optimization of latency and area at the same time, to improve the performance of point multiplication process in Elliptic Curve Cryptography. In order to reduce the overall computation time in the proposed 2-stage pipelined architecture, a rescheduling of point addition and point doubling instructions is performed along with an efficient use of required memory locations. Furthermore, a 41-bit multiplier is also proposed. Consequently, the...

2020/1083 (PDF) Last updated: 2020-10-02

A Fast and Compact RISC-V Accelerator for Ascon and Friends

Stefan Steinegger, Robert Primas

Implementation

Ascon-p is the core building block of Ascon, the winner in the lightweight category of the CAESAR competition. With ISAP, another Ascon-p-based AEAD scheme is currently competing in the 2nd round of the NIST lightweight cryptography standardization project. In contrast to Ascon, ISAP focuses on providing hardening/protection against a large class of implementation attacks, such as DPA, DFA, SFA, and SIFA, entirely on mode-level. Consequently, Ascon-p can be used to realize a wide range of...

2020/805 (PDF) Last updated: 2020-06-30

Proxy Re-Encryption for Accelerator Confidentiality in FPGA-Accelerated Cloud

Furkan Turan, Ingrid Verbauwhede

Cryptographic protocols

FPGAs offer many-fold acceleration to various application domains, and have become a part of cloud-based computation. However, their cloud-use introduce Cloud Service Provider (CSP) as trusted parties, who can access the hardware designs in plaintext. Therefore, the intellectual property of hardware designers is not protected against a dishonest cloud. In this paper, we propose a scheme for the confidentiality of accelerators on cloud, without limiting CSP to maintain their resources freely....

2020/276 (PDF) Last updated: 2020-03-15

CryptoPIM: In-memory Acceleration for Lattice-based Cryptographic Hardware

Hamid Nejatollahi, Saransh Gupta, Mohsen Imani, Tajana Simunic Rosing, Rosario Cammarota, Nikil Dutt

Implementation

Quantum computers promise to solve hard mathematical problems such as integer factorization and discrete logarithms in polynomial time, making standardized public-key cryptography (such as digital signature and key agreement) insecure. Lattice-Based Cryptography (LBC) is a promising post-quantum public-key cryptographic protocol that could replace standardized public-key cryptography, thanks to the inherent post-quantum resistant properties, efficiency, and versatility. A key mathematical...

2020/178 (PDF) Last updated: 2020-02-14

A >100 Gbps Inline AES-GCM Hardware Engine and Protected DMA Transfers between SGX Enclave and FPGA Accelerator Device

Santosh Ghosh, Luis S Kida, Soham Jayesh Desai, Reshma Lal

Implementation

This paper proposes a method to protect DMA data transfer that can be used to offload computation to an accelerator. The proposal minimizes changes in the hardware platform and to the application and SW stack. The paper de-scribes the end-to-end scheme to protect communication between an appli-cation running inside a SGX enclave and a FPGA accelerator optimized for bandwidth and latency and details the implementation of AES-GCM hard-ware engines with high bandwidth and low latency.

2019/1448 (PDF) Last updated: 2020-04-15

Investigating Profiled Side-Channel Attacks Against the DES Key Schedule

Johann Heyszl, Katja Miller, Florian Unterstein, Marc Schink, Alexander Wagner, Horst Gieser, Sven Freud, Tobias Damm, Dominik Klein, Dennis Kügler

Implementation

Recent publications describe profiled side-channel attacks (SCAs) against the DES key-schedule of a “commercially available security controller”. They report a significant reduction of the average remaining entropy of cryptographic keys after the attack, with large, key-dependent variations and results as low as a few bits using only a single attack trace. Unfortunately, they leave important questions unanswered: Is the reported wide distribution of results plausible? Are the results...

2019/1403 Last updated: 2019-12-14

No RISC, no Fun: Comparison of Hardware Accelerated Hash Functions for XMSS

Ingo Braun, Fabio Campos, Steffen Reith, Marc Stöttinger

Implementation

We investigate multiple implementations of a hash-based digital signature scheme in software and hardware for a RISC-V processor. For this, different instantiations of XMSS by leveraging SHA-256 and SHA-3 are considered. Moreover, we propose various optimisations for accelerating the signature scheme on resource-constrained FPGAs. Compared to the pure software version, the implemented hardware accelerators for SHA-256 and SHA-3 achieve a significant speedup of 25x and 87x respectively for...

2019/1394 (PDF) Last updated: 2021-08-27

Voltage-based Covert Channels using FPGAs

Dennis R. E. Gnad, Cong Dang Khoa Nguyen, Syed Hashim Gillani, Mehdi B. Tahoori

Implementation

FPGAs are increasingly used in cloud applications and being integrated into Systems-on-Chip (SoCs). For these systems, various side-channel attacks on cryptographic implementations have been reported, motivating to apply proper countermeasures. Beyond cryptographic implementations, maliciously introduced covert channel receivers and transmitters can allow to exfiltrate other secret information from the FPGA. In this paper, we present a fast covert channel on FPGAs, which exploits the on-chip...

2019/1040 (PDF) Last updated: 2019-09-18

Hardware-Software Co-Design Based Obfuscation of Hardware Accelerators

Abhishek Chakraborty, Ankur Srivastava

Implementation

Existing logic obfuscation approaches aim to protect hardware design IPs from SAT attack by increasing query count and output corruptibility of a locked netlist. In this paper, we demonstrate the ineffectiveness of such techniques to obfuscate hardware accelerator platforms. Subsequently, we propose a Hardware/software co-design based Accelerator Obfuscation (HSCAO) scheme to provably safeguard the IP of such designs against SAT as well as removal/bypass type of attacks while still...

2019/765 (PDF) Last updated: 2019-07-02

SPQCop: Side-channel protected Post-Quantum Cryptoprocessor

Arpan Jati, Naina Gupta, Anupam Chattopadhyay, Somitra Kumar Sanadhya

Implementation

The past few decades have seen significant progress in practically realizable quantum technologies. It is well known since the work of Peter Shor that large scale quantum computers will threaten the security of most of the currently used public key cryptographic algorithms. This has spurred the cryptography community to design algorithms which will remain safe even with the emergence of large scale quantum computing systems. An effort in this direction is the currently ongoing post-quantum...

2019/711 (PDF) Last updated: 2020-04-11

SIKE'd Up: Fast and Secure Hardware Architectures for Supersingular Isogeny Key Encapsulation

Brian Koziel, A-Bon Ackie, Rami El Khatib, Reza Azarderakhsh, Mehran Mozaffari-Kermani

Implementation

In this work, we present a fast parallel architecture to perform supersingular isogeny key encapsulation (SIKE). We propose and implement a fast isogeny accelerator architecture that uses fast and parallelized isogeny formulas. On top of our isogeny accelerator, we build a novel architecture for the SIKE primitive, which provides both quantum and IND-CCA security. Since SIKE can support static keys, we propose and implement additional differential power analysis countermeasures. We...

2019/160 (PDF) Last updated: 2019-02-20

FPGA-based High-Performance Parallel Architecture for Homomorphic Computing on Encrypted Data

Sujoy Sinha Roy, Furkan Turan, Kimmo Jarvinen, Frederik Vercauteren, Ingrid Verbauwhede

Implementation

Homomorphic encryption is a tool that enables computation on encrypted data and thus has applications in privacy-preserving cloud computing. Though conceptually amazing, implementation of homomorphic encryption is very challenging and typically software implementations on general purpose computers are extremely slow. In this paper we present our year long effort to design a domain specific architecture in a heterogeneous Arm+FPGA platform to accelerate homomorphic computing on encrypted...

2019/121 (PDF) Last updated: 2019-02-13

Anonymous Attestation for IoT

Santosh Ghosh, Andrew H. Reinders, Rafael Misoczki, Manoj R. Sastry

Implementation

Internet of Things (IoT) have seen tremendous growth and are being deployed pervasively in areas such as home, surveillance, health-care and transportation. These devices collect and process sensitive data with respect to user's privacy. Protecting the privacy of the user is an essential aspect of security, and anonymous attestation of IoT devices are critical to enable privacy-preserving mechanisms. Enhanced Privacy ID (EPID) is an industry-standard cryptographic scheme that offers...

2019/109 (PDF) Last updated: 2019-02-05

Design and Implementation of a Fast and Scalable NTT-Based Polynomial Multiplier Architecture

Ahmet Can Mert, Erdinc Ozturk, Erkay Savas

Implementation

In this paper, we present an optimized FPGA implementation of a novel, fast and highly parallelized NTT-based polynomial multiplier architecture, which proves to be effective as an accelerator for lattice-based homomorphic cryptographic schemes. As input-output (I/O) operations are as time-consuming as NTT operations during homomorphic computations in a host processor/accelerator setting, instead of achieving the fastest NTT implementation possible on the target FPGA, we focus on a balanced...

2018/1225 (PDF) Last updated: 2020-03-08

XMSS and Embedded Systems - XMSS Hardware Accelerators for RISC-V

Wen Wang, Bernhard Jungk, Julian Wälde, Shuwen Deng, Naina Gupta, Jakub Szefer, Ruben Niederhagen

Implementation

We describe a software-hardware co-design for the hash-based post-quantum signature scheme XMSS on a RISC-V embedded processor. We provide software optimizations for the XMSS reference implementation for SHA-256 parameter sets and several hardware accelerators that allow to balance area usage and performance based on individual needs. By integrating our hardware accelerators into the RISC-V processor, the version with the best time-area product generates a key pair (that can be used to...

2018/881 (PDF) Last updated: 2018-09-23

Remote Inter-Chip Power Analysis Side-Channel Attacks at Board-Level

Falk Schellenberg, Dennis R. E. Gnad, Amir Moradi, Mehdi B. Tahoori

Implementation

The current practice in board-level integration is to incorporate chips and components from numerous vendors. A fully trusted supply chain for all used components and chipsets is an important, yet extremely difficult to achieve, prerequisite to validate a complete board-level system for safe and secure operation. An increasing risk is that most chips nowadays run software or firmware, typically updated throughout the system lifetime, making it practically impossible to validate the full...

2017/913 (PDF) Last updated: 2017-09-24

Thunderella: Blockchains with Optimistic Instant Confirmation

Rafael Pass, Elaine Shi

Cryptographic protocols

State machine replication, or “consensus”, is a central abstraction for distributed systems where a set of nodes seek to agree on an ever-growing, linearly-ordered log. In this paper, we propose a practical new paradigm called Thunderella for achieving state machine replication by combining a fast, asynchronous path with a (slow) synchronous “fall-back” path (which only gets executed if something goes wrong); as a consequence, we get simple state machine replications that essentially are as...

2015/937 (PDF) Last updated: 2015-11-11

End-to-end Design of a PUF-based Privacy Preserving Authentication Protocol

Aydin Aysu, Ege Gulcan, Daisuke Moriyama, Patrick Schaumont, Moti Yung

Implementation

We demonstrate a prototype implementation of a provably secure protocol that supports privacy-preserving mutual authentication between a server and a constrained device. Our proposed protocol is based on a physically unclonable function (PUF) and it is optimized for resource-constrained platforms. The reported results include a full protocol analysis, the design of its building blocks, their integration into a constrained device, and finally its performance evaluation. We show how to obtain...

2015/818 (PDF) Last updated: 2015-08-18

cuHE: A Homomorphic Encryption Accelerator Library

Wei Dai, Berk Sunar

Implementation

We introduce a CUDA GPU library to accelerate evaluations with homomorphic schemes defined over polynomial rings enabled with a number of optimizations including algebraic techniques for efficient evaluation, memory minimization techniques, memory and thread scheduling and low level CUDA hand-tuned assembly optimizations to take full advantage of the mass parallelism and high memory bandwidth GPUs offer. The arithmetic functions constructed to handle very large polynomial operands using...

2015/529 (PDF) Last updated: 2016-04-11

Power Analysis Attacks against IEEE 802.15.4 Nodes

Colin O'Flynn, Zhizhang Chen

Implementation

IEEE 802.15.4 is a wireless standard used by a variety of higher-level protocols, including many used in the Internet of Things (IoT). A number of system on a chip (SoC) devices that combine a radio transceiver with a microcontroller are available for use in IEEE 802.15.4 networks. IEEE 802.15.4 supports the use of AES-CCM* for encryption and authentication of messages, and a SoC normally includes an AES accelerator for this purpose. This work measures the leakage characteristics of the AES...

2015/421 (PDF) Last updated: 2015-05-05

VLSI Implementation of Double-Base Scalar Multiplication on a Twisted Edwards Curve with an Efficiently Computable Endomorphism

Zhe Liu, Husen Wang, Johann Großschädl, Zhi Hu, Ingrid Verbauwhede

Implementation

The verification of an ECDSA signature requires a double-base scalar multiplication, an operation of the form $k \cdot G + l \cdot Q$ where $G$ is a generator of a large elliptic curve group of prime order $n$, $Q$ is an arbitrary element of said group, and $k$, $l$ are two integers in the range of $[1, n-1]$. We introduce in this paper an area-optimized VLSI design of a Prime-Field Arithmetic Unit (PFAU) that can serve as a loosely-coupled or tightly-coupled hardware accelerator in a...

2015/294 (PDF) Last updated: 2015-04-01

Accelerating Somewhat Homomorphic Evaluation using FPGAs

Erdi̇̀nç Öztürk, Yarkın Doröz, Berk Sunar, Erkay Savaş

Implementation

After being introduced in 2009, the first fully homomorphic encryption (FHE) scheme has created significant excitement in academia and industry. Despite rapid advances in the last 6 years, FHE schemes are still not ready for deployment due to an efficiency bottleneck. Here we introduce a custom hardware accelerator optimized for a class of reconfigurable logic to bring LTV based somewhat homomorphic encryption (SWHE) schemes one step closer to deployment in real-life applications. The...

2015/199 (PDF) Last updated: 2015-03-04

Side-Channel Security Analysis of Ultra-Low-Power FRAM-based MCUs

Amir Moradi, Gesine Hinterwälder

Implementation

By shrinking the technology and reducing the energy requirements of integrated circuits, producing ultra-low-power devices has practically become possible. Texas Instruments as a pioneer in developing FRAM-based products announced a couple of different microcontroller (MCU) families based on the low-power and fast Ferroelectric RAM technology. Such MCUs come with embedded cryptographic module(s) as well as the assertion that - due to the underlying ultra-low-power technology - mounting...

2014/800 (PDF) Last updated: 2015-09-23

Efficient Pairings and ECC for Embedded Systems

Thomas Unterluggauer, Erich Wenger

The research on pairing-based cryptography brought forth a wide range of protocols interesting for future embedded applications. One significant obstacle for the widespread deployment of pairing-based cryptography are its tremendous hardware and software requirements. In this paper we present three side-channel protected hardware/software designs for pairing-based cryptography yet small and practically fast: our plain ARM Cortex-M0+-based design computes a pairing in less than one second....

2013/282 (PDF) Last updated: 2015-05-25

Three Snakes in One Hole: The First Systematic Hardware Accelerator Design for SOSEMANUK with Optional Serpent and SNOW 2.0 Modes

Goutam Paul, Anupam Chattopadhyay

With increasing usage of hardware accelerators in modern heterogeneous System-on-Chips (SoCs), the distinction between hardware and software is no longer rigid. The domain of cryptography is no exception and efficient hardware design of so-called software ciphers are becoming increasingly popular. In this paper, for the first time we propose an efficient hardware accelerator design for SOSEMANUK, one of the finalists of the eSTREAM stream cipher competition in the software category. Since...

2012/343 (PDF) Last updated: 2012-09-07

High-Throughput Hardware Architecture for the SWIFFT / SWIFFTX Hash Functions

Tamas Gyorfi, Octavian Cret, Guillaume Hanrot, Nicolas Brisebarre

Introduced in 1996 and greatly developed over the last few years, Lattice-based cryptography oers a whole set of primitives with nice features, including provable security and asymptotic efficiency. Going from \asymptotic" to \real-world" efficiency seems important as the set of available primitives increases in size and functionality. In this present paper, we explore the improvements that can be obtained through the use of an FPGA architecture for implementing an ideal-lattice based...

2012/048 (PDF) Last updated: 2012-02-01

Designing Integrated Accelerator for Stream Ciphers with Structural Similarities

Sourav Sen Gupta, Anupam Chattopadhyay, Ayesha Khalid

Implementation

Till date, the basic idea for implementing stream ciphers has been confined to individual standalone designs. In this paper, we introduce the notion of integrated implementation of multiple stream ciphers within a single architecture, where the goal is to achieve area and throughput efficiency by exploiting the structural similarities of the ciphers at an algorithmic level. We present two case studies to support our idea. First, we propose the merger of SNOW 3G and ZUC stream ciphers, which...

2010/559 (PDF) Last updated: 2011-11-23

Optimal Eta Pairing on Supersingular Genus-2 Binary Hyperelliptic Curves

Diego F. Aranha, Jean-Luc Beuchat, Jérémie Detrey, Nicolas Estibals

Public-key cryptography

This article presents a novel pairing algorithm over supersingular genus-$2$ binary hyperelliptic curves. Starting from Vercauteren's work on optimal pairings, we describe how to exploit the action of the $2^{3m}$-th power Verschiebung in order to reduce the loop length of Miller's algorithm even further than the genus-$2$ $\eta_T$ approach. As a proof of concept, we detail an optimized software implementation and an FPGA accelerator for computing the proposed optimal Eta pairing on a...

2010/371 (PDF) Last updated: 2010-09-13

Compact hardware for computing the Tate pairing over 128-bit-security supersingular curves

Nicolas Estibals

Implementation

This paper presents a novel method for designing compact yet efficient hardware implementations of the Tate pairing over supersingular curves in small characteristic. Since such curves are usually restricted to lower levels of security because of their bounded embedding degree, aiming for the recommended security of 128 bits implies considering them over very large finite fields. We however manage to mitigate this effect by considering curves over field extensions of moderately-composite...

2010/276 (PDF) Last updated: 2010-06-17

Garbled Circuits for Leakage-Resilience: Hardware Implementation and Evaluation of One-Time Programs

Kimmo Järvinen, Vladimir Kolesnikov, Ahmad-Reza Sadeghi, Thomas Schneider

The power of side-channel leakage attacks on cryptographic implementations is evident. Today's practical defenses are typically attack-specific countermeasures against certain classes of side-channel attacks. The demand for a more general solution has given rise to the recent theoretical research that aims to build provably leakage-resilient cryptography. This direction is, however, very new and still largely lacks practitioners' evaluation with regard to both efficiency and practical...

2010/116 (PDF) Last updated: 2010-03-05

Practical Improvements of Profiled Side-Channel Attacks on a Hardware Crypto-Accelerator

M. Abdelaziz Elaabid, Sylvain Guilley

This article investigates the relevance of the theoretical framework on profiled side-channel attacks presented by F.-X. Standaert et al. at Eurocrypt 2009. The analyses consist in a case-study based on sidechannel measurements acquired experimentally from a hardwired cryptographic accelerator. Therefore, with respect to previous formal analyses carried out on software measurements or on simulated data, the investigations we describe are more complex, due to the underlying chip’s...

2009/398 (PDF) Last updated: 2009-08-19

Fast Architectures for the $\eta_T$ Pairing over Small-Characteristic Supersingular Elliptic Curves

Jean-Luc Beuchat, Jérémie Detrey, Nicolas Estibals, Eiji Okamoto, Francisco Rodríguez-Henríquez

Implementation

This paper is devoted to the design of fast parallel accelerators for the cryptographic $\eta_T$ pairing on supersingular elliptic curves over finite fields of characteristics two and three. We propose here a novel hardware implementation of Miller's algorithm based on a parallel pipelined Karatsuba multiplier. After a short description of the strategies we considered to design our multiplier, we point out the intrinsic parallelism of Miller's loop and outline the architecture of...

2009/122 (PDF) Last updated: 2009-08-04

Hardware Accelerator for the Tate Pairing in Characteristic Three Based on Karatsuba-Ofman Multipliers

Jean-Luc Beuchat, Jérémie Detrey, Nicolas Estibals, Eiji Okamoto, Francisco Rodríguez-Henríquez

Implementation

This paper is devoted to the design of fast parallel accelerators for the cryptographic Tate pairing in characteristic three over supersingular elliptic curves. We propose here a novel hardware implementation of Miller's loop based on a pipelined Karatsuba-Ofman multiplier. Thanks to a careful selection of algorithms for computing the tower field arithmetic associated to the Tate pairing, we manage to keep the pipeline busy. We also describe the strategies we considered to design our...

2008/280 (PDF) Last updated: 2009-06-17

FPGA and ASIC Implementations of the $\eta_T$ Pairing in Characteristic Three

Jean-Luc Beuchat, Hiroshi Doi, Kaoru Fujita, Atsuo Inomata, Piseth Ith, Akira Kanaoka, Masayoshi Katouno, Masahiro Mambo, Eiji Okamoto, Takeshi Okamoto, Takaaki Shiga, Masaaki Shirase, Ryuji Soga, Tsuyoshi Takagi, Ananda Vithanage, Hiroyasu Yamamoto

Implementation

Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. As they rely critically on efficient algorithms and implementations of pairing primitives, the study of hardware accelerators became an active research area. In this paper, we propose two coprocessors for the reduced $\eta_T$ pairing introduced by Barreto {\it et al.} as an alternative means of computing the Tate pairing on...

2008/115 (PDF) Last updated: 2008-03-17

A Comparison Between Hardware Accelerators for the Modified Tate Pairing over $\mathbb{F}_{2^m}$ and $\mathbb{F}_{3^m}$

Jean-Luc Beuchat, Nicolas Brisebarre, Jérémie Detrey, Eiji Okamoto, Francisco Rodríguez-Henríquez

Implementation

In this article we propose a study of the modified Tate pairing in characteristics two and three. Starting from the $\eta_T$ pairing introduced by Barreto {\em et al.} (Des Codes Crypt, 2007), we detail various algorithmic improvements in the case of characteristic two. As far as characteristic three is concerned, we refer to the survey by Beuchat {\em et al.} (ePrint 2007-417). We then show how to get back to the modified Tate pairing at almost no extra cost. Finally, we explore the...

2007/417 (PDF) Last updated: 2008-09-10

Algorithms and Arithmetic Operators for Computing the $\eta_T$ Pairing in Characteristic Three

Jean-Luc Beuchat, Nicolas Brisebarre, Jérémie Detrey, Eiji Okamoto, Masaaki Shirase, Tsuyoshi Takagi

Implementation

Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. Software implementations being rather slow, the study of hardware architectures became an active research area. In this paper, we discuss several algorithms to compute the $\eta_T$ pairing in characteristic three and suggest further improvements. These algorithms involve addition, multiplication, cubing, inversion, and...

2007/187 (PDF) Last updated: 2007-05-20

Executing Modular Exponentiation on a Graphics Accelerator

Andrew Moss, Dan Page, Nigel Smart

Implementation

Demand in the consumer market for graphics hardware that accelerates rendering of 3D images has resulted in commodity devices capable of astonishing levels of performance. These results were achieved by specifically tailoring the hardware for the target domain. As graphics accelerators become increasingly programmable this performance makes them an attractive target for other domains. Specifically, they have motivated the transformation of costly algorithms from a general purpose...

2007/091 (PDF) Last updated: 2007-06-03

Arithmetic Operators for Pairing-Based Cryptography

Jean-Luc Beuchat, Nicolas Brisebarre, Jérémie Detrey, Eiji Okamoto

Implementation

Since their introduction in constructive cryptographic applications, pairings over (hyper)elliptic curves are at the heart of an ever increasing number of protocols. Software implementations being rather slow, the study of hardware architectures became an active research area. In this paper, we first study an accelerator for the $\eta_T$ pairing over $\mathbb{F}_3[x]/(x^{97}+x^{12}+2)$. Our architecture is based on a unified arithmetic operator which performs addition, multiplication, and...

2007/045 (PDF) Last updated: 2007-02-14

A Coprocessor for the Final Exponentiation of the $\eta_T$ Pairing in Characteristic Three

Jean-Luc Beuchat, Nicolas Brisebarre, Masaaki Shirase, Tsuyoshi Takagi, Eiji Okamoto

Implementation

Since the introduction of pairings over (hyper)elliptic curves in constructive cryptographic applications, an ever increasing number of protocols based on pairings have appeared in the literature. Software implementations being rather slow, the study of hardware architectures became an active research area. Beuchat et al. proposed for instance a coprocessor which computes the characteristic three $\eta_T$ pairing, from which the Tate pairing can easily be derived, in $33$\,$\mu$s on a...

2006/327 (PDF) Last updated: 2007-03-23

An Algorithm for the $\eta_T$ Pairing Calculation in Characteristic Three and its Hardware Implementation

Jean-Luc Beuchat, Masaaki Shirase, Tsuyoshi Takagi, Eiji Okamoto

Implementation

In this paper, we propose a modified $\eta_T$ pairing algorithm in characteristic three which does not need any cube root extraction. We also discuss its implementation on a low cost platform which hosts an Altera Cyclone~II FPGA device. Our pairing accelerator is ten times faster than previous known FPGA implementations in characteristic three.

2006/316 (PDF) Last updated: 2006-09-22

A Parallelization of ECDSA Resistant to Simple Power Analysis Attacks

Sarang Aravamuthan, Viswanatha Rao Thumparthy

Implementation

The Elliptic Curve Digital Signature Algorithm admits a natural parallelization wherein the point multiplication step can be split in two parts and executed in parallel. Further parallelism is achieved by executing a portion of the multiprecision arithmetic operations in parallel with point multiplication. This results in a saving in timing as well as gate count when the two paths are implemented in hardware and software. This article attempts to exploit this parallelism in a typical system...

Search Help

95 results sorted by ID