ZKHD — Zero-Knowledge HyperDimensional

A personal research project exploring whether biometric authentication is possible without storing any biometric data — not even a derived template.

What is ZKHD?

ZKHD combines three ideas:

SimHash encoding — ArcFace face embeddings (512-dim floats) are projected into 32,000-bit binary hypervectors via Sign Random Projections. The encoding preserves angular distance: same-person pairs land close in Hamming space, different-person pairs land near 0.5.

Fuzzy extraction — a cryptographic primitive that derives a stable key from a noisy biometric using stored helper data. Currently mocked with an oracle; the real construction requires an error-correcting code (SC-LDPC) that this project has not yet implemented.

Zero-knowledge proof — the user should prove "my biometric is close enough" without revealing anything. Currently mocked with an Ed25519 signature.

Current state

The representation layer (ArcFace + SimHash) is implemented and empirically validated. The cryptographic layer (fuzzy extractor, ZKP) is scaffolded with placeholders.

The whitepaper reports a negative result: under standard fuzzy extractor bounds, the error rate required for 2D face recognition in the wild forces the helper string to leak more min-entropy than a 2D face embedding is estimated to contain. Secure 256-bit key extraction from a 2D camera is not achievable under these assumptions.

Results

Dataset	Identities	Best tau	FRR	FAR
Five Faces	5	0.44	0.15%	0.09%
LFW (>=5 img. per identity)	420	0.41	1.23%	0.08%
VGGFace2 subset	50	0.43	2.63%	0.07%

Structure

ZKHD/
├── Code/
│   ├── hv.py                 SimHash encoder, projection matrix, integrity guard
│   ├── simulate.py           Full pipeline: extract, encode, enroll, evaluate
│   ├── fuzzy_extractor.py    Fuzzy extractor (oracle mock)
│   ├── zk.py                 ZKP (Ed25519 mock)
│   └── requirements.txt
└── Dataset/                  Not tracked - place your datasets here

Dataset layout: Dataset/<name>/<user>/*.jpg/jpeg or flat Dataset/<name>/<identity>_*.jpg/jpeg.

Running

cd Code

pip3 install -r requirements.txt

python3 simulate.py --enroll 3

The program prompts for backend (buffalo_l / antelopev2), dataset, and core count.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Code		Code
docs		docs
.gitignore		.gitignore
README.md		README.md
ZKHD_Whitepaper.pdf		ZKHD_Whitepaper.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ZKHD — Zero-Knowledge HyperDimensional

What is ZKHD?

Current state

Results

Structure

Running

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

ZKHD — Zero-Knowledge HyperDimensional

What is ZKHD?

Current state

Results

Structure

Running

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages