A fully open-source hardware AV1 intra encoder in SystemVerilog, verified bit-exact against two independent decoders (ffmpeg and dav1d). Implements 9 intra prediction modes with SAD-based mode decision, 4x4 DCT/quantization, entropy coding, and local reconstruction feedback. Synthesized to GDS-II on SkyWater 130nm — DRC and LVS clean.
| Metric | V1 (DC-only) | V2 (Directional) |
|---|---|---|
| Logic Cells | 129,436 | 143,396 |
| Flip-Flops | 3,531 | 4,819 |
| Core Utilization | 50.3% | 58.5% |
| Max Frequency | 64.1 MHz | 28.6 MHz |
| Power | 775 mW | 97.4 mW |
| Die Size | 1.5 x 1.5 mm | 1.5 x 1.5 mm |
| DRC | 0 violations | 0 violations |
| LVS | Clean | Clean |
Both versions target the SkyWater SKY130 process using OpenLane 2.3.1. The V2 critical path runs through the directional predictor's 16-pixel interpolation stage and 4-level SAD adder tree (~20.7 ns at the nominal corner). Synthesis is fully reproducible from the committed config.
Pixels ──► Mode Decision ──► Directional Predict ──► Residual ──► DCT ──► Quantize ──► Entropy ──► Bitstream
│ (9 modes) (16-pixel interp) │
│ SAD compare │
▼ ▼
best_mode Inv Quantize
│
Recon pixels ◄──── Clamp ◄──── Add Prediction ◄──── Inverse DCT ◄──────────────┘
Prediction: 9 AV1 intra modes — DC, V, H, D45, D135, D113, D157, D203, D67. Mode decision evaluates all 9 via SAD against the original block and selects the best. Directional predictor computes sub-pixel interpolation using the AV1-specified dx/dy tables.
Transform + Quantization: Row-column 4x4 DCT with HEVC-convention shift values (SHIFT_ROW=1, SHIFT_COL=8). Quantizer uses multiply-by-reciprocal with 33-bit precision to avoid division. AV1-standard quantization tables (256 entries for DC/AC).
Entropy Coding: Bool encoder with a 1024-bit accumulation register (no streaming carry propagation). Coefficient encoder maps quantized coefficients to binary symbols (has_more, is_nz, sign, unary magnitude).
Reconstruction Feedback: Inverse quantize, inverse DCT, add prediction, and clamp to [0, 255]. Reconstructed pixels feed back as reference for subsequent blocks.
Bitstream Output: Full AV1 compliance for the supported feature set — IVF container, OBU framing (temporal delimiter + sequence header + frame OBU), tile encoding with Daala EC entropy coding and CDF-based context modeling.
- Python 3.9+
- Cocotb 2.0+ (
pip install cocotb) - Icarus Verilog 12.x (
sudo apt install iverilog) - ffmpeg (optional, for bitstream decode verification)
git clone https://github.com/Tgcohce/openAV1.git
cd openAV1
pip install -r requirements.txt# V1 pipeline tests (57 tests across 8 suites)
cd tb
for mf in Makefile Makefile.quant Makefile.pipe Makefile.bool \
Makefile.coeff_enc Makefile.inv Makefile.pred Makefile.intra; do
make -f $mf SIM=icarus
done
# V2 directional prediction tests (34 tests across 4 suites)
for mf in Makefile.sad Makefile.dir_pred Makefile.mode_dec Makefile.intra_v2; do
make -f $mf SIM=icarus
donepython -m golden.encode_frame # Full encoder + ffmpeg decode test
python -m golden.av1_tile # Tile encoder (53 tests)
python -m golden.bool_coder # Bool encoder/decoder (994 tests)
python -m golden.coeff_encode # Coefficient coding (1024 tests)python scripts/encode_image.py --width 16 --height 16 --pattern gray output.ivf
ffmpeg -i output.ivf -f rawvideo -pix_fmt gray decoded.rawRequires Docker with 12+ GB memory allocated.
docker run --rm \
-v "$(pwd):/work" \
-v openlane-pdk:/home/openlane/.volare \
-w /work/synth \
ghcr.io/efabless/openlane2:2.3.1 \
python3 -m openlane /work/synth/config.json| Module | Latency | Input | Output |
|---|---|---|---|
dct4x4 |
2 cycles | 16-bit signed pixels | 24-bit signed coefficients |
quantizer |
2 cycles | 24-bit coefficients + qindex | 16-bit quantized coefficients |
bool_encoder |
1 sym/cycle | bit + probability | byte stream |
directional_predictor |
combinational | 8-bit pixels + mode | 8-bit predicted block |
sad_4x4 |
combinational | two 4x4 blocks | 16-bit SAD value |
mode_decision |
~18 cycles | pixels + neighbor context | best mode + predicted block |
intra_encode_v2_top |
~26 cycles | 8-bit pixels + qindex | encoded bytes + recon pixels |
Bitstreams validated against ffmpeg (libdav1d) and dav1d (VideoLAN) — both produce identical decoded output across all configurations.
| Test Suite | Tests | Status |
|---|---|---|
| Multi-decoder agreement (deterministic) | 120 | Pass |
| Multi-decoder agreement (random images) | 1000 | Pass |
| Encoder-decoder round-trip | 1111 | Pass |
| Bitstream structure (ffprobe) | 16 | Pass |
| Compression sanity | 17 | Pass |
| Golden model integration | 27 | Pass |
| Adversarial inputs | 40 | Pass |
| RTL — V1 pipeline (Cocotb) | 57 | Pass |
| RTL — SAD (Cocotb) | 5 | Pass |
| RTL — Directional predictor (Cocotb) | 14 | Pass |
| RTL — Mode decision (Cocotb) | 7 | Pass |
| RTL — V2 pipeline (Cocotb) | 8 | Pass |
| Total | 2422 | All pass |
Measured on a 16x16 random image (monochrome, DC + directional prediction):
| qindex | Size | Compression | PSNR |
|---|---|---|---|
| 10 | 585 B | 0.44x | 69.2 dB |
| 50 | 524 B | 0.49x | 52.2 dB |
| 100 | 450 B | 0.57x | 46.7 dB |
| 150 | 299 B | 0.86x | 39.3 dB |
| 200 | 177 B | 1.45x | 31.6 dB |
golden/ Python golden models (DCT, quantizer, entropy, prediction, bitstream)
rtl/ SystemVerilog RTL (15 modules + 9 synthesis variants)
tb/ Cocotb testbenches (12 test files + 12 Makefiles)
synth/ OpenLane 2 config + synthesis reports
scripts/ Encoding tools, GDS renderer, test runner
sim/ Shell scripts for batch simulation
This is a proof-of-concept encoder targeting a specific AV1 subset:
- Monochrome only (no chroma planes)
- Still pictures (intra-only, no inter-frame prediction)
- 4x4 transform blocks (no 8x8, 16x16, or 32x32)
- 9 of 13 intra modes (no SMOOTH, SMOOTH_V, SMOOTH_H, or PAETH)
- Up to 64x64 pixels per frame
- Synthesis-verified (GDS-II generated and DRC/LVS clean, not fabricated)
- 4x4 DCT transform unit
- AV1 quantizer (multiply-by-reciprocal)
- Entropy coding (bool encoder + coefficient encoder)
- DC intra prediction with reconstruction feedback
- AV1 bitstream assembly (OBU framing, tile encoding, IVF container)
- V1 chip layout (SkyWater 130nm, 64 MHz)
- 9-mode directional prediction (golden model + RTL)
- V2 chip layout (SkyWater 130nm, 28.6 MHz)
- Multi-block streaming with neighbor feedback
- Larger transform sizes (8x8, 16x16, 32x32)
- Inter-frame prediction (integer-pel motion estimation)
MIT License. See LICENSE for details.