Skip to content

allenk/SudokuSolver

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Sudoku Solver - High-Performance Cross-Platform Sudoku Solver

A high-performance, cross-platform Sudoku solver featuring multiple algorithms, OCR image recognition, comprehensive benchmarking tools, and multi-platform CI/CD support.

Preview

🧠 Project Background

I originally built a Sudoku solver as a small father-son coding challenge. It later evolved into a useful benchmark tool to explore algorithmic efficiency and cross-platform CPU performance.

I wrote two articles documenting this journey:


✨ Features

  • Multiple Algorithms

    • Dancing Links (DLX) - Donald Knuth's Algorithm X implementation
    • Backtracking with Constraint Propagation
    • Automatic algorithm comparison
  • Find All Solutions

    • --solve-all option to find multiple solutions
    • Configurable limit with --max-solutions
    • Save all solutions to JSON
  • OCR Support (Windows/Linux/macOS)

    • Extract puzzles from images (PNG, JPG, BMP, TIFF)
    • Supports 9x9 and 16x16 grids
    • Automatic grid detection and perspective correction
  • Built-in Test Puzzles

    • 9x9 classic puzzle for quick tests
    • 16x16 hard puzzle (77 clues) for standard benchmarks
    • 25x25 mega puzzle for stress testing
  • Comprehensive Benchmarking

    • Single-threaded and multi-threaded benchmarks
    • Cross-platform system information detection
    • Performance metrics: throughput, speedup, efficiency
    • Professional benchmark reports
  • Cross-Platform Support

    • Windows (x64)
    • Linux (x64)
    • macOS (Universal Binary - Intel & Apple Silicon)
    • Android (ARM64) - Solver & Benchmark only

πŸš€ Quick Start

Prerequisites

  • CMake 3.20+
  • C++20 compatible compiler (MSVC 2022, GCC 11+, Clang 14+)
  • vcpkg package manager

Build with CMake Presets

# Clone repository
git clone https://github.com/user/sudoku-solver.git
cd sudoku-solver

# Windows
cmake --preset windows-x64-Release
cmake --build --preset windows-x64-Release

# Linux
cmake --preset linux-x64-Release
cmake --build --preset linux-x64-Release

# macOS (Universal Binary)
cmake --preset macos-universal-Release
cmake --build --preset macos-universal-Release

Usage

# Solve a puzzle from JSON
SudokuSolver -i puzzle.json

# Solve from image (OCR) - Windows/Linux/macOS only
SudokuSolver -i sudoku.png

# Use built-in test puzzles
SudokuSolver -t 9                    # 9x9 puzzle
SudokuSolver -t 16                   # 16x16 puzzle  
SudokuSolver -t 25                   # 25x25 puzzle (heavy)

# Compare algorithms
SudokuSolver -i puzzle.json -a compare

# Find all solutions (with default limit of 100)
SudokuSolver -i puzzle.json --solve-all

# Find up to 10 solutions
SudokuSolver -i puzzle.json --solve-all --max-solutions 10

# Find all solutions and save to JSON
SudokuSolver -i puzzle.json --solve-all -o solutions.json

# Run benchmark (1000 iterations, auto-detect threads)
SudokuSolver -t 9 -b 1000 -w 0

# Single-threaded benchmark
SudokuSolver -t 25 -b 100

# Multi-threaded benchmark with built-in puzzle
SudokuSolver -t 16 -b 1000 -w 0 -a compare

# Show system info
SudokuSolver -s

Command Line Options

Option Description
-i, --input <file> Input file (JSON or image)
-t, --test <size> Use built-in test puzzle (9, 16, or 25)
-a, --algorithm <algo> Algorithm: dlx, backtrack, compare
-b, --benchmark <N> Run benchmark with N iterations
-w, --workers <N> Number of parallel workers (0 = auto)
-o, --output <file> Output solution to JSON file
--solve-all Find all solutions (default limit: 100)
--max-solutions <N> Maximum solutions to find (0 = unlimited)
-s, --sysinfo Show system information
-q, --quiet Minimal output

πŸ“Š Benchmark Results

Performance Metrics Explained

                    Throughput (T)
Efficiency (E) = ─────────────────────
                  Single-Thread x Workers

Where:
  - Throughput = Total solves per second (multi-threaded)
  - Speedup = Multi-threaded throughput / Single-threaded throughput  
  - Efficiency = Speedup / Number of workers x 100%

Efficiency Interpretation:

Efficiency Rating Description
90-100% ⭐⭐⭐⭐⭐ Near-perfect scaling
70-90% ⭐⭐⭐⭐ Excellent
50-70% ⭐⭐⭐ Good (typical for CPU-bound tasks)
30-50% ⭐⭐ Moderate (synchronization overhead)
<30% ⭐ Poor (bottlenecked)

πŸ–₯️ Cross-Platform Benchmark (GitHub CI Runners)

Multi-threaded Test: ./SudokuSolver -b 100 -w 0 -a compare -t 25

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    GitHub Actions Runner Performance                        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Platform  β”‚       CPU        β”‚  Cores  β”‚   BT/s    β”‚  DLX/s    β”‚ Winner    β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Windows   β”‚ AMD EPYC 7763    β”‚  2C/4T  β”‚  44.86/s  β”‚  58.57/s  β”‚ DLX +31%  β”‚
β”‚  Linux     β”‚ AMD EPYC 7763    β”‚  2C/4T  β”‚  38.94/s  β”‚  67.49/s  β”‚ DLX +73%  β”‚
β”‚  macOS     β”‚ Apple M1 (VM)    β”‚  3C/3T  β”‚  35.56/s  β”‚  94.92/s  β”‚ DLX+167%  β”‚ ⭐
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

BT = Backtracking, DLX = Dancing Links

Single-threaded Test: ./SudokuSolver -b 100 -t 25

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Single-Threaded Performance Comparison                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Platform  β”‚       CPU        β”‚  Avg(ms) β”‚ Throughput β”‚        Notes        β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  WSL2      β”‚ AMD 9950X3D      β”‚  17.41   β”‚  57.43/s   β”‚ Fastest             β”‚ ⭐
β”‚  Windows   β”‚ AMD 9950X3D      β”‚  19.27   β”‚  51.90/s   β”‚ MSVC overhead       β”‚
β”‚  Android   β”‚ Snapdragon 8G3   β”‚  25.46   β”‚  39.27/s   β”‚ Mobile SoC!         β”‚
β”‚  macOS VM  β”‚ Apple M1 (VM)    β”‚  31.86   β”‚  31.39/s   β”‚ VM overhead ~30%    β”‚
β”‚  Linux CI  β”‚ AMD EPYC 7763    β”‚  35.98   β”‚  27.79/s   β”‚ CI resource limit   β”‚
β”‚  Win CI    β”‚ AMD EPYC 7763    β”‚  41.51   β”‚  24.09/s   β”‚ CI resource limit   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Observations:

  • WSL2 (GCC) outperforms native Windows (MSVC) by ~10% on DLX algorithm
  • Snapdragon 8 Gen 3 achieves 68% of desktop performance at 1/34 power consumption
  • macOS Runner runs in VM, real M1 estimated ~45/s (based on min latency 22.24ms)

Per-Core Throughput Analysis

Per-Core Efficiency Comparison (DLX, 25x25 Puzzle)
══════════════════════════════════════════════════════════════════════════════

Platform              β”‚ Cores β”‚ Total/s  β”‚ Per-Core/s β”‚ Notes
──────────────────────┼───────┼──────────┼────────────┼───────────────────────
Ryzen 9950X3D (WSL2)  β”‚  32T  β”‚  903.59  β”‚   28.2     β”‚ High core count
Ryzen 9950X3D (Win)   β”‚  32T  β”‚  885.59  β”‚   27.7     β”‚ MSVC overhead
Apple M1 (VM)         β”‚   3T  β”‚   94.92  β”‚   31.6     β”‚ Highest IPC!
EPYC 7763 (Linux)     β”‚   4T  β”‚   67.49  β”‚   16.9     β”‚ CI resource limit
EPYC 7763 (Windows)   β”‚   4T  β”‚   58.57  β”‚   14.6     β”‚ CI resource limit
Snapdragon 8 Gen 3    β”‚   8T  β”‚  105.31  β”‚   13.2     β”‚ big.LITTLE penalty

══════════════════════════════════════════════════════════════════════════════

Key Insight: Apple M1 has the HIGHEST per-core throughput (31.6/s)
             Ryzen 9950X3D wins total throughput purely via core count (32T)

Platform Performance Analysis

Throughput Comparison (GitHub Runners)
═════════════════════════════════════════════════════════════════════════════

  DLX Algorithm:
  macOS (M1)   β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”‚ 94.92/s ⭐
  Linux        β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                 β”‚ 67.49/s
  Windows      β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                      β”‚ 58.57/s

  Backtracking Algorithm:
  Windows      β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                             β”‚ 44.86/s
  Linux        β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                                β”‚ 38.94/s
  macOS (M1)   β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ                                  β”‚ 35.56/s

               β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                0        20        40        60        80       100

🏠 Local Machine Benchmark

Test Configuration: ./SudokuSolver -b 100 -w 0 -a compare -t 25

High-End Desktop: AMD Ryzen 9 9950X3D

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚           AMD Ryzen 9 9950X3D (16C/32T) - 192GB DDR5                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Platform   β”‚  Compiler   β”‚  Throughput   β”‚ Speedup  β”‚     Efficiency     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Windows    β”‚  MSVC 1944  β”‚               β”‚          β”‚                    β”‚
β”‚  β”œβ”€ BT      β”‚             β”‚    805.40/s   β”‚  15.02x  β”‚       46.93%       β”‚
β”‚  └─ DLX     β”‚             β”‚    885.59/s   β”‚  17.68x  β”‚       55.25%       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  WSL2       β”‚  GCC 11.4   β”‚               β”‚          β”‚                    β”‚
β”‚  β”œβ”€ BT      β”‚             β”‚    561.83/s   β”‚  15.87x  β”‚       49.60%       β”‚
β”‚  └─ DLX     β”‚             β”‚    903.59/s   β”‚  15.97x  β”‚       49.92%       β”‚ ⭐
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

BT = Backtracking, DLX = Dancing Links

πŸ”¬ WSL2 Performance Analysis

Task Manager CPU View

Key Observation: WSL2 DLX (903.59/s) outperforms native Windows (885.59/s) by ~2%!

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    WSL2 vs Windows Analysis                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  Task Manager Color Legend:                                     β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”                                                     β”‚
β”‚  β”‚ Green  β”‚ = Kernel Mode (WSL2 workload)                       β”‚
β”‚  β”‚ Red    β”‚ = User Mode (Windows native)                        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                                     β”‚
β”‚                                                                 β”‚
β”‚  Why WSL2 shows as Kernel Mode?                                 β”‚
β”‚  β”œβ”€ WSL2 runs a real Linux kernel inside Hyper-V                β”‚
β”‚  β”œβ”€ From Windows perspective, VM operations = kernel mode       β”‚
β”‚  └─ The Linux userspace code runs inside the VM                 β”‚
β”‚                                                                 β”‚
β”‚  Performance Implications:                                      β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                          β”‚
β”‚  β”‚    Algorithm    β”‚ WSL2 vs Windows β”‚                          β”‚
β”‚  β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€                          β”‚
β”‚  β”‚  Backtracking   β”‚  -30% (slower)  β”‚  MSVC recursion wins     β”‚
β”‚  β”‚  DLX            β”‚  +2%  (faster)  β”‚  GCC pointers win        β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                          β”‚
β”‚                                                                 β”‚
β”‚  Conclusion:                                                    β”‚
β”‚  - WSL2 virtualization overhead is negligible for CPU-bound     β”‚
β”‚  - Compiler optimization matters more than OS overhead          β”‚
β”‚  - DLX benefits from GCC's superior pointer optimization        β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Mobile SoC: Qualcomm Snapdragon 8 Gen 3

./SudokuSolver -b 100 -w 0 -a compare -t 25
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚     Snapdragon 8 Gen 3 (Cortex-X4 + A720 + A520) - 12GB LPDDR5            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Platform   β”‚  Compiler   β”‚  Throughput   β”‚ Speedup  β”‚     Efficiency     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Android    β”‚ Clang 21.0  β”‚               β”‚          β”‚                    β”‚
β”‚  β”œβ”€ BT      β”‚             β”‚     95.83/s   β”‚   3.74x  β”‚      46.77%        β”‚
β”‚  └─ DLX     β”‚             β”‚    105.31/s   β”‚   2.74x  β”‚      34.23%        β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“ˆ Algorithm Performance Analysis

Compiler Impact on Algorithm Performance

                     Backtracking vs DLX by Compiler
    ═════════════════════════════════════════════════════════════════════════

    MSVC (Windows)
    β”œβ”€ Backtracking β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”‚ 805/s
    └─ DLX          β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”‚ 886/s
                    Ξ” = +10% (DLX wins)

    GCC (Linux/WSL2)  
    β”œβ”€ Backtracking β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”‚ 562/s
    └─ DLX          β”‚β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ”‚ 904/s
                    Ξ” = +61% (DLX dominates)

    ═════════════════════════════════════════════════════════════════════════

Key Findings:

Compiler Backtracking DLX Winner Analysis
MSVC 805/s 886/s DLX +10% MSVC optimizes recursion well
GCC 562/s 904/s DLX +61% GCC excels at pointer/linked-list ops
Clang 96/s 105/s DLX +10% Similar to MSVC pattern

Why the Difference?

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Algorithm Characteristics                       β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚       Backtracking          β”‚           DLX                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ - Deep recursion            β”‚ - Linked-list traversal           β”‚
β”‚ - Stack-heavy operations    β”‚ - Pointer manipulation            β”‚
β”‚ - Predictable branches      β”‚ - Cache-unfriendly access         β”‚
β”‚ - Cache-friendly (array)    β”‚ - Complex data structures         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ MSVC: Excellent tail-call   β”‚ GCC: Superior pointer aliasing    β”‚
β”‚ optimization & stack mgmt   β”‚ analysis & link-time optimization β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🍎 Apple Silicon Deep Dive

Why High Efficiency on M1? (Note: GitHub Runner uses VM)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Apple Silicon Architecture Advantages              β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  ⚠️ Note: macOS GitHub Runner runs M1 in VM (~30% overhead)     β”‚
β”‚     Real M1 estimated: ~45/s single-thread, ~114/s multi-thread β”‚
β”‚                                                                 β”‚
β”‚  1. Unified Memory Architecture (UMA)                           β”‚
β”‚     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”                   β”‚
β”‚     β”‚ P-Core  β”‚    β”‚ E-Core  β”‚    β”‚   GPU   β”‚                   β”‚
β”‚     β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                   β”‚
β”‚          β”‚              β”‚              β”‚                        β”‚
β”‚          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                        β”‚
β”‚                         β”‚                                       β”‚
β”‚                  β”Œβ”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”                                β”‚
β”‚                  β”‚ Unified RAM β”‚  << Zero-copy, no NUMA penalty β”‚
β”‚                  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                                β”‚
β”‚                                                                 β”‚
β”‚  2. Low Core Count = Simple Scheduling                          β”‚
β”‚     - 3 cores = minimal context switch overhead                 β”‚
β”‚     - No cross-CCD latency (unlike AMD Ryzen)                   β”‚
β”‚     - macOS scheduler highly optimized for Apple Silicon        β”‚
β”‚                                                                 β”‚
β”‚  3. High IPC (Instructions Per Clock)                           β”‚
β”‚     - M1 IPC rivals desktop CPUs despite lower clock            β”‚
β”‚     - Wide execution units (8-wide decode)                      β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

UMA: Double-Edged Sword

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   Unified Memory Architecture   β”‚
                    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                    β”‚   Advantages   β”‚ Disadvantages  β”‚
                    β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
                    β”‚ + No NUMA      β”‚ - Shared       β”‚
                    β”‚   penalties    β”‚   bandwidth    β”‚
                    β”‚                β”‚                β”‚
                    β”‚ + Zero-copy    β”‚ - CPU & GPU    β”‚
                    β”‚   GPU <-> CPU  β”‚   compete      β”‚
                    β”‚                β”‚                β”‚
                    β”‚ + Simplified   β”‚ - Limited by   β”‚
                    β”‚   programming  β”‚   single pool  β”‚
                    β”‚                β”‚                β”‚
                    β”‚ + Lower power  β”‚ - Can't add    β”‚
                    β”‚   consumption  β”‚   more RAM     β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ”¬ 16x16 and 25x25 Puzzle Benchmarks

Test Configuration: AMD Ryzen 9 9950X3D (32T)

Puzzle Complexity Comparison
════════════════════════════════════════════════════════════════

 Size   β”‚ Cells β”‚ Empty β”‚ Search Space β”‚ Recommended Runs β”‚
────────┼───────┼───────┼──────────────┼───────────────────
  9x9   β”‚   81  β”‚  ~51  β”‚    ~10^20    β”‚    5000-10000    β”‚
 16x16  β”‚  256  β”‚ ~220  β”‚    ~10^80    β”‚    1000-5000     β”‚
 25x25  β”‚  625  β”‚ ~500  β”‚   ~10^200+   β”‚     50-200       β”‚

════════════════════════════════════════════════════════════════

16x16 Performance (320,000 total solves)

Algorithm System Throughput Speedup Efficiency
DLX 9950X3D (32T) 23,075 /s 19.87x 62.10%
DLX 5800X (16T) 7,558 /s 9.16x 57.27%
Backtracking 9950X3D (32T) 9,393 /s 15.83x 49.48%
Backtracking 5800X (16T) 2,731 /s 7.03x 43.96%

25x25 Performance (50 runs x 32 workers)

25x25 Mega Puzzle - Extreme Benchmark
────────────────────────────────────────────────────
Algorithm       β”‚ Throughput β”‚ Speedup β”‚ Efficiency
────────────────┼────────────┼─────────┼───────────
Backtracking    β”‚   794/s    β”‚ 14.95x  β”‚  46.71%
DLX             β”‚   877/s    β”‚ 17.26x  β”‚  53.95%     ⭐
────────────────────────────────────────────────────
Note: DLX is 10% faster on 25x25 puzzles

πŸ“‹ Summary: Algorithm Selection Guide

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  When to Use Each Algorithm                     β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                                 β”‚
β”‚  Use DLX when:                                                  β”‚
β”‚  β”œβ”€ Puzzle size β‰₯ 16x16                                         β”‚
β”‚  β”œβ”€ Multi-threaded execution                                    β”‚
β”‚  β”œβ”€ Finding ALL solutions                                       β”‚
β”‚  β”œβ”€ Running on Linux/GCC                                        β”‚
β”‚  └─ Consistency across platforms is important                   β”‚
β”‚                                                                 β”‚
β”‚  Use Backtracking when:                                         β”‚
β”‚  β”œβ”€ Puzzle size = 9x9                                           β”‚
β”‚  β”œβ”€ Single-threaded execution                                   β”‚
β”‚  β”œβ”€ Only need ONE solution                                      β”‚
β”‚  β”œβ”€ Running on Windows/MSVC                                     β”‚
β”‚  └─ Memory efficiency is critical                               β”‚
β”‚                                                                 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

🌐 Platform Support Matrix

Platform Solver Benchmark OCR System Detection
Windows x64 βœ… βœ… βœ… βœ… Full
Linux x64 βœ… βœ… βœ… βœ… Full
macOS (Intel) βœ… βœ… βœ… βœ… Full
macOS (Apple Silicon) βœ… βœ… βœ… βœ… Full (ref clock)
Android ARM64 βœ… βœ… ❌ βœ… Full (SoC detection)

Platform-Specific Notes

  • Windows: Best Backtracking performance due to MSVC optimizations
  • Linux: Best DLX performance due to GCC pointer optimizations
  • macOS: Excellent efficiency due to UMA and optimized scheduler
  • Android: Full SoC detection (Snapdragon, Dimensity, Exynos, Tensor)

πŸ“ JSON Format

Array Format

{
  "puzzle": [
    [5, 3, 0, 0, 7, 0, 0, 0, 0],
    [6, 0, 0, 1, 9, 5, 0, 0, 0],
    ...
  ]
}

String Format

{
  "puzzle": "530070000600195000098000060800060003400803001700020006060000280000419005000080079"
}

With Metadata (Optional)

{
  "name": "Easy 9x9 Puzzle",
  "difficulty": "easy",
  "size": 9,
  "box_rows": 3,
  "box_cols": 3,
  "grid": [
    [5, 3, 0, 0, 7, 0, 0, 0, 0],
    ...
  ]
}

When metadata is present, it will be displayed:

Input Puzzle:
...
Name: Easy 9x9 Puzzle
Difficulty: easy
Size: 9x9
Empty cells: 51

πŸ—οΈ Project Structure

sudoku-solver/
β”œβ”€β”€ include/
β”‚   β”œβ”€β”€ board.hpp            # Board representation
β”‚   β”œβ”€β”€ solver.hpp           # Solver interface
β”‚   β”œβ”€β”€ solver_dlx.hpp       # DLX implementation
β”‚   β”œβ”€β”€ solver_backtrack.hpp # Backtracking implementation
β”‚   β”œβ”€β”€ ocr_processor.hpp    # OCR processing (conditional)
β”‚   β”œβ”€β”€ benchmark.hpp        # Benchmarking utilities
β”‚   β”œβ”€β”€ system_info.hpp      # Cross-platform system detection
β”‚   β”œβ”€β”€ json_handler.hpp     # JSON I/O
β”‚   └── types.hpp            # Common types
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ main.cpp             # Entry point + built-in puzzles
β”‚   β”œβ”€β”€ board.cpp
β”‚   β”œβ”€β”€ solver_dlx.cpp
β”‚   β”œβ”€β”€ solver_backtrack.cpp
β”‚   β”œβ”€β”€ ocr_processor.cpp    # Conditionally compiled
β”‚   β”œβ”€β”€ benchmark.cpp
β”‚   β”œβ”€β”€ system_info.cpp      # Platform-specific detection
β”‚   └── json_handler.cpp
β”œβ”€β”€ .github/
β”‚   └── workflows/           # CI/CD pipelines
β”œβ”€β”€ CMakeLists.txt
β”œβ”€β”€ CMakePresets.json        # Cross-platform build presets
β”œβ”€β”€ CMakeSettings.json       # Visual Studio settings
β”œβ”€β”€ vcpkg.json               # Dependencies
└── README.md

πŸ“¦ Dependencies

Managed via vcpkg:

Package Purpose Platforms
OpenCV Image processing All
Tesseract OCR engine Windows/Linux/macOS
nlohmann/json JSON parsing All
CLI11 Command line parsing All

πŸ”§ Building

Using CMake Presets (Recommended)

# List available presets
cmake --list-presets

# Configure and build
cmake --preset <preset-name>
cmake --build --preset <preset-name>

Manual Build

cmake -B build -S . \
  -DCMAKE_TOOLCHAIN_FILE=$VCPKG_ROOT/scripts/buildsystems/vcpkg.cmake \
  -DCMAKE_BUILD_TYPE=Release

cmake --build build --config Release

Build Options

Option Default Description
ENABLE_OCR ON Enable Tesseract OCR support
ENABLE_OPENMP OFF Enable OpenMP parallelization

πŸ“œ License

MIT License

πŸ‘₯ Authors

  • AllenK
  • Kwyshell

πŸ™ Acknowledgments

  • Donald Knuth for the Dancing Links algorithm
  • The Tesseract OCR project
  • OpenCV community
  • vcpkg maintainers

Related