GPU Utilities Library (e_jerk_gpu)

A shared Zig library providing GPU detection, capability querying, and intelligent backend selection for GPU-accelerated Unix utilities.

Features

GPU Detection: Query Metal and Vulkan device capabilities at runtime
Performance Scoring: Classify GPUs into tiers (Entry/Mid/High/Ultra)
Auto-Selection: Intelligent backend selection based on workload characteristics
Swappable Backends: Abstract interface for pluggable compute backends

Supported Backends

Backend	Platform	Status
Metal	macOS	Supported
Vulkan	Linux, macOS (via MoltenVK)	Supported
CUDA	Linux, Windows	Planned
OpenCL	Cross-platform	Planned

Installation

Add as a dependency using zig fetch:

zig fetch --save "git+https://github.com/e-jerk/gpu#v0.1.0"

Usage

Import and use:

const gpu = @import("e_jerk_gpu");

// Detect GPU capabilities
const caps = gpu.metal.detectCapabilities() orelse gpu.GpuCapabilities.default;

// Create auto-selector with hardware capabilities
var selector = gpu.AutoSelector.initWithCapabilities(caps);

// Select backend for workload
const result = selector.select(.{
    .data_size = file_size,
    .compute_intensity = 0.7,
});

if (result.backend == .metal) {
    // Use Metal backend
} else if (result.backend == .vulkan) {
    // Use Vulkan backend
} else {
    // Use CPU fallback
}

Architecture

GpuCapabilities (`capabilities.zig`)

Runtime-detected GPU attributes:

pub const GpuCapabilities = struct {
    max_threads_per_group: u32,    // Max threads per threadgroup/workgroup
    max_buffer_size: u64,          // Maximum buffer size in bytes
    recommended_memory: u64,        // Device local memory size
    is_discrete: bool,             // Discrete vs integrated GPU
    device_type: DeviceType,       // discrete/integrated/virtual/cpu/other
    device_name: ?[]const u8,      // Optional device name for logging
};

Performance Scoring:

performanceScore(): Returns 0-120+ based on hardware attributes
Discrete GPU: +50 points
Thread count: +10/+20/+30 for 256/512/1024+ threads
Memory: +5/+10/+20/+30 for 2/4/8/16+ GB
Buffer size: +5/+10 for 1/4+ GB max buffer

Tier Classification:

Entry (0-39): Intel/AMD integrated, base Apple Silicon
Mid (40-69): Apple Silicon Pro, mainstream discrete
High (70-99): Apple Silicon Max, high-end discrete
Ultra (100+): Apple Silicon Ultra, workstation GPUs

AutoSelector (`auto_select.zig`)

Intelligent backend selection based on workload characteristics:

pub const WorkloadInfo = struct {
    data_size: usize,              // Bytes to process
    compute_intensity: f32,         // 0.0 (trivial) to 1.0 (intensive)
    parallelizable: bool,          // Benefits from parallelism?
    memory_bound: bool,            // Memory-bound vs compute-bound
    batch_size: usize,             // Number of independent operations
    custom_bias: i32,              // Manual adjustment (-10 to +10)
};

Selection Algorithm:

Hard Limits: Reject GPU if data < min_gpu_size or > max_gpu_size
Parallelism Score: +3 for parallelizable, -5 for sequential
Data Size Score: +1 for ≥1MB, +2 for ≥4MB
Compute Intensity: +4 for ≥0.8, +2 for ≥0.5, -2 for <0.2
Memory Bound: ±1 based on data size
Batch Size: +1 for ≥10, +3 for ≥100
Hardware Bias: +4 (ultra) to -2 (entry) from GPU tier

Threshold Adjustments by Tier:

Tier	Min GPU Size	GPU Bias
Ultra	32KB	+4
High	64KB	+2
Mid	128KB	0
Entry	256KB	-2

Backend Detection

Metal (metal.zig):

Queries MTLDevice properties on macOS
Detects Apple Silicon vs Intel/AMD
Reports recommendedWorkingSetSize, maxThreadsPerThreadgroup

Vulkan (vulkan.zig):

Queries VkPhysicalDeviceProperties and VkPhysicalDeviceMemoryProperties
Detects device type (discrete, integrated, etc.)
Reports maxComputeWorkGroupInvocations, maxStorageBufferRange

Swappable Backend Interface (`backend.zig`)

Abstract interface for compute backends:

pub const ComputeBackend = struct {
    pub const VTable = struct {
        init: *const fn(allocator: Allocator) anyerror!*anyopaque,
        deinit: *const fn(self: *anyopaque) void,
        createBuffer: *const fn(self: *anyopaque, size: usize) anyerror!BufferHandle,
        execute: *const fn(self: *anyopaque, ...) anyerror!ComputeResult,
        // ...
    };
};

pub const BackendRegistry = struct {
    backends: std.ArrayList(ComputeBackend),

    pub fn register(self: *@This(), backend: ComputeBackend) !void;
    pub fn getBestBackend(self: @This()) ?ComputeBackend;
};

API Reference

Quick Selection

// Simple one-liner for common cases
const backend = gpu.quickSelect(
    file_size,           // Data size in bytes
    0.7,                 // Compute intensity
    detected_caps,       // Optional capabilities
);

Custom Configuration

var config = gpu.AutoSelectConfig{
    .min_gpu_size = 64 * 1024,      // 64KB minimum
    .max_gpu_size = 32 * 1024 * 1024, // 32MB maximum
    .gpu_bias = 2,                   // Prefer GPU
    .preferred_gpu_backend = .metal, // Prefer Metal over Vulkan
};

var selector = gpu.AutoSelector.initWithConfig(config);
const result = selector.select(workload);

Selection Result

pub const SelectionResult = struct {
    backend: Backend,      // .cpu, .metal, .vulkan, etc.
    gpu_score: i32,        // Positive = GPU better, negative = CPU better
    reason: []const u8,    // Human-readable explanation
};

Integration Example

const gpu = @import("e_jerk_gpu");

pub fn processFile(path: []const u8, options: Options) !void {
    const file_size = try getFileSize(path);

    // Detect hardware
    const caps = gpu.metal.detectCapabilities() orelse
                 gpu.vulkan.detectCapabilities() orelse
                 gpu.GpuCapabilities.default;

    // Create selector
    var selector = gpu.AutoSelector.initWithCapabilities(caps);

    // Describe workload
    const workload = gpu.WorkloadInfo{
        .data_size = file_size,
        .compute_intensity = if (options.case_insensitive) 0.8 else 0.5,
        .parallelizable = true,
    };

    // Select backend
    const result = selector.select(workload);

    if (options.verbose) {
        std.debug.print("Backend: {s} (score: {d}, reason: {s})\n",
            .{@tagName(result.backend), result.gpu_score, result.reason});
    }

    // Dispatch to appropriate implementation
    switch (result.backend) {
        .metal => try processWithMetal(path, options),
        .vulkan => try processWithVulkan(path, options),
        .cpu => try processWithCpu(path, options),
        else => try processWithCpu(path, options),
    }
}

License

Source code: Unlicense (public domain) Binaries: GPL-3.0-or-later

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.github/workflows		.github/workflows
libs/zigtrait		libs/zigtrait
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.zig		build.zig
build.zig.zon		build.zig.zon

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GPU Utilities Library (e_jerk_gpu)

Features

Supported Backends

Installation

Usage

Architecture

GpuCapabilities (`capabilities.zig`)

AutoSelector (`auto_select.zig`)

Backend Detection

Swappable Backend Interface (`backend.zig`)

API Reference

Quick Selection

Custom Configuration

Selection Result

Integration Example

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

GPU Utilities Library (e_jerk_gpu)

Features

Supported Backends

Installation

Usage

Architecture

GpuCapabilities (capabilities.zig)

AutoSelector (auto_select.zig)

Backend Detection

Swappable Backend Interface (backend.zig)

API Reference

Quick Selection

Custom Configuration

Selection Result

Integration Example

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

GpuCapabilities (`capabilities.zig`)

AutoSelector (`auto_select.zig`)

Swappable Backend Interface (`backend.zig`)

Packages