Skip to content

e-jerk/gpu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GPU Utilities Library (e_jerk_gpu)

A shared Zig library providing GPU detection, capability querying, and intelligent backend selection for GPU-accelerated Unix utilities.

Features

  • GPU Detection: Query Metal and Vulkan device capabilities at runtime
  • Performance Scoring: Classify GPUs into tiers (Entry/Mid/High/Ultra)
  • Auto-Selection: Intelligent backend selection based on workload characteristics
  • Swappable Backends: Abstract interface for pluggable compute backends

Supported Backends

Backend Platform Status
Metal macOS Supported
Vulkan Linux, macOS (via MoltenVK) Supported
CUDA Linux, Windows Planned
OpenCL Cross-platform Planned

Installation

Add as a dependency using zig fetch:

zig fetch --save "git+https://github.com/e-jerk/gpu#v0.1.0"

Usage

Import and use:

const gpu = @import("e_jerk_gpu");

// Detect GPU capabilities
const caps = gpu.metal.detectCapabilities() orelse gpu.GpuCapabilities.default;

// Create auto-selector with hardware capabilities
var selector = gpu.AutoSelector.initWithCapabilities(caps);

// Select backend for workload
const result = selector.select(.{
    .data_size = file_size,
    .compute_intensity = 0.7,
});

if (result.backend == .metal) {
    // Use Metal backend
} else if (result.backend == .vulkan) {
    // Use Vulkan backend
} else {
    // Use CPU fallback
}

Architecture

GpuCapabilities (capabilities.zig)

Runtime-detected GPU attributes:

pub const GpuCapabilities = struct {
    max_threads_per_group: u32,    // Max threads per threadgroup/workgroup
    max_buffer_size: u64,          // Maximum buffer size in bytes
    recommended_memory: u64,        // Device local memory size
    is_discrete: bool,             // Discrete vs integrated GPU
    device_type: DeviceType,       // discrete/integrated/virtual/cpu/other
    device_name: ?[]const u8,      // Optional device name for logging
};

Performance Scoring:

  • performanceScore(): Returns 0-120+ based on hardware attributes
  • Discrete GPU: +50 points
  • Thread count: +10/+20/+30 for 256/512/1024+ threads
  • Memory: +5/+10/+20/+30 for 2/4/8/16+ GB
  • Buffer size: +5/+10 for 1/4+ GB max buffer

Tier Classification:

  • Entry (0-39): Intel/AMD integrated, base Apple Silicon
  • Mid (40-69): Apple Silicon Pro, mainstream discrete
  • High (70-99): Apple Silicon Max, high-end discrete
  • Ultra (100+): Apple Silicon Ultra, workstation GPUs

AutoSelector (auto_select.zig)

Intelligent backend selection based on workload characteristics:

pub const WorkloadInfo = struct {
    data_size: usize,              // Bytes to process
    compute_intensity: f32,         // 0.0 (trivial) to 1.0 (intensive)
    parallelizable: bool,          // Benefits from parallelism?
    memory_bound: bool,            // Memory-bound vs compute-bound
    batch_size: usize,             // Number of independent operations
    custom_bias: i32,              // Manual adjustment (-10 to +10)
};

Selection Algorithm:

  1. Hard Limits: Reject GPU if data < min_gpu_size or > max_gpu_size
  2. Parallelism Score: +3 for parallelizable, -5 for sequential
  3. Data Size Score: +1 for ≥1MB, +2 for ≥4MB
  4. Compute Intensity: +4 for ≥0.8, +2 for ≥0.5, -2 for <0.2
  5. Memory Bound: ±1 based on data size
  6. Batch Size: +1 for ≥10, +3 for ≥100
  7. Hardware Bias: +4 (ultra) to -2 (entry) from GPU tier

Threshold Adjustments by Tier:

Tier Min GPU Size GPU Bias
Ultra 32KB +4
High 64KB +2
Mid 128KB 0
Entry 256KB -2

Backend Detection

Metal (metal.zig):

  • Queries MTLDevice properties on macOS
  • Detects Apple Silicon vs Intel/AMD
  • Reports recommendedWorkingSetSize, maxThreadsPerThreadgroup

Vulkan (vulkan.zig):

  • Queries VkPhysicalDeviceProperties and VkPhysicalDeviceMemoryProperties
  • Detects device type (discrete, integrated, etc.)
  • Reports maxComputeWorkGroupInvocations, maxStorageBufferRange

Swappable Backend Interface (backend.zig)

Abstract interface for compute backends:

pub const ComputeBackend = struct {
    pub const VTable = struct {
        init: *const fn(allocator: Allocator) anyerror!*anyopaque,
        deinit: *const fn(self: *anyopaque) void,
        createBuffer: *const fn(self: *anyopaque, size: usize) anyerror!BufferHandle,
        execute: *const fn(self: *anyopaque, ...) anyerror!ComputeResult,
        // ...
    };
};

pub const BackendRegistry = struct {
    backends: std.ArrayList(ComputeBackend),

    pub fn register(self: *@This(), backend: ComputeBackend) !void;
    pub fn getBestBackend(self: @This()) ?ComputeBackend;
};

API Reference

Quick Selection

// Simple one-liner for common cases
const backend = gpu.quickSelect(
    file_size,           // Data size in bytes
    0.7,                 // Compute intensity
    detected_caps,       // Optional capabilities
);

Custom Configuration

var config = gpu.AutoSelectConfig{
    .min_gpu_size = 64 * 1024,      // 64KB minimum
    .max_gpu_size = 32 * 1024 * 1024, // 32MB maximum
    .gpu_bias = 2,                   // Prefer GPU
    .preferred_gpu_backend = .metal, // Prefer Metal over Vulkan
};

var selector = gpu.AutoSelector.initWithConfig(config);
const result = selector.select(workload);

Selection Result

pub const SelectionResult = struct {
    backend: Backend,      // .cpu, .metal, .vulkan, etc.
    gpu_score: i32,        // Positive = GPU better, negative = CPU better
    reason: []const u8,    // Human-readable explanation
};

Integration Example

const gpu = @import("e_jerk_gpu");

pub fn processFile(path: []const u8, options: Options) !void {
    const file_size = try getFileSize(path);

    // Detect hardware
    const caps = gpu.metal.detectCapabilities() orelse
                 gpu.vulkan.detectCapabilities() orelse
                 gpu.GpuCapabilities.default;

    // Create selector
    var selector = gpu.AutoSelector.initWithCapabilities(caps);

    // Describe workload
    const workload = gpu.WorkloadInfo{
        .data_size = file_size,
        .compute_intensity = if (options.case_insensitive) 0.8 else 0.5,
        .parallelizable = true,
    };

    // Select backend
    const result = selector.select(workload);

    if (options.verbose) {
        std.debug.print("Backend: {s} (score: {d}, reason: {s})\n",
            .{@tagName(result.backend), result.gpu_score, result.reason});
    }

    // Dispatch to appropriate implementation
    switch (result.backend) {
        .metal => try processWithMetal(path, options),
        .vulkan => try processWithVulkan(path, options),
        .cpu => try processWithCpu(path, options),
        else => try processWithCpu(path, options),
    }
}

License

Source code: Unlicense (public domain) Binaries: GPL-3.0-or-later

About

GPU detection and auto-selection library for e-jerk utilities

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages