Phase A: Vulkan Compute Backend for VSL Matrix/Vector

## Motivation

VSL (V Scientific Library) is the low-level linear algebra foundation for VTL. VSL provides `vsl.la.Matrix` (column-major, f64), BLAS/LAPACK wrappers, and VCL (OpenCL) for data transport. For GPU acceleration to work end-to-end, VSL must provide GPU-accelerated `gemm`, `matmul`, and element-wise operations that VTL's `la/` module calls.

**This issue is the VSL counterpart of VTL issue #58 (Phase 1: Vulkan Compute Foundation).**

### VSL's Role in the GPU Architecture

VTL's `la/la.v` converts `Tensor[T]` -> `[]f64` -> `vsl.la.Matrix` -> calls VSL -> converts back. When VTL calls GPU-accelerated operations:
- VTL's `compute/gemm_vulkan` is used directly for `Tensor[T]`
- VSL's `la.vulkan` must also support `vsl.la.Matrix` for:
  - VSL's own operations (`la.gemm`, `la.matmul`)
  - VTL's `la/` module when it uses VSL as the backend

### Why Vulkan for VSL

VSL uses Vulkan Compute as the primary GPU backend (same as VTL). Benefits:
- **Cross-vendor**: Works on NVIDIA, AMD, Intel, ARM Mali/Adreno
- **No proprietary SDK**: Unlike CUDA which only works on NVIDIA
- **Same code path**: VTL and VSL share the same Vulkan compute kernels
- **SPIR-V**: Unified GPU IR across all vendors

### Reference Repositories

The V ecosystem has existing GPU/compute infrastructure used as **reference**:

- **`antono2/vulkan`** — Full raw Vulkan 1.0–1.4 bindings (~1.3 MB). Use as the **pattern** for VSL's own Vulkan bindings: C→V type mapping, struct layout, handle definitions, API function signatures. MIT licensed.
- **`antono2/v_vulkan_bindings`** — Python generator: Khronos XML → V code. Fork to generate a `vsl`-specific Vulkan bindings subset. MIT licensed.
- **`antono2/vulkan_memory_allocator`** — Pool-based GPU memory allocator. Use as **architectural reference** for VSL's memory allocator. MIT licensed.
- **`vsl.vcl`** — Mature OpenCL wrapper. Use as **pattern** for VSL's compute abstraction.

### Implementation Pattern: Self-Contained Wrappers

**Decision:** VSL maintains its own Vulkan bindings within `vsl/vulkan/`, following the same pattern as BLAS, LAPACK, and VCL. Use `antono2/vulkan`, `antono2/v_vulkan_bindings`, and `antono2/vulkan_memory_allocator` as reference implementations.

```
vsl/
├── vk/              ← Vulkan bindings (self-contained, like vcl/)
│   ├── vk.c.v       ← C function declarations
│   ├── vk.ctypes.v  ← C type definitions
│   ├── vk.device.v  ← Device, physical device, instance
│   ├── vk.buffer.v  ← Buffer creation, memory binding
│   ├── vk.memory.v  ← Memory allocation, map/unmap
│   ├── vk.shader.v  ← ShaderModule from SPIR-V
│   ├── vk.pipeline.v← Compute pipeline, pipeline layout
│   ├── vk.descriptor.v← Descriptor set layout, pool
│   ├── vk.command.v ← Command buffer, submit, wait
│   └── vk.kernels.v ← GLSL compute shader sources
└── compute/          ← VSL compute abstraction
```

This keeps VSL self-contained and gives full control over the Vulkan API surface.

## Scope

### Files to create

1. **`vsl/vulkan/`** — VSL's own Vulkan bindings directory
   - `vsl/vulkan/vk.c.v` — C function declarations (`fn C.vkCreateInstance(...)`, `fn C.vkCmdDispatch(...)`, etc.)
   - `vsl/vulkan/vk.ctypes.v` — C type definitions (`VkInstance`, `VkDevice`, `VkBuffer`, `VkDeviceMemory`, etc.)
   - `vsl/vulkan/vk.device.v` — Device discovery, instance, physical device, logical device, queue
   - `vsl/vulkan/vk.buffer.v` — Buffer creation, memory binding
   - `vsl/vulkan/vk.memory.v` — Memory allocation, host<->device mapping
   - `vsl/vulkan/vk.shader.v` — Shader module creation from SPIR-V
   - `vsl/vulkan/vk.pipeline.v` — Compute pipeline, pipeline layout
   - `vsl/vulkan/vk.descriptor.v` — Descriptor set layout, pool, allocation
   - `vsl/vulkan/vk.command.v` — Command buffer, submit, wait
   - `vsl/vulkan/vk.kernels.v` — GLSL compute shader sources as V string constants
2. **`vsl/compute/`** — VSL compute abstraction directory
   - `vsl/compute/gemm.v` — GPU GEMM dispatcher (routes to Vulkan/CUDA/VCL/BLAS)
   - `vsl/compute/elementwise.v` — GPU element-wise ops dispatcher
   - `vsl/compute/broadcast.v` — GPU broadcast ops dispatcher
3. **`vsl/la/vulkan.v`** — Vulkan GEMM for `vsl.la.Matrix`

### Files to modify

- **`vsl/la/la.v`** — Update `gemm`, `matmul` to dispatch to GPU via `$if defined(vulkan)`
- **`vsl/vcl/vector.c.v`** — Add methods to extract `vcl.Vector[f64]` from `vsl.la.Matrix`

### API Contracts

```v
// VSL Matrix GPU dispatch
pub fn gemm_gpu(a vsl.la.Matrix, b vsl.la.Matrix, alpha f64, beta f64) !vsl.la.Matrix

// Vulkan GEMM for VSL Matrix
pub fn gemm_vulkan(a vsl.la.Matrix, b vsl.la.Matrix) !vsl.la.Matrix

// Element-wise ops on VSL Matrix
pub fn relu_vulkan(x vsl.la.Matrix) !vsl.la.Matrix
pub fn sigmoid_vulkan(x vsl.la.Matrix) !vsl.la.Matrix
pub fn tanh_vulkan(x vsl.la.Matrix) !vsl.la.Matrix

// Broadcast ops
pub fn add_vulkan(a vsl.la.Matrix, b vsl.la.Matrix) !vsl.la.Matrix
pub fn mul_vulkan(a vsl.la.Matrix, b vsl.la.Matrix) !vsl.la.Matrix
```

### Integration with VTL

VTL's `la/la.v` needs a path that calls VSL GPU compute:

```v
// vtl/la/la.v — updated for GPU
$if defined(vulkan) {
    return vsl_compute.gemm_vulkan(a, b)!
} $else $if defined(cuda) {
    return vsl_compute.gemm_cuda(a, b)!
} $else $if defined(vcl) {
    return vsl_compute.gemm_vcl(a, b)!
} $else {
    return vsl.la.gemm(a, b)!
}
```

### SPIR-V Compilation

Same strategy as VTL Phase 1:
- Kernels as V string constants in `vsl/vulkan/vk.kernels.v`
- Compile to SPIR-V via `glslangValidator -V -x -o out.spv`
- Load into Vulkan via `vkCreateShaderModule`

### VSL-specific Considerations

- **Matrix layout**: VSL `vsl.la.Matrix` is **column-major** (Fortran style). Vulkan kernels must account for this:
  - `LDA = num_rows` (column stride = number of rows)
  - Index: `A[i + j * lda]` where `i` is row, `j` is column
- **Type**: VSL operates on `f64` only (no generic `T`). Simplifies kernel variants.
- **No views/slices**: Unlike VTL's `Tensor[T]`, VSL's `Matrix` does not have views — all operations allocate new matrices. GPU storage model is simpler.
- **VCL conflict**: VSL already has `vsl.vcl` module. The Vulkan backend (`vk`) should coexist with VCL — they are separate backends, not competing.

### Testing Plan

- Unit tests: `gemm_vulkan` produces same output as `la.gemm` (within FP64 tolerance)
- Integration: VTL `la.matmul` produces same output with Vulkan backend
- VSL: `Matrix.vulkan()` / `.cpu()` round-trip preserves data
- Benchmark: Vulkan GEMM achieves >= 80% of theoretical peak

### Dependencies

- `vsl/vulkan/` module (new — needs to be written first)
- `glslangValidator` in PATH for SPIR-V compilation
- VSL `la/` module (existing)

## Checklist

- [ ] `vsl/vulkan/` directory and Vulkan bindings (vk.c.v, vk.ctypes.v)
- [ ] `vsl/vulkan/vk.device.v`: device discovery, instance, physical device, queue
- [ ] `vsl/vulkan/vk.buffer.v`: buffer creation, memory binding
- [ ] `vsl/vulkan/vk.memory.v`: memory allocation, host<->device mapping
- [ ] `vsl/vulkan/vk.shader.v`: shader module creation from SPIR-V
- [ ] `vsl/vulkan/vk.pipeline.v`: compute pipeline, pipeline layout
- [ ] `vsl/vulkan/vk.descriptor.v`: descriptor set layout, pool, allocation
- [ ] `vsl/vulkan/vk.command.v`: command buffer, submit, wait
- [ ] `vsl/vulkan/vk.kernels.v`: GLSL compute shader sources
- [ ] `vsl/compute/` directory and dispatcher modules
- [ ] `vsl/compute/gemm.v`: GPU dispatch for GEMM
- [ ] `vsl/compute/elementwise.v`: GPU dispatch for element-wise
- [ ] `vsl/compute/broadcast.v`: GPU dispatch for broadcast
- [ ] `vsl/la/vulkan.v`: Vulkan GEMM for vsl.la.Matrix
- [ ] `vsl/la/la.v`: update gemm/matmul to dispatch to Vulkan
- [ ] `vsl/vcl/vector.c.v`: add methods for vcl.Vector extraction
- [ ] Tests: consistency across Vulkan / CPU
- [ ] Integration test: VTL la.matmul with Vulkan backend
- [ ] Benchmark: Vulkan GEMM vs CPU BLAS

Related: VTL #58 (Phase 1: Vulkan Compute Foundation)
Parent: #236
Labels: enhancement, gpu, phase-a, vulkan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Phase A: Vulkan Compute Backend for VSL Matrix/Vector #237

Motivation

VSL's Role in the GPU Architecture

Why Vulkan for VSL

Reference Repositories

Implementation Pattern: Self-Contained Wrappers

Scope

Files to create

Files to modify

API Contracts

Integration with VTL

SPIR-V Compilation

VSL-specific Considerations

Testing Plan

Dependencies

Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Phase A: Vulkan Compute Backend for VSL Matrix/Vector #237

Description

Motivation

VSL's Role in the GPU Architecture

Why Vulkan for VSL

Reference Repositories

Implementation Pattern: Self-Contained Wrappers

Scope

Files to create

Files to modify

API Contracts

Integration with VTL

SPIR-V Compilation

VSL-specific Considerations

Testing Plan

Dependencies

Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions