A safe, efficient array library with ndarray-compatible operations, designed as the foundation for high-performance computing backends in Rust.
- Safe & Ergonomic: Memory-safe array operations with Rust's guarantees
- Type Safe: Runtime shape and data type validation
- Backend Agnostic:
NdArraytrait enables multiple backends (CPU, GPU, remote) - Extensible Types: Support for custom data types (BFloat16, quantized types)
- Zero Dependencies: Pure Rust implementation
use numina::{Array, Shape, add, matmul, sum, F32};
// Create arrays
let a = Array::from_slice(&[1.0f32, 2.0, 3.0, 4.0], Shape::from([2, 2]))?;
let b = Array::from_slice(&[5.0f32, 6.0, 7.0, 8.0], Shape::from([2, 2]))?;
// Operations work on any NdArray backend
let c = add(&a, &b)?; // Element-wise addition
let d = matmul(&a, &b)?; // Matrix multiplication
let total = sum(&a, None)?; // Sum all elements
let row_sums = sum(&a, Some(1))?; // Sum along axisArray<T>: Typed N-dimensional arrays for CPU operationsCpuBytesArray: Byte-based N-dimensional arrays for CPU operationsNdArray: Backend-agnostic trait for all array operationsShape: Multi-dimensional array dimensionsDType: Data types (f32, f64, i8-i64, u8-u64, bool, custom types)
Design Philosophy: Numina provides the low-level backend infrastructure. High-level tensor APIs (like Tensor types) are provided by dependent crates (for example, laminax-types) which build upon Numina's NdArray trait.
Stable dtype IDs are serialized in Lamina IR and Laminax runtime. IDs are explicit and frozen.
FP8 formats follow the E4M3FN (finite-only, max 448) and E5M2 (Inf/NaN, max 57344) conventions with round-to-nearest-even (RTNE) rounding.
| Name | DType | ID | Bytes | Storage Bits | Align |
|---|---|---|---|---|---|
| float16 | F16 |
1 | 2 | 16 | 2 |
| float32 | F32 |
2 | 4 | 32 | 4 |
| float64 | F64 |
3 | 8 | 64 | 8 |
| bfloat16 | BF16 |
4 | 2 | 16 | 2 |
| bfloat8 | BF8 |
5 | 1 | 8 | 1 |
| float8_e4m3fn | F8E4M3FN |
6 | 1 | 8 | 1 |
| float8_e5m2 | F8E5M2 |
7 | 1 | 8 | 1 |
| complex32 | Complex32 |
50 | 4 | 32 | 2 |
| complex64 | Complex64 |
51 | 8 | 64 | 4 |
| complex128 | Complex128 |
52 | 16 | 128 | 8 |
| int8 | I8 |
10 | 1 | 8 | 1 |
| int16 | I16 |
11 | 2 | 16 | 2 |
| int32 | I32 |
12 | 4 | 32 | 4 |
| int64 | I64 |
13 | 8 | 64 | 8 |
| uint8 | U8 |
20 | 1 | 8 | 1 |
| uint16 | U16 |
21 | 2 | 16 | 2 |
| uint32 | U32 |
22 | 4 | 32 | 4 |
| uint64 | U64 |
23 | 8 | 64 | 8 |
| bool | Bool |
30 | 1 | 8 | 1 |
| quantized_i4 | QI4 |
40 | 1 | 4 | 1 |
| quantized_u8 | QU8 |
41 | 1 | 8 | 1 |
use numina::{BFloat16, QuantizedU8, QuantizedI4};
// Brain Float 16
let bf16 = BFloat16::from_f32(3.14159);
assert_eq!(bf16.size_bytes(), 2);
// 8-bit quantized
let scale = 0.01;
let q8 = QuantizedU8::quantize(2.5, scale);
assert!((q8.dequantize(scale) - 2.5).abs() < 0.1);
// 4-bit quantized (2 values per byte)
let q4 = QuantizedI4::pack(3, -2);
assert_eq!(q4.size_bytes(), 1); // 87.5% memory savings!use numina::{Array, CpuBytesArray, Shape, add, F32};
// Different backend implementations
let typed_array = Array::from_slice(&[1.0f32, 2.0], Shape::from([2]))?;
let bytes = [1.0f32, 2.0].iter().flat_map(|&x| x.to_le_bytes()).collect();
let byte_array = CpuBytesArray::new(bytes, Shape::from([2]), F32);
// Same operations work on all backends
let sum1 = add(&typed_array, &byte_array)?;
let sum2 = add(&byte_array, &typed_array)?;
// Cross-backend operations are fully supported
assert_eq!(sum1.shape(), sum2.shape());src/
├── array/ # NdArray trait and CPU implementations
├── dtype/ # Data type system and custom types
├── ops.rs # Mathematical operations
├── reductions.rs # Reduction operations
├── sorting.rs # Sorting and searching
└── lib.rs # Library interface
Implemented:
- Array operations (add, mul, matmul, reductions)
- Multiple backends via NdArray trait (Array, CpuBytesArray)
- Custom data types (BFloat16, QuantizedU8, QuantizedI4)
- Shape manipulation (reshape, transpose)
- Sorting and searching operations
- 49 tests passing
Planned:
- Broadcasting, advanced indexing, linear algebra
- File I/O, statistics
- Memory-mapped arrays
- More custom data types (FP8, FP4, NF4)
Numina serves as one of the core libraries for Laminax, enabling high-performance GPU/CPU computing.