fma
Here are 12 public repositories matching this topic...
A collection of highly optimized, SIMD-accelerated (SSE, AVX, FMA, NEON) functions written in C
-
Updated
Oct 19, 2021 - C
X86-64 bilateral instruction tokenizer implemented in C. Supports the following processor extensions: AES, AVX, AVX2, AVX512, FMA, MMX, SSE, SSE2, SSE3, SSE4, x87(FPU), VMX. In order to ease testing, a diassembler which transforms tokens into compilable assembly (for NASM compiler) has been implemented.
-
Updated
Oct 2, 2022 - C
offline artistic image generator
-
Updated
Apr 29, 2020 - C
Sum of three floating-point values with correct rounding
-
Updated
Feb 25, 2026 - C
Safety-hardened GEMM (matrix multiply) implementation achieving 169.8 GFLOPS on Intel i9-14900. Built for embedded systems and safety-critical applications where reliability matters as much as speed. 162× faster than naive, zero UB, fully validated.
-
Updated
Nov 21, 2025 - C
Improve this page
Add a description, image, and links to the fma topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the fma topic, visit your repo's landing page and select "manage topics."