Stars
๐ค TT-NN operator library, and TT-Metalium low level kernel programming model.
FP64 equivalent GEMM via Int8 Tensor Cores using the Ozaki scheme
Large collection of number systems providing custom arithmetic for mixed-precision algorithm development and optimization for AI, Machine Learning, Computer Vision, Signal Processing, CAE, EDA, conโฆ
The first analysis framework for CPU microcode
BLIS fork with kernels for Apple M1. (Perhaps) The first open-source BLAS with Apple Matrix Coprocessor support.
xoreaxeaxeax / sandsifter
Forked from Battelle/sandsifterThe x86 processor fuzzer
Public repository for Litefury & Nitefury
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
Run compilers interactively from your web browser and interact with the assembly
Real-time face swap for PC streaming or video calls
Reinforcement learning environments for compiler and program optimization tasks
OpenBLAS is an optimized BLAS library based on GotoBLAS2 1.13 BSD version.
Parallel solvers for sparse linear systems featuring multigrid methods.
OpenFoamยฎ motorBike case with adaptive volume & surface mesh refinement based on curl(U) or grad(p)
Limbo is a QEMU-based emulator for Android. It currently supports x86, ARM, PowerPC, and Sparc emulation for Intel x86 and ARM android devices. See wiki https://virtualmachinery.weebly.com for APK โฆ
Utilities to print information about video encode/decode capabilities of nvidia GPUs