-
22:15
(UTC +08:00)
Lists (3)
Sort Name ascending (A-Z)
Stars
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Performance-portable, length-agnostic SIMD with runtime dispatch
Public domain cross platform lock free thread caching 16-byte aligned memory allocator implemented in C
facebook / jemalloc
Forked from jemalloc/jemallocMeta fork of the OG Jemalloc project
CUDA Templates and Python DSLs for High-Performance Linear Algebra
🌐 Make websites accessible for AI agents. Automate tasks online with ease.
Implementations of SIMD instruction sets for systems which don't natively support them.
An open-source C++ library developed and used at Facebook.
A fast & densely stored hashmap and hashset based on robin-hood backward shift deletion
mimalloc is a compact general purpose allocator with excellent performance.
Example RISC-V Out-of-Order/Superscalar Processor Performance Core and MSS Model
A tool for running small microbenchmarks on recent Intel and AMD x86 CPUs.
Capstone disassembly/disassembler framework for ARM, ARM64 (ARMv8), Alpha, BPF, Ethereum VM, HPPA, LoongArch, M68K, M680X, Mips, MOS65XX, PPC, RISC-V(rv32G/rv64G), SH, Sparc, SystemZ, TMS320C64X, T…
A translator from Intel SSE intrinsics to Arm/Aarch64 NEON implementation
The Compute Library is a set of computer vision and machine learning functions optimised for both Arm CPUs and GPUs using SIMD technologies.
VVenC, the Fraunhofer Versatile Video Encoder
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
ARMv8 performance monitor from userspace