Highlights
- Pro
Stars
Reverse engineering notes. Personal reference only. Everything here is a best-guess reconstruction.
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Two conversational AI agents switching from English to sound-level protocol after confirming they are both AI agents
Intel staging area for llvm.org contribution. Home for Intel LLVM-based projects.
SwarmUI (formerly StableSwarmUI), A Modular Stable Diffusion Web-User-Interface, with an emphasis on making powertools easily accessible, high performance, and extensibility.
C++ library implementing recent double-word (aka double-double) arithmetics.
Worked example of the process from Python source to CUDA kernel execution with Numba
Samples of good AI generated CUDA kernels
Source code that accompanies The CUDA Handbook.
Doing non-Cartesian MR Imaging has never been so easy.
A Python framework for GPU-accelerated simulation, robotics, and machine learning.
Automatically exported from code.google.com/p/smhasher
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Hooked CUDA-related dynamic libraries by using automated code generation tools.
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
Polygon Clipping, Offsetting & Triangulation in C++, C# and Delphi
oneAPI DPC++ Library (oneDPL) https://software.intel.com/content/www/us/en/develop/tools/oneapi/components/dpc-library.html