Stars
A simple, unified multimodal models training engine. Lean, flexible, and built for hacking at scale.
A dynamic binary instrumentation tool for tracing and analyzing CUDA kernel instructions.
PTX ISA 9.1 documentation converted to searchable markdown. Includes Claude Code skill for CUDA development.
AI agents running research on single-GPU nanochat training automatically
A Curated List of Awesome Video World Models with AR Diffusion: Covering Algorithms, Applications, and Infrastructure, Aimed at Serving as a Comprehensive Resource for Researchers, Practitioners, a…
A Streaming-Native Serving Engine for TTS/STS Models
Enable Claude Code to learn in real-time, update it's knowledge, and grow with you, using supermemory.
code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
This repository contains the official implementation of "FastVLM: Efficient Vision Encoding for Vision Language Models" - CVPR 2025
Arm64 inline hooking for iOS, Android, OSX, and Linux.
Code execution utilities for Open WebUI & Ollama
Zotero is a free, easy-to-use tool to help you collect, organize, annotate, cite, and share your research sources.
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
Vulkan-based Gaussian Splatting viewer, and python binding
An innovative library for efficient LLM inference via low-bit quantization
Universal local privilege escalation Proof-of-Concept exploit for CVE-2024-1086, working on most Linux kernels between v5.14 and v6.6, including Debian, Ubuntu, and KernelCTF. The success rate is 9…
Production-grade 3D gaussian splatting with CPU/GPU support for Windows, Mac and Linux 🚀
3D Gaussian Splatting Renderer implemented in WebGPU (WGPU) and Rust
A 3d gaussian splatting renderer in C++ and OpenGL
A compute shader implementation of the OneSweep sorting algorithm.
GPU Radix Sort implemented in Vulkan and GLSL.