Skip to content
View FisherKKK's full-sized avatar

Highlights

  • Pro

Block or report FisherKKK

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results
Python 588 60 Updated Jul 11, 2024

Transformer related optimization, including BERT, GPT

C++ 6,430 935 Updated Mar 27, 2024

A modern model graph visualizer and debugger

JavaScript 1,510 159 Updated Jun 22, 2026

Source code examples from the Parallel Forall Blog

HTML 1,331 642 Updated Sep 23, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 100,970 28,097 Updated Jun 23, 2026

CUDA Library Samples

Cuda 2,439 462 Updated Jun 10, 2026

This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, s…

Cuda 1,317 181 Updated Jul 29, 2023

AI agents running research on single-GPU nanochat training automatically

Python 88,217 12,766 Updated Mar 26, 2026

A lightweight alternative to OpenClaw that runs in containers for security. Connects to WhatsApp, Telegram, Slack, Discord, Gmail and other messaging apps,, has memory, scheduled jobs, and runs dir…

TypeScript 29,953 12,897 Updated Jun 22, 2026

Beautiful, Modern & Opinionated Linux

Shell 23,716 2,389 Updated Jun 23, 2026

Very low latency speech to text, intent recognition, and text to speech, for building voice agents and interfaces

C++ 8,526 462 Updated Jun 17, 2026

AliSQL is a MySQL branch originated from Alibaba Group. Fetch document from Release Notes at bottom.

C++ 5,826 893 Updated May 12, 2026

A lightweight, lightning-fast, in-process vector database

C++ 12,237 726 Updated Jun 23, 2026

A simple C++11 Thread Pool implementation

C++ 8,757 2,341 Updated Jul 20, 2024

Algorithm powering the For You feed on X

Rust 26,257 4,504 Updated May 15, 2026

A vector indexing library to bring fast, fresh and filtered search to your database

Rust 1,857 427 Updated Jun 23, 2026

Minimal Claude Code alternative. Single Python file, zero dependencies, ~250 lines.

Python 2,437 237 Updated Jan 14, 2026

vsag is a vector indexing library used for similarity search.

C++ 6 Updated Jun 22, 2026

A curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows

Python 65,592 7,292 Updated May 22, 2026

Animation engine for explanatory math videos

Python 87,820 7,328 Updated Apr 18, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 4,444 709 Updated May 17, 2026

🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning

Python 25,203 4,874 Updated Jun 23, 2026

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 419 52 Updated Jan 2, 2025

🧩 Hands-on SIMD Programming with C++

C++ 160 17 Updated Oct 12, 2025

Awesome Generative Recommendation papers primarily focused on industry-level applications.

223 13 Updated Jun 1, 2026

High Performance KV Cache Store for LLM

C 56 8 Updated May 20, 2026

Flash Attention from Scratch on CUDA Ampere

Assembly 182 29 Updated Sep 1, 2025

The best ChatGPT that $100 can buy.

Python 55,342 7,597 Updated May 5, 2026

Awesome resources for GPUs

629 60 Updated Mar 10, 2026

Codebase for Cuda Learning

Cuda 35 3 Updated Jul 13, 2024
Next