Skip to content
View ftxj's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ftxj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Helpful kernel tutorials and examples for tile-based GPU programming

Python 630 44 Updated Feb 7, 2026

[CVPR 2025] 🎉 Official repository of "ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning"

Python 277 21 Updated Oct 10, 2025

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

Python 2,401 569 Updated Feb 6, 2026

Sampling profiler for Python programs

Rust 14,918 497 Updated Feb 5, 2026

Machine Learning Engineering Open Book

Python 16,589 1,034 Updated Jan 23, 2026

The Art of Debugging Open Book

Python 1,278 66 Updated Jan 17, 2026
Jupyter Notebook 1,121 174 Updated Feb 5, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,741 13,274 Updated Feb 7, 2026

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 379 77 Updated Feb 6, 2026

Making large AI models cheaper, faster and more accessible

Python 41,340 4,537 Updated Jan 19, 2026

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,382 591 Updated Oct 28, 2024

Awesome resources for GPUs

609 57 Updated Jul 1, 2023

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,912 352 Updated Jul 15, 2024

Pyjion - A JIT for Python based upon CoreCLR

C++ 1,429 59 Updated Dec 25, 2024

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

Python 16,594 590 Updated Feb 7, 2026

Ocolos is the first open-sourced online code layout optimization system for unmodified applications written in unmanaged languages.

C++ 53 16 Updated Jan 9, 2026

An optimizing compiler for decision tree ensemble inference.

C++ 18 5 Updated Jul 11, 2025

Reinforcement learning environments for compiler and program optimization tasks

Python 994 136 Updated Feb 6, 2026

A speculative mechanism to accelerate long-latency off-chip load requests by removing on-chip cache access latency from their critical path, as described by MICRO 2022 paper by Bera et al. (https:/…

C++ 76 13 Updated Sep 10, 2025

Tensors and Dynamic neural networks in Python with strong GPU acceleration

C++ 26 7 Updated Apr 20, 2023

Ceras is yet another tiny deep learning engine, in pure c++ and header only.

C++ 127 12 Updated Oct 26, 2025

Compile Time Regular Expression in C++

C++ 3,758 204 Updated Sep 12, 2025

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 5,045 224 Updated Jan 15, 2026

The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++

CSS 44,752 5,545 Updated Feb 4, 2026

Papers on Graph Analytics, Mining, and Learning

128 18 Updated Aug 15, 2022

Fluid simulation engine for computer graphics applications

C++ 2,068 282 Updated Dec 24, 2023

Study Group of Deep Learning Compiler

167 19 Updated Jan 15, 2023

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,703 382 Updated Jan 12, 2026

⚓ 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

9,852 1,753 Updated Oct 16, 2021
Next