Skip to content
View ftxj's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report ftxj

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Helpful kernel tutorials and examples for tile-based GPU programming

Python 683 55 Updated Mar 24, 2026

[CVPR 2025] 🎉 Official repository of "ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning"

Python 298 24 Updated Oct 10, 2025

Open-source deep-learning framework for building, training, and fine-tuning deep learning models using state-of-the-art Physics-ML methods

Python 2,589 617 Updated Mar 24, 2026

Sampling profiler for Python programs

Rust 15,057 505 Updated Mar 5, 2026

Machine Learning Engineering Open Book

Python 17,528 1,111 Updated Mar 16, 2026

The Art of Debugging Open Book

Python 1,330 67 Updated Mar 16, 2026
Jupyter Notebook 1,149 182 Updated Mar 17, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,274 14,754 Updated Mar 25, 2026

A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")

C++ 385 80 Updated Mar 25, 2026

Making large AI models cheaper, faster and more accessible

Python 41,376 4,523 Updated Mar 16, 2026

Running large language models on a single GPU for throughput-oriented scenarios.

Python 9,382 592 Updated Oct 28, 2024

Awesome resources for GPUs

612 59 Updated Mar 10, 2026

Solve puzzles. Improve your pytorch.

Jupyter Notebook 3,993 361 Updated Jul 15, 2024

Pyjion - A JIT for Python based upon CoreCLR

C++ 1,426 59 Updated Dec 25, 2024

A high-performance, zero-overhead, extensible Python compiler with built-in NumPy support

Python 16,683 597 Updated Mar 24, 2026

Ocolos is the first open-sourced online code layout optimization system for unmodified applications written in unmanaged languages.

C++ 53 16 Updated Jan 9, 2026

An optimizing compiler for decision tree ensemble inference.

C++ 18 5 Updated Jul 11, 2025

Reinforcement learning environments for compiler and program optimization tasks

Python 1,001 136 Updated Mar 20, 2026

A speculative mechanism to accelerate long-latency off-chip load requests by removing on-chip cache access latency from their critical path, as described by MICRO 2022 paper by Bera et al. (https:/…

C++ 77 13 Updated Feb 21, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

C++ 26 7 Updated Apr 20, 2023

Ceras is yet another tiny deep learning engine, in pure c++ and header only.

C++ 127 12 Updated Oct 26, 2025

Compile Time Regular Expression in C++

C++ 3,771 206 Updated Sep 12, 2025

Ecosystem of libraries and tools for writing and executing fast GPU code fully in Rust.

Rust 5,117 227 Updated Mar 24, 2026

The C++ Core Guidelines are a set of tried-and-true guidelines, rules, and best practices about coding in C++

CSS 44,863 5,546 Updated Mar 12, 2026

Papers on Graph Analytics, Mining, and Learning

128 18 Updated Aug 15, 2022

Fluid simulation engine for computer graphics applications

C++ 2,078 279 Updated Dec 24, 2023

Study Group of Deep Learning Compiler

169 19 Updated Jan 15, 2023

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.

Python 4,711 382 Updated Mar 16, 2026

⚓ 我的游戏程序员生涯的读书笔记合辑。你可以把它看作一个加强版的Blog。涉及图形学、实时渲染、编程实践、GPU编程、设计模式、软件工程等内容。Keep Reading , Keep Writing , Keep Coding.

9,882 1,754 Updated Oct 16, 2021
Next