leofang

Leo Fang leofang

Python CUDA tech lead @NVIDIA. Open source contributor on my spare time.

270 followers · 61 following

Achievements

x2 x4 x3

Achievements

x2 x4 x3

Highlights

Organizations

Stars

409 results for source starred repositories

Clear filter

NVIDIA / cutile-python

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,612 80 Updated Dec 17, 2025

NVIDIA / nsight-python

Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools

Python 75 6 Updated Dec 16, 2025

NVIDIA / nvshmem

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 417 48 Updated Nov 13, 2025

keleshev / schema

Schema validation just got Pythonic

Python 2,939 215 Updated Oct 26, 2025

scikit-hep / vector

Vector classes and utilities

Python 94 35 Updated Dec 15, 2025

jameslamb / pydistcheck

Linter that finds portability issues in Python package distributions (wheels, sdists, conda packages).

Python 44 4 Updated Dec 8, 2025

cupy / cupy

NumPy & SciPy for GPU

Python 10,673 979 Updated Dec 18, 2025

scikit-hep / ragged

Manipulating ragged arrays in an Array API compliant way.

Python 45 8 Updated Dec 14, 2025

NVIDIA / jitify

A single-header C++ library for simplifying the use of CUDA Runtime Compilation (NVRTC).

C++ 567 73 Updated Sep 15, 2025

rapidsai / shared-workflows

Reusable GitHub Actions workflows for RAPIDS CI

Shell 7 25 Updated Dec 15, 2025

richards199999 / Thinking-Claude

Let your Claude able to think

TypeScript 16,614 1,964 Updated Nov 4, 2025

NVIDIA / TensorRT-Incubator

Experimental projects related to TensorRT

MLIR 117 22 Updated Dec 18, 2025

simdjson / simdjson

Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks

C++ 22,967 1,191 Updated Dec 18, 2025

mitsuba-renderer / drjit

Dr.Jit — A Just-In-Time-Compiler for Differentiable Rendering

C++ 730 55 Updated Dec 17, 2025

GridTools / gt4py

Python library for generating high-performance implementations of stencil kernels for weather and climate modeling from a domain-specific language (DSL).

Python 136 54 Updated Dec 17, 2025