jeromeku

jeromeku

104 followers · 263 following

Achievements

x3 x2

Achievements

x3 x2

Starred repositories

yao-jz / intra-kernel-profiler

Region-level profiling for CUDA kernels with trace, NVBit, CUPTI, NSys, and an interactive Explorer.

Python 118 11 Updated Apr 17, 2026

facebookresearch / tensor-layouts

A pure-Python implementation of the Nvidia CuTe layout algebra intended to be approachable and easy to learn.

Python 183 12 Updated May 15, 2026

IST-DASLab / CloverLM

🍀 Codebase for CloverLM

Python 7 Updated Apr 26, 2026

gvlassis / ant

🐜 Research-friendly Deep Learning framework

Python 8 1 Updated May 4, 2026

SandAI-org / MagiCompiler

A plug-and-play compiler that delivers free-lunch optimizations for both inference and training.

Python 314 23 Updated May 31, 2026

stotko / stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

C++ 1,264 99 Updated Jun 8, 2026

gautam1858 / tiny-gpu-compiler

An MLIR-based compiler that takes GPU kernels and compiles them to real hardware instructions. Interactive web visualizer included.

TypeScript 131 16 Updated Mar 21, 2026

flagos-ai / awesome-LLM-driven-kernel-generation

Review automated kernel generation in the era of LLMs

233 18 Updated May 26, 2026

Aleph-Alpha / Alpha-MoE

Cuda 61 12 Updated Dec 10, 2025

banach-space / cpp-tutor

Code examples for tutoring modern C++

C++ 100 9 Updated Jul 21, 2025

ashvardanian / ScalingElections

GPU-accelerated Schulze voting method in Python, Numba, CUDA, and Mojo 🔥, using ideas from Algebraic Graph Theory

Mojo 19 1 Updated Oct 28, 2025

kuterd / nv_isa_solver

Nvidia Instruction Set Specification Generator

Python 338 23 Updated Jul 9, 2024

AnHaechan / ai-compilers-study-material

A collection of study materials for AI compilers and systems.

58 2 Updated Nov 14, 2025

FedericoBruzzone / papers-on-compiler-optimizations

A chronologically sorted list of influential papers on compiler optimization, from the seminal works of 1952 through the advanced techniques of 1994

TeX 79 7 Updated May 26, 2026

rafasumi / mlir-tutorial

SBLP 2025 MLIR Tutorial

C++ 75 4 Updated Mar 25, 2026

lambdaclass / mlir-workshop

A MLIR Rust workshop

Rust 8 1 Updated Dec 11, 2024

amindWalker / Rust-Layout-and-Types

A concise explanation of Rust types and Memory Layout.

137 15 Updated Jul 9, 2025

joerick / pyinstrument

🚴 Call stack profiler for Python. Shows you why your code is slow!

Python 7,929 289 Updated Jun 9, 2026

tanishqkumar / beyond-nanogpt

Minimal and annotated implementations of key ideas from modern deep learning research.

Python 1,320 109 Updated Jan 29, 2026

ByteDance-Seed / Triton-distributed

Distributed Compiler based on Triton for Parallel Systems

Python 1,458 151 Updated Apr 22, 2026

CalvinXKY / mfu_calculation

A simple calculation for LLM MFU.

Jupyter Notebook 78 4 Updated Sep 10, 2025

reed-lau / cute-gemm

C++ 182 45 Updated May 11, 2026

mohitmishra786 / amILearningEnough

Low-Level Programming Roadmap and Resources

1,333 89 Updated Mar 26, 2026

adamtiger / tinyGPUlang

Tutorial on building a gpu compiler backend in LLVM

C++ 58 11 Updated Jan 11, 2025

nano-R1 / resources

Compiling useful links, papers, benchmarks, ideas, etc.

46 1 Updated Mar 16, 2025

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

This is the homepage of a new book entitled "Mathematical Foundations of Reinforcement Learning."

MATLAB 16,476 1,557 Updated May 26, 2026

mbzuai-oryx / Awesome-LLM-Post-training

Awesome Reasoning LLM Tutorial/Survey/Guide

Python 2,441 164 Updated Apr 6, 2026

dendibakh / perf-ninja

This is an online course where you can learn and master the skill of low-level performance analysis and tuning.

C++ 3,738 381 Updated Jun 4, 2026

deepseek-ai / EPLB

Expert Parallelism Load Balancer

Python 1,388 203 Updated Mar 24, 2025

deepseek-ai / profile-data

Analyze computation-communication overlap in V3/R1.

1,160 147 Updated Mar 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jeromeku

Achievements

Achievements

Block or report jeromeku

Starred repositories

yao-jz / intra-kernel-profiler

facebookresearch / tensor-layouts

IST-DASLab / CloverLM

gvlassis / ant

SandAI-org / MagiCompiler

stotko / stdgpu

gautam1858 / tiny-gpu-compiler

flagos-ai / awesome-LLM-driven-kernel-generation

Aleph-Alpha / Alpha-MoE

banach-space / cpp-tutor

ashvardanian / ScalingElections

kuterd / nv_isa_solver

AnHaechan / ai-compilers-study-material

FedericoBruzzone / papers-on-compiler-optimizations

rafasumi / mlir-tutorial

lambdaclass / mlir-workshop

amindWalker / Rust-Layout-and-Types

joerick / pyinstrument

tanishqkumar / beyond-nanogpt

ByteDance-Seed / Triton-distributed

CalvinXKY / mfu_calculation

reed-lau / cute-gemm

mohitmishra786 / amILearningEnough

adamtiger / tinyGPUlang

nano-R1 / resources

MathFoundationRL / Book-Mathematical-Foundation-of-Reinforcement-Learning

mbzuai-oryx / Awesome-LLM-Post-training

dendibakh / perf-ninja

deepseek-ai / EPLB

deepseek-ai / profile-data

Starred topics

Awesome Lists