Chillee

Horace He Chillee

PyTorch Intern 2019 Compilers intern 2018 @google Maintainer of @VSCodeVim Cornell CS/Math 2020

1.2k followers · 12 following

Highlights

Organizations

Stars

pytorch / helion

A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.

Python 692 89 Updated Dec 21, 2025

NVIDIA / nvshmem

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 419 48 Updated Dec 20, 2025

stackav-oss / dltype

Deep Learning Type Library

Python 34 3 Updated Dec 13, 2025

meta-pytorch / monarch

PyTorch Single Controller

Rust 929 120 Updated Dec 20, 2025

genmoai / mochi

The best OSS video generation models, created by Genmo

Python 3,538 468 Updated Nov 14, 2025

meta-pytorch / LeanRL

LeanRL is a fork of CleanRL, where selected PyTorch scripts optimized for performance using compile and cudagraphs.

Python 663 28 Updated Aug 22, 2025

meta-pytorch / attention-gym

Helpful tools and examples for working with flex-attention

Python 1,092 67 Updated Dec 18, 2025

test-time-training / ttt-lm-pytorch

Official PyTorch implementation of Learning to (Learn at Test Time): RNNs with Expressive Hidden States

Python 1,293 85 Updated Jul 14, 2024

Edward-Sun / gpt-accelera

Simple and efficient pytorch-native transformer training and inference (batched)

Python 79 6 Updated Apr 2, 2024

google / gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Python 5,588 567 Updated May 30, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,995 1,588 Updated Dec 19, 2025

meta-pytorch / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Python 6,167 568 Updated Aug 22, 2025

meta-pytorch / segment-anything-fast

A batched offline inference oriented version of segment-anything

Python 1,261 76 Updated Aug 22, 2025

jrfonseca / gprof2dot

Converts profiling output to a dot graph.

Python 3,412 395 Updated Apr 15, 2025

tintn / torch-graph-force

A PyTorch-based library for embedding large graphs to low-dimensional space using force-directed layouts with GPU acceleration.

Python 7 1 Updated Dec 21, 2022

replit / ReplitLM

Inference code and configs for the ReplitLM model family

Python 1,017 117 Updated Oct 9, 2023

nnaisense / evotorch

Advanced evolutionary computation library built directly on top of PyTorch, created at NNAISENSE.

Python 1,110 76 Updated Dec 8, 2025

metaopt / optree

OpTree: Optimized PyTree Utilities

Python 202 12 Updated Dec 21, 2025

ezyang / SMT-LIB-benchmarks-pytorch-shapes

SMT-LIB benchmarks for shape computations from deep learning models in PyTorch

SMT 18 Updated Dec 21, 2022

facebookresearch / shumai

Fast Differentiable Tensor Library in JavaScript and TypeScript with Bun + Flashlight

TypeScript 1,164 27 Updated Jul 23, 2024

yushangdi / fx_module_dumps

Graph dump of torchbench models, huggingface models, and TIMM models.

Python 5 Updated Jul 22, 2022

facebookresearch / torchdim

Named tensors with first-class dimensions for PyTorch

Jupyter Notebook 332 11 Updated Jun 14, 2023

facebookresearch / metaseq

Repo for external large-scale work

Python 6,547 723 Updated Apr 27, 2024

metaopt / torchopt

TorchOpt is an efficient library for differentiable optimization built upon PyTorch.

Python 622 41 Updated Dec 1, 2025

jansel / pytorch-jit-paritybench

Python 41 20 Updated Dec 10, 2024

mosaicml / composer

Supercharge Your Model Training

Python 5,449 458 Updated Nov 12, 2025

unixpickle / sk2torch

Convert scikit-learn models to PyTorch modules

Python 169 8 Updated May 15, 2024

hpcaitech / FastFold

Optimizing AlphaFold Training and Inference on GPU Clusters

Python 610 89 Updated Jul 16, 2024

GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compression of numerical and other data types in HPC/ML applications.

Cuda 366 32 Updated Dec 21, 2025

llvm / torch-mlir

The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.

C++ 1,700 629 Updated Dec 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly