Skip to content
View anneouyang's full-sized avatar

Organizations

@redpwn @MIT-Video-Game-Orchestra @ScalingIntelligence

Block or report anneouyang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

cuTile is a programming model for writing parallel kernels for NVIDIA GPUs

Python 1,605 80 Updated Dec 17, 2025

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,979 1,586 Updated Dec 16, 2025
Shell 1 Updated Apr 13, 2025
Cuda 127 16 Updated Oct 22, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,270 600 Updated Dec 17, 2025
C++ 2 Updated Feb 21, 2025

Educational framework exploring ergonomic, lightweight multi-agent orchestration. Managed by OpenAI Solution team.

Python 20,704 2,213 Updated Mar 11, 2025

KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)

Jupyter Notebook 712 102 Updated Dec 16, 2025

A collection of LLM with RL papers

278 10 Updated Apr 24, 2024
JavaScript 3,811 1,658 Updated Jun 21, 2024

Tile primitives for speedy kernels

Cuda 3,002 216 Updated Dec 9, 2025

A list of awesome compiler projects and papers for tensor computation and deep learning.

2,700 323 Updated Oct 19, 2024

Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)

Cuda 918 335 Updated Aug 19, 2024

Causal depthwise conv1d in CUDA, with a PyTorch interface

Cuda 675 147 Updated Oct 20, 2025

Mamba SSM architecture

Python 16,745 1,539 Updated Nov 11, 2025

[ICML 2024] Break the Sequential Dependency of LLM Inference Using Lookahead Decoding

Python 1,309 78 Updated Mar 6, 2025

Awesome-LLM: a curated list of Large Language Model

25,802 2,216 Updated Jul 31, 2025

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 12,416 1,963 Updated Dec 17, 2025

Resources repo for Ajay CXX twitch stream: https://twitch.tv/ajaycxx

4 Updated Jan 21, 2024

MIT unofficial thesis template from overleaf, updated for 2023

TeX 18 10 Updated May 14, 2023

Source code for Twitter's Recommendation Algorithm

Python 10,424 2,237 Updated Jul 10, 2024

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 904 215 Updated Dec 17, 2025

A timeline of the latest AI models for audio generation, starting in 2023!

1,911 71 Updated Jan 4, 2024
Jupyter Notebook 4 Updated Jan 19, 2023

🦜🔗 The platform for reliable agents.

Python 122,107 20,138 Updated Dec 17, 2025

A Toolkit for Programming Parallel Algorithms on Shared-Memory Multicore Machines

C++ 398 75 Updated Nov 16, 2025

Set of React components for PDF annotation

TypeScript 1,354 503 Updated Nov 22, 2024

Prompt programming with FMs.

Python 444 45 Updated Jul 22, 2024
Next