StuartSul

Stuart Sul StuartSul

cs @ stanford | ml @ cursor

43 followers · 3 following

Achievements

x3 x2

Achievements

x3 x2

Highlights

Organizations

Stars

dddrrreee / cs340lx-25aut

all class materials for 340lx

C 3 1 Updated Oct 8, 2025

NVIDIA / nvshmem

NVIDIA NVSHMEM is a parallel programming interface for NVIDIA GPUs based on OpenSHMEM. NVSHMEM can significantly reduce multi-process communication and coordination overheads by allowing programmer…

C++ 327 26 Updated Oct 8, 2025

StuartSul / gpu-experiments

A collection of GPU tests and benchmarks for my own research.

Cuda 3 1 Updated Oct 5, 2025

NVIDIA / TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper, Ada and Blackwell GPUs, to provide better performance with lower memory…

Python 2,766 514 Updated Oct 9, 2025

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 8,554 1,474 Updated Sep 25, 2025

yaof20 / Flash-RL

Implementation for FP8/INT8 Rollout for RL training without performence drop.

Python 250 18 Updated Sep 29, 2025

IST-DASLab / Quartet

Jupyter Notebook 100 10 Updated Aug 24, 2025

linux-rdma / perftest

Infiniband Verbs Performance Tests

C 830 353 Updated Oct 5, 2025

Mellanox / gpu_direct_rdma_access

example code for using DC QP for providing RDMA READ and WRITE operations to remote GPU memory

C 145 36 Updated Jul 30, 2024

bytedance / flux

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,138 83 Updated Aug 28, 2025

pytorch / torchtitan

A PyTorch native platform for training generative AI models

Python 4,511 555 Updated Oct 9, 2025

pytorch / extension-cpp

C++ extensions in PyTorch

Python 1,149 245 Updated Jul 8, 2025

dddrrreee / cs240lx-25spr

cs240lx stanford 2025 spring

C 15 2 Updated Jun 11, 2025

haoliuhl / ringattention

Large Context Attention

Python 742 54 Updated Jan 24, 2025

Tweoss / waveform

JavaScript 1 Updated May 18, 2024

HazyResearch / ThunderKittens

Tile primitives for speedy kernels

Cuda 2,794 182 Updated Sep 21, 2025

alienator88 / Pearcleaner

A free, source-available and fair-code licensed mac app cleaner

Swift 8,978 211 Updated Oct 8, 2025

NVIDIA / nccl

Optimized primitives for collective multi-GPU communication

C++ 4,124 1,033 Updated Sep 24, 2025

StuartSul / co-chuck

Co-Chuck: WebChucK IDE with Multi-User Collaboration and Synchronized ChucK Shreds

TypeScript 3 Updated Dec 4, 2024

rodyager / RWTS-PDFwriter

An OSX print to pdf-file printer driver

Swift 1,032 87 Updated Sep 9, 2025

marmelab / react-admin

A frontend Framework for single-page applications on top of REST/GraphQL APIs, using TypeScript, React and Material Design

TypeScript 26,273 5,405 Updated Oct 9, 2025

maddevsio / aws-eks-base

This boilerplate contains terraform configurations for the rapid deployment of a Kubernetes cluster, supporting services, and the underlying infrastructure in AWS.

HCL 633 111 Updated Sep 2, 2025

spotify / annoy

Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk

C++ 13,982 1,209 Updated Jul 29, 2024

bentoml / BentoML

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Python 8,117 877 Updated Oct 8, 2025

tensorflow / recommenders

TensorFlow Recommenders is a library for building recommender system models using TensorFlow.

Python 1,979 294 Updated Sep 27, 2025

Kyubyong / dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model

Python 1,160 364 Updated Apr 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stuart Sul StuartSul

Achievements

Achievements

Highlights

Organizations

Block or report StuartSul

Stars

dddrrreee / cs340lx-25aut

NVIDIA / nvshmem

StuartSul / gpu-experiments

NVIDIA / TransformerEngine

NVIDIA / cutlass

yaof20 / Flash-RL

IST-DASLab / Quartet

linux-rdma / perftest

Mellanox / gpu_direct_rdma_access

bytedance / flux

pytorch / torchtitan

pytorch / extension-cpp

dddrrreee / cs240lx-25spr

haoliuhl / ringattention

Tweoss / waveform

HazyResearch / ThunderKittens

alienator88 / Pearcleaner

NVIDIA / nccl

StuartSul / co-chuck

rodyager / RWTS-PDFwriter

marmelab / react-admin

maddevsio / aws-eks-base

spotify / annoy

bentoml / BentoML

tensorflow / recommenders

Kyubyong / dc_tts