Skip to content
View ryantd's full-sized avatar
🏎️
🏎️

Organizations

@kubeflow

Block or report ryantd

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

An open-source long-horizon SuperAgent harness that researches, codes, and creates. With the help of sandboxes, memories, tools, skill, subagents and message gateway, it handles different levels of…

Python 73,295 9,907 Updated Jun 23, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,709 1,063 Updated Apr 30, 2026

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 5,631 871 Updated Jun 22, 2026

Meta Lingua: a lean, efficient, and easy-to-hack codebase to research LLMs.

Python 4,760 273 Updated Jul 18, 2025

A validation and profiling tool for AI infrastructure

Python 377 88 Updated Jun 22, 2026

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,638 321 Updated Jun 22, 2026

A Flexible Framework for Experiencing Heterogeneous LLM Inference/Fine-tune Optimizations

Python 17,313 1,321 Updated Jun 22, 2026

A plug-in replacement for DataLoader to load Imagenet disk-sequentially in PyTorch.

Python 239 20 Updated Aug 18, 2021

Universal Python binding for the LMDB 'Lightning' Database

C 741 122 Updated Jun 9, 2026

Vector Search Engine base on BRPC + FAISS

C++ 152 52 Updated Oct 21, 2019

LLM training in simple, raw C/CUDA

Cuda 30,302 3,657 Updated Jun 26, 2025

Grok open release

Python 51,692 8,472 Updated Aug 30, 2024

The official PyTorch implementation of Google's Gemma models

Python 5,697 598 Updated May 30, 2025

A simple program to calculate and visualize the FLOPs and Parameters of Pytorch models, with handy CLI and easy-to-use Python API.

Python 131 10 Updated Nov 23, 2024

Minimal, clean code for the Byte Pair Encoding (BPE) algorithm commonly used in LLM tokenization.

Python 10,583 1,071 Updated Jul 1, 2024

Official PyTorch Implementation of "Scalable Diffusion Models with Transformers"

Python 8,636 796 Updated May 31, 2024

Make huge neural nets fit in memory

Python 2,840 279 Updated Apr 26, 2020

Provide Python access to the NVML library for GPU diagnostics

Python 270 34 Updated Sep 5, 2025

Building blocks for foundation models.

627 27 Updated Jan 3, 2024

Code repository for the paper - "Matryoshka Representation Learning"

Jupyter Notebook 641 43 Updated Feb 19, 2024

pytorch-profiler

Python 49 8 Updated Jun 1, 2023

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Python 1,092 88 Updated Sep 4, 2024

Efficient AI Inference & Serving

Python 480 31 Updated Jan 8, 2024

Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Python 346 30 Updated Feb 23, 2025

A curated list of reinforcement learning with human feedback resources (continually updated)

4,397 254 Updated May 20, 2026

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 810 29 Updated Oct 13, 2025
Python 8,680 519 Updated Oct 9, 2024

High-speed Large Language Model Serving for Local Deployment

C++ 9,577 582 Updated May 11, 2026
Next