-
Shanghai Jiao Tong University
- Shanghai
Lists (2)
Sort Name ascending (A-Z)
Stars
- All languages
- Assembly
- Astro
- C
- C#
- C++
- CMake
- CSS
- Common Lisp
- Cuda
- Cython
- Dockerfile
- Emacs Lisp
- Erlang
- Go
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Jupyter Notebook
- LLVM
- Linear Programming
- Lua
- MATLAB
- MLIR
- Makefile
- Markdown
- P4
- PHP
- PLpgSQL
- PostScript
- Python
- R
- Racket
- ReScript
- Roff
- Ruby
- Rust
- SAS
- SCSS
- Sage
- Scala
- Scheme
- Shell
- SystemVerilog
- TeX
- TypeScript
- Typst
- VBA
- VHDL
- Verilog
- Vim Script
- Vue
- Yacc
GoLongRL: Capability-Oriented Long Context Reinforcement Learning with Multitask Alignment
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
Personal deep learning study notes and tutorial-style notebooks
🍎 One kernel a day keeps high latency away. A hands-on CUDA learning path featuring a rich collection of kernels, from the basics to peak performance, seamlessly integrated as PyTorch C++ extensions.
Hundreds of agent skills for medical research, including protocol design, data analysis, evidence insights, and academic writing.
Vortex: Programmable Sparse Attention for Agents as Algorithm Designers
A straightforward method for training your LLM, from downloading data to generating text.
UniRL is a Framework for Unified Multimodal Model Reinforcement Learning
Frontier: A Discrete-Event Simulator for Modern LLM Serving
A unified framework for building, running, and training general agents at scale.
AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary
tutorials about polyhedral compilation.
Large DNNs training framework for consumer GPUs
Go sidecar proxy that eliminates Head-of-Line Blocking in LLM inference via ML-driven SJF scheduling — zero backend modification. Paper in preparation
Virtual whiteboard for sketching hand-drawn like diagrams
Foundry materializes CUDA graphs along with its execution context to disk to support fast cold start of serving engines.
Virtual Decoupled Cores: Composable Programming Framework and Runtime for Async GPUs
BlitzScale Router - Distributed LLM Inference Router (Rust)
SwiftRDMA -- Exposing RDMA NIC Resources for Software-Defined RDMA Scheduling
Modern RL Post-training Infrastructure: Optimized for NVIDIA/AMD GPUs with a focus on vLLM and DeepSpeed integration, CUDA/ROCm/Triton kernels, and transparent hardware-aware scaling.