-
Princeton University
- Princeton, NJ
-
23:04
(UTC -04:00) - https://yinwei-dai.com
- @dai_yinwei
Highlights
- Pro
Stars
A Cluster-Wide Model Manager to Accelerate DNN Training via Automated Training Warmup
Aequitas enables RPC-level QoS in datacenter networks.
Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs
An extremely fast Python package and project manager, written in Rust.
SGLang is a high-performance serving framework for large language models and multimodal models.
Large Language Model (LLM) Systems Paper List
Mirage Persistent Kernel: Compiling LLMs into a MegaKernel
Artifact for "Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving" [SOSP '24]
Measure and optimize the energy consumption of your AI applications!
Infiniswap enables unmodified applications to efficiently use disaggregated memory.
Tiresias is a GPU cluster manager for distributed deep learning training.
Hydra adds resilience and high availability to remote memory solutions.
FedScale is a scalable and extensible open-source federated learning (FL) platform.
Justitia provides RDMA isolation between applications with diverse requirements.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Lecture notes for Chris Peikert's graduate-level Theory of Cryptography course
EECS 489: Computer Networks @ the University of Michigan