- Berkeley, CA
- https://zongheng.me/
- @zongheng_yang
Highlights
- Pro
Stars
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
A collection of reproducible inference engine benchmarks
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, 20+ clouds, or on-prem).
Releasing the spot availability traces used in "Can't Be Late" paper.
Multi-LoRA inference server that scales to 1000s of fine-tuned LLMs
⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW.
A high-throughput and memory-efficient inference and serving engine for LLMs
UI tool for fine-tuning and testing your own LoRA models base on LLaMA, GPT-J and more. One-click run on Google Colab. + A Gradio ChatGPT-like Chat UI to demonstrate your language models.
An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.
Examples and instructions about use LLMs (especially ChatGPT) for PhD
Distribute and run AI workloads on Kubernetes magically in Python, like PyTorch for ML infra.
Tutorial to get started with SkyPilot!
Training and serving large-scale neural networks with auto parallelization.
🔥 Blazing fast bulk data transfers between any cloud 🔥
A Domain-Agnostic Benchmark for Self-Supervised Learning
Run-time data-access policy enforcement for web applications.
Aqueduct is no longer being maintained. Aqueduct allows you to run LLM and ML workloads on any cloud infrastructure.
Move fast from data science prototype to pipeline. Capture, analyze, and transform messy notebooks into data pipelines with just two lines of code.
Balsa is a learned SQL query optimizer. It tailor optimizes your SQL queries to find the best execution plans for your hardware and engine.
Source code and datasets for Ekya, a system for continuous learning on the edge.
A library that translates Python and NumPy to optimized distributed systems code.
State-of-the-art neural cardinality estimators for join queries