- San Francisco, CA
- https://www.linkedin.com/in/parano/
- @chaoyu_
- eyes_of_sf
Highlights
Lists (1)
Sort Name ascending (A-Z)
- All languages
- ApacheConf
- Assembly
- Astro
- C
- C#
- C++
- CMake
- CSS
- Clojure
- CoffeeScript
- Cuda
- Cython
- Dockerfile
- Erlang
- Go
- Groovy
- HCL
- HTML
- Haskell
- Java
- JavaScript
- Jinja
- Jsonnet
- Julia
- Jupyter Notebook
- Kotlin
- Less
- LiveScript
- Lua
- MDX
- MLIR
- Makefile
- Markdown
- Mojo
- Mustache
- Nunjucks
- OCaml
- Objective-C
- Objective-C++
- PHP
- Perl
- Python
- R
- Ruby
- Rust
- SCSS
- Scala
- Shell
- Starlark
- Swift
- TeX
- Terra
- Thrift
- TypeScript
- Vim Script
- Vue
- WebAssembly
Starred repositories
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
A fast communication-overlapping library for tensor/expert parallelism on GPUs.
Domain-specific language designed to streamline the development of high-performance GPU/CPU/Accelerators kernels
dstack is an open-source control plane for running development, training, and inference jobs on GPUs—across hyperscalers, neoclouds, or on-prem.
A Python-embedded DSL that makes it easy to write fast, scalable ML kernels with minimal boilerplate.
Automatically Discovering Fast Parallelization Strategies for Distributed Deep Neural Network Training
Development repository for the Triton language and compiler
Approximate Nearest Neighbors in C++/Python optimized for memory usage and loading/saving to disk
💫 Toolkit to help you get started with Spec-Driven Development
Benchmark and optimize LLM inference across frameworks with ease
Original Apollo 11 Guidance Computer (AGC) source code for the command and lunar modules.
The simplest, fastest repository for training/finetuning small-sized VLMs.
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
Open Source framework for voice and multimodal conversational AI
[ICLR2025 Spotlight] SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models
PyTorch native quantization and sparsity for training and inference
A better build tool for Java, Scala and Kotlin: Simpler than Maven, easier than Gradle, with 3-7x faster dev workflows than other JVM build tools
A nano Claude Code–like agent, built from 0 to 1