Stars
Official implementation of UniSHARP: Universal Sharp Monocular View Synthesis
Model export recipes, Python primitives, and Swift runtime utilities for on-device AI
Bridges PyTorch and Core AI. Convert existing models to Core AI IR, or author new ones from PyTorch via composite ops, custom op lowerings, and inline Metal GPU kernels.
A library for PyTorch model compression and optimizations for deployment via Core AI on Apple silicon.
Code for the manuscript "A self-supervised multi-layer network of Rectified Spectral Units (ReSUs)" submitted to NeurIPS
Instant neural graphics primitives: lightning fast NeRF and more
[NeurIPS 2025] Instant4D: 4D Gaussian Splatting in Minutes
Lightning fast C++/CUDA neural network framework
WaveFormer: Frequency-Time Decoupled Vision Modeling with Wave Equation
CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation
KernelBench: Can LLMs Write GPU Kernels? - Benchmark + Toolkit with Torch -> CUDA (+ more DSLs)
Training neural networks on Apple Neural Engine via reverse-engineered private APIs
Sharp Monocular View Synthesis in Less Than a Second
PyTorch native quantization and sparsity for training and inference
Ultra-low bitrate neural audio codec (0.31~1.40 kbps) with a better semantic in the latent space.
Open source implementation of Apple's SwiftUI.
Jax Codebase for Evolutionary Strategies at the Hyperscale
On-device AI across mobile, embedded and edge for PyTorch
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second.
DiffusionLigth Turbo reimplement using Diffuser's callback
Fast Light estimation under 30 second with DiffusionLight-Turbo
NVIDIA Math Libraries for the Python Ecosystem