Lists (1)
Sort Name ascending (A-Z)
Stars
Exploring how optimizations for GEMMs work
This module defines a type system for distributed training code, based off of JAX's sharding in types, but adapted for the PyTorch ecosystem.
Optimized primitives for collective multi-GPU communication