-
https://github.com/ROCmSoftwarePlatform/MIOpen
- San Diego, CA
- @_TejashShah_
Stars
Fast and memory-efficient exact attention
PjRt plugin and Python APIs for MPMD workflows in Jax
Experimental projects related to TensorRT
PyTorch compiler that accelerates training and inference. Get built-in optimizations for performance, memory, parallelism, and easily write your own.
A Fusion Code Generator for NVIDIA GPUs (commonly known as "nvFuser")
[DEPRECATED] Moved to ROCm/rocm-libraries repo