Stars
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
Helpful kernel tutorials, examples and SKILLs for tile-based GPU programming
Accompanying Code for "Flipping Coins to Estimate Pseudocounts for Exploration in Reinforcement Learning", ICML 2023
Inference server benchmarking tool
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Large Language Model Text Generation Inference
One stop shop for running AI/ML on AWS.
Amazon SageMaker operator for Kubernetes