- Denver, CO
-
06:18
(UTC -06:00) - https://www.aaronbatilo.dev
- @aaronbatilo
- https://sliceofexperiments.com
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
NVIDIA AITune is an inference toolkit designed for tuning and deploying Deep Learning models with a focus on NVIDIA GPUs.
Google Workspace CLI — one command-line tool for Drive, Gmail, Calendar, Sheets, Docs, Chat, Admin, and more. Dynamically built from Google Discovery Service. Includes AI agent skills.
A CLI to estimate inference memory requirements for Hugging Face models, written in Python.
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
GLM-ASR-Nano: A robust, open-source speech recognition model with 1.5B parameters
NVSentinel is a cross-platform fault remediation service designed to rapidly remediate runtime node-level issues in GPU-accelerated computing environments
Checkpoint-engine is a simple middleware to update model weights in LLM inference engines
Nsight Python is a Python kernel profiling interface based on NVIDIA Nsight Tools
Autonomous GPU Kernel Generation & Optimization via Deep Agents
Minimalistic 4D-parallelism distributed training framework for education purpose
An Efficient and User-Friendly Scaling Library for Reinforcement Learning with Large Language Models
Distribute and run AI workloads on Kubernetes magically in Python, like PyTorch for ML infra.
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
agent-sandbox enables easy management of isolated, stateful, singleton workloads, ideal for use cases like AI agent runtimes.
Beads - A memory upgrade for your coding agent
An agentic skills framework & software development methodology that works.
Open Source Continuous Inference Benchmarking Qwen3.5, DeepSeek, GPTOSS - GB200 NVL72 vs MI355X vs B200 vs GB300 NVL72 vs H100 & soon™ TPUv6e/v7/Trainium2/3
A scalable asynchronous reinforcement learning implementation with in-flight weight updates.
GenAI inference performance benchmarking tool
A storage solution for PyTorch tensors with distributed tensor support.