Stars
A library to analyze PyTorch traces.
A high-throughput and memory-efficient inference and serving engine for LLMs
CUPTI based GPU profiling library exposing usdt hooks
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.
SGLang is a high-performance serving framework for large language models and multimodal models.
A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.
eBPF based always-on CPU/GPU profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …
eBPF Observability - Distributed Tracing and Profiling
Distributed tracing without code changes. 🚀 Instantly monitor any application using OpenTelemetry and eBPF
🔥 horizontally-scalable, highly-available, multi-tenant continuous profiling aggregation system
Continuous Profiling Platform. Debug performance issues down to a single line of code
Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.
ebpf-go is a pure-Go library to read, modify and load eBPF programs and attach them to various hooks in the Linux kernel.
The production-scale datacenter profiler (C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl, ...)
Hooked CUDA-related dynamic libraries by using automated code generation tools.
NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…
Trace your python process line by line with eBPF!
The best way to write secure and reliable applications. Write nothing; deploy nowhere.
Tensors and Dynamic neural networks in Python with strong GPU acceleration
DLRover: An Automatic Distributed Deep Learning System