Skip to content
View tangxin-hn's full-sized avatar

Block or report tangxin-hn

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Tutorials for NVIDIA CUPTI samples

C++ 61 13 Updated Nov 3, 2025

A library to analyze PyTorch traces.

Python 480 84 Updated Mar 17, 2026

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 74,692 14,950 Updated Mar 30, 2026

CUPTI based GPU profiling library exposing usdt hooks

C 28 1 Updated Mar 24, 2026

整理和收集来自不同项目的Cursor规则文件,提供多种编程语言和框架的规则支持。

1,749 348 Updated Jan 25, 2026

A fast compressor/decompressor

C++ 6,551 1,029 Updated Mar 6, 2026

Implement a ChatGPT-like LLM in PyTorch from scratch, step by step

Jupyter Notebook 89,516 13,667 Updated Mar 26, 2026

The official repository for ERNIE 4.5 and ERNIEKit – its industrial-grade development toolkit based on PaddlePaddle.

Python 7,695 1,452 Updated Jan 4, 2026

CUDA/Metal accelerated language model inference

C 631 32 Updated May 29, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 25,198 5,051 Updated Mar 30, 2026

A CPU+GPU Profiling library that provides access to timeline traces and hardware performance counters.

HTML 938 240 Updated Mar 29, 2026

eBPF based always-on CPU/GPU profiler auto-discovering targets in Kubernetes and systemd, zero code changes or restarts needed!

Go 714 88 Updated Mar 28, 2026

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

JavaScript 13,335 432 Updated Mar 23, 2026

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also …

C++ 368 81 Updated Mar 25, 2026

eBPF Observability - Distributed Tracing and Profiling

Go 3,949 435 Updated Mar 30, 2026

Distributed tracing without code changes. 🚀 Instantly monitor any application using OpenTelemetry and eBPF

Go 3,644 243 Updated Mar 30, 2026

High-level tracing language for Linux

C++ 10,020 1,445 Updated Mar 30, 2026

🔥 horizontally-scalable, highly-available, multi-tenant continuous profiling aggregation system

Go 2,030 69 Updated Jul 19, 2023

Continuous Profiling Platform. Debug performance issues down to a single line of code

Go 11,320 735 Updated Mar 30, 2026

Continuous profiling for analysis of CPU and memory usage, down to the line number and throughout time. Saving infrastructure cost, improving performance, and increasing reliability.

TypeScript 4,829 247 Updated Mar 30, 2026

ebpf-go is a pure-Go library to read, modify and load eBPF programs and attach them to various hooks in the Linux kernel.

Go 7,624 842 Updated Mar 30, 2026

The production-scale datacenter profiler (C/C++, Go, Rust, Python, Java, NodeJS, .NET, PHP, Ruby, Perl, ...)

Go 3,071 389 Updated Mar 30, 2026

Hooked CUDA-related dynamic libraries by using automated code generation tools.

C 171 47 Updated Dec 12, 2023

NVIDIA Resiliency Extension is a python package for framework developers and users to implement fault-tolerant features. It improves the effective training time by minimizing the downtime due to fa…

Python 275 48 Updated Mar 28, 2026

Sampling profiler for Python programs

Rust 15,071 505 Updated Mar 5, 2026

Trace your python process line by line with eBPF!

Python 260 5 Updated Feb 19, 2023

The best way to write secure and reliable applications. Write nothing; deploy nowhere.

Dockerfile 65,113 4,828 Updated Aug 7, 2024

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 98,646 27,342 Updated Mar 30, 2026

DLRover: An Automatic Distributed Deep Learning System

Python 1,641 213 Updated Mar 27, 2026

GLM (General Language Model)

Python 1 Updated Aug 29, 2024
Next