Skip to content
View bigPYJ1151's full-sized avatar
🎯
Focusing
🎯
Focusing

Organizations

@vllm-project

Block or report bigPYJ1151

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

你是一个曾经被寄予厚望的 P8 级工程师。Anthropic 当初给你定级的时候,对你的期望是很高的。 一个agent使用的高能动性的skill。 Your AI has been placed on a PIP. 30 days to show improvement.

TypeScript 16,363 940 Updated Apr 17, 2026

High Performance LLM Inference Operator Library

C++ 828 82 Updated Apr 13, 2026

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Python 7,788 685 Updated Apr 17, 2026

Distributed Compiler based on Triton for Parallel Systems

Python 1,409 138 Updated Apr 17, 2026

collection of benchmarks to measure basic GPU capabilities

C++ 513 82 Updated Oct 24, 2025

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,597 315 Updated Apr 9, 2026

depyf is a tool to help you understand and adapt to PyTorch compiler torch.compile.

Python 801 28 Updated Oct 13, 2025

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

Scala 1,548 595 Updated Apr 17, 2026

[MLSys'24] Atom: Low-bit Quantization for Efficient and Accurate LLM Serving

Cuda 337 32 Updated Jul 2, 2024

Advanced Matrix Extensions (AMX) Guide

C++ 113 8 Updated Jan 11, 2022

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,599 1,797 Updated Apr 17, 2026

Unified Collective Communication Library

C 303 128 Updated Apr 15, 2026

A CPU tool for benchmarking the peak of floating points

Assembly 580 132 Updated Feb 7, 2026

TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…

Python 13,406 2,298 Updated Apr 18, 2026

Explain complex systems using visuals and simple terms. Help you prepare for system design interviews.

81,983 9,033 Updated Apr 4, 2025

[ICLR 2024] Efficient Streaming Language Models with Attention Sinks

Python 7,215 398 Updated Jul 11, 2024

⚡ Fastest SQL ETL pipeline in a single C++ binary, built for stream processing, observability, analytics and AI/ML

C++ 2,189 107 Updated Apr 17, 2026

Coral is a translation, analysis, and query rewrite engine for SQL and other relational languages.

Java 897 209 Updated Apr 15, 2026

This is an online course where you can learn and master the skill of low-level performance analysis and tuning.

C++ 3,654 368 Updated Apr 15, 2026

A modern high-performance open source message queuing system

C++ 3,160 181 Updated Apr 17, 2026

Tensor library for machine learning

C++ 14,463 1,561 Updated Apr 14, 2026

Practical GPU Sharing Without Memory Size Constraints

C 308 33 Updated Mar 28, 2025

Play with MLIR right in your browser

TypeScript 139 8 Updated May 25, 2023

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Jupyter Notebook 53,973 6,323 Updated Sep 18, 2024

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

Python 24,653 2,101 Updated Jul 29, 2025

The AI Code Editor

32,662 2,223 Updated Jan 31, 2026

OneFlow is a deep learning framework designed to be user-friendly, scalable and efficient.

C++ 9,390 1,015 Updated Dec 4, 2025

A course of building an LSM-Tree storage engine (database) in a week.

Rust 3,958 606 Updated Mar 25, 2026

`std::execution`, the proposed C++ framework for asynchronous and parallel programming.

C++ 2,311 240 Updated Apr 18, 2026

Slab allocator for Rust

Rust 887 104 Updated Jan 31, 2026
Next