Skip to content
View zrl4836's full-sized avatar

Block or report zrl4836

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

Graphs that teach > graphs that impress. Turn any code, or knowledge base (Karpathy LLM wiki), into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claud…

TypeScript 9,540 806 Updated Apr 26, 2026

Real-time visualization of Claude Code agent orchestration — see your agents think, branch, and coordinate as they work.

TypeScript 861 90 Updated Apr 25, 2026

很多镜像都在国外。比如 gcr 。国内下载很慢,需要加速。致力于提供连接全世界的稳定可靠安全的容器镜像服务。

Shell 13,958 1,508 Updated Apr 27, 2026

分享AI Infra知识&代码练习:PyTorch/vLLM/SGLang框架入门⚡️、性能加速🚀、大模型基础🧠、AI软硬件🔧等

Jupyter Notebook 2,034 162 Updated Apr 30, 2026

Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞

TypeScript 366,669 75,282 Updated Apr 30, 2026

CUDA Agent: Large-Scale Agentic RL for High-Performance CUDA Kernel Generation

Python 940 75 Updated Mar 4, 2026

Public repository for Agent Skills

Python 126,463 14,831 Updated Apr 23, 2026

Give your agents the power of the Hugging Face ecosystem

Python 10,362 653 Updated Apr 30, 2026

KV cache store for distributed LLM inference

C++ 412 37 Updated Nov 13, 2025

Qwen3-TTS is an open-source series of TTS models developed by the Qwen team at Alibaba Cloud, supporting stable, expressive, and streaming speech generation, free-form voice design, and vivid voice…

Python 11,065 1,447 Updated Mar 17, 2026

A generative speech model for daily dialogue.

Python 39,175 4,247 Updated Apr 10, 2026

Custom comfyui nodes for vllm-omni

Python 3 2 Updated Jan 5, 2026

Build compute kernels and load them from the Hub.

Python 619 79 Updated Apr 30, 2026

A framework for efficient model inference with omni-modality models

Python 4,571 861 Updated Apr 30, 2026

Heterogeneous GPU Sharing on Kubernetes

Go 3,385 545 Updated Apr 30, 2026

Light Image Video Generation Inference Framework

Python 2,228 192 Updated Apr 30, 2026

xDiT: A Scalable Inference Engine for Diffusion Transformers (DiTs) with Massive Parallelism

Python 2,610 317 Updated Apr 27, 2026

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 28,331 8,869 Updated Apr 30, 2026

A repo for all spark examples using Rapids Accelerator including ETL, ML/DL, etc.

Jupyter Notebook 168 62 Updated Apr 29, 2026
Python 140 16 Updated Mar 5, 2026

Nano vLLM

Python 13,196 2,019 Updated Apr 26, 2026

Tongyi Deep Research, the Leading Open-source Deep Research Agent

Python 18,778 1,447 Updated Feb 27, 2026

LLM notes, including model inference, transformer model structure, and llm framework code analysis notes.

Python 879 87 Updated Apr 16, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 15,524 1,909 Updated Mar 17, 2026

Open-Source Frontier Voice AI

Python 45,987 5,076 Updated Apr 24, 2026

A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.

Python 1,156 70 Updated Apr 29, 2026

使用vllm加速cosyvoice2的推理

Jupyter Notebook 493 65 Updated Apr 26, 2025

Serving multiple LoRA finetuned LLM as one

Python 1,156 62 Updated May 8, 2024

Utilities intended for use with Llama models.

Python 7,582 1,368 Updated Feb 11, 2026

Context engineering is the new vibe coding - it's the way to actually make AI coding assistants work. Claude Code is the best for this so that's what this repo is centered around, but you can apply…

Python 13,241 2,707 Updated Mar 16, 2026
Next