Skip to content
View whybeyoung's full-sized avatar
💭
I may be slow to respond.
💭
I may be slow to respond.

Block or report whybeyoung

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
225 results for source starred repositories
Clear filter

中文文生图stable diffsion模型集合

401 23 Updated Jan 14, 2026

A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.

Python 3,304 410 Updated Jan 19, 2026

rdma_demo

Python 1 Updated Apr 9, 2025

Astron-xmod-shim — Lightweight, declarative middleware for reliably converging AI service workloads.

Go 98 15 Updated Nov 3, 2025

Cross-platform AI workflow DSL converter supporting iFlytek Spark, Dify, and Coze platforms with unified intermediate representation and bidirectional transformation capabilities.

Go 20 3 Updated Nov 21, 2025
Go 73 3 Updated Sep 15, 2025

A workload for deploying LLM inference services on Kubernetes

Go 168 42 Updated Jan 30, 2026

Open Model Engine (OME) — Kubernetes operator for LLM serving, GPU scheduling, and model lifecycle management. Works with SGLang, vLLM, TensorRT-LLM, and Triton

Go 365 59 Updated Jan 31, 2026

This a simple implementation of an MCP server using iFlytek. It enables calling iFlytek workflows through MCP tools.

Python 27 7 Updated Mar 28, 2025

easy version of pyverbs

Python 6 2 Updated Apr 16, 2023

A lightweight data processing framework built on DuckDB and 3FS.

Python 4,921 442 Updated Mar 5, 2025

Expert Parallelism Load Balancer

Python 1,343 200 Updated Mar 24, 2025

Analyze computation-communication overlap in V3/R1.

1,140 145 Updated Mar 21, 2025

A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.

Python 2,918 313 Updated Jan 14, 2026

My learning notes for ML SYS.

Python 5,267 342 Updated Jan 30, 2026

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 6,154 810 Updated Feb 3, 2026

DeepEP: an efficient expert-parallel communication library

Cuda 8,959 1,084 Updated Feb 3, 2026

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,445 977 Updated Jan 20, 2026

Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation

7,962 288 Updated May 15, 2025

SGLang is a high-performance serving framework for large language models and multimodal models.

Python 23,241 4,320 Updated Feb 4, 2026

🪄 Turns your machine learning code into microservices with web API, interactive GUI, and more.

Python 3,134 164 Updated Jan 7, 2026

A collection of community maintained NRI plugins

Go 100 31 Updated Feb 4, 2026

SciLifeLab Serve is a platform offering machine learning model serving, data science app hosting (Shiny, Gradio, Streamlit, Dash, etc.), and other tools to life science researchers affiliated with …

Python 14 3 Updated Feb 4, 2026

Examples of models deployable with Truss

Python 214 57 Updated Feb 4, 2026

BERT classification model for processing texts longer than 512 tokens. Text is first divided into smaller chunks and after feeding them to BERT, intermediate results are pooled. The implementation …

Python 146 32 Updated Jun 19, 2024

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 69,482 13,186 Updated Feb 4, 2026

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

Python 14,748 1,306 Updated Apr 6, 2025

Free ChatGPT&DeepSeek API Key,免费ChatGPT&DeepSeek API。免费接入DeepSeek API和GPT4 API,支持 gpt | deepseek | claude | gemini | grok 等排名靠前的常用大模型。

Python 35,927 2,531 Updated Jan 10, 2026

Slim(toolkit): Don't change anything in your container image and minify it by up to 30x (and for compiled languages even more) making it secure too! (free and open source)

Go 22,802 804 Updated Jan 23, 2026

Transforms complex documents like PDFs into LLM-ready markdown/JSON for your Agentic workflows.

Python 53,799 4,464 Updated Feb 2, 2026
Next