Skip to content
View wangyikewxgm's full-sized avatar
  • alibabacloud
  • Beijing

Block or report wangyikewxgm

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A fast communication-overlapping library for tensor/expert parallelism on GPUs.

C++ 1,208 85 Updated Aug 28, 2025

Nano vLLM

Python 9,847 1,238 Updated Nov 3, 2025

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

C++ 4,455 475 Updated Dec 20, 2025

compiler learning resources collect.

Python 2,618 362 Updated Mar 19, 2025

My learning notes for ML SYS.

Python 4,713 299 Updated Dec 19, 2025

Deep Reinforcement Learning

4,365 652 Updated Dec 10, 2022

verl: Volcano Engine Reinforcement Learning for LLMs

Python 17,649 2,859 Updated Dec 20, 2025

Train transformer language models with reinforcement learning.

Python 16,720 2,370 Updated Dec 20, 2025

Byted PyTorch Distributed for Hyperscale Training of LLMs and RLs

Python 910 53 Updated Nov 27, 2025

A throughput-oriented high-performance serving framework for LLMs

Jupyter Notebook 924 45 Updated Oct 29, 2025

NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference on NVIDIA GPUs. This repository contains the open source components of TensorRT.

C++ 12,488 2,293 Updated Dec 11, 2025

Get up and running with OpenAI gpt-oss, DeepSeek-R1, Gemma 3 and other models.

Go 157,971 13,971 Updated Dec 19, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,826 3,814 Updated Dec 20, 2025

Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"

Python 13,086 869 Updated Dec 17, 2024

A modular graph-based Retrieval-Augmented Generation (RAG) system

Python 29,870 3,153 Updated Dec 20, 2025

🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.

Python 154,087 31,495 Updated Dec 20, 2025

🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.

Python 20,309 2,130 Updated Dec 18, 2025

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 65,831 12,091 Updated Dec 20, 2025

Fast and memory-efficient exact attention

Python 21,199 2,232 Updated Dec 20, 2025

Kubernetes community content

Jupyter Notebook 12,686 5,330 Updated Dec 19, 2025

Standardized Distributed Generative and Predictive AI Inference Platform for Scalable, Multi-Framework Deployment on Kubernetes

Shell 4,924 1,323 Updated Dec 19, 2025

AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.

Python 180,399 46,188 Updated Dec 20, 2025

The simplest, fastest repository for training/finetuning medium-sized GPTs.

Python 51,257 8,584 Updated Nov 12, 2025

🦜🔗 The platform for reliable agents.

Python 122,315 20,162 Updated Dec 20, 2025

The Modern Application Platform.

Go 7,641 961 Updated Dec 16, 2025

Go client libraries for OpenAI

Go 450 34 Updated Nov 29, 2023

🤖 AI Gateway | AI Native API Gateway

Go 7,102 927 Updated Dec 20, 2025

Crane is a FinOps Platform for Cloud Resource Analytics and Economics in Kubernetes clusters. The goal is not only to help users to manage cloud cost easier but also ensure the quality of applicati…

Go 2,017 401 Updated Dec 20, 2024

An app development platform using cloud native stacks

Go 135 13 Updated Jul 15, 2022
Next