gcanlin

Canlin Guo gcanlin

vLLM-Omni Maintainer

36 followers · 78 following

Huawei
Shenzhen, China
10:53 (UTC +08:00)

Achievements

x3 x2

Achievements

x3 x2

Stars

Inferact / vllm-frontend-rs

Early-stage Rust drop-in alternative frontend for vLLM

Rust 26 2 Updated Apr 29, 2026

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Python 5,535 943 Updated Apr 29, 2026

tile-ai / tilelang-mlir-ascend

MLIR-based TileLang Ascend Adapter

C++ 10 12 Updated Apr 29, 2026

yiz-liu / yiz-liu.github.io

Notes on AI infrastructure, inference systems, and engineering trade-offs.

Astro 3 Updated Apr 17, 2026

redai-infra / Relax

An Asynchronous Reinforcement Learning Engine for Omni-Modal Post-Training at Scale

Python 340 31 Updated Apr 29, 2026

chenhg5 / cc-connect

Bridge local AI coding agents (Claude Code, Cursor, Gemini CLI, Codex) to messaging platforms (Feishu/Lark, DingTalk, Slack, Telegram, Discord, LINE, WeChat Work). Chat with your AI dev assistant f…

Go 6,772 635 Updated Apr 28, 2026

a2aproject / A2A

Agent2Agent (A2A) is an open protocol enabling communication and interoperability between opaque agentic applications.

Shell 23,507 2,376 Updated Apr 29, 2026

Edenzzzz / claude-history-sync

Synchronizing Claude Code conversations across machines

Python 13 Updated Apr 21, 2026

ajeetdsouza / zoxide

A smarter cd command. Supports all major shells.

Rust 36,179 812 Updated Apr 13, 2026

inclusionAI / Ming-omni-tts

Ming-omni-tts: Simple and Efficient Unified Generation of Speech, Music, and Sound with Precise Control

Python 229 17 Updated Feb 26, 2026

vllm-project / bart-plugin

vLLM Model plugin for the encoder-decoder BART model

Python 11 7 Updated Apr 10, 2026

OpenMOSS / MOSS-TTS

MOSS‑TTS Family is an open‑source speech and sound generation model family from MOSI.AI and the OpenMOSS team. It is designed for high‑fidelity, high‑expressiveness, and complex real‑world scenario…

Python 1,700 159 Updated Apr 29, 2026

anomalyco / opencode

The open source coding agent.

TypeScript 152,079 17,514 Updated Apr 30, 2026

anthropics / skills

Public repository for Agent Skills

Python 126,169 14,787 Updated Apr 23, 2026

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,831 490 Updated Feb 10, 2026

z-lab / dflash

DFlash: Block Diffusion for Flash Speculative Decoding

Python 2,423 174 Updated Apr 26, 2026

maaslalani / slides

Terminal based presentation tool

Go 11,483 310 Updated Aug 21, 2024

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,996 1,436 Updated Mar 3, 2026

gaogaotiantian / viztracer

A debugging and profiling tool that can trace and visualize python code execution

Python 7,620 468 Updated Feb 16, 2026

vllm-project / vllm-metal

Community maintained hardware plugin for vLLM on Apple Silicon

Python 1,051 112 Updated Apr 29, 2026

feifeibear / long-context-attention

USP: Unified (a.k.a. Hybrid, 2D) Sequence Parallel Attention for Long Context Transformers Model Training and Inference

Python 666 79 Updated Jan 15, 2026

tile-ai / TileRT

Tile-Based Runtime for Ultra-Low-Latency LLM Inference

Python 715 43 Updated Mar 8, 2026

vllm-project / router

A high-performance and light-weight router for vLLM large scale deployment

Rust 212 74 Updated Apr 29, 2026

vllm-project / llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Python 3,171 496 Updated Apr 29, 2026

vipshop / cache-dit

A PyTorch-native inference engine with cache, parallelism, quantization for Diffusion Transformers.

Python 1,155 70 Updated Apr 29, 2026

huggingface / diffusers

🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.

Python 33,502 6,955 Updated Apr 29, 2026

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 4,558 855 Updated Apr 30, 2026

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 3,334 402 Updated Jan 17, 2026

ml-explore / mlx

MLX: An array framework for Apple silicon

C++ 25,860 1,731 Updated Apr 28, 2026

deepseek-ai / LPLB

An early research stage expert-parallel load balancer for MoE models based on linear programming.

Python 502 34 Updated Nov 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Canlin Guo gcanlin

Achievements

Achievements

Block or report gcanlin

Stars

Inferact / vllm-frontend-rs

flashinfer-ai / flashinfer

tile-ai / tilelang-mlir-ascend

yiz-liu / yiz-liu.github.io

redai-infra / Relax

chenhg5 / cc-connect

a2aproject / A2A

Edenzzzz / claude-history-sync

ajeetdsouza / zoxide

inclusionAI / Ming-omni-tts

vllm-project / bart-plugin

OpenMOSS / MOSS-TTS

anomalyco / opencode

anthropics / skills

QwenLM / Qwen-Image

z-lab / dflash

maaslalani / slides

facebookresearch / vggt

gaogaotiantian / viztracer

vllm-project / vllm-metal

feifeibear / long-context-attention

tile-ai / TileRT

vllm-project / router

vllm-project / llm-compressor

vipshop / cache-dit

huggingface / diffusers

vllm-project / vllm-omni

thu-ml / SageAttention

ml-explore / mlx

deepseek-ai / LPLB