🎯
Focusing
PhD student UCSD CSE,
BS ZJU CS
Highlights
- Pro
-
-
-
TensorRT-LLM Public
Forked from WeiHaocheng/TensorRT-LLMTensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficie…
C++ Apache License 2.0 UpdatedSep 4, 2025 -
vllm Public
Forked from vllm-project/vllmA high-throughput and memory-efficient inference and serving engine for LLMs
-
sglang-rebase Public
Forked from thu-wyz/sglangSGLang is a fast serving framework for large language models and vision language models.
Python Apache License 2.0 UpdatedSep 8, 2024 -
-
flash-attention-lookahead Public
-
-
fairseq Public
Forked from facebookresearch/fairseqFacebook AI Research Sequence-to-Sequence Toolkit written in Python.
Python MIT License UpdatedMay 6, 2023 -
PatrickStar Public
Forked from Tencent/PatrickStarPatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.
Python BSD 3-Clause "New" or "Revised" License UpdatedMay 6, 2023