Tags · gty111/gLLM

v0.0.6

Kernel refactor: use sgl_kernel (#181)

* Use sgl-kernel and flashinfer

* Simplify build

* Update requirements

* Add health check

* Fix shape

* Fix attention

* Fix

* Fix

* Fix

* Fix conv3d

* Support endpoint

* Fix stop string

* Abstract conv3d module

* Fix moe and add conv file

* Fix weight for conv3d

* Fix

* Fix fused moe

* FIx

* Fix for moe and model max len

* Fix padding block

* Bump up to v0.0.6

* Clean up

---------

Co-authored-by: instinctguo <instinctguo@tencent.com>

May 15, 2026
707ddcd
zip
tar.gz
Notes

v0.0.5

Bump up to version 0.0.5 (#147)

Dec 18, 2025
a17b1b0
zip
tar.gz
Notes

v0.0.4

Add torchvision in requirements.txt (#120)

Sep 15, 2025
db34cb6
zip
tar.gz
Notes

v0.0.3

Bump up to version 0.0.3 (#81)

Jun 22, 2025
d037af0
zip
tar.gz
Notes

v0.0.2

Support TP 🎉 (#72)

* Initial support for TP

* Use random initialization

* Fix PP forward

* Downgrade to torch 2.6.0

* Fix env setting for MAX_JOBS

* Downgrade to torch 2.5.1

* Fix TP group init

* Fix annotation

* Make llama compatible for tp

* Make chatglm compatible for TP

* Make Qwen3 compatible for TP

* Remove weight_loader in fused_moe

* Make fused_moe compatible for TP; Abstract weight load function

* Make qwen_moe compatible for tp

* Make mixtral compatible for TP

* Update readme

* Abstract module attention; Clean up code for TP attention; Clean up code for model weights loading for glm

* Add MoE tuing config for A100 PCIE 40GB

* Refactor scheduler.py and AllocatorID

* Refactor IDAllocator

* Refactor worker scheduler

* Update readme

* Make embed_tokens and lm_head compatible for TP

* Fix multi-node zmq_comm

* Bump version to 0.1.0

Jun 15, 2025
577104b
zip
tar.gz
Notes

v0.0.1

Add pyproject.toml (#62)

May 31, 2025
2487e89
zip
tar.gz
Notes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.0.6

v0.0.5

v0.0.4

v0.0.3

v0.0.2

v0.0.1

Tags: gty111/gLLM

v0.0.6

v0.0.5

v0.0.4

v0.0.3

v0.0.2

v0.0.1