Skip to content

Releases: thu-pacman/chitu

v0.5.0

12 Dec 11:58

Choose a tag to compare

针对集群部署性能的多项改进:

  • 更好的 DP+TP+EP 混合并行支持。
  • MoE 负载均衡策略。
  • 针对预处理和后处理的性能优化。
  • 多处问题修复。

Multiple improvements on cluster deployments:

  • Better support on hybird DP+TP+EP parallelism.
  • Load balancing strategy for MoE.
  • Optimizations on pre-processing and post-processing.
  • Multiple bug fixes.

Official Docker images / 官方 docker 镜像:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.5.0
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.5.0
  • Ascend A2: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.5.0
  • Ascend A3: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend-a3:v0.5.0

v0.4.3

18 Sep 17:56

Choose a tag to compare

Fixed some performance issues.

修复了一些性能问题。


Official Docker images / 官方 docker 镜像:

NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.4.3
Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.4.3
Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.4.3

v0.4.2

28 Aug 16:39

Choose a tag to compare

  • Added supports to some new models.

  • Performance Optimization

    • Support Chunked Prefill
    • Support using DeepEP to optimize EP communication
      • requires extra installation of nvshmem (see installation guide)
      • CUDA Graph can be enabled when using DeepEP
  • Fixed some bugs


  • 新增模型支持

  • 性能优化

    • 支持 Chunked Prefill
    • 支持利用 DeepEP 优化 EP 通信
      • 需要额外安装 nvshmem(参考官方安装说明
      • 利用 DeepEP 时可开启 CUDA graph
  • 修复若干缺陷


Official Docker images / 官方 docker 镜像:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.4.2
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.4.2
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.4.2

v0.4.1

14 Aug 12:47

Choose a tag to compare

  • Supported Expert Parallelism (EP). Enable it by setting infer.ep_size (which currently should be equal to infer.tp_size, parallelizing the attention part with TP in the same degree of parallelism).
  • Supported PD-disaggregated inference (requiring additional dependencies, currently, please build it manually based on the Dockerfile following the mooncake configuration guideline).
  • Supported hardware fp4 computation on NVIDIA Blackwell GPUs (requiring additional dependencies, available when building from blackwell.Dockerfile).
  • Added supports to some new models. See chitu/docs/en/SUPPORTED_MODELS.md at public-main · thu-pacman/chitu for details.
  • Fixed multiple bugs.

  • 支持专家并行(EP),设置 infer.ep_size 使用(目前需要与 infer.tp_size 相等,表示 attention 部分以相同的并行度进行 TP 并行)。
  • 支持 PD 分离(需要额外依赖,当前请基于赤兔基础镜像,参考 mooncake 配置指南手动构建)。
  • 支持在 NVIDIA Blackwell GPU 上进行硬件 fp4 计算(需要额外依赖,建议通过 blackwell.Dockerfile 构建镜像)。
  • 新增部分模型支持,详见 chitu/docs/zh/SUPPORTED_MODELS.md at public-main · thu-pacman/chitu
  • 修复若干缺陷。

Official Docker images / 官方 docker 镜像:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.4.1
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.4.1
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.4.1

v0.4.0

01 Aug 03:38

Choose a tag to compare

v0.4.0 marks a significant improvement over v0.3.x on performance and availability. We recommand all medium-sized (about 1-4 servers) deployments upgrading to this version.

Highlighted changes:

  • Optimizations for platforms including NVIDIA GPUs, Ascend NPUs, MetaX GPUs, and Hygon DCUs.
  • Optimizations for models including DeepSeek-R1, Qwen3-32B, Kimi K2, GLM-4.5.

Official Docker images:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.4.0
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.4.0
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.4.0

v0.3.9

28 Jul 13:49

Choose a tag to compare

New supported models on multiple platforms including NVIDIA GPUs and Ascend NPUs:

  • GLM-4.5: To use, append models=GLM-4.5 and models.ckpt_dir=/your/local/model/path command line argument when starting Chitu.
  • GLM-4.5-Air: To use, append models=GLM-4.5-Air and models.ckpt_dir=/your/local/model/path command line argument when starting Chitu.
  • Kimi-K2-Instruct: To use, append models=Kimi-K2-Instruct and models.ckpt_dir=/your/local/model/path command line argument when starting Chitu.

Official Docker images:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:latest
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:latest
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:latest

v0.3.8

10 Jul 15:04

Choose a tag to compare

What's new:

  • Performance has been further optimized.
  • DeepSeek models quantized with mixed precision DeepSeek-R1-mix are now released.
  • FP4 models are now compatible with Ascend 910B2.

Official Docker images:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:latest
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:latest
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:latest

v0.3.7

26 Jun 13:23

Choose a tag to compare

What's new:

  • Initial support for Hygon DCU.
  • Optimized post-processing performance.
  • Added launch argument validation.

Official Docker images:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:latest
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:latest
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:latest

v0.3.6

19 Jun 13:07

Choose a tag to compare

What's new:

  • Support recent GLM models
  • Automatic settings for KV Cache manager and task scheduler
  • Fixed some known issues.

Official Docker images:

  • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:v0.3.6
  • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:v0.3.6
  • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:v0.3.6

v0.3.5

12 Jun 12:35

Choose a tag to compare

What's new:

  • Support for Ascend NPU aclgraph to enhance higher performance.
  • Fixed some known issues.
  • We provide official Docker images:
    • NVIDIA: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-nvidia:0.3.5
    • Muxi: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-muxi:0.3.5
    • Ascend: qingcheng-ai-cn-beijing.cr.volces.com/public/chitu-ascend:0.3.5