Skip to content

Releases: xorbitsai/inference

v1.15.0

13 Dec 15:30
b2adcee

Choose a tag to compare

What's new in 1.15.0 (2025-12-13)

These are the changes in inference v1.15.0.

New features

Enhancements

Bug fixes

Documentation

  • DOC: add new models and v1.14.0 release notes by @qinxuye in #4305

Others

New Contributors

Full Changelog: v1.14.0...v1.15.0

v1.14.0

30 Nov 02:16
94c1c27

Choose a tag to compare

What's new in 1.14.0 (2025-11-30)

These are the changes in inference v1.14.0.

New features

Enhancements

Bug fixes

Documentation

Others

Full Changelog: v1.13.0...v1.14.0

v1.13.0

15 Nov 03:11
5451ed1

Choose a tag to compare

What's new in 1.13.0 (2025-11-15)

These are the changes in inference v1.13.0.

New features

Enhancements

Bug fixes

Documentation

Others

  • chore: sync models JSON [audio, embedding, image, llm, rerank, video] by @XprobeBot in #4214
  • chore: sync models JSON [audio, embedding, image, llm, rerank, video] by @XprobeBot in #4226
  • chore: sync models JSON [audio] by @XprobeBot in #4243

Full Changelog: v1.12.0...v1.13.0

v1.12.0

02 Nov 13:25
117ba29

Choose a tag to compare

What's new in 1.12.0 (2025-11-02)

These are the changes in inference v1.12.0.

New features

Enhancements

Bug fixes

Documentation

  • DOC: add release notes doc by @qinxuye in #4157
  • DOC: Add PyPI mirror configuration guide for audio package installation by @qiulang in #4177

Others

Full Changelog: v1.11.0...v1.12.0

v1.11.0.post1

20 Oct 12:02
378b991

Choose a tag to compare

What's new in 1.11.0.post1 (2025-10-20)

These are the changes in inference v1.11.0.post1.

Bug fixes

Others

  • BLD:fix transformers version in cu128 dockerfile by @zwt-1234 in #4152

Full Changelog: v1.11.0...v1.11.0.post1

v1.11.0

19 Oct 12:54
baaa40b

Choose a tag to compare

What's new in 1.11.0 (2025-10-19)

These are the changes in inference v1.11.0.

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Full Changelog: v1.10.1...v1.11.0

v1.10.1

01 Oct 00:56
71313a4

Choose a tag to compare

What's new in 1.10.1 (2025-10-01)

These are the changes in inference v1.10.1.

New features

Enhancements

Bug fixes

  • BUG: Optimize rerank model lookup logic and add support for video model type by @amumu96 in #4063
  • BUG: Fix seed-oss required VLLM_VERSION by @Jun-Howie in #4071
  • BUG: fix register_model when model name is duplicated by @llyycchhee in #4076
  • BUG: [UI] fix the custom model drawer component could not be opened. by @yiboyasss in #4089
  • BUG: Fix the issue where registered models cannot use tools by @amumu96 in #4100
  • BUG: fix finish_reason field handling logic by @amumu96 in #4105
  • BUG: vllm structured output compatibility by @OliverBryant in #4111

Documentation

New Contributors

Full Changelog: v1.10.0...v1.10.1

v1.10.0

13 Sep 12:21
b018733

Choose a tag to compare

What's new in 1.10.0 (2025-09-13)

These are the changes in inference v1.10.0.

New features

Enhancements

Bug fixes

New Contributors

Full Changelog: v1.9.1...v1.10.0

v1.9.1

30 Aug 12:07
b2d793d

Choose a tag to compare

What's new in 1.9.1 (2025-08-30)

These are the changes in inference v1.9.1.

New features

Enhancements

  • ENH: added zero shot and voice cloning ability for audio models by @qianduoduo0904 in #3968
  • ENH: Add Template for Qwen3 Reranker when model_engine = vllm by @zhcn000000 in #3983
  • ENH: Update the environment dependencies for cosyvoice2 by @Gmgge in #4015
  • ENH: Compat with xllamacpp 0.2.0 by @codingl2k1 in #4004
  • ENH: support chat_template_kwargs for llama.cpp by @qinxuye in #3988
  • BLD: Clean up Docker's last legacy cache and images before executing each step by @zwt-1234 in #3963
  • BLD: fix CI failures by @qinxuye in #4002

Bug fixes

  • BUG: disable flash_attention when GPU compute capability < 8.0 by @amumu96 in #3973
  • BUG: fix rerank model creation by @qinxuye in #3977

Documentation

Others

New Contributors

Full Changelog: v1.9.0...v1.9.1

v1.9.0

16 Aug 15:41
6e129a8

Choose a tag to compare

What's new in 1.9.0 (2025-08-16)

These are the changes in inference v1.9.0.

New features

Enhancements

Bug fixes

Documentation

Others

  • Replace @torch.no_grad() with @torch.inference_mode() in Qwen3-Reranker by @yasu-oh in #3911

Full Changelog: v1.8.1...v1.9.0