Skip to content

Releases: xorbitsai/inference

v0.12.2

21 Jun 09:14
5cef7c3

Choose a tag to compare

What's new in 0.12.2 (2024-06-21)

These are the changes in inference v0.12.2.

New features

  • FEAT: Add Tools Support for Qwen Series MOE Models by @zhanghx0905 in #1642
  • FEAT: [UI]Modify the deletion function of a custom model. by @yiboyasss in #1656
  • FEAT: [UI]Custom model presents JSON data and modifies it. by @yiboyasss in #1670
  • FEAT: Add Rerank model token input/output usage by @wxiwnd in #1657

Enhancements

  • ENH: Continuous batching supports all the models with transformers backend by @ChengjieLi28 in #1659

Bug fixes

  • BUG: show error when user launch quantized model without device supported by @Minamiyama in #1645
  • BUG: Fix default rerank type by @codingl2k1 in #1649
  • BUG: chat_completion not response while error appears more than 100 by @liuzhenghua in #1663

Tests

Others

Full Changelog: v0.12.1...v0.12.2

v0.12.1

14 Jun 09:31
34a57df

Choose a tag to compare

What's new in 0.12.1 (2024-06-14)

These are the changes in inference v0.12.1.

New features

Enhancements

Bug fixes

Others

New Contributors

Full Changelog: v0.12.0...v0.12.1

v0.12.0

07 Jun 07:27
55c5636

Choose a tag to compare

What's new in 0.12.0 (2024-06-07)

These are the changes in inference v0.12.0.

New features

Enhancements

  • ENH: make CogVLM2 support stream output by @Minamiyama in #1572
  • BLD: Docker clean all images after building image on self-hosted machine by @ChengjieLi28 in #1595
  • BLD: Fix pip is looking multiple versions of some packages while installing by @ChengjieLi28 in #1603

Bug fixes

Documentation

New Contributors

Full Changelog: v0.11.3...v0.12.0

v0.11.3

31 May 09:28
69c09cd

Choose a tag to compare

What's new in 0.11.3 (2024-05-31)

These are the changes in inference v0.11.3.

New features

Enhancements

Bug fixes

  • BUG: fix launch model error when use torch 2.3.0 by @amumu96 in #1543
  • BUG: fix vl-model img path error by @amumu96 in #1559
  • BUG: Fix validation errors when define a custom baichuan-chat LLM model by @buptzyf in #1557

Documentation

  • DOC: update readme and fix description about model engine by @qinxuye in #1566

Others

New Contributors

Full Changelog: v0.11.2...v0.11.3

v0.11.2.post1

24 May 11:52
ac8f334

Choose a tag to compare

What's new in 0.11.2.post1 (2024-05-24)

These are the changes in inference v0.11.2.post1, a hotfix version of v0.11.2.

Bug fixes

  • BUG: fix launch model error when use torch 2.3.0 by @amumu96 in #1543

Full Changelog: v0.11.2...v0.11.2.post1

v0.11.2

24 May 09:10
77e79f8

Choose a tag to compare

What's new in 0.11.2 (2024-05-24)

These are the changes in inference v0.11.2.

New features

Enhancements

Bug fixes

  • BUG: Fix start worker failed due to None device name by @codingl2k1 in #1539
  • BUG: Fix gpu_idx allocate error when set replica > 1 by @amumu96 in #1528

Others

Full Changelog: v0.11.1...v0.11.2

v0.11.1

17 May 07:17
55a0200

Choose a tag to compare

What's new in 0.11.1 (2024-05-17)

These are the changes in inference v0.11.1.

New features

  • FEAT: support Yi-1.5 series by @qinxuye in #1489
  • FEAT: [UI] embedding and rerank support the specified GPU and CPU. by @yiboyasss in #1491

Enhancements

Bug fixes

Documentation

New Contributors

Full Changelog: v0.11.0...v0.11.1

v0.11.0

11 May 09:41
21be5ab

Choose a tag to compare

What's new in 0.11.0 (2024-05-11)

These are the changes in inference v0.11.0.

Break Changes

v0.11.0 introduced break change when launching model that model_engine should be specified, refer to Model Engine for more information

New features

Enhancements

Bug fixes

Tests

  • TST: Pin huggingface-hub to pass CI since it has some break changes by @ChengjieLi28 in #1427

Documentation

Others

  • BUG:Fix mertics is empty when call /v1/chat/completions by @amumu96 in #1406

New Contributors

Full Changelog: v0.10.3...v0.11.0

v0.10.3

24 Apr 02:57
2ba72b0

Choose a tag to compare

What's new in 0.10.3 (2024-04-24)

These are the changes in inference v0.10.3.

New features

Enhancements

Bug fixes

  • BUG: Fix Launching embedding or reranking models from commandline fails due to PEFT by @hainaweiben in #1343
  • BUG: Fix extra parameters issue when auto-recovering models by @ChengjieLi28 in #1348
  • BUG: Fix old rerank models use flag rerank issue by @codingl2k1 in #1350

Documentation

New Contributors

Full Changelog: v0.10.2.post1...v0.10.3

v0.10.2.post1

19 Apr 06:48
5001715

Choose a tag to compare

What's new in 0.10.2.post1 (2024-04-19)

These are the changes in inference v0.10.2.post1.

Bug fixes

Full Changelog: v0.10.2...v0.10.2.post1