What's new in 0.7.2 (2023-12-15)

@ChengjieLi28

What's new in 0.7.2 (2023-12-15)

These are the changes in inference v0.7.2.

New features

FEAT: Supports qwen-chat 1.8B by @ChengjieLi28 in #757
FEAT: Support gorilla openfunctions v1 by @codingl2k1 in #760
FEAT: qwen function call by @codingl2k1 in #763

Enhancements

ENH: Handle tool call failed by @codingl2k1 in #767

Bug fixes

BUG: [UI] Fix model size selection crash issue by @ChengjieLi28 in #764

Documentation

DOC: Fix model_uri missing in Custom Models by @ChengjieLi28 in #759

Full Changelog: v0.7.1...v0.7.2

@ChengjieLi28

What's new in 0.7.1 (2023-12-12)

These are the changes in inference v0.7.1.

Enhancements

ENH: [UI] Supports model_uid input when launching models by @ChengjieLi28 in #746
ENH: Add more vllm supported models by @aresnow1 in #756

Bug fixes

BUG: Fix cached tag on UI by @ChengjieLi28 in #748
BUG: Fix stream arg for vllm backend by @aresnow1 in #758

Documentation

DOC: Autogen model spec by @onesuper in #752

Others

Bugs: Fixing issue with emote encoding in streaming chat, Fixing issue with missing pad_token for pytorch tokenizers, allowing system message as latest message in chat by @AndiMajore in #747

New Contributors

@AndiMajore made their first contribution in #747

Full Changelog: v0.7.0...v0.7.1

@waltcow

What's new in 0.7.0 (2023-12-08)

These are the changes in inference v0.7.0.

Enhancements

ENH: upgrade insecure requests when necessary by @waltcow in #712
ENH: [UI] Using tab in running models by @ChengjieLi28 in #714
ENH: [UI] supports launching rerank models by @ChengjieLi28 in #711
ENH: [UI] Error can be shown on web UI directly via Snackbar by @ChengjieLi28 in #721
ENH: [UI] Supports n_gpu config when launching LLM models on web ui by @ChengjieLi28 in #730
ENH: [UI] n_gpu default value auto by @ChengjieLi28 in #738
ENH: [UI] Support unregistering custom model on web UI by @ChengjieLi28 in #735
ENH: Auto recover model actor by @codingl2k1 in #694
ENH: allow rerank models run with LLM models on same device by @aresnow1 in #741

Bug fixes

BUG: Auto patch trust remote code for embedding model by @codingl2k1 in #710
BUG: Fix vLLM backend by @codingl2k1 in #728

Others

Update builtin model list by @onesuper in #709
Revert "ENH: upgrade insecure requests when necessary" by @qinxuye in #716
CHORE: Format js file and check js code style by @ChengjieLi28 in #727

New Contributors

@waltcow made their first contribution in #712
@qinxuye made their first contribution in #716

Full Changelog: v0.6.5...v0.7.0

@aresnow1

What's new in 0.6.5 (2023-12-01)

These are the changes in inference v0.6.5.

New features

FEAT: Support jina embedding models by @aresnow1 in #704
FEAT: Support Yi-chat by @aresnow1 in #700
FEAT: Support qwen 72b by @aresnow1 in #705
FEAT: ChatGLM3 tool calls by @codingl2k1 in #701

Enhancements

ENH: Specify actor pool port for distributed deployment by @ChengjieLi28 in #688
ENH: Remove xorbits dependency by @ChengjieLi28 in #699
ENH: User can just specify a string for prompt style when registering custom LLM models by @ChengjieLi28 in #682
ENH: Add more models supported by vllm by @aresnow1 in #706

Bug fixes

BUG: Fix xinference start failed if invalid custom model found by @codingl2k1 in #690

Documentation

Doc: Fix some incorrect links in documentation by @aresnow1 in #684
Doc: Update readme by @aresnow1 in #687
DOC: documentation for docker and k8s by @lynnleelhl in #661

Others

Add langchain streamlit demo example code by @onesuper in #681

New Contributors

@lynnleelhl made their first contribution in #661

Full Changelog: v0.6.4...v0.6.5

@ChengjieLi28

What's new in 0.6.4 (2023-11-24)

These are the changes in inference v0.6.4.

New features

FEAT: Support registering custom embedding model by @ChengjieLi28 in #667
FEAT: Supports qwen.cpp for qwen-chat with ggml format by @ChengjieLi28 in #675
FEAT: Xverse by @fengsxy in #678
FEAT: Support rerank models by @aresnow1 in #672

Enhancements

ENH: Add generate interface for chatglm with ggml format by @ChengjieLi28 in #671

Bug fixes

BUG: Fix custom model missing config json by @codingl2k1 in #674
BUG: Fix http error is not raised by @codingl2k1 in #657
BUG: Fix pip install xinference[all] by @codingl2k1 in #679

Documentation

DOC: update pot files by @UranusSeven in #638
DOC: A more detailed beginner's guide has been created, covering various aspects of the first-time usage experience for new users. by @onesuper in #651
DOC: documentation for using xinference by @fengsxy in #677
DOC: Register custom embedding model by @ChengjieLi28 in #683

Others

Add why xinf section to readme to compare pivitol features with others by @onesuper in #652
Fix README.md by @aresnow1 in #669

New Contributors

@fengsxy made their first contribution in #677

Full Changelog: v0.6.3...v0.6.4

@UranusSeven

What's new in 0.6.3 (2023-11-16)

These are the changes in inference v0.6.3.

New features

FEAT: qwen-chat-14b by @UranusSeven in #494
FEAT: Support gptq quantization by @codingl2k1 in #645

Bug fixes

BUG: Fix restful api serialization slow by @codingl2k1 in #648

Tests

TST: disable test_is_self_hosted by @UranusSeven in #641

Documentation

DOC: About Logging in Xinference by @ChengjieLi28 in #631
DOC: Init for Chinese doc by @ChengjieLi28 in #565

Full Changelog: v0.6.2...v0.6.3

@ChengjieLi28

What's new in 0.6.2 (2023-11-09)

These are the changes in inference v0.6.2.

New features

FEAT: Support Yi Model by @ChengjieLi28 in #629

Enhancements

ENH: cache status by @UranusSeven in #616
ENH: Supports request limits for the model by @ChengjieLi28 in #596
ENH: running model location & accelerators by @UranusSeven in #626
ENH: Create completion restful api compatibility by @codingl2k1 in #622

Bug fixes

BUG: Compatible with openai 1.1 by @codingl2k1 in #619
BUG: fix spec decoding by @UranusSeven in #628
BUG: No slot available error for embedding and LLM model on one card by @ChengjieLi28 in #611
BUG: Rotating log does not create a new one when recreate the xinference cluster by @ChengjieLi28 in #618

Documentation

DOC: Change links for some tutorials by @onesuper in #617

Full Changelog: v0.6.1...v0.6.2

@aresnow1

What's new in 0.6.1 (2023-11-06)

These are the changes in inference v0.6.1.

New features

FEAT: support chatglm3 with ggml format by @aresnow1 in #613

Enhancements

ENH: add command xinference-local by @UranusSeven in #610
ENH: Don't check dead nodes by @aresnow1 in #614

Full Changelog: v0.6.0...v0.6.1

@UranusSeven

What's new in 0.6.0 (2023-11-03)

These are the changes in inference v0.6.0.

New features

FEAT: Zephyr by @UranusSeven in #597
FEAT: stable diffusion with controlnet by @codingl2k1 in #575

Enhancements

ENH: increase heartbeat interval by @UranusSeven in #604
ENH: Support more models downloading from modelscope by @aresnow1 in #595
ENH: Supports rotating file log by @ChengjieLi28 in #590
ENH: stateless supervisor and worker by @UranusSeven in #546

Bug fixes

BUG: Fix chat system messages by @codingl2k1 in #594
BUG: fix transformers compatibility by @UranusSeven in #600

Tests

TST: Compatible with llama-cpp-python 0.2.12 by @ChengjieLi28 in #603

Documentation

DOC: Download model from ModelScope by @ChengjieLi28 in #553
DOC: Stable Diffusion with ControlNet example by @codingl2k1 in #605

Full Changelog: v0.5.6...v0.6.0

@Minamiyama

What's new in 0.5.6 (2023-10-30)

These are the changes in inference v0.5.6.

New features

FEAT: launch embedding models by @Minamiyama in #582
FEAT: chatglm3 by @UranusSeven in #587

Documentation

DOC: update hot topics and fix docs by @UranusSeven in #584

Others

CHORE: install setuptools in release actions by @aresnow1 in #588
CHORE: Use python3.10 to build and release by @aresnow1 in #589

Full Changelog: v0.5.5...v0.5.6

Releases: xorbitsai/inference

v0.7.2

What's new in 0.7.2 (2023-12-15)

New features

Enhancements

Bug fixes

Documentation

Contributors

Uh oh!

v0.7.1

What's new in 0.7.1 (2023-12-12)

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

Uh oh!

v0.7.0

What's new in 0.7.0 (2023-12-08)

Enhancements

Bug fixes

Others

New Contributors

Contributors

Uh oh!

v0.6.5

What's new in 0.6.5 (2023-12-01)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

Uh oh!

v0.6.4

What's new in 0.6.4 (2023-11-24)

New features

Enhancements

Bug fixes

Documentation

Others

New Contributors

Contributors

Uh oh!

v0.6.3

What's new in 0.6.3 (2023-11-16)

New features

Bug fixes

Tests

Documentation

Contributors

Uh oh!

v0.6.2

What's new in 0.6.2 (2023-11-09)

New features

Enhancements

Bug fixes

Documentation

Contributors

Uh oh!

v0.6.1

What's new in 0.6.1 (2023-11-06)

New features

Enhancements

Contributors

Uh oh!

v0.6.0

What's new in 0.6.0 (2023-11-03)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

Uh oh!

v0.5.6

What's new in 0.5.6 (2023-10-30)

New features