What's new in 0.6.2 (2023-11-09)

@ChengjieLi28

What's new in 0.6.2 (2023-11-09)

These are the changes in inference v0.6.2.

New features

FEAT: Support Yi Model by @ChengjieLi28 in #629

Enhancements

ENH: cache status by @UranusSeven in #616
ENH: Supports request limits for the model by @ChengjieLi28 in #596
ENH: running model location & accelerators by @UranusSeven in #626
ENH: Create completion restful api compatibility by @codingl2k1 in #622

Bug fixes

BUG: Compatible with openai 1.1 by @codingl2k1 in #619
BUG: fix spec decoding by @UranusSeven in #628
BUG: No slot available error for embedding and LLM model on one card by @ChengjieLi28 in #611
BUG: Rotating log does not create a new one when recreate the xinference cluster by @ChengjieLi28 in #618

Documentation

DOC: Change links for some tutorials by @onesuper in #617

Full Changelog: v0.6.1...v0.6.2

@aresnow1

What's new in 0.6.1 (2023-11-06)

These are the changes in inference v0.6.1.

New features

FEAT: support chatglm3 with ggml format by @aresnow1 in #613

Enhancements

ENH: add command xinference-local by @UranusSeven in #610
ENH: Don't check dead nodes by @aresnow1 in #614

Full Changelog: v0.6.0...v0.6.1

@UranusSeven

What's new in 0.6.0 (2023-11-03)

These are the changes in inference v0.6.0.

New features

FEAT: Zephyr by @UranusSeven in #597
FEAT: stable diffusion with controlnet by @codingl2k1 in #575

Enhancements

ENH: increase heartbeat interval by @UranusSeven in #604
ENH: Support more models downloading from modelscope by @aresnow1 in #595
ENH: Supports rotating file log by @ChengjieLi28 in #590
ENH: stateless supervisor and worker by @UranusSeven in #546

Bug fixes

BUG: Fix chat system messages by @codingl2k1 in #594
BUG: fix transformers compatibility by @UranusSeven in #600

Tests

TST: Compatible with llama-cpp-python 0.2.12 by @ChengjieLi28 in #603

Documentation

DOC: Download model from ModelScope by @ChengjieLi28 in #553
DOC: Stable Diffusion with ControlNet example by @codingl2k1 in #605

Full Changelog: v0.5.6...v0.6.0

@Minamiyama

What's new in 0.5.6 (2023-10-30)

These are the changes in inference v0.5.6.

New features

FEAT: launch embedding models by @Minamiyama in #582
FEAT: chatglm3 by @UranusSeven in #587

Documentation

DOC: update hot topics and fix docs by @UranusSeven in #584

Others

CHORE: install setuptools in release actions by @aresnow1 in #588
CHORE: Use python3.10 to build and release by @aresnow1 in #589

Full Changelog: v0.5.5...v0.5.6

@Minamiyama

What's new in 0.5.5 (2023-10-26)

These are the changes in inference v0.5.5.

Enhancements

ENH: display language tags by @Minamiyama in #558
ENH: filter models by type by @Minamiyama in #559
ENH: disable create embeddings using LLMs by @UranusSeven in #570
ENH: benchmark latency by @UranusSeven in #576
ENH: configurable XINFERENCE_HOME env by @ChengjieLi28 in #566

Bug fixes

BUG: Fix bge-base-zh and bge-large-zh from ModelScope by @ChengjieLi28 in #571
BUG: When change model revision, xinference still uses the previous model by @ChengjieLi28 in #573
BUG: incorrect vLLM config by @UranusSeven in #579
BUG: fix llama-2 stop words by @UranusSeven in #580

Documentation

DOC: Incompatibility Between NVIDIA Driver and PyTorch Version by @onesuper in #551
DOC: Examples and resources page by @onesuper in #561

Full Changelog: v0.5.4...v0.5.5

@UranusSeven

What's new in 0.5.4 (2023-10-20)

These are the changes in inference v0.5.4.

New features

FEAT: wizardcoder python by @UranusSeven in #539
FEAT: Support grammar-based sampling for ggml models by @aresnow1 in #525
FEAT: speculative decoding by @UranusSeven in #509

Enhancements

ENH: Download embedding models from ModelScope by @ChengjieLi28 in #532
ENH: lock transformers version by @UranusSeven in #549
ENH: Support downloading code-llama family models from ModelScope by @ChengjieLi28 in #557
ENH: Add gguf format of codellama-instruct by @aresnow1 in #567

Bug fixes

BUG: Fix stream not compatible with openai by @codingl2k1 in #524
BUG: set trust_remote_code to true by default by @richzw in #555
BUG: add quantization to valid file name by @richzw in #562
BUG: remove "generate" ability from Baichuan-2-chat json config by @Minamiyama in #556

Documentation

DOC: update pot files by @UranusSeven in #538
DOC: Add Client API reference by @codingl2k1 in #543
DOC: Add client doc to the user guide by @codingl2k1 in #547

New Contributors

@richzw made their first contribution in #555
@Minamiyama made their first contribution in #556

Full Changelog: v0.5.3...v0.5.4

@ChengjieLi28

What's new in 0.5.3 (2023-10-13)

These are the changes in inference v0.5.3.

New features

FEAT: Add BAAI/BGE v1.5 family models by @ChengjieLi28 in #522
FEAT: Support Mistral & Mistral-Instruct by @Bojun-Feng in #510
FEAT: Add --model-uid to launch sub command by @codingl2k1 in #529
FEAT: Support stable diffusion by @codingl2k1 in #484

Enhancements

REF: Use restful client as default client by @aresnow1 in #470
REF: refactor client codes for xinference-client by @ChengjieLi28 in #528

Bug fixes

BUG: Fix listing embedding models by @aresnow1 in #514

Tests

TST: fix tiny llama by @UranusSeven in #513

Documentation

DOC: hardware specific installations by @UranusSeven in #517
DOC: update installation by @UranusSeven in #527

Full Changelog: v0.5.2...v0.5.3

@UranusSeven

What's new in 0.5.2 (2023-09-27)

These are the changes in inference v0.5.2.

Enhancements

ENH: validate model URI on register by @UranusSeven in #476
ENH: Skip download for embedding models by @aresnow1 in #499
ENH: set trust_remote_code to true by @UranusSeven in #500

Full Changelog: v0.5.1...v0.5.2

@codingl2k1

What's new in 0.5.1 (2023-09-26)

These are the changes in inference v0.5.1.

Enhancements

ENH: Safe iterate stream of ggml model by @codingl2k1 in #449
ENH: Skip download if model exists by @aresnow1 in #495

Documentation

DOC: vLLM by @UranusSeven in #491

Full Changelog: v0.5.0...v0.5.1

@UranusSeven

What's new in 0.5.0 (2023-09-22)

These are the changes in inference v0.5.0.

New features

FEAT: incorporate vLLM by @UranusSeven in #445
FEAT: add register model page for dashboard by @Bojun-Feng in #420
FEAT: internlm 20b by @UranusSeven in #486
FEAT: support glaive coder by @UranusSeven in #490
FEAT: Support download models from modelscope by @aresnow1 in #475

Enhancements

ENH: shorten OpenBuddy's desc by @UranusSeven in #471
ENH: enable vLLM on Linux with cuda by @UranusSeven in #472
ENH: vLLM engine supports more models by @UranusSeven in #477
ENH: remove subpool on failure by @UranusSeven in #478
ENH: support trust_remote_code when launching a model by @UranusSeven in #479
ENH: vLLM auto tensor parallel by @UranusSeven in #480

Bug fixes

BUG: llama-cpp version dismatch by @Bojun-Feng in #473
BUG: incorrect endpoint on host 0.0.0.0 by @UranusSeven in #474
BUG: prompt style not set as expected on web UI by @UranusSeven in #489

Tests

TST: Fix windows CI by @aresnow1 in #455

Documentation

DOC: Getting started guide for beginners by @onesuper in #460

Full Changelog: v0.4.4...v0.5.0

Releases: xorbitsai/inference

v0.6.2

What's new in 0.6.2 (2023-11-09)

New features

Enhancements

Bug fixes

Documentation

Contributors

Uh oh!

v0.6.1

What's new in 0.6.1 (2023-11-06)

New features

Enhancements

Contributors

Uh oh!

v0.6.0

What's new in 0.6.0 (2023-11-03)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

Uh oh!

v0.5.6

What's new in 0.5.6 (2023-10-30)

New features

Documentation

Others

Contributors

Uh oh!

v0.5.5

What's new in 0.5.5 (2023-10-26)

Enhancements

Bug fixes

Documentation

Contributors

Uh oh!

v0.5.4

What's new in 0.5.4 (2023-10-20)

New features

Enhancements

Bug fixes

Documentation

New Contributors

Contributors

Uh oh!

v0.5.3

What's new in 0.5.3 (2023-10-13)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

Uh oh!

v0.5.2

What's new in 0.5.2 (2023-09-27)

Enhancements

Contributors

Uh oh!

v0.5.1

What's new in 0.5.1 (2023-09-26)

Enhancements

Documentation

Contributors

Uh oh!

v0.5.0

What's new in 0.5.0 (2023-09-22)

New features

Enhancements

Bug fixes

Tests

Documentation

Contributors

Uh oh!