Changelog

Beta releases

LM Studio 0.4.16

Build 2

Lm Link no longer requires waitlisting
Updated default context length to 8k tokens

Build 1

Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups

Jun 8, 2026

LM Studio 0.4.16

Build 1

Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
- Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups

Jun 4, 2026

LM Studio 0.4.15

Build 2

[CUDA] Added tensor parallelism support for multi-GPU model loading
[llama.cpp] Added a Physical Batch Size advanced load option
Fixed REST API requests hanging when HTTP/2 clients sent upgrade headers
Fixed image attachments appearing in reverse order after sending a chat message
Fixed a bug where double clicking model trigger would re-open it
Fixed tool type error bug while using codex --oss
Fixed invalid role error while using Claude Code. /v1/messages API now supports system messages in the messages array

Build 1

LM Studio Engine Protocol beta 2
- New architecture to enable us to ship more frequent engine updates
- Turn it on in Settings > Developer > Enable LM Studio Engine Protocol
Fixed a bug which dropped prompt cache on every message when using Claude Code, improving performance significantly
Fixed a bug when using theme picker overlay over other modals
Fixed a z-index bug in the Models Table scroll
Security hardening

May 29, 2026

LM Studio 0.4.14

Build 4

Stable release of MTP Speculative Decoding!
- Speeds up generation with models that include built-in multi-token prediction heads
- To try it out, download an MTP-capable model
Fixed an issue with non-MTP speculative decoding error while MTP was enabled
Fixed a bug where lms get gemma4 would not show any results
lms chat now shows which LM Link device each remote model is on

Build 3

Fixed a chat UI bug that could remove whitespace when using MTP

Build 2

Beta release of MTP Speculative Decoding
Fixed token exchange failure for few MCPs in OAuth flow

Build 1

Beta build of LM Studio Engine Protocol

May 22, 2026

LM Studio 0.4.13

Build 1

[MLX] mlx-engine v1.8.1 significantly improves performance and adds parallel predictions for vision-capable models such as Qwen 3.5/3.6 and Gemma 4
Fixed a bug where newlines were compacted in the chat input on paste
Bug fixes and security hardening. This update is recommended for all users.

May 13, 2026

LM Studio 0.4.12

Build 1

Support for Qwen 3.6
Improved style in chat PDF exports
Fixed bug where MCP servers with OAuth would not work on some Windows environments
Improved Qwen 3.5 performance with OpenAI-compatible /v1/chat/completions, /v1/responses and Anthropic-compatible /v1/messages

Apr 17, 2026

LM Studio 0.4.11

Build 1

Support for updated Gemma 4 chat template

Apr 10, 2026

LM Studio 0.4.10

Build 1

Improve Gemma 4 tool call reliability
Add OAuth support for MCP servers

Apr 9, 2026

LM Studio 0.4.9

Build 1

Improve Gemma 4 tool call reliability
Add support for Anthropic-compatible v1/messages output_config.effort (low, medium, high, max)
Fixed a bug where deleting a chat folder would sometimes freeze the UI
Fixed a bug where markdown Link popovers would appear at the top of the window

Apr 2, 2026

LM Studio 0.4.8

Build 1

Add support for reasoning_effort and reasoning_tokens in OpenAI-compatible v1/chat/completions
Adds a reasoning field to the /api/v1/models API response, indicating each model's supported reasoning capabilities/REST configuration options learn more
Fixed a bug where Insert in chat input would sometimes not work after toggling assistant and user mode
Fixed a bug where surrounding spaces in tool call parameters would be stripped for models that uses XML/XML-like tool call formats
[CUDA] Fixed issue where some VRAM would not be deallocated under certain conditions
Fixes a bug where setting reasoning to low when using Nemotron 3 Super via the /api/v1/chat or OpenAI-compatible /v1/responses API would error out

Mar 26, 2026

Changelog 👾

LM Studio 0.4.16

LM Studio 0.4.16

LM Studio 0.4.15

LM Studio 0.4.14

LM Studio 0.4.13

LM Studio 0.4.12

LM Studio 0.4.11

LM Studio 0.4.10

LM Studio 0.4.9

LM Studio 0.4.8

Changelog