LM Studio 0.4.16 LM Studio 0.4.16 Build 2
Lm Link no longer requires waitlisting
Updated default context length to 8k tokens
Build 1
Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 8, 2026
LM Studio 0.4.16 LM Studio 0.4.16 Build 1
Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
Use LM Link in Locally to take your largest LM Studio models on the go
Security hardening
[GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 4, 2026
LM Studio 0.4.15 LM Studio 0.4.15 Build 2
[CUDA] Added tensor parallelism support for multi-GPU model loading
[llama.cpp] Added a Physical Batch Size advanced load option
Fixed REST API requests hanging when HTTP/2 clients sent upgrade headers
Fixed image attachments appearing in reverse order after sending a chat message
Fixed a bug where double clicking model trigger would re-open it
Fixed tool type error bug while using codex --oss
Fixed invalid role error while using Claude Code. /v1/messages API now supports system messages in the messages array
Build 1
LM Studio Engine Protocol beta 2
New architecture to enable us to ship more frequent engine updates
Turn it on in Settings > Developer > Enable LM Studio Engine Protocol
Fixed a bug which dropped prompt cache on every message when using Claude Code, improving performance significantly
Fixed a bug when using theme picker overlay over other modals
Fixed a z-index bug in the Models Table scroll
Security hardening
May 29, 2026
LM Studio 0.4.14 LM Studio 0.4.14 Build 4
Stable release of MTP Speculative Decoding!
Speeds up generation with models that include built-in multi-token prediction heads
To try it out, download an MTP-capable model
Fixed an issue with non-MTP speculative decoding error while MTP was enabled
Fixed a bug where lms get gemma4 would not show any results
lms chat now shows which LM Link device each remote model is on
Build 3
Fixed a chat UI bug that could remove whitespace when using MTP
Build 2
Beta release of MTP Speculative Decoding
Fixed token exchange failure for few MCPs in OAuth flow
Build 1
Beta build of LM Studio Engine Protocol
May 22, 2026
LM Studio 0.4.13 LM Studio 0.4.13 Build 1
[MLX] mlx-engine v1.8.1 significantly improves performance and adds parallel predictions for vision-capable models such as Qwen 3.5/3.6 and Gemma 4
Fixed a bug where newlines were compacted in the chat input on paste
Bug fixes and security hardening. This update is recommended for all users.
May 13, 2026
LM Studio 0.4.12 LM Studio 0.4.12 Build 1
Support for Qwen 3.6
Improved style in chat PDF exports
Fixed bug where MCP servers with OAuth would not work on some Windows environments
Improved Qwen 3.5 performance with OpenAI-compatible /v1/chat/completions, /v1/responses and Anthropic-compatible /v1/messages
Apr 17, 2026
LM Studio 0.4.11 LM Studio 0.4.11 Build 1
Support for updated Gemma 4 chat template
Apr 10, 2026
LM Studio 0.4.10 LM Studio 0.4.10 Build 1
Improve Gemma 4 tool call reliability
Add OAuth support for MCP servers
Apr 9, 2026
LM Studio 0.4.9 LM Studio 0.4.9 Build 1
Improve Gemma 4 tool call reliability
Add support for Anthropic-compatible v1/messages output_config.effort (low, medium, high, max)
Fixed a bug where deleting a chat folder would sometimes freeze the UI
Fixed a bug where markdown Link popovers would appear at the top of the window
Apr 2, 2026
LM Studio 0.4.8 LM Studio 0.4.8 Build 1
Add support for reasoning_effort and reasoning_tokens in OpenAI-compatible v1/chat/completions
Adds a reasoning field to the /api/v1/models API response, indicating each model's supported reasoning capabilities/REST configuration options learn more
Fixed a bug where Insert in chat input would sometimes not work after toggling assistant and user mode
Fixed a bug where surrounding spaces in tool call parameters would be stripped for models that uses XML/XML-like tool call formats
[CUDA] Fixed issue where some VRAM would not be deallocated under certain conditions
Fixes a bug where setting reasoning to low when using Nemotron 3 Super via the /api/v1/chat or OpenAI-compatible /v1/responses API would error out
Mar 26, 2026
Load more
Element Labs, Inc. © 2026