Introducing LM Studio's iPhone app.Get the app

Changelog

Beta releases
LM Studio 0.4.16

LM Studio 0.4.16

Build 2

  • Lm Link no longer requires waitlisting
  • Updated default context length to 8k tokens

Build 1

  • Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
    • Use LM Link in Locally to take your largest LM Studio models on the go
  • Security hardening
  • [GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 8, 2026
LM Studio 0.4.16

LM Studio 0.4.16

Build 1

  • Introducing Locally, LM Studio's mobile app. Available on iPhone and iPad.
    • Use LM Link in Locally to take your largest LM Studio models on the go
  • Security hardening
  • [GGUF] Fix multi-GPU selection bugs affecting GPU ON/OFF and Priority Order on some CUDA 12, ROCm, and Vulkan setups
Jun 4, 2026
LM Studio 0.4.15

LM Studio 0.4.15

Build 2

  • [CUDA] Added tensor parallelism support for multi-GPU model loading
  • [llama.cpp] Added a Physical Batch Size advanced load option
  • Fixed REST API requests hanging when HTTP/2 clients sent upgrade headers
  • Fixed image attachments appearing in reverse order after sending a chat message
  • Fixed a bug where double clicking model trigger would re-open it
  • Fixed tool type error bug while using codex --oss
  • Fixed invalid role error while using Claude Code. /v1/messages API now supports system messages in the messages array

Build 1

  • LM Studio Engine Protocol beta 2
    • New architecture to enable us to ship more frequent engine updates
    • Turn it on in Settings > Developer > Enable LM Studio Engine Protocol
  • Fixed a bug which dropped prompt cache on every message when using Claude Code, improving performance significantly
  • Fixed a bug when using theme picker overlay over other modals
  • Fixed a z-index bug in the Models Table scroll
  • Security hardening
May 29, 2026
LM Studio 0.4.14

LM Studio 0.4.14

Build 4

  • Stable release of MTP Speculative Decoding!
    • Speeds up generation with models that include built-in multi-token prediction heads
    • To try it out, download an MTP-capable model
  • Fixed an issue with non-MTP speculative decoding error while MTP was enabled
  • Fixed a bug where lms get gemma4 would not show any results
  • lms chat now shows which LM Link device each remote model is on

Build 3

  • Fixed a chat UI bug that could remove whitespace when using MTP

Build 2

  • Beta release of MTP Speculative Decoding
  • Fixed token exchange failure for few MCPs in OAuth flow

Build 1

  • Beta build of LM Studio Engine Protocol
May 22, 2026
LM Studio 0.4.13

LM Studio 0.4.13

Build 1

  • [MLX] mlx-engine v1.8.1 significantly improves performance and adds parallel predictions for vision-capable models such as Qwen 3.5/3.6 and Gemma 4
  • Fixed a bug where newlines were compacted in the chat input on paste
  • Bug fixes and security hardening. This update is recommended for all users.
May 13, 2026
LM Studio 0.4.12

LM Studio 0.4.12

Build 1

  • Support for Qwen 3.6
  • Improved style in chat PDF exports
  • Fixed bug where MCP servers with OAuth would not work on some Windows environments
  • Improved Qwen 3.5 performance with OpenAI-compatible /v1/chat/completions, /v1/responses and Anthropic-compatible /v1/messages
Apr 17, 2026
LM Studio 0.4.11

LM Studio 0.4.11

Build 1

  • Support for updated Gemma 4 chat template
Apr 10, 2026
LM Studio 0.4.10

LM Studio 0.4.10

Build 1

  • Improve Gemma 4 tool call reliability
  • Add OAuth support for MCP servers
Apr 9, 2026
LM Studio 0.4.9

LM Studio 0.4.9

Build 1

  • Improve Gemma 4 tool call reliability
  • Add support for Anthropic-compatible v1/messages output_config.effort (low, medium, high, max)
  • Fixed a bug where deleting a chat folder would sometimes freeze the UI
  • Fixed a bug where markdown Link popovers would appear at the top of the window
Apr 2, 2026
LM Studio 0.4.8

LM Studio 0.4.8

Build 1

  • Add support for reasoning_effort and reasoning_tokens in OpenAI-compatible v1/chat/completions
  • Adds a reasoning field to the /api/v1/models API response, indicating each model's supported reasoning capabilities/REST configuration options learn more
  • Fixed a bug where Insert in chat input would sometimes not work after toggling assistant and user mode
  • Fixed a bug where surrounding spaces in tool call parameters would be stripped for models that uses XML/XML-like tool call formats
  • [CUDA] Fixed issue where some VRAM would not be deallocated under certain conditions
  • Fixes a bug where setting reasoning to low when using Nemotron 3 Super via the /api/v1/chat or OpenAI-compatible /v1/responses API would error out
Mar 26, 2026