Releases: ollama/ollama
v0.24.0
What's Changed
- mlx: add memory trace logging by @dhiltgen in #16131
- launch: codex app integration by @ParthSareen in #16120
Full Changelog: v0.23.4...v0.24.0-rc0
v0.30.0
This version of Ollama will change the architecture to directly support llama.cpp instead of building on top of GGML, and allows for compatibility with GGUF file format. MLX is used to accelerate model inference on Apple Silicon.
While in pre-release we'd love feedback on:
- Performance improvements or degradation
- Errors or crashes that did not previously occur
- Memory utilization improvements or degradation
Known issues:
laguna-xs.2is not supported yet on this pre-releasellama3.2-visionis not supported yet on this pre-release
Installing:
Mac/Linux
curl -fsSL https://ollama.com/install.sh | OLLAMA_VERSION=0.30.0-rc15 sh
Windows
$env:OLLAMA_VERSION="0.30.0-rc15"; irm https://ollama.com/install.ps1 | iex
v0.23.4
What's Changed
ollama launch opencodenow supports vision models with image inputs- Fixed formatting of Claude tool results when using local image paths
Full Changelog: v0.23.3...v0.23.4
v0.23.3
What's Changed
- mlx: refined model push behavior by @dhiltgen in #15431
- test: integration test hardening by @dhiltgen in #13532
- app: harden update flows by @dhiltgen in #16100
- mlx: update the imagegen runner for mlx thread affinity by @pdevine in #16096
- mlx: avoid status timeout during inference by @dhiltgen in #16086
- mlx: fix macOS 26 target leakage in v3 metallib by @dhiltgen in #16053
Full Changelog: v0.23.2...v0.23.3
v0.23.2
What's Changed
ollama launchno longer includes Claude Desktop due to the third-party integration being limited to Anthropic models.- Use
ollama launch claude-desktop --restoreto restore Claude Desktop to its normal state. /api/showresponses are now cached, improving median latency by ~6.7x which will increase load speed for integrations like VS Code.- Improved backup workflow when managing launch integrations
- Cleaner image generation layout in the MLX runner
Full Changelog: v0.23.1...v0.23.2
v0.23.1
Gemma 4 MTP (Multi-token Processing) for the MLX runner
Gemma 4 MTP speculative decoding is now supported on Macs. This can give over a 2x speed increase for the Gemma 4 31B model on coding tasks.
ollama run gemma4:31b-coding-mtp-bf16
What's Changed
- Update MLX and MLX-C with threading fixes by @dhiltgen in #15845
- go: bump to 1.26 by @ParthSareen in #15904
- Add Gemma 4 MTP speculative decoding by @pdevine in #15980
Full Changelog: v0.23.0...v0.23.1
v0.23.0
Claude Desktop
Claude Desktop is now supported with Ollama Launch.
Claude Cowork and Claude Code are supported within the Claude Desktop App.
ollama launch claude-desktop
Claude Cowork
Claude Code
Claude Code on the terminal can still be accessed through the CLI with:
ollama launch claude
Not supported yet
- Web Search (coming soon)
- Extensions
What's Changed
- Launch Claude Desktop with
ollama launch claude-desktop - The Ollama app now surfaces featured models from server-driven recommendations
- Fixed OpenClaw gateway timeout on Windows by enforcing IPv4 loopback (thanks @UniquePratham)
- Hardened Metal initialization to gracefully handle ggml kernel compilation failures
New Contributors
- @UniquePratham made their first contribution in #15726
Full Changelog: v0.22.1...v0.23.0
v0.22.1
What's Changed
- Updated the Gemma 4 renderer for thinking and tool calling improvements
- Model recommendations are now updated without updating Ollama
- Aligned the desktop app's launch page with
ollama launchintegrations - Fixed the Poolside integration title in
ollama launch
Full Changelog: v0.22.0...v0.22.1
v0.22.0
New models
- NVIDIA's Nemotron 3 Omni
- Poolside's first open-weight coding model - Laguna XS.2
Full Changelog: v0.21.2...v0.22.0
v0.21.3
What's Changed
- api: accept "max" as a think value by @ParthSareen in #15787
- openai: map responses reasoning effort to think by @ParthSareen in #15789
Full Changelog: v0.21.2...v0.21.3-rc0