Tags: shivamtiwari3/ollama
Tags
server: remove experimental aliases support (ollama#14810)
ci: fix missing windows zip file (ollama#14807) Use 7z compression (better compression rate) if found in path. That alone isn't sufficient to get us under 2G, so MLX is now split out as a discrete download. Fix CI so it will fail if artifacts fail to upload.
mlx: perf improvements (ollama#14768) * mlx: perf improvements Fix nn.go to call mlx_fast_layer_norm instead of manually implementing (mean, subtract, variance, rsqrt, multiply, add — 6 ops) Fix llama.go, gemma3.go to remove RepeatKV to tile K/V tensors to match the Q head count, since scaled_dot_product_attention natively handles GQA (it just requires n_q_heads % n_kv_heads == 0) * review comments
ci: Fix windows build (ollama#14754) Instead of relying on sh for wildcard, do it in Go for better windows compatibility.
MLX: add header vendoring and remove go build tag (ollama#14642) * prefer rocm v6 on windows Avoid building with v7 - more changes are needed * MLX: add header vendoring and remove go build tag This switches to using a vendoring approach for the mlx-c headers so that Go can build without requiring a cmake first. This enables building the new MLX based code by default. Every time cmake runs, the headers are refreshed, so we can easily keep them in sync when we bump mlx versions. Basic Windows and Linux support are verified. * ci: harden for flaky choco repo servers CI sometimes fails due to choco not actually installing cache. Since it just speeds up the build, we can proceed without. * review comments
cmd: override stale entries for context window pi (ollama#14655)
cmd: override stale entries for context window pi (ollama#14655)
cmd/config: fix cloud model limit lookups in integrations (ollama#14650)
cmd: add qwen3.5 context length for launch (ollama#14626)
PreviousNext