Skip to content

Tags: ArielleTolome/llmfit

Tags

v0.9.1

Toggle v0.9.1's commit message
fix: query local providers in `recommend` CLI command

Populates `fit.installed` in CLI `recommend` output (text + JSON) by
probing Ollama, MLX, llama.cpp, Docker Model Runner, and LM Studio —
same behavior as the TUI. Honors DOCKER_MODEL_RUNNER_HOST so the DMR
backend receives requests from non-interactive CLI usage too.

Fixes docker/model-runner#747 feedback.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v0.9.0

Toggle v0.9.0's commit message
chore: cargo format

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.8.9

Toggle v0.8.9's commit message
fix: prefer discrete GPU over integrated on Windows (AlexsJones#303)

On Windows systems with both an integrated (e.g. Intel UHD) and a
discrete GPU (e.g. NVIDIA), the WMI AdapterRAM 32-bit cap could cause
the integrated GPU to report higher VRAM and win the sort, becoming
the primary GPU incorrectly.

Added `prefer_discrete_gpus` filtering that drops integrated GPUs when
at least one discrete GPU is present. On iGPU-only systems the
integrated GPU is kept as before. Integrated GPUs are identified by
name patterns (Intel UHD/HD/Iris, AMD Radeon Graphics without a
discrete model identifier).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v0.8.8

Toggle v0.8.8's commit message
feat: add Google Gemma 4 models and fix Gemma 3 capabilities

Cherry-picked from AlexsJones#310 (credit: @shaal). Adds Gemma 4 models
(E2B-it, E4B-it, 31B-it, 26B-A4B-it), fixes MoE detection for
top_k_experts, adds any-to-any vision pipeline tag, and enables
tool_use + vision for Gemma 3/4 instruction-tuned models.

Version bump to 0.8.8.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

v0.8.7

Toggle v0.8.7's commit message
chore: fixed gguf filter regression

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.8.6

Toggle v0.8.6's commit message
chore: version bump

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.8.5

Toggle v0.8.5's commit message
chore: updated

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.8.4

Toggle v0.8.4's commit message
chore: updated version

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.0.2

Toggle v0.0.2's commit message
chore: updated fmt

Signed-off-by: Alex Jones <alexsimonjones@gmail.com>

v0.8.2

Toggle v0.8.2's commit message
fix: iGPU inflating GPU count and force-runtime being ignored (AlexsJ…

…ones#271)

rocm-smi reports all GPU agents including iGPUs on APUs like the Ryzen
9800X3D, inflating the discrete GPU count and total VRAM. Filter out
VRAM entries below 2 GB so only discrete GPUs are counted.

Also fix --force-runtime being silently ignored for pre-quantized
(AWQ/GPTQ) models by checking the override before the prequantized
default.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>