Tags: ArielleTolome/llmfit
Tags
fix: query local providers in `recommend` CLI command Populates `fit.installed` in CLI `recommend` output (text + JSON) by probing Ollama, MLX, llama.cpp, Docker Model Runner, and LM Studio — same behavior as the TUI. Honors DOCKER_MODEL_RUNNER_HOST so the DMR backend receives requests from non-interactive CLI usage too. Fixes docker/model-runner#747 feedback. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: prefer discrete GPU over integrated on Windows (AlexsJones#303) On Windows systems with both an integrated (e.g. Intel UHD) and a discrete GPU (e.g. NVIDIA), the WMI AdapterRAM 32-bit cap could cause the integrated GPU to report higher VRAM and win the sort, becoming the primary GPU incorrectly. Added `prefer_discrete_gpus` filtering that drops integrated GPUs when at least one discrete GPU is present. On iGPU-only systems the integrated GPU is kept as before. Integrated GPUs are identified by name patterns (Intel UHD/HD/Iris, AMD Radeon Graphics without a discrete model identifier). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
feat: add Google Gemma 4 models and fix Gemma 3 capabilities Cherry-picked from AlexsJones#310 (credit: @shaal). Adds Gemma 4 models (E2B-it, E4B-it, 31B-it, 26B-A4B-it), fixes MoE detection for top_k_experts, adds any-to-any vision pipeline tag, and enables tool_use + vision for Gemma 3/4 instruction-tuned models. Version bump to 0.8.8. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fix: iGPU inflating GPU count and force-runtime being ignored (AlexsJ… …ones#271) rocm-smi reports all GPU agents including iGPUs on APUs like the Ryzen 9800X3D, inflating the discrete GPU count and total VRAM. Filter out VRAM entries below 2 GB so only discrete GPUs are counted. Also fix --force-runtime being silently ignored for pre-quantized (AWQ/GPTQ) models by checking the override before the prequantized default. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PreviousNext