GitHub issue draft (copy into leon-ai/leon)
Title: Support Ollama as OpenAI-compatible backend (LEON_SGLANG_BASE_URL + model tags)
Labels (suggested): enhancement, llm
Summary
Many self-hosters run Leon with Ollama using its OpenAI-compatible API (/v1/chat/completions) instead of a native SGLang server. Today this combination fails or misbehaves in several places because Leon treats sglang/<name> as a local filesystem model and enables thinking/reasoning that Ollama rejects.
We would like first-class support (or documented configuration) for:
LEON_LLM=sglang/llama3.2
LEON_SGLANG_BASE_URL=http://127.0.0.1:11434/v1
Problems observed (Leon develop, 2026-05)
1. Wrong model sent to the API
resolveConfiguredLLMTarget('sglang/llama3.2') resolves to a path under $LEON_HOME/models/ (see llm-routing.ts, LOCAL_PROVIDERS).
SGLangLLMProvider passes that path as the OpenAI-compatible model field.
- Ollama expects the tag
llama3.2 → 400 invalid model name (or similar).
2. Thinking / reasoning not supported by Ollama
- ReAct phases default to
reasoningMode: 'on' (react-llm-duty/phase-policy.ts).
- OpenAI-compatible provider options can set
reasoningEffort: 'high' (ai-sdk-remote-llm-provider.ts).
- Ollama returns:
"llama3.2" does not support thinking.
3. Docker / LEON_HOME mismatches (deployment)
- Build trains NLP assets under
~/.leon; runtime compose often sets LEON_HOME=/data/leon.
pnpm start expects managed Node at $LEON_HOME/bin/node/bin/node (skipped when GITHUB_ACTIONS=true during install).
- Missing
leon-skill-list.nlp under runtime LEON_HOME unless copied/seeded.
These are exacerbated in container stacks but (2) and (1) are core provider issues.
Proposed direction
- Remote tag mode: If
sglang/<tag> is not an on-disk model under LLM_DIR_PATH, treat <tag> as the remote API model name when LEON_SGLANG_BASE_URL is configured (Ollama, LiteLLM, etc.).
- Capability flag: For providers/backends without thinking support, force
disableThinking / reasoningMode: 'off' for ReAct and retry on "does not support thinking" (generalize existing tool_choice + thinking retry).
- Docs: Short “Ollama + Docker” section in configuration docs (env vars, no thinking, model pull).
Reference implementation (downstream)
We maintain a working stack and patch set here (interim, not ideal long-term):
- Repo: https://github.com/githubphadnis/leon-ei
- Patches applied at image build:
docker/leon/apply-leon-ei-patches.mjs
- Upstream package in that repo:
upstream/leon-ollama/ (this issue text + PR proposal)
Happy to contribute a proper PR against develop if the approach above aligns with maintainers’ intent for SGLang vs generic OpenAI-compatible endpoints.
Environment
- Leon:
develop (2.0 dev preview)
- Ollama:
ollama/ollama:latest, model llama3.2
- Direct API test:
POST /v1/chat/completions with "model":"llama3.2" succeeds in ~5s on modest CPU hardware
GitHub issue draft (copy into leon-ai/leon)
Title: Support Ollama as OpenAI-compatible backend (
LEON_SGLANG_BASE_URL+ model tags)Labels (suggested):
enhancement,llmSummary
Many self-hosters run Leon with Ollama using its OpenAI-compatible API (
/v1/chat/completions) instead of a native SGLang server. Today this combination fails or misbehaves in several places because Leon treatssglang/<name>as a local filesystem model and enables thinking/reasoning that Ollama rejects.We would like first-class support (or documented configuration) for:
Problems observed (Leon
develop, 2026-05)1. Wrong
modelsent to the APIresolveConfiguredLLMTarget('sglang/llama3.2')resolves to a path under$LEON_HOME/models/(seellm-routing.ts,LOCAL_PROVIDERS).SGLangLLMProviderpasses that path as the OpenAI-compatiblemodelfield.llama3.2→400 invalid model name(or similar).2. Thinking / reasoning not supported by Ollama
reasoningMode: 'on'(react-llm-duty/phase-policy.ts).reasoningEffort: 'high'(ai-sdk-remote-llm-provider.ts)."llama3.2" does not support thinking.3. Docker /
LEON_HOMEmismatches (deployment)~/.leon; runtime compose often setsLEON_HOME=/data/leon.pnpm startexpects managed Node at$LEON_HOME/bin/node/bin/node(skipped whenGITHUB_ACTIONS=trueduring install).leon-skill-list.nlpunder runtimeLEON_HOMEunless copied/seeded.These are exacerbated in container stacks but (2) and (1) are core provider issues.
Proposed direction
sglang/<tag>is not an on-disk model underLLM_DIR_PATH, treat<tag>as the remote API model name whenLEON_SGLANG_BASE_URLis configured (Ollama, LiteLLM, etc.).disableThinking/reasoningMode: 'off'for ReAct and retry on"does not support thinking"(generalize existing tool_choice + thinking retry).Reference implementation (downstream)
We maintain a working stack and patch set here (interim, not ideal long-term):
docker/leon/apply-leon-ei-patches.mjsupstream/leon-ollama/(this issue text + PR proposal)Happy to contribute a proper PR against
developif the approach above aligns with maintainers’ intent for SGLang vs generic OpenAI-compatible endpoints.Environment
develop(2.0 dev preview)ollama/ollama:latest, modelllama3.2POST /v1/chat/completionswith"model":"llama3.2"succeeds in ~5s on modest CPU hardware