VLM Context size for local models (e.g. Ollama) #2784
Replies: 2 comments
-
|
Hi @98h398hrpohpoai! I'm Dosu and I’m helping the docling team. Yes, you can adjust the context size for local VLM models like Ollama in Docling, but it isn't exposed as a dedicated option in the UI or CLI. Instead, you need to set the max_tokens parameter manually in the params dictionary when creating your ApiVlmOptions. For example: from docling.datamodel.pipeline_options import ApiVlmOptions
options = ApiVlmOptions(
url="http://localhost:11434/v1/chat/completions",
params={
"model": "your-ollama-model",
"max_tokens": 2048, # Set your desired context size here
},
prompt="Your prompt here",
timeout=90,
response_format=ResponseFormat.MARKDOWN,
)This params dictionary is passed directly to the Ollama API, so you can include any supported Ollama parameters there. The default context size of 4096 only applies to inline models unless you override it for Ollama in this way. Just make sure the model you’re using with Ollama supports the context size you specify, and be aware that Ollama’s API sometimes has quirks with OpenAI compatibility, but this doesn’t affect context size configuration itself (reference). To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
Hi, and thanks! I've tried that, however the context remains at the default 4096. ollama ps Edit: Work around for now is to set Ollama context size globally (e.g. Environment="OLLAMA_CONTEXT_LENGTH=98304"), but this isn't a great long-term solution or scale well with multiple tools/agents/etc. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Is it possible to set context size using VLM with local options like Ollama?
The default is set to 4096 which seems slow, and haven't seen a way to adjust that specifically in the docs.
Tried options for the remote models, but those didn't apply.
Beta Was this translation helpful? Give feedback.
All reactions