llm_id2llm_type ignores the factory part of "name@factory", causing wrong model_type for models that exist in multiple factories
Description
TenantLLMService.llm_id2llm_type (api/db/services/tenant_llm_service.py) can return the wrong model_type for a chat model, causing LookupError: Model(<name>@<factory>) not authorized in dialog_service.async_chat, even though the model is correctly configured in tenant_llm with model_type='chat'.
Root Cause
def llm_id2llm_type(llm_id: str) -> str | None:
from api.db.services.llm_service import LLMService
llm_id, *_ = TenantLLMService.split_model_name_and_factory(llm_id)
llm_factories = settings.FACTORY_LLM_INFOS
for llm_factory in llm_factories:
for llm in llm_factory["llm"]:
if llm_id == llm["llm_name"]:
return llm["model_type"].split(",")[-1]
...
split_model_name_and_factory correctly splits llm_id (format "<name>@<factory>") into (name, factory), but the factory part is discarded (, *_). The subsequent loop iterates over all factories in settings.FACTORY_LLM_INFOS and returns the model_type of the first llm_name match, regardless of which factory it belongs to.
Concrete Example
gemini-2.5-flash exists in conf/llm_factories.json under multiple factories, e.g.:
- An earlier factory entry (appears first in the JSON array) with
"model_type": "image2text"
- The "Google Cloud" (Vertex AI) factory entry with
"model_type": "chat"
For llm_id = "gemini-2.5-flash@Google Cloud":
split_model_name_and_factory returns ("gemini-2.5-flash", "Google Cloud"), but only "gemini-2.5-flash" is kept.
- The loop finds the first
llm_name == "gemini-2.5-flash" match — which belongs to a different factory — and returns its model_type, "image2text".
- Back in
dialog_service.async_chat:
llm_type = TenantLLMService.llm_id2llm_type(dialog.llm_id)
if llm_type == "image2text":
llm_model_config = TenantLLMService.get_model_config(dialog.tenant_id, LLMType.IMAGE2TEXT, dialog.llm_id)
else:
llm_model_config = TenantLLMService.get_model_config(dialog.tenant_id, LLMType.CHAT, dialog.llm_id)
Since llm_type == "image2text", it queries tenant_llm for model_type='image2text' for this tenant — which does not exist (the tenant only configured model_type='chat' for gemini-2.5-flash@Google Cloud), raising:
LookupError: Model(gemini-2.5-flash@Google Cloud) not authorized
Why a DB workaround is not possible
Adding a second tenant_llm row with the same tenant_id/llm_factory/llm_name but model_type='image2text' is blocked by:
UNIQUE KEY `uk_tenant_llm` (`tenant_id`,`llm_factory`,`llm_name`)
Suggested Fix
llm_id2llm_type should use the factory information from split_model_name_and_factory to only match llm_name within the specified factory (falling back to the current name-only behavior if no factory suffix is present):
def llm_id2llm_type(llm_id: str) -> str | None:
from api.db.services.llm_service import LLMService
llm_name, llm_factory_name = TenantLLMService.split_model_name_and_factory(llm_id)
llm_factories = settings.FACTORY_LLM_INFOS
for llm_factory in llm_factories:
if llm_factory_name and llm_factory["name"] != llm_factory_name:
continue
for llm in llm_factory["llm"]:
if llm_name == llm["llm_name"]:
return llm["model_type"].split(",")[-1]
for llm in LLMService.query(llm_name=llm_name):
return llm.model_type
llm = TenantLLMService.get_or_none(llm_name=llm_name)
if llm:
return llm.model_type
for llm in TenantLLMService.query(llm_name=llm_name):
return llm.model_type
return None
Environment
- RAGFlow version: v0.25.6
- LLM factory: Google Cloud (Vertex AI), model:
gemini-2.5-flash
- Deployment: Docker,
infiniflow/ragflow:v0.25.6 image
llm_id2llm_type ignores the factory part of "name@factory", causing wrong model_type for models that exist in multiple factories
Description
TenantLLMService.llm_id2llm_type(api/db/services/tenant_llm_service.py) can return the wrongmodel_typefor a chat model, causingLookupError: Model(<name>@<factory>) not authorizedindialog_service.async_chat, even though the model is correctly configured intenant_llmwithmodel_type='chat'.Root Cause
split_model_name_and_factorycorrectly splitsllm_id(format"<name>@<factory>") into(name, factory), but thefactorypart is discarded (, *_). The subsequent loop iterates over all factories insettings.FACTORY_LLM_INFOSand returns themodel_typeof the firstllm_namematch, regardless of which factory it belongs to.Concrete Example
gemini-2.5-flashexists inconf/llm_factories.jsonunder multiple factories, e.g.:"model_type": "image2text""model_type": "chat"For
llm_id = "gemini-2.5-flash@Google Cloud":split_model_name_and_factoryreturns("gemini-2.5-flash", "Google Cloud"), but only"gemini-2.5-flash"is kept.llm_name == "gemini-2.5-flash"match — which belongs to a different factory — and returns itsmodel_type,"image2text".dialog_service.async_chat:llm_type == "image2text", it queriestenant_llmformodel_type='image2text'for this tenant — which does not exist (the tenant only configuredmodel_type='chat'forgemini-2.5-flash@Google Cloud), raising:Why a DB workaround is not possible
Adding a second
tenant_llmrow with the sametenant_id/llm_factory/llm_namebutmodel_type='image2text'is blocked by:Suggested Fix
llm_id2llm_typeshould use the factory information fromsplit_model_name_and_factoryto only matchllm_namewithin the specified factory (falling back to the current name-only behavior if no factory suffix is present):Environment
gemini-2.5-flashinfiniflow/ragflow:v0.25.6image