Bugfix: Check for pending model loads before re-attempting by TheConverseEngineer · Pull Request #53 · commaai/miniray

TheConverseEngineer · 2026-06-10T22:01:14Z

This fixes a group of bugs that exhibited the following behavior:

Long model load times would cause multiple load commands to be queued before the first load finished, resulting in multiple loads (which stalls inference)
Tasks are uninterruptible while in an load request, causing miniray to timeout and fail while the task is waiting to complete redundant loads
Even if miniray does successfully kill the task, Triton server still honors the load requests and wastes time and cuda memory trying to complete them, causing subsequent tasks to fail.

This change prevents load requests from being issued when a model of that name is already being loaded. Furthermore, retry attempts now poll instead of blocking at the endpoint, which makes them now cancel-able by miniray.

haraschax · 2026-06-10T22:23:53Z

This just fixes the blocking nature of the follow-up requests, not the original load request?

haraschax · 2026-06-10T22:25:06Z

-def load_triton_model(client: InferenceServerClient, model: str, config: ModelConfig):
+def load_triton_model(client: InferenceServerClient, model: str, config: ModelConfig, load_timeout = 60):
+  if _is_model_loading(client, model):
+    # If model is loading, wait at most load_timeout for it to finish


this comment doesn't really add anything for me, code seems sellf-explanatory.

haraschax · 2026-06-10T22:25:21Z

+def load_triton_model(client: InferenceServerClient, model: str, config: ModelConfig, load_timeout = 60):
+  if _is_model_loading(client, model):
+    # If model is loading, wait at most load_timeout for it to finish
+    deadline = time.time() + load_timeout


time.time isn't monotonic. time.perf_counter is

TheConverseEngineer · 2026-06-10T22:25:35Z

Correct, however the original load request can be made with a very short timeout now to avoid the 30 second threshold that miniray uses to determine if a canceled job is canceled

haraschax · 2026-06-10T22:27:40Z

+    # If model is loading, wait at most load_timeout for it to finish
+    deadline = time.time() + load_timeout
+    while time.time() < deadline and _is_model_loading(client, model):
+      time.sleep(min(5, load_timeout / 5))


This seems like a very random choice of sleep time and difficult to read. Why not just do time.sleep(5)?

TheConverseEngineer added 2 commits June 10, 2026 14:51

added model load check

6665c3e

removed trailing whitespace

2acafbd

TheConverseEngineer requested a review from haraschax June 10, 2026 22:06

haraschax reviewed Jun 10, 2026

View reviewed changes

Comment thread lib/triton_helpers.py Outdated

haraschax reviewed Jun 10, 2026

View reviewed changes

refactor

a74d079

TheConverseEngineer force-pushed the check-model-loads branch from 8915266 to a74d079 Compare June 10, 2026 22:31

pr feedback

0c333ce

TheConverseEngineer merged commit 4d183eb into master Jun 10, 2026
1 check passed

TheConverseEngineer deleted the check-model-loads branch June 10, 2026 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bugfix: Check for pending model loads before re-attempting#53

Bugfix: Check for pending model loads before re-attempting#53
TheConverseEngineer merged 4 commits into
masterfrom
check-model-loads

TheConverseEngineer commented Jun 10, 2026

Uh oh!

Uh oh!

haraschax commented Jun 10, 2026

Uh oh!

haraschax Jun 10, 2026

Uh oh!

haraschax Jun 10, 2026

Uh oh!

TheConverseEngineer commented Jun 10, 2026

Uh oh!

haraschax Jun 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TheConverseEngineer commented Jun 10, 2026

Uh oh!

Uh oh!

haraschax commented Jun 10, 2026

Uh oh!

haraschax Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

haraschax Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

TheConverseEngineer commented Jun 10, 2026

Uh oh!

haraschax Jun 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants