Feature/gj 66 roundrobin#97
Conversation
The queue had a max size (as that was required by eckit::Queue), but it did not in anyway control the resource usage (producers are already holding all their state in memory before reaching the queue). So, in practice, all this size did was increase complexity
f95d445 to
5895093
Compare
There was a problem hiding this comment.
Pull request overview
This PR replaces the global FIFO WorkQueue with a per-TaskGroup round-robin scheduler to prevent head-of-line blocking by large requests, and removes the old bounded queue-size configuration that didn’t meaningfully cap resource usage.
Changes:
- Reworked
WorkQueueto maintain per-group FIFOs and dispatch tasks in round-robin order across activeTaskGroups. - Removed the legacy queue-size setting (
GRIBJUMP_QUEUESIZE/ConfigOptions::queueSize()), making the queue unbounded. - Added deterministic unit tests (single worker thread) to verify round-robin fairness and non-starvation behavior.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_workqueue.cc | Adds unit tests asserting round-robin dispatch order and lack of starvation. |
| tests/CMakeLists.txt | Registers the new workqueue test and pins it to GRIBJUMP_THREADS=1 for determinism. |
| src/gribjump/Task.cc | Updates task enqueueing to pass the owning TaskGroup* into the new WorkQueue::push API. |
| src/gribjump/remote/WorkQueue.h | Replaces the old single FIFO queue interface with round-robin, per-group queue data structures. |
| src/gribjump/remote/WorkQueue.cc | Implements the round-robin scheduler, shutdown behavior, and worker loop refactor. |
| src/gribjump/Config.h | Removes the queueSize() configuration option declaration. |
| src/gribjump/Config.cc | Removes the queueSize() configuration option implementation. |
| docs/round-robin-scheduling.md | Documents the rationale, design, and behavioral differences of round-robin scheduling vs FIFO. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #97 +/- ##
===========================================
+ Coverage 74.89% 75.61% +0.71%
===========================================
Files 98 99 +1
Lines 5047 5196 +149
Branches 445 460 +15
===========================================
+ Hits 3780 3929 +149
Misses 1267 1267 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Description
Replace fifo queue with round robin scheudling.
Also, the fifo queue had a max size (as that was required by eckit::Queue), but it did not in anyway control the resource usage (producers are already holding all their state in memory before reaching the queue). So this has been uncapped.
(renamed #96 for jira)
Contributor Declaration
By opening this pull request, I affirm the following: