-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Implement pagination for ListWorkers. #8849
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements cursor-based pagination for the ListWorkers API in the worker registry. This allows clients to efficiently retrieve large sets of workers in manageable chunks. Changes API Changes: - Added ListWorkersParams struct to encapsulate query parameters (Query, PageSize, NextPageToken) - Added ListWorkersResponse struct with Workers slice and NextPageToken - Updated Registry.ListWorkers interface signature to use the new structs Implementation (registry_impl.go): - Added paginateWorkers function that handles cursor-based pagination - Workers are sorted by WorkerInstanceKey for deterministic ordering when pagination is requested Sorting overhead is acceptable as we expect the number of workers to be in O(1000s). Even for 100K workers, the sort time < 100 milliseconds. Handler (handler.go): - Updated ListWorkers handler to extract pagination params from request and pass to registry
424b485 to
b4cefcc
Compare
|
|
||
| // ListWorkersResponse contains the result of listing workers. | ||
| ListWorkersResponse struct { | ||
| Workers []*workerpb.WorkerHeartbeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little odd to me that ListWorkers and DescribeWorker return WorkerHeartbeat and not Worker.
To me it seems more intuitive and futureproof if we return a Worker message that has a last_heartbeat field?
Are we certain that later no info other than the last heartbeat is going to be returned in these APIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The matchingservice actually returns a WorkerInfo (heartbeat is wrapped inside it). See handler.go in this PR. The registry is an internal API and that is why it is returning the heartbeat info. We can always evolve this if needed. There are 3 options:
- Current impl.
- Wrap the heartbeat inside a WorkerInfo struct. Like you suggested.
- Repurpose the matchingservice.ListWorkersResponse here.
We can go either way since this is all internal. Let me know what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I confused this with service APIs. all good.
| func (m *registryImpl) ListWorkers(nsID namespace.ID, query string, _ []byte) ([]*workerpb.WorkerHeartbeat, error) { | ||
| if query == "" { | ||
| return m.filterWorkers(nsID, func(_ *workerpb.WorkerHeartbeat) bool { return true }), nil | ||
| func (m *registryImpl) ListWorkers(nsID namespace.ID, params ListWorkersParams) (ListWorkersResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wouldn't all these filtering and pagination logic be removed once we move to CHASM? is there a need to really implement it rn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. The reason for adding this functionality now is to wrap up the current implementation, and make it usable from the UI (UI team is building this out). I wanted to decouple it from the chasm work as that will land only by end of Q1. This is the only remaining item.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna approve this part without reviewing as it's only temporary and for experiment.
|
|
||
| // ListWorkersResponse contains the result of listing workers. | ||
| ListWorkersResponse struct { | ||
| Workers []*workerpb.WorkerHeartbeat |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I confused this with service APIs. all good.
| func (m *registryImpl) ListWorkers(nsID namespace.ID, query string, _ []byte) ([]*workerpb.WorkerHeartbeat, error) { | ||
| if query == "" { | ||
| return m.filterWorkers(nsID, func(_ *workerpb.WorkerHeartbeat) bool { return true }), nil | ||
| func (m *registryImpl) ListWorkers(nsID namespace.ID, params ListWorkersParams) (ListWorkersResponse, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm gonna approve this part without reviewing as it's only temporary and for experiment.
What changed?
Implements cursor-based pagination for the ListWorkers API in the worker registry. This allows clients to efficiently retrieve large sets of workers in manageable chunks.
Changes
API Changes:
Implementation (registry_impl.go):
Handler (handler.go):
Why?
This allows clients to efficiently retrieve large sets of workers in manageable chunks.
How did you test it?
Potential risks
Minimal. Existing calls that don't specify pagination params will not exercise the new path. We do an early return without applying pagination logic.