Skip to content

Conversation

@rkannan82
Copy link
Contributor

What changed?

Implements cursor-based pagination for the ListWorkers API in the worker registry. This allows clients to efficiently retrieve large sets of workers in manageable chunks.

Changes
API Changes:

  • Added ListWorkersParams struct to encapsulate query parameters (Query, PageSize, NextPageToken)
  • Added ListWorkersResponse struct with Workers slice and NextPageToken
  • Updated Registry.ListWorkers interface signature to use the new structs

Implementation (registry_impl.go):

  • Added paginateWorkers function that handles cursor-based pagination
  • Workers are sorted by WorkerInstanceKey for deterministic ordering when pagination is requested Sorting overhead is acceptable as we expect the number of workers to be in O(1000s). Even for 100K workers, the sort time < 100 milliseconds.

Handler (handler.go):

  • Updated ListWorkers handler to extract pagination params from request and pass to registry

Why?

This allows clients to efficiently retrieve large sets of workers in manageable chunks.

How did you test it?

  • built
  • run locally and tested manually
  • covered by existing tests
  • added new unit test(s)
  • added new functional test(s)

Potential risks

Minimal. Existing calls that don't specify pagination params will not exercise the new path. We do an early return without applying pagination logic.

Implements cursor-based pagination for the ListWorkers API in the worker registry. This allows clients to efficiently retrieve large sets of workers in manageable chunks.

Changes
API Changes:
- Added ListWorkersParams struct to encapsulate query parameters (Query, PageSize, NextPageToken)
- Added ListWorkersResponse struct with Workers slice and NextPageToken
- Updated Registry.ListWorkers interface signature to use the new structs

Implementation (registry_impl.go):
- Added paginateWorkers function that handles cursor-based pagination
- Workers are sorted by WorkerInstanceKey for deterministic ordering when pagination is requested
Sorting overhead is acceptable as we expect the number of workers to be in O(1000s). Even for 100K workers, the sort time < 100 milliseconds.

Handler (handler.go):
- Updated ListWorkers handler to extract pagination params from request and pass to registry
@rkannan82 rkannan82 requested review from a team as code owners December 17, 2025 19:02
@rkannan82 rkannan82 requested a review from ShahabT December 17, 2025 20:26
@rkannan82 rkannan82 force-pushed the kannanr/paginate_list_workers_response branch from 424b485 to b4cefcc Compare December 17, 2025 22:48

// ListWorkersResponse contains the result of listing workers.
ListWorkersResponse struct {
Workers []*workerpb.WorkerHeartbeat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a little odd to me that ListWorkers and DescribeWorker return WorkerHeartbeat and not Worker.

To me it seems more intuitive and futureproof if we return a Worker message that has a last_heartbeat field?

Are we certain that later no info other than the last heartbeat is going to be returned in these APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matchingservice actually returns a WorkerInfo (heartbeat is wrapped inside it). See handler.go in this PR. The registry is an internal API and that is why it is returning the heartbeat info. We can always evolve this if needed. There are 3 options:

  1. Current impl.
  2. Wrap the heartbeat inside a WorkerInfo struct. Like you suggested.
  3. Repurpose the matchingservice.ListWorkersResponse here.

We can go either way since this is all internal. Let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I confused this with service APIs. all good.

func (m *registryImpl) ListWorkers(nsID namespace.ID, query string, _ []byte) ([]*workerpb.WorkerHeartbeat, error) {
if query == "" {
return m.filterWorkers(nsID, func(_ *workerpb.WorkerHeartbeat) bool { return true }), nil
func (m *registryImpl) ListWorkers(nsID namespace.ID, params ListWorkersParams) (ListWorkersResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't all these filtering and pagination logic be removed once we move to CHASM? is there a need to really implement it rn?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right. The reason for adding this functionality now is to wrap up the current implementation, and make it usable from the UI (UI team is building this out). I wanted to decouple it from the chasm work as that will land only by end of Q1. This is the only remaining item.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna approve this part without reviewing as it's only temporary and for experiment.

@rkannan82 rkannan82 requested a review from ShahabT December 18, 2025 01:18

// ListWorkersResponse contains the result of listing workers.
ListWorkersResponse struct {
Workers []*workerpb.WorkerHeartbeat
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I confused this with service APIs. all good.

func (m *registryImpl) ListWorkers(nsID namespace.ID, query string, _ []byte) ([]*workerpb.WorkerHeartbeat, error) {
if query == "" {
return m.filterWorkers(nsID, func(_ *workerpb.WorkerHeartbeat) bool { return true }), nil
func (m *registryImpl) ListWorkers(nsID namespace.ID, params ListWorkersParams) (ListWorkersResponse, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna approve this part without reviewing as it's only temporary and for experiment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants