Add Kubernetes watch-list selectors#1311
Conversation
Signed-off-by: Alex Streed <alex.s@prefect.io>
570d8a2 to
f4eff0c
Compare
There was a problem hiding this comment.
Thanks for this PR. Please give me some time to digest this proposed change in the DSL/API. I am very careful and cautious with the API changes, since it is impossible to change the API later (after the release). I usually play with the code samples with different syntactic approaches — this is what takes time — until something settles based on the gut feeling.
I've put a few remarks for myself below for specific lines. One more thing: at some point in time, I also considered mapping the existing labels=/field= filters into the API filters, but only those that have specific values or presence/absence markers (not callables), and only the common subset of all handlers. At that point in time, it was too difficult to implement (read: I was too lazy), so I abandoned this idea. Maybe it is time to reconsider this. But the split into filters & settings can be appropriate too.
| settings.watching.server_side_selectors[("", "v1", "pods")] = ( | ||
| kopf.WatchListSelector( | ||
| label_selector="prefect.io/flow-run-id", | ||
| field_selector="status.phase!=Succeeded,status.phase!=Failed", | ||
| ) | ||
| ) | ||
| settings.watching.server_side_selectors[("batch", "v1", "jobs")] = ( | ||
| kopf.WatchListSelector(label_selector="prefect.io/flow-run-id") | ||
| ) |
There was a problem hiding this comment.
Let me first think on the DSL/API part. I would design it differently, to be used this way:
settings.watching.label_selectors[…] = …
settings.watching.field_selectors[…] = …This automatically eliminates a different class and a data structure, and removes None values from it. — For simplicity.
Besides, I would borrow the logic from KMock on ResourceArray class (associative array in this case) and the kmock.resources[¬] attribute, as documented in https://kmock.readthedocs.io/kubernetes/discovery/ — mainly for string keys instead of 3-item tuples — for readability and flexibility. This would require the parser of those strings from KMock, which is slightly more flexible than Kopf's one (but serves the same function).
I also need to check if there is any syntax in those selectors, to see if it can be mapped to Python dicts — to level it with the labels= filters on the syntactic level. If it is anything more advanced than key=val, then leave it as pass-through strings in the K8s notation.
Just remarks for myself for later.
| @@ -258,6 +259,7 @@ async def spawn_missing_watchers( | |||
| settings=settings, | |||
| resource=resource, | |||
| namespace=namespace, | |||
| server_side_selector=server_side_selector, | |||
There was a problem hiding this comment.
Can you please clarify why you decided to pass this argument (server_side_selector) through the whole call stack?
I see that both settings and resource are already passed down, so this can be resolved in place in the listing/watching calls. Do I miss something after a quick glance?
There was a problem hiding this comment.
I did this mostly due to unfamiliarity with the codebase, but I see now that since settings and resource are already available in the client calls, resolving the selector there is much simpler. So no, I don't think you're missing anything!
Hi! I work on Prefect, where we use Kopf in our Kubernetes worker to watch resources associated with flow runs.
One place this has become important for us is watching pods and jobs. Our handlers are already scoped with Kopf filters, but the initial Kubernetes LIST still returns every object in scope before those filters run inside the process. In larger or multi-tenant clusters, that can mean the worker has to receive and hold many more objects than it actually needs.
This PR adds an explicit, opt-in way to pass Kubernetes
labelSelectorandfieldSelectorparameters to LIST/WATCH requests for specific resources. That gives operators a way to ask Kubernetes to do the coarse filtering before objects are sent back to Kopf.The setting is intentionally explicit. Kopf does not try to translate handler filters into Kubernetes selectors, since those are different contracts and Kubernetes field-selector support varies by resource. Existing handler filters such as
labels=,annotations=,field=,value=, andwhen=continue to work exactly as they do today.When no selectors are configured, LIST/WATCH behavior stays unchanged.
This came up while investigating Prefect Kubernetes worker observer OOMs: PrefectHQ/prefect#22208