Skip to content

Support kubernetes metadata in events#212

Merged
haesbaert merged 1 commit into
mainfrom
gokube
Aug 20, 2025
Merged

Support kubernetes metadata in events#212
haesbaert merged 1 commit into
mainfrom
gokube

Conversation

@haesbaert
Copy link
Copy Markdown
Collaborator

@haesbaert haesbaert commented Aug 7, 2025

Support kubernetes metadata in events
With this now we can correlate process events with container and pod information:

->105511 (FORK+EXEC)
  COMM  comm=agnhost
  CMDL  cmdline=[ /agnhost, netexec, --http-port=8080 ]
  PROC  ppid=105489
  PROC  uid=0 gid=0 suid=0 sgid=0 euid=0 egid=0 pgid=105511 sid=105511
  PROC  cap_inheritable=0x0 cap_permitted=0xa80425fb cap_effective=0xa80425fb
  PROC  cap_bset=0x0 cap_ambient=0x0
  PROC  time_boot=1755681539339596940 tty_major=0 tty_minor=0
  PROC  uts_inonum=4026534147 ipc_inonum=4026533581
  PROC  mnt_inonum=4026534146 net_inonum=4026533741
  PROC  entity_id=J5wBAIw0DNsOb10Y, entry_leader_type=UNKNOWN entry_leader=0
  CWD   cwd=/
  FNAM  filename=/agnhost
  CGRP  cgroup=/system.slice/docker-4a995ac3f123fd2d0c16578af72b1132d9938459ba3aecea0ca08a5718146ff9.scope/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podcf9c2244_53ad_417e_9509_88fdbac14a59.slice/docker-d32ce750bd4d47749c4ac7bae31672fbbafb1c647219324b8e1e2784e7996609.scope
**POD   name=hello-node-c74958b5d-hfg57 namespace=default
**POD   uid=cf9c2244-53ad-417e-9509-88fdbac14a59 phase=Running
**POD   labels=[ app=hello-node, pod-template-hash=c74958b5d ]
**CONT  name=agnhost image=registry.k8s.io/e2e-test-images/agnhost:2.39
**CONT  container_id=docker://d32ce750bd4d47749c4ac7bae31672fbbafb1c647219324b8e1e2784e7996609

Currently only docker containters under kubernetes is supported, adding others
should be relatively easy as long as all the information is in cgroup.

The way it works is we have a companion binary, quark-kube-talker written in
GO that talks to the kubernetes controller and writes the information back on a
pipe. For now, it's the user's responsability to manage the pipe file
descriptor, but we can improve it in the future.

We read from the pipe in quark_queue_get_events(), which populates kubernetes
metadata. Then, at every event, we try to correlate processes with container
metadata, so a user can access kubernetes info via the process pointer.

Now, a quark_process points to a quark_containter, which points to a quark_pod.
A quark_pod is indexed by the kubernetes uid and has a list of containers it owns.
A quark_container is indexed by container_id inside quark_queue, and also
indexed inside the pod by container_id.
A quark_container also holds a list of every process that it owns, and the
process has a back pointer.

If quark is configured with kubernetes, quark_queue_open() now blocks for 2
seconds populating the kubernetes database, this is done before scraping /proc
and before opening the ringbuffer, otherwise we might actually lose events for
blocking for 2 seconds. This is a compromise, 2 seconds "looks reasonable", we
want all existing processes to be enriched by the time quark_queue_open()
returns the first time.

Changes to the build system were needed to build the go binary.

The reason for writing this in Go is that the kubernetes-c-api pulls in a ton of
dependencies, and it's also not distributed in major distributions, but we might
consider writing it in the future in C, so we don't have to ship a 65MB go
binary.

To test this I have minikube running on my machine and I create/delete/modify
pods and see if the resulting events are properly enriched. Adding proper tests
should come next, but it involves firing up minikube and other things.

Many thanks for the indepth review by Mr.Berlin, he found one serious bug and
a bunch of minor things.

Co-authored-by: Nicholas Berlin 56366649+nicholasberlin@users.noreply.github.com

@nicholasberlin nicholasberlin requested a review from Copilot August 7, 2025 13:16
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds Kubernetes metadata support to the quark system, enabling correlation between processes and their corresponding Kubernetes pods and containers. The implementation includes a Go-based Kubernetes event listener and C structures to manage pod/container metadata.

  • Adds data structures and APIs for tracking Kubernetes pods and containers
  • Implements a Go-based Kubernetes API client (quark-kube-talker) that streams pod events
  • Integrates container ID lookup from process cgroup information to link system events with Kubernetes metadata

Reviewed Changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
quark.h Defines new data structures for pods, containers, and Kubernetes integration
quark.c Implements pod/container management, JSON parsing, and process-to-container mapping
quark-mon.c Adds initialization for the Kubernetes event processing pipeline
quark-kube-talker.go Go application that watches Kubernetes API and streams pod events
go.mod Go module dependencies for Kubernetes client libraries
Makefile Build configuration updates to include the Go component

Comment thread quark.h Outdated
Comment thread quark.h Outdated
Comment thread quark.h
Comment thread quark.h Outdated
Comment thread quark.c Outdated
Comment thread quark-kube-talker.go Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Copy link
Copy Markdown
Contributor

@nicholasberlin nicholasberlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Half done

Comment thread quark-kube-talker.go
Comment thread quark-kube-talker.go Outdated
Comment thread quark.h Outdated
Comment thread quark.c
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c
Comment thread quark.c Outdated
Comment thread quark-kube-talker.go Outdated
Comment thread quark.c
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c Outdated
Comment thread quark.c
Comment thread quark.c Outdated
Comment thread quark.c
With this now we can correlate process events with container and pod information:

```
->105511 (FORK+EXEC)
  COMM  comm=agnhost
  CMDL  cmdline=[ /agnhost, netexec, --http-port=8080 ]
  PROC  ppid=105489
  PROC  uid=0 gid=0 suid=0 sgid=0 euid=0 egid=0 pgid=105511 sid=105511
  PROC  cap_inheritable=0x0 cap_permitted=0xa80425fb cap_effective=0xa80425fb
  PROC  cap_bset=0x0 cap_ambient=0x0
  PROC  time_boot=1755681539339596940 tty_major=0 tty_minor=0
  PROC  uts_inonum=4026534147 ipc_inonum=4026533581
  PROC  mnt_inonum=4026534146 net_inonum=4026533741
  PROC  entity_id=J5wBAIw0DNsOb10Y, entry_leader_type=UNKNOWN entry_leader=0
  CWD   cwd=/
  FNAM  filename=/agnhost
  CGRP  cgroup=/system.slice/docker-4a995ac3f123fd2d0c16578af72b1132d9938459ba3aecea0ca08a5718146ff9.scope/kubepods.slice/kubepods-besteffort.slice/kubepods-besteffort-podcf9c2244_53ad_417e_9509_88fdbac14a59.slice/docker-d32ce750bd4d47749c4ac7bae31672fbbafb1c647219324b8e1e2784e7996609.scope
**POD   name=hello-node-c74958b5d-hfg57 namespace=default
**POD   uid=cf9c2244-53ad-417e-9509-88fdbac14a59 phase=Running
**POD   labels=[ app=hello-node, pod-template-hash=c74958b5d ]
**CONT  name=agnhost image=registry.k8s.io/e2e-test-images/agnhost:2.39
**CONT  container_id=docker://d32ce750bd4d47749c4ac7bae31672fbbafb1c647219324b8e1e2784e7996609
```

Currently only docker containters under kubernetes is supported, adding others
should be relatively easy as long as all the information is in cgroup.

The way it works is we have a companion binary, quark-kube-talker written in
GO that talks to the kubernetes controller and writes the information back on a
pipe. For now, it's the user's responsability to manage the pipe file
descriptor, but we can improve it in the future.

We read from the pipe in quark_queue_get_events(), which populates kubernetes
metadata. Then, at every event, we try to correlate processes with container
metadata, so a user can access kubernetes info via the process pointer.

Now, a quark_process points to a quark_containter, which points to a quark_pod.
A quark_pod is indexed by the kubernetes uid and has a list of containers it owns.
A quark_container is indexed by container_id inside quark_queue, and also
indexed inside the pod by container_id.
A quark_container also holds a list of every process that it owns, and the
process has a back pointer.

If quark is configured with kubernetes, quark_queue_open() now blocks for 2
seconds populating the kubernetes database, this is done before scraping /proc
and before opening the ringbuffer, otherwise we might actually lose events for
blocking for 2 seconds. This is a compromise, 2 seconds "looks reasonable", we
want all existing processes to be enriched by the time quark_queue_open()
returns the first time.

Changes to the build system were needed to build the go binary.

The reason for writing this in Go is that the kubernetes-c-api pulls in a ton of
dependencies, and it's also not distributed in major distributions, but we might
consider writing it in the future in C, so we don't have to ship a 65MB go
binary.

To test this I have minikube running on my machine and I create/delete/modify
pods and see if the resulting events are properly enriched. Adding proper tests
should come next, but it involves firing up minikube and other things.

Many thanks for the indepth review by Mr.Berlin, he found one serious bug and
a bunch of minor things.

Co-authored-by: Nicholas Berlin <56366649+nicholasberlin@users.noreply.github.com>
@haesbaert haesbaert changed the title BIG DRAFT kubernetes metadata Support kubernetes metadata in events Aug 20, 2025
@haesbaert haesbaert marked this pull request as ready for review August 20, 2025 09:37
@haesbaert haesbaert requested a review from a team as a code owner August 20, 2025 09:37
Copy link
Copy Markdown
Contributor

@nicholasberlin nicholasberlin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM

@haesbaert haesbaert merged commit 9565e81 into main Aug 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants