btf: Add type deduplication #1903

dylandreimerink · 2025-11-21T15:28:27Z

This PR introduces a deduplication mechanism for BTF types.

Most of the time BTF types originate from a single spec where a compiler or other external tool has already ensured that types are unique. In such cases we can simply rely on pointer equality to determine if two types are the same.

When dealing with manually generated BTF or merging multiple BTF specs, duplicate types are common. Meaning we have multiple different go objects which represent the same underlying BTF type.

It is useful to be able to deduplicate these types, both to reduce the size of the resulting BTF, as well as allowing name based lookups after combining multiple specs with duplicate types.

The deduplication algorithm is loosely based on the one used in libbpf. This version for example does not do FWD type resolution, as that is only needed when combining BTF from multiple compilation units, which is something typically not seen in eBPF use-cases (only pahole). In the libbpf implementation the first step is string deduplication, however, we do this step during marshaling, and thus we do not deduplicate strings in the Go representation.

When a type is deduplicated, we try to deduplicate not just that root type, but the full subtree of types reachable from that type. We start by traversing all types in post-order, and any time there is an edge we try to replace that child with an equal type we have already seen.

Comparing every type with those seen before would be very expensive. So what we do is we compute a hash of each type. The hash is an approximation of all properties of the type, including recursively hashing child types. When using this hash as key in a map we end up with a set of candidate types which might be equal to the type we are currently deduplicating. We still need to do a full equality check to be sure two types are equal, both to avoid hash collisions as well as to compare properties which are not included in the hash (recursion limit). In practically every case a hash narrows down to 0 or 1 candidate types.

Once we have narrowed down to candidate types, we do a full equality check in which we walk the two types to be compared together in depth first manner, and bail out as soon as we find a difference. Since types can form cycles with pointers, we keep track of already visited types in the current equality check, and assume types are equal.

This deduplication mechanism can be used via a standalone function, but is also integrated in the BTF spec builder via a new method to add and deduplicate a types.

The implementation as it stands performs reasonable for its intended application:

goos: linux
goarch: amd64
pkg: github.com/cilium/ebpf/btf
cpu: 13th Gen Intel(R) Core(TM) i7-13800H
                      │   sec/op    │
DeduplicateSKBuff-20    9.253m ± 2%
DeduplicateVMLinux-20   103.5m ± 9%

                      │     B/op      │
DeduplicateSKBuff-20    2.744Mi ±  1%
DeduplicateVMLinux-20   30.02Mi ± 16%

                      │  allocs/op   │
DeduplicateSKBuff-20     318.0 ±  1%
DeduplicateVMLinux-20   8.065k ± 18%

So ~9ms for struct sk_buff which has ~5800 types, and ~103ms for vmlinux which has ~93000 types. For smaller types such as map key and value types the cost of obviously way cheaper.

This commit introduces a deduplication mechanism for BTF types. Most of the time BTF types originate from a single spec where a compiler or other external tool has already ensured that types are unique. In such cases we can simply rely on pointer equality to determine if two types are the same. When dealing with manually generated BTF or merging multiple BTF specs, duplicate types are common. Meaning we have multiple different go objects which represent the same underlying BTF type. It is useful to be able to deduplicate these types, both to reduce the size of the resulting BTF, as well as allowing name based lookups after combining multiple specs with duplicate types. The deduplication algorithm is loosely based on the one used in libbpf. This version for example does not do FWD type resolution, as that is only needed when combining BTF from multiple compilation units, which is something typically not seen in eBPF use-cases (only pahole). In the libbpf implementation the first step is string deduplication, however, we do this step during marshaling, and thus we do not deduplicate strings in the Go representation. When a type is deduplicated, we try to deduplicate not just that root type, but the full subtree of types reachable from that type. We start by traversing all types in post-order, and any time there is an edge we try to replace that child with an equal type we have already seen. Comparing every type with those seen before would be very expensive. So what we do is we compute a hash of each type. The hash is an approximation of all properties of the type, including recursively hashing child types. When using this hash as key in a map we end up with a set of candidate types which might be equal to the type we are currently deduplicating. We still need to do a full equality check to be sure two types are equal, both to avoid hash collisions as well as to compare properties which are not included in the hash (recursion limit). In practically every case a hash narrows down to 0 or 1 candidate types. Once we have narrowed down to candidate types, we do a full equality check in which we walk the two types to be compared together in depth first manner, and bail out as soon as we find a difference. Since types can form cycles with pointers, we keep track of already visited types in the current equality check, and assume types are equal. This deduplication mechanism can be used via a standalone function, but is also integrated in the BTF spec builder via a new method to add and deduplicate a types. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com>

lmb

Neat feature, looking forward to this!

lmb · 2025-11-27T11:57:56Z

btf/dedup.go

+	"slices"
+)
+
+func Deduplicate(roots []Type) ([]Type, error) {


Please add a doc comment.

Is this even used at the moment? Maybe just remove it outright if the use case is via the builder.

I think we can get rid of it, minimize the API surface. Its was useful while testings, but we can just call the unexported function to do that.

lmb · 2025-11-27T12:04:27Z

btf/dedup.go

+
+// maphash.Hash by default uses a random seed per Hash instance.
+// To ensure consistent hashing across multiple calls, we use a fixed seed for the lifetime of the program.
+var seed = maphash.MakeSeed()


I think we should store maphash.Hash and Seed in deduper instead. That avoids temporarily allocating Hash (which is big-ish) and gets rid of global state, which is always a bit of a smell.

lmb · 2025-11-27T12:08:04Z

btf/dedup.go

+	depthBudget int
+}
+
+// typeHash computes a hash for `t`. The produced hash is the same for types which are similar.


What does "similar" mean here? equivalent?

P.S. It took me a long time to figure out what this really does: produce a hash of the first couple of levels of indirection, as an approximation of equivalence. This is a really neat idea, I wonder whether we can simplify it even more.

Right now depth is changed dynamically based on what kind of type we are looking at. Sometimes we use depth 1, sometimes 2. This trades off how much work we do. At the same time it avoids infinite recursion for self referential types. I think instead of having some dynamic level we could always use a fixed budget. Basically make this the hash of the first N levels of indirection. If a type is self referential we just end up hashing the same data up to N times, which isn't a problem as long as N is something reasonably small.

This has the nice side effect that hash(T) now doesn't change anymore, and the cache can become map[Type]uint64. We might also be able to move the cache into the caller. Plus the code is a bit easier to understand.

Yes, as you might imagine this is where I started, a static budget. The issue is that in select cases, specifically double points and function prototypes you often end up with hash collisions since the first levels of these are often similar, causing a lot of work in equality checks. Increasing the budget overall fixes that, but it also makes the hash way more expensive, canceling out the benefit. Which is why I settled on the current state, adding depth where it reduces collisions going from ~100 (in double pointers and func protos) to <10 collisions on all types in vmlinux.

So we can simplify at the cost of some performance, I can run the numbers so we can quantify.

Here are the numbers when we swap out the dynamic depth limit for a static limit of 1, 2 or 3:

goos: linux goarch: amd64 pkg: github.com/cilium/ebpf/btf cpu: 13th Gen Intel(R) Core(TM) i7-13800H │ dynamic-depth.txt │ depth-1.txt │ depth-2.txt │ depth-3.txt │ │ sec/op │ sec/op vs base │ sec/op vs base │ sec/op vs base │ DeduplicateSKBuff-20 8.290m ± 4% 10.035m ± 5% +21.04% (p=0.000 n=10) 9.624m ± 10% +16.08% (p=0.000 n=10) 12.523m ± 5% +51.05% (p=0.000 n=10) DeduplicateVMLinux-20 95.49m ± 4% 123.51m ± 3% +29.35% (p=0.000 n=10) 96.37m ± 10% ~ (p=0.315 n=10) 98.56m ± 2% +3.22% (p=0.004 n=10) geomean 28.14m 35.21m +25.12% 30.45m +8.24% 35.13m +24.86% │ dynamic-depth.txt │ depth-1.txt │ depth-2.txt │ depth-3.txt │ │ B/op │ B/op vs base │ B/op vs base │ B/op vs base │ DeduplicateSKBuff-20 2.505Mi ± 4% 2.559Mi ± 17% +2.17% (p=0.014 n=10) 2.804Mi ± 19% +11.92% (p=0.000 n=10) 3.770Mi ± 17% +50.48% (p=0.000 n=10) DeduplicateVMLinux-20 29.06Mi ± 10% 29.74Mi ± 0% ~ (p=0.089 n=10) 30.99Mi ± 7% +6.63% (p=0.014 n=10) 31.99Mi ± 3% +10.09% (p=0.007 n=10) geomean 8.533Mi 8.724Mi +2.24% 9.321Mi +9.24% 10.98Mi +28.71% │ dynamic-depth.txt │ depth-1.txt │ depth-2.txt │ depth-3.txt │ │ allocs/op │ allocs/op vs base │ allocs/op vs base │ allocs/op vs base │ DeduplicateSKBuff-20 298.0 ± 3% 320.0 ± 3% +7.38% (p=0.000 n=10) 325.0 ± 12% +9.06% (p=0.000 n=10) 387.0 ± 8% +29.87% (p=0.000 n=10) DeduplicateVMLinux-20 7.081k ± 10% 12.128k ± 0% +71.28% (p=0.000 n=10) 8.315k ± 7% +17.43% (p=0.000 n=10) 7.968k ± 4% +12.53% (p=0.001 n=10) geomean 1.453k 1.970k +35.62% 1.644k +13.17% 1.756k +20.89%

It seems a static depth of 2 is the best alternative at ~10% slower.

lmb · 2025-11-27T12:12:24Z

btf/dedup.go

+		return cached
+	}
+	defer func() {
+		cache[key] = sum


Nit: neat but brittle due to variable shadowing. I'd probably just do this explicitly at the bottom of the method, since there is only a single return statement.

lmb · 2025-11-27T12:13:38Z

btf/dedup.go

+	h := &hash
+	h.SetSeed(seed)
+
+	switch t := t.(type) {


One problem with this approach is that it is very easy to forget to implement this for new types. Could we instead do something like the following?

type Type interface{ // ... // hash all comparable fields of the type, excluding any [Type]s. hash(*maphash.Hash) } func hashType(h *Hash, t Type, budget int) { if budget < 0 { return } // Write all non-Type fields to h. Since we don't deal with Type there is no recursion. t.hash(h) for _, c := range children(t) { hashType(h, *c, budget-1) } } func typeHash(h *Hash, s *Seed t *Type) (uint64) { h.SetSeed(s) hashType(h, t, N) return h.Sum64() }

(Assumes that caching of typeHash happens in the caller.)

Another upside is that this could let us add a test in testType or whatever that is called which asserts some behaviour of hash.

lmb · 2025-11-27T14:49:33Z

btf/dedup.go

+	}
+
+	// Check the cache to avoid recomputing the hash for this type and depth budget.
+	key := hashCacheKey{t, depthBudget}


Does it really pay off to put all types into the cache? Intuitively I'd expected we'd only cache composite types.

Yes, I believe it does, I recall testing just composites and this was better. I will double check and post the numbers.

Turns out just caching composites is better, so will do that instead:

goos: linux goarch: amd64 pkg: github.com/cilium/ebpf/btf cpu: 13th Gen Intel(R) Core(TM) i7-13800H │ cache-all.txt │ cache-composites.txt │ │ sec/op │ sec/op vs base │ DeduplicateSKBuff-20 8.322m ± 1% 8.045m ± 0% -3.33% (p=0.000 n=30) DeduplicateVMLinux-20 98.95m ± 2% 85.39m ± 1% -13.71% (p=0.000 n=30) geomean 28.70m 26.21m -8.67% │ cache-all.txt │ cache-composites.txt │ │ B/op │ B/op vs base │ DeduplicateSKBuff-20 2.673Mi ± 1% 1.725Mi ± 1% -35.46% (p=0.000 n=30) DeduplicateVMLinux-20 30.98Mi ± 6% 22.27Mi ± 3% -28.12% (p=0.000 n=30) geomean 9.100Mi 6.198Mi -31.89% │ cache-all.txt │ cache-composites.txt │ │ allocs/op │ allocs/op vs base │ DeduplicateSKBuff-20 311.0 ± 0% 260.0 ± 1% -16.40% (p=0.000 n=30) DeduplicateVMLinux-20 8.314k ± 6% 8.097k ± 3% ~ (p=0.059 n=30) geomean 1.608k 1.451k -9.77%

lmb · 2025-11-27T15:02:03Z

btf/dedup.go

+	return dm.deduplicateSingle(t)
+}
+
+func (dm *deduper) deduplicateSingle(t Type) Type {


Nit: I think you can merge Deduplicate and deduplicateSingle?

lmb · 2025-11-27T15:04:29Z

btf/dedup.go

+		return true
+	}
+
+	switch a := aTyp.(type) {


Similar feelings to this switch as the one above, it's very easy to forget to implement things. Not sure how to fix this though. The core code looks veeery similar, I'll look at that and maybe I can find a common abstraction.

lmb · 2025-11-27T15:05:22Z

btf/dedup.go

+
+		// Detect cycles by tracking visited types. Assume types are equal if we have already
+		// visited this type in the current equality check.
+		if slices.Contains(visited, aTyp) {


This is going to get quite expensive pretty fast, no?

Its really depends. Turns out CPUs are quite fast at checking this when its in contiguous memory, since its a predictable access pattern and cache lines ect. So for small-ish slices, its faster than a map, both the lookup and writing. Doing this only for pointers and not all types really helps. I can also double check and re-measure.

lmb · 2025-11-27T15:05:39Z

btf/dedup.go

+		return a.Value == b.Value &&
+			typesEqual(a.Type, b.Type, visited, cache)
+	default:
+		panic(fmt.Sprintf("unsupported type for equality: %T", a))


Not sure how I feel about this panic as well.

dylandreimerink mentioned this pull request Nov 21, 2025

BPF map spec registry cilium/cilium#42427

Open

dylandreimerink force-pushed the feature/dedup-btf branch from 48f732b to 8e2a957 Compare November 21, 2025 15:37

dylandreimerink marked this pull request as ready for review November 24, 2025 13:28

dylandreimerink requested review from lmb and ti-mo November 24, 2025 13:28

lmb reviewed Nov 27, 2025

View reviewed changes

btf: Add type deduplication #1903

Are you sure you want to change the base?

btf: Add type deduplication #1903

Uh oh!

Conversation

dylandreimerink commented Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lmb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dylandreimerink Nov 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

dylandreimerink commented Nov 21, 2025 •

edited

Loading

dylandreimerink Nov 27, 2025 •

edited

Loading