Skip to content

jamra/cuckoo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cuckoo Filter

Go Reference Go Report Card License: MIT

A high-performance, generic Cuckoo filter implementation in Go. Cuckoo filters are space-efficient probabilistic data structures used to test whether an element is a member of a set, with the added benefit of supporting deletions unlike traditional Bloom filters.

Features

  • Generic Implementation: Works with any hashable type (string, int, []byte, etc.)
  • Deletion Support: Unlike Bloom filters, supports item removal
  • High Performance: O(1) expected time for all operations
  • Space Efficient: Better space utilization than Bloom filters for low false positive rates
  • Bounded False Positives: Guaranteed maximum false positive rate
  • Cache Friendly: Bucket-based design for better memory locality

Installation

go get github.com/jamra/cuckoo

Quick Start

package main

import (
    "fmt"
    "github.com/jamra/cuckoo"
)

func main() {
    // Create a new Cuckoo filter for strings with capacity of 1000
    cf := cuckoo.New[string](1000)
    
    // Insert items
    cf.Insert("apple")
    cf.Insert("banana")
    cf.Insert("cherry")
    
    // Lookup items
    fmt.Println(cf.Lookup("apple"))   // true
    fmt.Println(cf.Lookup("banana"))  // true
    fmt.Println(cf.Lookup("grape"))   // false (probably)
    
    // Delete items (unique feature compared to Bloom filters!)
    cf.Delete("apple")
    fmt.Println(cf.Lookup("apple"))   // false
    
    // Get statistics
    stats := cf.GetStats()
    fmt.Printf("Size: %d, Load Factor: %.2f%%\n", 
        stats.Size, stats.LoadFactor*100)
}

API Reference

Creating a Filter

// Create a filter for strings
cf := cuckoo.New[string](1000)

// Create a filter for integers
cfInt := cuckoo.New[int](1000)

// Create a filter for byte slices
cfBytes := cuckoo.New[[]byte](1000)

Basic Operations

// Insert an item
success := cf.Insert("item")

// Check if item exists (may have false positives)
exists := cf.Lookup("item")

// Delete an item
deleted := cf.Delete("item")

// Get current size
size := cf.Size()

// Get load factor
loadFactor := cf.LoadFactor()

Statistics

stats := cf.GetStats()
fmt.Printf("Size: %d\n", stats.Size)
fmt.Printf("Capacity: %d\n", stats.Capacity)
fmt.Printf("Load Factor: %.2f\n", stats.LoadFactor)
fmt.Printf("Memory Usage: %d bytes\n", stats.MemoryBytes)

Supported Types

The filter supports any type that satisfies the Hashable constraint:

  • string
  • int, int32, int64
  • uint, uint32, uint64
  • []byte

How It Works

Cuckoo Hashing

Cuckoo filters use cuckoo hashing with two hash functions. Each item can be stored in one of two possible buckets:

  1. Primary bucket: Determined by hash of the item
  2. Alternative bucket: Determined by XOR of primary bucket and hash of fingerprint

Fingerprints

Instead of storing full items, Cuckoo filters store compact fingerprints (16-bit in this implementation):

  • Reduces memory usage
  • Enables efficient comparisons
  • Allows reconstruction of alternative bucket index

Cuckoo Eviction

When both possible buckets are full:

  1. Randomly select one of the two buckets
  2. Randomly evict an existing fingerprint
  3. Insert the new fingerprint
  4. Try to relocate the evicted fingerprint to its alternative bucket
  5. Repeat until successful or maximum kicks reached

Performance

Time Complexity

  • Insert: O(1) expected, O(log n) worst case
  • Lookup: O(1)
  • Delete: O(1)

Space Complexity

  • Memory usage: ~2-3 bytes per item (depending on load factor)
  • Load factor: Recommended 50-90% for optimal performance

Cuckoo Filter vs Bloom Filter

Feature Cuckoo Filter Bloom Filter
Deletions ✅ Supported ❌ Not supported
False Positives ✅ Bounded rate ❌ Unbounded growth
Space Efficiency ✅ Better for low FPR ❌ Worse for low FPR
Lookup Speed ✅ O(1) ✅ O(1)
Insert Speed ⚠️ O(1) expected* ✅ Always O(1)
Implementation ⚠️ More complex ✅ Simple

*May fail when filter approaches capacity

Use Cases

When to Use Cuckoo Filters

  • Caching Systems: When you need to evict items from cache
  • Databases: Negative query optimization with deletion support
  • Network Systems: Duplicate detection with item expiration
  • Security: Blocklist/allowlist management with updates
  • Analytics: Unique visitor tracking with GDPR deletion requirements

Testing

Run the test suite:

go test -v

Run benchmarks:

go test -bench=. -benchmem

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

References

About

A cuckoo filter implementation

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages