1 unstable release
| 0.1.0 | Jul 23, 2025 |
|---|
#499 in Concurrency
145KB
280 lines
Blewm: A Bloom Filter that Bloo(m) My Mind
Blewm is a fast, concurrent, lock-free Bloom filter.
The core idea of the fast implementation (deriving multiple bit indexes/positions from a single hash) was heavily insipred by fastbloom. Therefore, it retains its excellent theoretical guarantees and practical properties.
However, Blewm further enhances the implementation in a number of ways:
- It is based on atomics and uses relaxed loads and stores, therefore it can be used from multiple threads, which should have little to no performance penalty on modern platforms such as x64 and aarch64.
- The capacity is always a power of two, so the higher-order bits of the hash can
be used directly for indexing into an atomic
usizeslot, while the lower-order bits can be used for addressing individual bits within that slot, speeding up accesses even further. - The underlying buffer can be accessed by mutable reference, as well as by-value.
- Equality is conditioned on the equality of the hashers, so it's not possible to
accidentally get a
truewhen equating two Blewm filters using different hashers. It would not make sense to compare those as equal, since different hashers produce potentially different bit patterns, which is very typical of randomized hashers of high statistical (such as the std SipHash) or downright cryptographic (e.g. AHash) quality. - It marks many relevant functions as
const - It gets rid of the annoying builder and typestate pattern, and provides a small number of convenience constructors instead.
Example
use blewm::Filter;
fn main() {
// creates a concurrent Bloom filter using the default std hasher,
// the specified capacity, and the desirable maximum false positive rate.
let filter = Filter::new(100_000, 0.00001);
// the filter is still empty
assert_eq!(filter.contains("foo"), false);
// insert some elements
filter.insert("foo");
filter.insert("bar");
// check that they are in fact inserted
assert!(filter.contains("bar"));
assert!(filter.contains("foo"));
// check that an unrelated element is not inserted.
// NOTE: this may fail with the probability specified above, i.e., 0.00001.
assert_eq!(filter.contains("qux"), false);
}
Benchmarks
Run Criterion benchmarks using cargo bench. The results of benchmarking the library
on my own computer can be accessed here.
The main take-away is that Blewm is easily competitive with fastbloom.