Skip to content

Fix TurboQuant heap memory under-reporting#9099

Merged
generall merged 1 commit into
devfrom
fix/turboquant-heap-size-reporting
May 20, 2026
Merged

Fix TurboQuant heap memory under-reporting#9099
generall merged 1 commit into
devfrom
fix/turboquant-heap-size-reporting

Conversation

@generall

Copy link
Copy Markdown
Member

Problem

EncodedVectorsTQ was the only quantizer that did not override EncodedVectors::heap_size_bytes(), so it fell back to the trait default of 0.

Impact:

  • For the RAM-backed variants (TQRam/TQRamMulti), the underlying storage backend computes the resident quantized-vector size correctly — but EncodedVectorsTQ never delegated to it. The entire RAM-resident quantized dataset was reported as 0 bytes and misclassified as fully on-disk by MemoryReporter for QuantizedVectors.
  • The always-resident quantizer tables (Hadamard rotation + TQ+ error-correction shift/scale/d_prime_sq_i16 vectors) and the per-instance encoding buffer were uncounted for every variant, including mmap.

Fix

  • Make heap_size_bytes() a required trait method (remove the { 0 } default) so every quantizer must account for its own heap explicitly. U8/PQ/Binary already implemented it — only TurboQuant broke, surfacing the gap as a compile error.
  • Add the missing EncodedVectorsTQ impl: storage backend + quantizer tables + encoding buffer.
  • Add TurboQuantizer::heap_size_bytes() (rotation tables + TQ+ error-correction vectors) and HadamardRotation::heap_size_bytes() (chunk metadata; permutations are inline).

All four impls destructure Self so future field additions force a compile-time decision about whether they count toward heap.

Verification

cargo check -p quantization, cargo check -p segment, and cargo clippy -p quantization all pass clean.

🤖 Generated with Claude Code

EncodedVectorsTQ was the only quantizer that did not override the
EncodedVectors::heap_size_bytes() trait method, so it fell back to the
default of 0. For the RAM-backed variants (TQRam/TQRamMulti) this meant
the entire resident quantized dataset was reported as 0 bytes and
misclassified as fully on-disk by the MemoryReporter; the always-resident
quantizer tables (rotation + TQ+ error-correction vectors) and encoding
buffer were also uncounted for every variant.

Make heap_size_bytes() a required trait method (remove the default impl)
so every quantizer must account for its own heap explicitly, then add the
missing TurboQuant implementation: storage backend + quantizer tables +
encoding buffer.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@generall generall requested review from IvanPleshkov, JojiiOfficial and Copilot and removed request for Copilot May 19, 2026 23:45
@generall generall marked this pull request as ready for review May 19, 2026 23:46

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@lib/quantization/src/turboquant/rotation.rs`:
- Around line 48-58: The file fails to compile because heap_size_bytes uses
size_of::<T>() but std::mem::size_of is not imported; add the missing import by
adding use std::mem::size_of; near the top of the file with the other use
statements (alongside crate::turboquant::permutation::Permutation and
crate::turboquant::simd) so that the heap_size_bytes method (and any future uses
of size_of) can resolve correctly.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 8de765dd-3e32-45bb-860d-a70dcbffcb43

📥 Commits

Reviewing files that changed from the base of the PR and between faf2714 and 8ddf535.

📒 Files selected for processing (4)
  • lib/quantization/src/encoded_vectors.rs
  • lib/quantization/src/turboquant/mod.rs
  • lib/quantization/src/turboquant/quantization.rs
  • lib/quantization/src/turboquant/rotation.rs

Comment on lines +48 to +58
/// Heap memory owned by the rotation tables. The permutations are stored
/// inline (no heap), so only the chunk metadata vectors are counted.
pub(super) fn heap_size_bytes(&self) -> usize {
let Self {
permutations: _,
dim: _,
chunk_sizes,
chunk_norms,
} = self;
chunk_sizes.capacity() * size_of::<usize>() + chunk_norms.capacity() * size_of::<f64>()
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check for size_of import and verify compilation

# Search for size_of imports in this file
echo "=== Checking for size_of import in rotation.rs ==="
rg -n 'use.*size_of' lib/quantization/src/turboquant/rotation.rs

# Check for glob imports that might bring size_of into scope
echo ""
echo "=== Checking for glob imports ==="
rg -n 'use.*::.*\*' lib/quantization/src/turboquant/rotation.rs

# Verify the code compiles
echo ""
echo "=== Verifying compilation ==="
cargo check -p quantization --message-format=short 2>&1 | grep -i 'size_of\|rotation\.rs'

Repository: qdrant/qdrant

Length of output: 196


🏁 Script executed:

#!/bin/bash
# Check the imports and module context in rotation.rs
echo "=== Top 30 lines of rotation.rs (imports section) ==="
head -n 30 lib/quantization/src/turboquant/rotation.rs

echo ""
echo "=== Full imports in rotation.rs ==="
rg -n '^use ' lib/quantization/src/turboquant/rotation.rs | head -20

echo ""
echo "=== Check parent module (turboquant mod.rs or lib.rs) for exports ==="
find lib/quantization/src/turboquant -name 'mod.rs' -o -name 'lib.rs' | xargs ls -la 2>/dev/null

echo ""
echo "=== Check if size_of is re-exported in parent ==="
rg -n 'size_of' lib/quantization/src/turboquant/mod.rs 2>/dev/null || echo "(mod.rs not found or no size_of)"

echo ""
echo "=== Check Rust std prelude for size_of availability ==="
rg -n 'std::mem' lib/quantization/src/turboquant/rotation.rs || echo "(No explicit std::mem import)"

Repository: qdrant/qdrant

Length of output: 2101


🏁 Script executed:

#!/bin/bash
echo "=== First 50 lines of turboquant/mod.rs (imports section) ==="
head -n 50 lib/quantization/src/turboquant/mod.rs

echo ""
echo "=== Search for size_of import in mod.rs ==="
rg -n 'use.*size_of|use.*std::mem' lib/quantization/src/turboquant/mod.rs | head -10

echo ""
echo "=== Check if size_of is a re-export (pub use) in mod.rs ==="
rg -n 'pub.*size_of|pub.*use.*mem' lib/quantization/src/turboquant/mod.rs

Repository: qdrant/qdrant

Length of output: 1747


🏁 Script executed:

#!/bin/bash
echo "=== Context around line 491 in mod.rs where size_of is used ==="
sed -n '480,500p' lib/quantization/src/turboquant/mod.rs

echo ""
echo "=== All size_of usage in mod.rs with line numbers ==="
rg -n 'size_of' lib/quantization/src/turboquant/mod.rs

echo ""
echo "=== Look for imports within impl blocks or local scopes ==="
rg -B5 'size_of' lib/quantization/src/turboquant/mod.rs | head -30

Repository: qdrant/qdrant

Length of output: 1557


🏁 Script executed:

#!/bin/bash
echo "=== All imports in mod.rs ==="
rg -n '^use ' lib/quantization/src/turboquant/mod.rs

echo ""
echo "=== Check if std::mem is imported anywhere in mod.rs ==="
rg -n 'std::mem' lib/quantization/src/turboquant/mod.rs

echo ""
echo "=== Full cargo check output to see compilation errors ==="
cd lib/quantization && cargo check 2>&1 | grep -A5 'size_of\|rotation\.rs\|error\[' | head -40

Repository: qdrant/qdrant

Length of output: 1034


🏁 Script executed:

#!/bin/bash
echo "=== Running full cargo check on quantization package ==="
cd lib/quantization && cargo check 2>&1 | head -100

Repository: qdrant/qdrant

Length of output: 361


🏁 Script executed:

#!/bin/bash
echo "=== Read lines 1-160 of rotation.rs to see all imports and context ==="
head -n 160 lib/quantization/src/turboquant/rotation.rs | tail -n +140

Repository: qdrant/qdrant

Length of output: 627


🏁 Script executed:

#!/bin/bash
echo "=== Read lines 40-80 of rotation.rs showing heap_size_bytes method context ==="
sed -n '40,80p' lib/quantization/src/turboquant/rotation.rs

Repository: qdrant/qdrant

Length of output: 1368


🏁 Script executed:

#!/bin/bash
echo "=== Check if size_of is imported via prelude or macro in rotation.rs ==="
rg -n 'use ' lib/quantization/src/turboquant/rotation.rs

echo ""
echo "=== Check if size_of could be from a macro or attribute ==="
rg -n 'macro|#\[' lib/quantization/src/turboquant/rotation.rs | head -20

echo ""
echo "=== Verify the complete structure - check line numbers for imports vs impl ==="
wc -l lib/quantization/src/turboquant/rotation.rs

echo ""
echo "=== Read complete rotation.rs file to trace all imports ==="
cat lib/quantization/src/turboquant/rotation.rs | head -n 160

Repository: qdrant/qdrant

Length of output: 5587


🏁 Script executed:

#!/bin/bash
echo "=== Search for any use of std::mem in entire quantization crate ==="
rg 'use std::mem' lib/quantization/src/

echo ""
echo "=== Search for size_of usage throughout the quantization crate ==="
rg 'size_of' lib/quantization/src/ | grep -v test | grep -v '//'

echo ""
echo "=== Specifically check rotation.rs imports context around impl ==="
sed -n '1,60p' lib/quantization/src/turboquant/rotation.rs

Repository: qdrant/qdrant

Length of output: 5831


🏁 Script executed:

#!/bin/bash
echo "=== Check encoding.rs imports (which uses unqualified size_of) ==="
head -n 30 lib/quantization/src/turboquant/encoding.rs

echo ""
echo "=== Check quantization.rs imports ==="
head -n 30 lib/quantization/src/turboquant/quantization.rs

Repository: qdrant/qdrant

Length of output: 2498


🏁 Script executed:

#!/bin/bash
echo "=== Check if there's a common prelude macro in the crate ==="
rg -n 'macro_use|prelude' lib/quantization/src/lib.rs

echo ""
echo "=== Check the crate root ==="
head -n 50 lib/quantization/src/lib.rs

echo ""
echo "=== Verify if size_of without import works by checking compilation artifacts ==="
find . -name "*.rlib" -o -name "Cargo.lock" | head -5

Repository: qdrant/qdrant

Length of output: 1700


🏁 Script executed:

#!/bin/bash
echo "=== Search for 'use std::mem' in parent directories ==="
find lib/quantization -name '*.rs' -exec grep -l 'use std::mem' {} \;

echo ""
echo "=== Check if there's a global prelude in encoding context ==="
rg -n 'std::mem::size_of' lib/quantization/src/turboquant/

echo ""
echo "=== Most importantly: does rotation.rs line 58 match the review snippet exactly? ==="
sed -n '48,58p' lib/quantization/src/turboquant/rotation.rs

Repository: qdrant/qdrant

Length of output: 704


🏁 Script executed:

#!/bin/bash
echo "=== Check if rotation.rs has been recently modified ==="
git log --oneline -5 lib/quantization/src/turboquant/rotation.rs 2>/dev/null | head -5

echo ""
echo "=== Check the exact state: is size_of used elsewhere in working code? ==="
rg -B2 'size_of::<' lib/quantization/src/turboquant/encoding.rs | head -15

echo ""
echo "=== Final verification: search for implicit prelude or wildcard imports that might bring size_of ==="
rg 'prelude|^use \*' lib/quantization/src/

Repository: qdrant/qdrant

Length of output: 2638


🏁 Script executed:

#!/bin/bash
echo "=== Check the recent commit that fixed heap memory reporting ==="
git show 8ddf5351:lib/quantization/src/turboquant/rotation.rs | head -n 70

Repository: qdrant/qdrant

Length of output: 2459


🏁 Script executed:

#!/bin/bash
echo "=== Check the full diff of the heap memory fix commit ==="
git show 8ddf5351 --stat

echo ""
echo "=== Check if size_of is now imported via use std::mem::* somewhere ==="
git show 8ddf5351 lib/quantization/src/turboquant/rotation.rs | grep -E '^\+.*use|size_of' | head -20

Repository: qdrant/qdrant

Length of output: 50370


🏁 Script executed:

#!/bin/bash
echo "=== Search for how size_of is actually used in working code files ==="
rg 'size_of' lib/quantization/src/turboquant/rotation.rs

echo ""
echo "=== Check if there's a prelude macro or import I missed ==="
cat lib/quantization/src/turboquant/rotation.rs | grep -E 'use|prelude|macro' | head -20

Repository: qdrant/qdrant

Length of output: 499


Add missing std::mem::size_of import.

The method uses size_of::<T>() without importing it. Add use std::mem::size_of; at the top of the file to bring it into scope.

use crate::turboquant::permutation::Permutation;
use crate::turboquant::simd;
use std::mem::size_of;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/quantization/src/turboquant/rotation.rs` around lines 48 - 58, The file
fails to compile because heap_size_bytes uses size_of::<T>() but
std::mem::size_of is not imported; add the missing import by adding use
std::mem::size_of; near the top of the file with the other use statements
(alongside crate::turboquant::permutation::Permutation and
crate::turboquant::simd) so that the heap_size_bytes method (and any future uses
of size_of) can resolve correctly.

@generall generall merged commit 82176fd into dev May 20, 2026
19 checks passed
@generall generall deleted the fix/turboquant-heap-size-reporting branch May 20, 2026 07:48
@qdrant qdrant deleted a comment from coderabbitai Bot May 20, 2026
generall added a commit that referenced this pull request May 22, 2026
EncodedVectorsTQ was the only quantizer that did not override the
EncodedVectors::heap_size_bytes() trait method, so it fell back to the
default of 0. For the RAM-backed variants (TQRam/TQRamMulti) this meant
the entire resident quantized dataset was reported as 0 bytes and
misclassified as fully on-disk by the MemoryReporter; the always-resident
quantizer tables (rotation + TQ+ error-correction vectors) and encoding
buffer were also uncounted for every variant.

Make heap_size_bytes() a required trait method (remove the default impl)
so every quantizer must account for its own heap explicitly, then add the
missing TurboQuant implementation: storage backend + quantizer tables +
encoding buffer.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai coderabbitai Bot mentioned this pull request May 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants