Skip to content

Conversation

@nopcoder
Copy link
Contributor

@nopcoder nopcoder commented Dec 13, 2025

Summary

This PR fixes and improves the pyramid package, which handles the local caching layer for range and meta-range files.

Key Changes

Resource Management:

  • Add Close() method to FS interface and TierFS implementation to properly clean up resources
  • Add Close() method to Eviction interface and ristretto eviction implementation
  • Properly close rangeFS and metaRangeFS in catalog initialization on error and during shutdown
  • Fix file handle leak in store() by ensuring files are closed after adapter.Put()

Goroutine Leak Fix:

  • Replace unbounded goroutine spawning for directory deletion with a bounded worker pattern
  • Add dirDeleteWorker background goroutine with proper shutdown signaling via done channel and sync.WaitGroup
  • Add buffered channel (dirDeleteChSize=1000) to queue directory deletion requests
  • Add metric errorsTotal{type="DirDeleteQueueFull"} to track when directory deletions are skipped due to full queue

Context Cancellation Fix:

  • Use context.Background() in openWithLock when fetching from block storage inside Compute()
  • The ChanOnlyOne.Compute pattern shares results across multiple callers; using a caller's context could cause one caller's cancellation to fail the fetch for all waiting callers

Bug Fixes:

  • Fix missing .Inc() calls on error metric counters (errorsTotal.WithLabelValues(...).Inc())
  • Fix incorrect path handling in storeLocalFile() - was using relative path instead of full path for file removal
  • Use filepath.Join instead of path.Join for OS-specific path handling throughout the codebase

Code Quality:

  • Simplify numCounters calculation in eviction using max() builtin
  • Replace interface{} with any type alias
  • Remove unnecessary else after return statements
  • Add documentation comments to localFileRef struct fields
  • Update mock implementations to include new Close() method

Fix https://github.com/treeverse/lakeFS-Enterprise/issues/1426
Fix #9824

@nopcoder nopcoder self-assigned this Dec 13, 2025
@github-actions github-actions bot added area/cataloger Improvements or additions to the cataloger area/testing Improvements or additions to tests labels Dec 13, 2025
Copy link
Contributor

@arielshaqed arielshaqed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reviewing before it's time - but these comments might be relevant no matter what we do.

if err := os.Remove(p); err != nil {
tfs.logger.WithError(err).WithField("path", p).Error("Removing file failed")
errorsTotal.WithLabelValues(tfs.fsName, "FileRemoval")
errorsTotal.WithLabelValues(tfs.fsName, "FileRemoval").Inc()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😱

Comment on lines +47 to +48
// dirDeleteChSize is the buffer size for the directory deletion channel
dirDeleteChSize = 1000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am of 2 minds about this: deletion can take a long time. I'd understand a really short queue. OTOH, as we delay longer to deleted means that we risk going over allowed disk space. 1000 directories to delete is a lot.

At the minimum, perhaps monitor the length of the queue?

Copy link
Contributor Author

@nopcoder nopcoder Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently we do not delete in async, learned it when I looked at what we actually doing in the context of the goroutine.

The new implementation is doing best effort of having a background work to check and cleanup empty folders.

Now I just wonder if we can do have a periodic check that will scan the local directory structure and cleanup empty directories as needed will be a better option.

PS. we log in case we try to push work to the queue and the queue is full - I'll add metrics too for easy monitoring.

tfs.logger.WithError(err).Error("Failed deleting empty dir")
errorsTotal.WithLabelValues(tfs.fsName, "DirRemoval")
// Queue directory deletion (non-blocking)
dirPath := filepath.Dir(string(rPath))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why we would ever want to delete an entire directory tree at a time. It is not faster to delete multiple files than it is to delete a single file. AFAICT the worker could delete each file separately, it would be no slower.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't modified the functionality. The link is to the current code that runs in the go routine, it calls deleteDirRecIfEmpty which only ensure that the directory will be removed if it is empty.

go func() {
if err := tfs.syncDir.deleteDirRecIfEmpty(path.Dir(string(rPath))); err != nil {
tfs.logger.WithError(err).Error("Failed deleting empty dir")
errorsTotal.WithLabelValues(tfs.fsName, "DirRemoval")
}

@nopcoder nopcoder requested a review from arielshaqed December 13, 2025 13:31
@nopcoder nopcoder changed the title Fix and improve the pyramid package pyramid: fix file handle and goroutine leaks, add proper cleanup Dec 13, 2025
@nopcoder nopcoder added the include-changelog PR description should be included in next release changelog label Dec 14, 2025
…ing (#9789)

* imporve pyramid delete file tracking

* apply code review changes
@nopcoder nopcoder requested a review from a team December 16, 2025 06:36
@nopcoder
Copy link
Contributor Author

Merged approved smaller pr into this one. d3558b8

@AliRamberg
Copy link
Contributor

How does this PR solve the mergeIntoBranch errors? This seems to solve other issues we have in pyramid.

@nopcoder
Copy link
Contributor Author

How does this PR solve the mergeIntoBranch errors? This seems to solve other issues we have in pyramid.

The final PR solves multiple issues; the error in mergeIntoBranch occurs when we request to metadata during merge, in pyramid when we fetch data from s3 during openWithLock we make sure that one caller (the first) will perform the download. When the caller request cancelled (in our case timed-out) the download stops and return cancellation error to all callers.
This can occurs a lot when there are multiple merges into the same branch at the same time.
In order not to failed the "single-flight" request this pr updated the get to use background context.

Copy link
Contributor

@AliRamberg AliRamberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pleased, and it does look very good to me :content:

defer tfs.wg.Done()
for {
select {
case <-tfs.done:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this intentional? Doesn't it mean we could Close TierFS while dirDeleteCh is still not empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, when we shutdown (close tier fs) we stop managing the cache and when we start we read/track everything back.

keeping the metrics to monitor issues related to how we cleanup empty directories.
@nopcoder nopcoder merged commit 5c93f72 into master Dec 16, 2025
67 of 70 checks passed
@nopcoder nopcoder deleted the fix/pyramid branch December 16, 2025 20:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/cataloger Improvements or additions to the cataloger area/testing Improvements or additions to tests include-changelog PR description should be included in next release changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix pyramid resource management edge cases

4 participants