Skip to content

vs-123/dman

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DMAN

dman is a simple, POSIX-compliant, policy-based directory pruner written in C99 designed for high-performance cleanup of volatile cache and transient storage structures.

dman functions by performing a targeted, stream-oriented traversal of the filesystem. Instead of indexing an entire directory structure into a memory-heavy tree or list, dman utilises the POSIX nftw interface to process files one by one as it encounters them. This callback-based architecture ensures that the memory footprint remains constant regardless of how many thousands of files are contained within your target directories.

FEATURES

  • [POLICY-DRIVEN TRAVERSAL]  —  Control over file removal based on maximum-age (in days) and minimum-size (bytes)

  • [HARD RECURSION GUARD]  —  A strict maximum depth limit is implemented to prevent runaway deletion. This acts as a fail-safe that preventes accidental deletion of broad root-level directory structures should the user point dman to a misconfigured path

  • [ZERO-DEPENDENCY CORE]  —  Compiles with standard C99. No GNU extensions nor any external libraries were used. Just the standard library and POSIX ftw & unistd

  • [DRY-RUN]  —  Provides a --dry-run mode that allows you to audit before making any permanent disk mutation

  • [MINIMAL MEMORY FOOTPRINT]  —  dman uses a stream-based traversal pattern in the stead of building recursive node arrays. This allows dman to maintain a low resident memory footprint even on large cache structures

  • [SYMLINK AVOIDANCE]  —  dman strictly avoids following symbolic links with the use of FTW_PHYS flag. This ensures that the utility does not accidentally trigger deletions outside of the intended target directory

  • [FILESYSTEM BOUNDARY]  —  dman stays within the same filesystem as the starting directory with the use of FTW_MOUNT flag. This prevents the utility from accidentally traversing into mounted network shares or pseudo-filesystems like /proc, /sys, etc., or other disks

BUILD INSTRUCTIONS

Obtain a local copy of this repository with git clone and enter it.

% git clone https://github.com/vs-123/dman.git
% cd dman

CMAKE METHOD

Create a build directory and build this project with cmake inside it

% mkdir build && cd build
% cmake ..
% cmake --build .
% ls -F
dman*  compile_commands.json  Makefile  CMakeCache.txt

MANUAL METHOD

Create a build directory to keep things tidy, and then use your favourite compiler to compile the source code.

% mkdir build && cd build
% cc -o dman ../src/main.c
% ls -F
dman*

You may now use the generated binary dman

% ./dman --help

BENCHMARK

The following benchmark demonstrates the memory efficiency of dman against standard system find utility using two dummy cache directories. This was run on FreeBSD.

The script merely executes both programs with /usr/bin/time -l to display metrics.

% sh ../scripts/gen-test.sh
[INFO] DIRECTORY dummy-cache-A GENERATED AND POPULATED
[INFO] DIRECTORY dummy-cache-B GENERATED AND POPULATED
[DONE] EXITING...

% ls -l .test.log
drwxr-xr-x 253 vs  vs   7.9K  9 Mar 04:54 dummy-cache-A/
drwxr-xr-x 253 vs  vs   7.9K  9 Mar 04:54 dummy-cache-B/

% sh ../scripts/run-bench.sh
[INFO] BENCHMARKING DMAN...
[INFO] BENCHMARKING FIND...
[RESULTS]
===  DMAN  ===
        0.01 real         0.00 user         0.01 sys
             1474560  maximum resident set size
           153256463  instructions retired
            41028226  cycles elapsed
             1130808  peak memory footprint
===  FIND  ===
        0.01 real         0.00 user         0.01 sys
             1507328  maximum resident set size
           169091683  instructions retired
            43661846  cycles elapsed
             1179864  peak memory footprint

OBSERVATIONS

  • RESIDENT SET SIZE  —  As we see in the output, dman maintains a lower peak memory footprint compared to find. Since dman performs all evaluation (size, age, depth) within the initial nftw callback, it avoids secondary stat calls or additional memory allocation per file. This results in a tighter instruction-to-throughput ratio.

  • INSTRUCTION EFFICIENCY  —  dman completes the scan by retiring fewer instructions (~153 million vs. ~169 million). This implies our implementation is a bit more optimised for the specific task of directory pruning based on age and size.

  • REAL-TIME THROUGHPUT  —  The slightly lower cycles elapsed (~41 million vs. ~43.6 million) indicates that dman is a bit more efficient in its CPU usage for this specific workload and hence provides faster results whilst consuming slightly fewer system resources.

REPRODUCIBILITY

This repository provides two scripts located in the scripts/ directory — gen-cache.sh and run-bench.sh. This allows you to independently verify the benchmark results.

  • gen-test.sh  —  Generates two identical directory structures dummy-cache-A and dummy-cache-B which contain a mix of 1MB and 5MB files produced with dd and /dev/urandom. This ensures the test environment is consistent across multiple runs and accurately simulates typical cache behaviour.

  • run-bench.sh  —  Utilises /usr/bin/time -l to capture detailed resource statistics during the execution of both dman and find. These metrics are piped into individual text files to provide the comparative data presented above.

DISCLAIMER

dman was originally developed as a specialised, single-purpose tool intended to address my personal requirements for a lightweight, policy-based filesystem maintenance. This is not designed to replace mature, general-purpose utilities like find, rm or professional-grade storage management systems.

Although the provided benchmarks demonstrate efficiency in specific scenarios, it is important to note that this project makes no claims of superiority or universal applicability.

It is recommended to use it for its intended purpose, with the understanding that it is a specialised utility and not a general-purpose replacement.

LICENSE

This project is licensed under the GNU Affero General Public License version 3.0 or later.

NO WARRANTY PROVIDED

For more information, see LICENSE file or visit https://www.gnu.org/licenses/agpl-3.0.en.html.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors