Skip to content

feat(cli): Add an rg-equivalent CLI interface for fff#561

Open
markovejnovic wants to merge 1 commit into
dmtrKovalenko:mainfrom
markovejnovic:feat/cli
Open

feat(cli): Add an rg-equivalent CLI interface for fff#561
markovejnovic wants to merge 1 commit into
dmtrKovalenko:mainfrom
markovejnovic:feat/cli

Conversation

@markovejnovic

@markovejnovic markovejnovic commented Jun 2, 2026

Copy link
Copy Markdown

This PR adds a CLI interface to fff. After talking with @dmtrKovalenko, we decided that the best architecture is to have an fff daemon run in the background to keep the index hot. The command that the user interacts with is fff-rg, which talks to this daemon.

Before we proceed, it's worth looking at the architecture diagram here:

flowchart LR
  subgraph fff-rg
    rg-searcher
    fffd-searcher
  end

  subgraph fff-daemon
    query-service
    subgraph session-pool
      session-1
      session-2
    end

    query-service --> session-1
    query-service --> session-2
  end

  rg-bin[[rg]]

  fffd-searcher <-->|unix sock|query-service
  rg-searcher <-->|subproc| rg-bin
Loading

Let's walk over each component:

fff-rg

fff-rg can search either through talking through the fff-daemon, or through shelling to rg. It picks the fff-daemon if it believes to be working within a git repo. Otherwise, it will fall back to rg, and if the user doesn't have it installed, it will abort.

The core reasoning for this is that holding an fff-daemon alive for directories that don't need to be indexed makes no sense, as you'll hold a very large amount of RAM in memory for an effectively one-off search.

fff-daemon

The fff-daemon holds within it a query-service. This query-service is responsible for accepting connections from new clients and then passing the request down to the session-pool.

The session-pool is a pool of "active" sessions, ie. active git repositories for which we have an fff index in memory. A couple notable facts:

  • The session pool automatically reaps old sessions. If a search hasn't been performed for a while on a directory, chances are the user no longer cares for it, so we should evict it.
  • The session pool has an LRU policy which means that if we're out of available pool slots, we'll kick out some unused and old session in favor of creating a new one.

IPC protocol

I spent a good bit of time thinking through how best to handle the IPC protocol, and the epiphany I had is that we don't actually need to shuttle results between the daemon and the client. There may be hundreds, if not thousands of strings we'd need to copy, so if we can avoid it, that would be awesome.

On UNIX, you can pass file descriptors by using SCM_RIGHTS between processes. This enables us to take the stdout fd of the client, pass it to the daemon, and have the daemon directly write to the client's stdout. Neat!

The downside of this approach is that SCM_RIGHTS requires a unix domain socket, which means that passing the fd needs to happen over that. I evaluated:

  • Adding a sidecar iceoryx2 for all the other message passing, but ultimately I couldn't justify the maintenance burden. There's a ton of complexity added in figuring synchronizing the req-rep flow of the mmaped iceoryx2 channels and the req-rep of the fd transfer, so it made no sense.
  • Just having the req-rep flow between the client and the daemon live on the UDS (unix domain socket), which is ultimately what I landed on.

This lives in the crates/cli/fff-ipc-domain crate.

Holes in the implementation

  • Tests are quite lackluster. I added some testing, but it's quite weak.
  • There are a couple open questions I've left as comments on this PR.
  • The LRU logic could be separated out to a separate container which is responsible for automatic eviction.

Comment on lines +29 to +30
/// Case sensitivity strategy for grep searches. Mirrors `fff::CaseMode` but
/// with rkyv derives — fff-core doesn't depend on rkyv.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is up for debate whether we want to keep it this way or not.

I wanted to avoid adding another dependency to fff that is only used in the ipc code-path, but with optimizers being what they are, maybe that's not so bad.

either way, i chose this path of having two CaseModes (one in fff and one in fff-ipc-domain), but open to discussion

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can have optional dependencies - this is absolutely fine

Comment thread crates/cli/fff-ipc-domain/src/lib.rs Outdated
#[rkyv(derive(Debug))]
pub struct SearchRequest {
/// Root directory to search in (must be an absolute path).
pub directory: String,

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to stop passing paths as strings. much better bet is to pass pathbufs as deep as possible until the fff barrier.

@markovejnovic markovejnovic marked this pull request as ready for review June 10, 2026 03:21
@markovejnovic markovejnovic force-pushed the feat/cli branch 2 times, most recently from 3a207ad to 9969bba Compare June 10, 2026 18:57
Bootstrap the CLI layer for FFF:

- fff-ipc-domain: wire types and IPC protocol (Unix socket, bincode)
- fff-daemon: background search daemon with session pooling,
  rg-compatible output formatting, and ANSI color matching
- fff-rg: ripgrep-compatible CLI frontend with daemon/fallback
  searcher backends

Includes 120 e2e tests:
- 95 comparison tests (fff-rg vs rg side-by-side) across inline,
  heading, vimgrep, context, color, quiet, count, regex, unicode,
  and edge-case modes using test-case crate for parametrization
- 25 synthetic repo scale tests (50/200/500 files) verifying match
  counts, line numbers, output formats, concurrency, and per-needle
  findability without rg comparison

@dmtrKovalenko dmtrKovalenko left a comment

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits on the code I need to go through the way it actually works one more time

shared_picker.wait_for_scan(Duration::from_secs(120));
let file_count = {
let guard = shared_picker.read().expect("read lock");
guard.as_ref().expect("picker present").get_files().len()

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would require a major version bump

//! status byte. Spawns the daemon on first use if it isn't already running.

use std::io::{Read, Write};
use std::os::unix::io::AsRawFd;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ideally we should get all the unix specific mode into a separate file cause I would love this to work on windows at some point

fn into_core(self) -> Self::Core;
}

impl IntoCoreExt for fff_ipc_domain::CaseMode {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use From trait

fn main() {
let args = Args::parse();

tracing_subscriber::fmt()

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use our log crate

/// Default max file size for grep when the client doesn't specify one (4 MiB).
const DEFAULT_MAX_FILE_SIZE: u64 = 4 * 1024 * 1024;

use fff::{

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not intermix imports and constants

)]
/// Mirrors a subset of `rg` flags so `fff-rg` is a drop-in replacement.
#[allow(clippy::struct_excessive_bools)]
pub struct Args {

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should restructure the crates to be somethjing like this

cli/
daemon
ffd
frg
ipc

this iwll make it much easier to keep backward compatibility with those tools

use crate::types::cli::Args;

#[global_allocator]
static GLOBAL: mimalloc::MiMalloc = mimalloc::MiMalloc;

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think we need mimaloc here - can drop a dependency

@@ -0,0 +1,147 @@
use crate::util::Dir;

const EXTENSIONS: &[&str] = &["rs", "ts", "json", "md", "txt", "toml", "yaml"];

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for tests I would prefer tests in the big repo using proptest on various flags + queries

bytesize = "2"
mimalloc = { workspace = true }
git2 = { workspace = true }
which = "8.0.3"

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im not sure we need it - if it's on a path Command will resolve it

Comment thread crates/cli/rustfmt.toml
@@ -0,0 +1,6 @@
edition = "2024"

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls do not override

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants