Skip to content

A fast, local, indexed code search tool optimized for AI coding assistants. Written in Rust using Tantivy for full-text indexing.

License

Notifications You must be signed in to change notification settings

yetidevworks/ygrep

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

29 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ygrep

A fast, local, indexed code search tool optimized for AI coding assistants. Written in Rust using Tantivy for full-text indexing.

Features

  • Literal text matching - Works like grep by default, special characters included ($variable, {% block, ->get(, @decorator)
  • Regex support - Use -r flag for regex patterns (fn\s+main, TODO|FIXME)
  • Code-aware tokenizer - Preserves $, @, # as part of tokens (essential for PHP, Shell, Python, etc.)
  • Fast indexed search - Tantivy-powered BM25 ranking, instant results
  • File watching - Incremental index updates on file changes
  • Optional semantic search - HNSW vector index with local semantic model (all-MiniLM-L6-v2)
  • Symlink handling - Follows symlinks with cycle detection
  • AI-optimized output - Clean, minimal output with file paths and line numbers

Installation

Homebrew (macOS/Linux)

brew install yetidevworks/ygrep/ygrep

From Source

# Using cargo
cargo install --path crates/ygrep-cli

# Or build release
cargo build --release
cp target/release/ygrep ~/.cargo/bin/

Quick Start

1. Install for your AI tool

ygrep install claude-code    # Claude Code
ygrep install opencode       # OpenCode
ygrep install codex          # Codex
ygrep install droid          # Factory Droid

2. Index your project

ygrep index                    # Fast text-only index
ygrep index --semantic         # With semantic search (better natural language queries)

3. Search

ygrep "search query"         # Shorthand
ygrep search "search query"  # Explicit

That's it! The AI tool will now use ygrep for code searches.

Usage

Searching

# Basic search (literal text matching by default)
ygrep "$variable"                  # PHP/Shell variables
ygrep "{% block content"           # Twig templates
ygrep "->get("                     # Method calls
ygrep "@decorator"                 # Python decorators

# Regex search (use -r or --regex)
ygrep search "fn\s+\w+" -r         # Function definitions
ygrep search "TODO|FIXME" -r       # Multiple patterns
ygrep search "^import" -r          # Line anchors

# With options
ygrep search "error" -n 20         # Limit results
ygrep search "config" -e rs -e toml # Filter by extension
ygrep search "api" -p src/         # Filter by path

# Output formats (AI format is default)
ygrep search "query"               # AI-optimized (default)
ygrep search "query" --json        # JSON output
ygrep search "query" --pretty      # Human-readable

Indexing

ygrep index                        # Index current directory (honors stored mode)
ygrep index --rebuild              # Force rebuild (required after ygrep updates)
ygrep index --semantic             # Build semantic index (sticky - remembered)
ygrep index --text                 # Build text-only index (sticky - remembered)
ygrep index /path/to/project       # Index specific directory

The --semantic and --text flags are sticky - once set, subsequent ygrep index commands (without flags) will remember and use the same mode. This also applies to ygrep watch.

File Watching

ygrep watch                        # Watch current directory (honors stored mode)
ygrep watch /path/to/project       # Watch specific directory

File watching automatically uses the same mode (text or semantic) as the original index.

Status

ygrep status                       # Show index status
ygrep status --detailed            # Detailed statistics

Index Management

ygrep indexes list                 # List all indexes with sizes and type
ygrep indexes clean                # Remove orphaned indexes (freed disk space)
ygrep indexes remove <hash>        # Remove specific index by hash
ygrep indexes remove /path/to/dir  # Remove index by workspace path

Example output:

# 2 indexes (24.0 MB)

1bb65a32a7aa44ba  319.4 KB  [text]
  /path/to/project

c4f2ba4712ed98e7  23.7 MB  [semantic]
  /path/to/another-project

Semantic Search (Optional)

Enable semantic search for better results on natural language queries:

# Build semantic index (one-time, slower - mode is remembered)
ygrep index --semantic

# Search automatically uses hybrid mode when semantic index exists
ygrep "authentication flow"        # Uses BM25 + semantic search

# Force text-only search (single query, doesn't change index mode)
ygrep search "auth" --text-only

# Future index/watch commands remember the mode
ygrep index                        # Still semantic
ygrep watch                        # Watches with semantic indexing

# Convert back to text-only index
ygrep index --text

Semantic search uses the all-MiniLM-L6-v2 model (~25MB, downloaded on first use).

Note: Semantic search requires ONNX Runtime and is only available on certain platforms:

  • ✅ macOS ARM64 (Apple Silicon)
  • ✅ Linux x86_64
  • ❌ Linux ARM64/ARMv7/musl (text search only)

On unsupported platforms, ygrep works normally with BM25 text search - the --semantic flag will print a warning.

AI Tool Integration

ygrep integrates with popular AI coding assistants:

Claude Code

ygrep install claude-code          # Install plugin
ygrep uninstall claude-code        # Uninstall plugin

After installation, restart Claude Code. The plugin:

  • Runs ygrep index on session start
  • Provides a skill that teaches Claude to use ygrep for searches
  • Invoke with /ygrep then ask Claude to search

OpenCode

ygrep install opencode             # Install tool
ygrep uninstall opencode           # Uninstall tool

Codex

ygrep install codex                # Install skill
ygrep uninstall codex              # Uninstall skill

Factory Droid

ygrep install droid                # Install hooks and skill
ygrep uninstall droid              # Uninstall

Example Output

AI Format (Default)

Optimized for AI assistants - single line header with score and match type:

# 5 results (3 text + 2 semantic)

src/config.rs:45 (85%) +
  pub struct Config {

src/main.rs:12 (72%) ~
  fn main() -> Result<()> {

src/lib.rs:100 (65%)
  let workspace = Workspace::open(&config)?;

Format: path:line (score%) [match_indicator]

  • + = Hybrid match (both text AND semantic)
  • ~ = Semantic only (no exact text match)
  • No indicator = Text only

JSON Format

Full metadata with --json:

{
  "hits": [...],
  "total": 5,
  "query_time_ms": 42,
  "text_hits": 3,
  "semantic_hits": 2
}

Each hit includes match_type: "Text", "Semantic", or "Hybrid".

Pretty Format

Human-readable with --pretty:

# 5 results (3 text + 2 semantic)

src/config.rs:45-67
  45: pub struct Config {
  46:     pub data_dir: PathBuf,
  47:     pub max_file_size: u64,

src/main.rs:12-28
  12: fn main() -> Result<()> {
  13:     let config = Config::load()?;
  14:     let workspace = Workspace::open(&config)?;

How It Works

  1. Indexing: Walks directory tree, indexes text files with Tantivy using a code-aware tokenizer
  2. Tokenizer: Custom tokenizer preserves code characters ($, @, #, -, _) as part of tokens
  3. Search: BM25-ranked literal search (default) or regex matching with -r flag, plus optional semantic search
  4. Results: Returns matching files with line numbers and context

Configuration

Index data stored in:

  • macOS: ~/Library/Application Support/ygrep/indexes/
  • Linux: ~/.local/share/ygrep/indexes/

Upgrading

# Via Homebrew
brew upgrade ygrep

# Then rebuild indexes to use latest tokenizer
ygrep index --rebuild

License

MIT

About

A fast, local, indexed code search tool optimized for AI coding assistants. Written in Rust using Tantivy for full-text indexing.

Resources

License

Stars

Watchers

Forks

Packages

No packages published