Skip to content

lambdamechanic/groma

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Groma

A semantic code search tool for Git repositories that uses vector embeddings to find relevant files based on natural language queries.

Two Versions Available

1. groma - Cloud-based with Qdrant

  • Uses Qdrant vector database (requires Docker)
  • Uses OpenAI API for embeddings (requires API key, costs money)
  • Higher quality embeddings
  • Needs internet connection

2. groma-lancedb - Fully Local & Free

  • Uses LanceDB (embedded, no server needed)
  • Uses local fastembed model (AllMiniLML6V2)
  • 100% offline, no API calls
  • Completely free
  • Your code never leaves your machine

Installation

# Clone the repository
git clone https://github.com/yourusername/groma.git
cd groma

# Build both versions
cargo build --release --features qdrant --bin groma
cargo build --release --features lancedb --bin groma-lancedb

# Install to your PATH
cp target/release/groma ~/.local/bin/
cp target/release/groma-lancedb ~/.local/bin/

Usage

Both versions use the same command-line interface:

# Basic usage - pipe your query through stdin
echo "authentication logic" | groma /path/to/repo --cutoff 0.3

# Or use the LanceDB version (no setup needed!)
echo "authentication logic" | groma-lancedb /path/to/repo --cutoff 0.3

Options

  • --cutoff - Similarity threshold (0.0-1.0, default: 0.7)
  • --suppress-updates - Skip indexing, query existing data only
  • --debug - Enable debug logging

Setup Requirements

For groma (Qdrant version)

  1. Start Qdrant Docker container:
docker run -p 6334:6334 -v ~/.qdrant_data:/qdrant/storage qdrant/qdrant
  1. Set OpenAI API key:
export OPENAI_API_KEY='your-api-key-here'
  1. Optional - Set custom Qdrant URL:
export QDRANT_URL='http://your-qdrant-host:6334'

For groma-lancedb (Local version)

No setup required! Just run it. The first run will download the embedding model (~80MB) automatically.

MCP Server Mode (LanceDB only)

groma-lancedb can run as an MCP (Model Context Protocol) server:

# Run as MCP server
groma-lancedb mcp

# With debug logging (logs to /tmp/groma.log)
groma-lancedb mcp --debug

MCP Configuration

Add to your MCP client config (e.g., Claude Desktop):

{
  "mcpServers": {
    "groma": {
      "command": "/path/to/groma-lancedb",
      "args": ["mcp"]
    }
  }
}

The MCP server provides a query tool for semantic code search:

  • query: Search query string
  • folder: Repository path to search
  • cutoff: Similarity threshold (0.0-1.0, default 0.3)

How It Works

  1. Indexing: On first run, Groma scans your Git repository and creates embeddings for all tracked files
  2. Incremental Updates: Subsequent runs only process changed files
  3. Semantic Search: Your query is embedded and compared against the indexed files
  4. Results: Returns relevant file paths and content snippets in JSON format

File Filtering

Both versions respect:

  • .gitignore - Files ignored by Git are not indexed
  • .gromaignore - Additional patterns to exclude from indexing
  • Only Git-tracked files are processed
  • Binary files are automatically skipped

Output Format

Results are returned as JSON for easy integration with other tools:

{
  "path": "src/auth.rs",
  "score": 0.82,
  "content": "impl Authentication {\n    pub fn verify_token..."
}

Integration with Aider

Groma works great with aider for AI-assisted coding:

# Use with aider's --read flag
aider --read $(echo "authentication" | groma . --cutoff 0.3 | jq -r '.path')

# Or use the helper script
aider --read $(groma-files "authentication logic" .)

Performance Comparison

Feature groma (Qdrant) groma-lancedb (Local)
Setup Required Docker + API Key None
Internet Required Yes No
Cost OpenAI API fees Free
Privacy API calls 100% local
Embedding Quality Higher Good
Speed Fast after indexing Fast after indexing
Storage External (Qdrant) Local (.groma_lancedb)

Why Groma?

The name comes from the groma, a surveying instrument used in the Roman Empire. Just as the ancient groma helped surveyors find structure in the physical landscape, this tool helps you find relevant files within your codebase.

License

MIT

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages