13 releases
| 0.2.2 | Apr 11, 2026 |
|---|---|
| 0.2.1 | Apr 9, 2026 |
| 0.1.9 | Apr 8, 2026 |
#1264 in Command line utilities
7.5MB
228K
SLoC
AstBasedContext-rs
A Rust implementation of Ast based Context — builds a code graph from AST/CST analysis of your source code and exposes it to LLMs via an MCP server.
Supports 13 languages: Python, Rust, TypeScript, JavaScript, Go, Java, C, C++, C#, Ruby, PHP, Swift, Dart. (Note: Kotlin is currently a TODO due to upstream parser dependencies).
What it does
- Walks your project directory (respecting
.gitignore) - Parses every source file using tree-sitter CSTs
- Extracts functions, classes, structs, traits, interfaces, enums, variables, imports, and call relationships
- Builds a directed graph linking everything together
- Exposes the graph via a CLI or an MCP server so LLMs can query it
Installation
cargo install ast_context
This installs the ast_context binary which handles both CLI code analysis and the MCP server.
Build from source
If you prefer to build from source instead:
git clone https://github.com/agene0001/AstBasedContext-rs.git
cd AstBasedContext-rs
cargo install --path .
This will compile the project and install the ast_context binary to your Cargo bin directory.
Once installed (either from crates.io or from source), run the setup command to automatically configure your editors with the MCP server:
ast_context setup
CLI Usage
Index a project
ast_context index <path> [--format stats|json|jsonl] [--save graph.json] [--annotate] [--exclude <pattern>...]
# Print summary stats
ast_context index ./my-project
# Save the graph to a file for later querying
ast_context index ./my-project --save graph.json
# Index with source annotations (enables similarity/redundancy detection)
ast_context index ./my-project --save graph.json --annotate
# Skip test files for a smaller, faster graph focused on production code
ast_context index ./my-project --skip-tests
# Exclude directories/files (repeatable, gitignore glob syntax)
ast_context index ./my-project --exclude "vendor/**" --exclude "*.generated.go"
# Set a custom file size limit in MB (default: 50MB — skips huge auto-generated files)
ast_context index ./my-project --annotate --save graph.json --max-file-size 20
# Export as JSON or JSONL
ast_context index ./my-project --format json --output graph.json
ast_context index ./my-project --format jsonl --output output_dir/
.astcontextignore and .astcontextignore.local
Place an .astcontextignore file in your project root (or any subdirectory) to permanently exclude paths. You can also use .astcontextignore.local for per-user exclusions that you don't want to commit to git. Both use the same syntax as .gitignore:
# Skip vendored code
vendor/
third_party/
# Skip generated files
*.generated.go
*.pb.go
*_generated.ts
This is read automatically — no CLI flags needed. You can combine it with --exclude for one-off exclusions.
Query a saved graph
# Search by name (all types)
ast_context search --graph graph.json "parse"
# Search for functions only
ast_context search --graph graph.json "parse" --kind Function
# Analyze relationships
ast_context analyze --graph graph.json "my_function" --relationship callers
ast_context analyze --graph graph.json "my_function" --relationship callees
ast_context analyze --graph graph.json "MyClass" --relationship inheritance
ast_context analyze --graph graph.json "my_fn" --relationship call_chain --depth 5
ast_context analyze --graph graph.json "MyTrait" --relationship implementors
ast_context analyze --graph graph.json "MyModule" --relationship children
# Find dead code (functions never called)
ast_context dead-code --graph graph.json --limit 50
# Find most complex functions (by cyclomatic complexity)
ast_context complexity --graph graph.json --limit 20
Find similar/redundant code
Requires --annotate during indexing. Finds groups of structurally similar nodes based on token overlap and line count similarity.
# Find similar functions (great for finding consolidation opportunities)
ast_context similar --graph graph.json --kind Function --min-lines 8
# Find similar structs/classes
ast_context similar --graph graph.json --kind Struct
# Find all similar nodes across all types
ast_context similar --graph graph.json
This is designed for AI-assisted code review: the source snippets give an LLM enough context to identify genuinely redundant code even when names differ completely. Use cases:
- Redundancy detection: Find functions/classes that do the same thing
- Consolidation: Identify modules/packages that could be merged
- Refactoring: Help split large codebases into better modules based on what each node actually does
Tiered redundancy analysis
Full redundancy, architecture, anti-pattern, and code quality analysis with confidence tiers (Critical > High > Medium > Low). 102 checks spanning:
- Redundancy: passthrough wrappers, near-duplicates, merge/split candidates, overlapping structs/enums
- Type suggestions: parameter structs, enum dispatch, trait extraction
- Architecture patterns: facade, factory, builder, strategy, template method, observer, decorator, mediator, visitor, iterator, state, composite, repository, prototype, flyweight, event emitter, memento, fluent builder, null object
- Detected patterns: singleton, adapter, proxy, command, chain of responsibility, dependency injection
- Anti-patterns: god class, circular dependencies, feature envy, shotgun surgery, dead code, long parameter list, data clumps, middle man, lazy class, refused bequest, speculative generality, inappropriate intimacy, deep nesting, anemic domain model, magic numbers, mutable global state, empty catch, callback hell, API inconsistency, divergent change, parallel inheritance, primitive obsession, large class, unstable dependency
- Type system suggestions: tagged union → sum type, class hierarchy → enum, boolean blindness, newtype wrapper, sealed type, large product type
- Structural quality: hub module, orphan module, inconsistent naming, circular package dependency
- Metrics: LCOM (lack of cohesion), CBO (coupling between objects), module instability, cognitive complexity
- Composite risk scores: per-function and per-file risk score combining complexity, test coverage, fan-in, TODOs, mutability
- Test coverage gaps: untested public functions, low test ratio per file, integration test smells
- Change blast radius: transitive caller analysis showing how many modules a change would affect
- Semantic clustering: misplaced functions, implicit modules (tightly coupled code spanning files)
- API surface: unstable public APIs (many callers + many params), undocumented public APIs, leaky abstractions
- Cross-language boundaries: FFI boundaries (extern C, ctypes, wasm_bindgen, PyO3, JNI, N-API), subprocess/exec calls, IPC/RPC protocols (gRPC, protobuf, Kafka, WebSocket, REST endpoints)
- Configuration detection: environment variable reads, hardcoded URLs/endpoints, feature flags, config file references
- Data structure suggestions: Vec used as set (→ HashSet), Vec used as map (→ HashMap), linear search in loop, string concatenation in loop, sorted Vec for lookup (→ HashMap), nested loop lookup (→ HashMap), HashMap with sequential integer keys (→ Vec), excessive collect-then-iterate chains
# Show all findings
ast_context redundancy --graph graph.json
# Only critical + high confidence
ast_context redundancy --graph graph.json --tier high
# Only critical
ast_context redundancy --graph graph.json --tier critical
# Tune thresholds, skip checks, cap findings, and include source code
ast_context redundancy --graph graph.json \
--split-complexity 20 --split-lines 80 \
--near-dup-threshold 0.85 \
--structural-threshold 0.55 \
--merge-threshold 0.45 \
--skip-check detect_dead_code,data_structures \
--limit-per-type 5 \
--include-source
Watch for changes
ast_context watch ./my-project --debounce 2000
ast_context watch ./my-project --exclude "build/**"
Rebuilds the graph whenever files change. Useful during active development.
Parse a single file
ast_context parse src/main.rs
Prints the raw parse result as JSON.
List supported languages
ast_context languages
MCP Server
The MCP server lets Claude (or any MCP-compatible LLM) query your code graph directly.
Quick setup
After installing, run the setup command once to auto-configure every detected editor:
ast_context setup
This detects and configures:
- Claude Desktop — macOS & Windows
- Claude Code — via
claude mcp add - Zed —
~/.config/zed/settings.json - Cursor —
~/.cursor/mcp.json - Windsurf —
~/.codeium/windsurf/mcp_config.json - VS Code (GitHub Copilot, v1.99+) — user-level
mcp.json - JetBrains IDEs (IntelliJ, PyCharm, GoLand, WebStorm, …) — all detected installs
# Preview what would be changed without modifying anything
ast_context setup --dry-run
# Override the binary path if auto-detection fails
ast_context setup --mcp-path /custom/path/to/ast_context
Restart your editor after running setup. Then ask your AI assistant to index your project:
Index /path/to/my-project with annotations
Manual configuration
If you prefer to configure manually or use an unsupported editor:
Configure with Claude Desktop / Claude Code
Add to ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"ast-context": {
"command": "ast_context",
"args": ["mcp"]
}
}
}
Configure with Zed
Add to ~/.config/zed/settings.json:
{
"context_servers": {
"ast-context": {
"command": {
"path": "ast_context",
"args": ["mcp"]
}
}
}
}
After configuring, ask your AI assistant to index your project:
Index /path/to/my-project with annotations, excluding node_modules and vendor
Available MCP Tools
| Tool | Description |
|---|---|
index_directory |
Index a directory and build its code graph. Auto-caches to .ast_context_cache.json — subsequent calls load from cache instantly if no source files have changed. Pass force_reindex=true to rebuild, or skip_tests=true to exclude tests. |
find_code |
Search for functions/classes/structs by name (partial match, case-insensitive) |
get_file_summary |
List all symbols defined in a specific file — great for understanding a file before editing it |
get_source |
Retrieve the source snippet for a named symbol (requires annotate=true on index) |
get_context_for_symbol |
All context needed before editing a symbol: source, callers, callees, and similar functions in one call |
find_references |
All usages of a symbol: callers, inheritors, implementors, importers, and test functions |
get_module_overview |
Directory-level summary: files, line counts, public symbols, and cross-file dependencies |
analyze_relationships |
Callers, callees, inheritance, call chains, implementors, children |
find_dead_code |
Find uncalled functions |
find_complex_functions |
Rank functions by cyclomatic complexity |
get_stats |
Node/edge counts by type |
list_repositories |
Show all indexed repositories |
find_similar |
Find groups of redundant/similar code (requires annotate=true on index) |
analyze_redundancy |
Tiered redundancy + architecture + anti-pattern + type system + risk + boundary analysis (102 checks across 4 tiers, requires annotate=true) |
save_graph |
Save the in-memory graph to a file for manual archiving or sharing |
load_graph |
Load a previously saved graph into the session |
All query tools accept an optional repository parameter to target a specific indexed directory when multiple repos are loaded.
Session persistence
The first time you index a project the graph is saved to {project}/.ast_context_cache.json (automatically added to .gitignore). In subsequent sessions, calling index_directory on the same path will:
- Load from cache instantly if no source files have changed and the configuration (like
annotate,exclude, orskip_tests) is identical - Automatically re-index if any source file is newer than the cache, or if the indexing configuration fingerprint has changed
- Rebuild unconditionally if
force_reindex=trueis passed
This means you can safely call index_directory at the start of every session without worrying about performance.
MCP Protocol
The server implements JSON-RPC 2.0 over stdin/stdout, following the Model Context Protocol spec.
Example session:
// Client sends:
{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}
{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"index_directory","arguments":{"path":"/my/project"}}}
{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"find_code","arguments":{"query":"parse","kind":"Function"}}}
// Server responds:
{"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2024-11-05","serverInfo":{"name":"ast-context-mcp","version":"0.1.0"},...}}
{"jsonrpc":"2.0","id":2,"result":{"content":[{"type":"text","text":"Successfully indexed /my/project.\nGraph: 1317 nodes, 1904 edges."}]}}
{"jsonrpc":"2.0","id":3,"result":{"content":[{"type":"text","text":"Found 18 results for 'parse':\n..."}]}}
Graph Node Types
| Type | Description |
|---|---|
Repository |
Root of the graph |
Directory |
Subdirectory |
File |
Source file. Carries public_count, private_count, comment_line_count, total_lines, is_test_file |
Function |
Function or method. Carries cyclomatic_complexity, arg_types, return_type, visibility, is_static, is_abstract, is_async, todo_comments, raises, has_error_handling |
Class |
Class. Carries bases (parent classes) and fields (typed field declarations) |
Struct |
Struct. Carries fields (typed field declarations) |
Trait |
Rust trait or similar |
Interface |
Go/Java/TypeScript interface |
Enum |
Enum (includes variant names) |
Variable |
Module-level or top-level variable |
Module |
External module/package |
Edge Types
| Type | Description |
|---|---|
CONTAINS |
Parent → child containment |
CALLS |
Function → function call (with line number and args) |
IMPORTS |
File → module dependency |
INHERITS |
Class → parent class |
IMPLEMENTS |
Class → interface/trait |
HAS_PARAMETER |
Function → parameter variable |
TESTS |
Test function → the production function it tests |
Workspace Structure
AstBasedContext-rs/
├── src/
│ ├── parser/ # Language parsers (one file per language)
│ ├── graph/ # Graph data structure, builder, queries
│ ├── types/ # Node/edge types, language enum
│ ├── mcp/ # MCP server implementation
│ ├── redundancy/ # Redundancy analysis and tiered checks
│ ├── walker.rs # Directory walker
│ ├── watcher.rs # File watcher
│ ├── serialize.rs # JSON/JSONL export
│ └── main.rs # Unified binary entry point (CLI + MCP)
└── Cargo.toml
Running Tests
cargo test
29 tests covering the Python parser and graph builder. More language-specific tests are a good contribution target.
Future Work
Opt-in LSP Integration (--analyze --lsp)
The data structure checks (92-99) currently use source pattern matching (e.g., detecting .push() + .contains() to suggest HashSet). An opt-in LSP integration would confirm variable types before making suggestions, reducing false positives.
Benefits:
- Type-confirmed data structure suggestions (e.g., verify a variable is actually a
Vecbefore suggestingHashSet) - Unnecessary
.clone()detection - Parameter type suggestions
- Redundant type conversion detection
- More accurate unused import detection
Approach:
- Start with
rust-analyzeronly (best LSP support, project is Rust-focused) - Query
textDocument/hoverfor type-at-position to enrich existing findings - Degrade gracefully with timeouts if the LSP is slow or unavailable
- Expand to other language servers one at a time, since each has different startup/protocol quirks
Dependencies
~155MB
~4.5M SLoC