1 unstable release

0.1.0	Mar 27, 2026

#48 in #codebase

MIT license

60KB
1.5K SLoC

codemap

Scan a codebase and generate an interactive HTML architecture map. Point it at any repo, get a clickable visual graph of files, imports, functions, and call relationships.

codemap demo

The Problem

Understanding a new codebase is hard. You open a repo, see 50+ files, and have no idea how they connect. SourceTrail solved this beautifully but was discontinued in 2021. Nothing modern replaces it as a CLI tool. codemap fills that gap: one command, one HTML file, full architecture visibility.

How It Works

Scan the directory for source files (Python, JS/TS, Rust, Go)
Parse each file with tree-sitter to extract imports, function definitions, class/struct definitions, and function calls
Build a directed graph: files link to files (imports), files contain functions/classes, functions call functions
Render an interactive HTML file with a D3.js force-directed graph

The output is a single self-contained HTML file. No server needed. Open it in any browser.

Install

cargo install codemap

Or build from source:

git clone https://github.com/jtsilverman/codemap.git
cd codemap
cargo build --release

Usage

# Scan current directory, open in browser
codemap

# Scan a specific project
codemap ~/projects/my-app

# JSON output for tooling
codemap --json ~/projects/my-app -o graph.json

# Skip call graph (faster, cleaner for large repos)
codemap --no-calls ~/projects/my-app

# Force a specific language
codemap --lang python ~/projects/my-app

Options

codemap [OPTIONS] [PATH]

Arguments:
  [PATH]  Directory to scan [default: .]

Options:
      --json               Output as JSON instead of HTML
      --lang <LANG>        Force language (python|js|ts|rust|go)
      --depth <N>          Max directory depth [default: 10]
      --exclude <PAT>      Exclude patterns, comma-separated [default: node_modules,target,.git,__pycache__,.venv,vendor]
      --no-calls           Skip function call extraction
      --no-open            Don't auto-open the HTML
  -o, --output <FILE>      Output file path
  -v, --verbose            Show per-file parse stats

Supported Languages

Language	Imports	Functions	Classes/Structs	Calls
Python	`import`, `from...import`	`def`	`class`	function calls
JavaScript	`import`, `require()`	`function`, arrow functions	`class`	function calls
TypeScript	`import`	`function`, arrow functions	`class`	function calls
Rust	`use`	`fn`	`struct`, `trait`	function calls
Go	`import`	`func`	`type struct`, `type interface`	function calls

Interactive Features

Click a node to highlight its connections and see details in the sidebar
Search by file name or function name
Toggle edge types: imports, contains, calls
Toggle node types: files, functions, classes, external deps
Zoom and pan with mouse/trackpad
Directory clustering: files in the same directory are grouped visually
Auto-collapse: repos with 500+ nodes start file-level only to prevent hairball graphs

Tech Stack

Rust with tree-sitter for multi-language AST parsing
D3.js v7 for force-directed graph visualization
Single binary distribution via cargo install
Tree-sitter grammars: tree-sitter-python, tree-sitter-javascript, tree-sitter-typescript, tree-sitter-rust, tree-sitter-go

The Hard Part

Graph layout that doesn't produce a hairball. Force-directed graphs with hundreds of nodes and thousands of edges naturally collapse into an unreadable mess. The solution:

Directory clustering groups nodes by parent directory using a custom D3 force
Progressive detail starts file-level only for large repos, with click-to-expand
Edge filtering shows only import edges by default; call edges toggle on demand
Force tuning uses high charge repulsion (-300) with variable link distances (30px same-directory, 150px cross-directory)

This keeps the graph readable up to ~500 nodes. Beyond that, auto-collapse kicks in.

Limitations

Import resolution is best-effort (matches module names to local file paths, marks unresolved as "external")
No cross-language call graph (e.g., Python calling Rust via FFI)
No type inference or control flow analysis
Binary is ~15MB due to embedded tree-sitter grammars for 5 languages

License

MIT

Dependencies

~42MB
~1M SLoC