mark-guard

A CLI tool that keeps your documentation in sync with your Go code.

You change code. You forget to update docs. mark-guard parses the AST of your old code (from git) and your new code (on disk), extracts a semantic diff of exported symbols, and produces a structured summary of what changed in the public API. It then feeds that diff plus your current markdown docs to an LLM and writes the updated docs back to disk.

Text diffs are noisy and miss the point. AST-level diffing tells you exactly what changed in the public API which is exactly what documentation cares about.

Status

End-to-end pipeline works. I might change the prompt section with detailed XML and precise prompt. Other parts depend while I test other codbases I will figure out more

Phase	Description	Status
1-2	Skeleton + Git Integration	Done
3	Go Symbol Extraction	Done
4	Symbol Diffing	Done
5	Doc Scanning	Done
6	LLM Integration	Done
7	End-to-End Wiring	Done

What works today:

Detects changed .go files via git
Parses old and new Go source, extracts exported symbols
Diffs symbol sets: added, removed, modified (down to parameters, fields, methods)
Produces compact diff summaries (reduced token usage for LLM)
Scans and selects relevant markdown docs via config-based mapping
Loads config from .markguard.yaml with sensible defaults
Validates LLM output before writing (content-loss guard)
Dry-run by default, --write to apply, --force to bypass safety checks
Sends diff + docs to LLM (Gemini/OpenAI compatible) and writes updates back

How It Works

Detect changed .go files via git diff --name-only + git ls-files --others
Read old version from git show HEAD:<file>, new version from disk
Parse both with go/parser.ParseFile, extract exported symbols (functions, types, structs, interfaces, consts, vars)
Diff the two symbol sets: what was added, removed, or modified (down to individual parameters, fields, methods)
Scan configured doc paths, select relevant markdown files via config-based mapping
Build prompt with diff summary + doc content, send to LLM
Validate LLM output (reject empty results, block >50% content loss)
Write updated docs back to disk

Usage

format

Updates existing docs based on changed symbols in the current git diff.

# dry run -- see what would change (default)
make run

# apply changes to doc files
make run ARGS="--write"

# see the full diff summary, prompt, and raw LLM response
make run ARGS="--debug"

# bypass content-loss safety checks
make run ARGS="--write --force"

# compare against a specific git ref
make run ARGS="--base HEAD~3"

# use a custom config file
make run ARGS="--config path/to/.markguard.yaml"

# abort if token estimate exceeds a limit
make run ARGS="--max-tokens 30000"

format flags

Flag	Default	Description
`--base`	`HEAD`	Git ref to compare against
`--config`	`.markguard.yaml`	Path to config file
`--debug`	`false`	Print diff summary, prompt, and raw LLM response
`--force`	`false`	Bypass content-loss safety checks
`--max-tokens`	`50000`	Abort if estimated tokens exceed this limit
`--write`	`false`	Apply changes to doc files (dry-run by default)

generate

Bootstraps docs from scratch by parsing all exported Go symbols and sending them to the LLM. Use this when no docs exist yet. Use format for ongoing updates.

# dry run -- preview what would be generated
make generate

# append all packages to README.md
make generate-write ARGS="--output README.md"

# write one file per package into docs/
make generate-write ARGS="--output docs/"

# target a subdirectory of your repo
make generate-write ARGS="./internal/llm --output docs/"

# overwrite existing files in directory mode
make generate-write ARGS="--output docs/ --force"

# preview with full LLM prompt visible
make generate ARGS="--debug"

Output routing:

--output README.md (any .md file): all packages are appended to that single file, sorted alphabetically and separated by horizontal rules.
--output docs/ (directory): one <pkgname>.md file is created per package.

If --output is not passed, the value from generate.output in .markguard.yaml is used, then docs.paths[0], then docs/ as a final fallback.

generate flags

Flag	Default	Description
`--output`	from config	Directory or `.md` file destination
`--config`	`.markguard.yaml`	Path to config file
`--max-tokens`	`50000`	Abort if estimated tokens exceed this limit
`--write`	`false`	Apply changes (dry-run by default)
`--force`	`false`	Overwrite existing files in directory mode
`--debug`	`false`	Print symbol list, prompt, and raw LLM response

Docker

Run mark-guard without installing Go:

docker pull ghcr.io/elshadhu/mark-guard:latest

Run it against your repo:

# dry run - see what would change
docker run --rm \
  -v "$(pwd):/repo" \
  -w /repo \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  ghcr.io/elshadhu/mark-guard:latest format

# apply changes
docker run --rm \
  -v "$(pwd):/repo" \
  -w /repo \
  -e GEMINI_API_KEY="$GEMINI_API_KEY" \
  ghcr.io/elshadhu/mark-guard:latest format --write

# check version
docker run --rm ghcr.io/elshadhu/mark-guard:latest version

-v "$(pwd):/repo" mounts your repo so mark-guard can see your code, docs, and .git history.

You can pin to a specific version instead of latest:

docker pull ghcr.io/elshadhu/mark-guard:1.2.3

Key Design Decisions

Decision	Choice	Why
Diff strategy	AST-level symbol diff, not text diff	Text diffs include noise (whitespace, imports, comments). AST diff gives semantic changes: "parameter added", "field type changed". That is what docs care about.
Parser	`go/parser` only, no `go/types`	We parse raw strings from `git show`. `go/types` needs the full module graph. We need signatures, not resolved types.
Git integration	`os/exec` shelling out to `git`	`go-git` pulls 30+ dependencies. System `git` is faster for simple operations.
Doc-to-code mapping	Config-based mapping + send-all fallback	Small repos: send all docs (zero config). Large repos: user adds mappings for precision. No false-positive symbol scanning.
CLI framework	Cobra without Viper	Cobra gives subcommands, flags, help text. Viper pulls 20 transitive deps for reading one YAML file. We use `yaml.v3` directly.
Config	`.markguard.yaml` with env var references	API key stored as env var name, not the key itself. Config is optional, defaults work out of the box.

What It Does Not Do

Support languages other than Go. Each language needs its own parser. Go-only for now.
Auto-commit. You review the changes first.

Dependencies

github.com/spf13/cobra       # CLI framework
gopkg.in/yaml.v3              # YAML config parsing

Two external deps. Everything else is Go stdlib (go/parser, go/ast, go/token, os/exec, encoding/json).

Config

Create .markguard.yaml at your repo root (optional, defaults work without it):

llm:
  base_url: "https://generativelanguage.googleapis.com/v1beta/openai"
  api_key_env: "GEMINI_API_KEY"
  model: "gemini-2.5-flash"
docs:
  paths:
    - "docs/"
    - "README.md"
  exclude:
    - "docs/roadmap.md"
  mappings:
    - docs: ["docs/api.md"]
      code: ["internal/git/", "internal/config/"]
    - docs: ["README.md"]
      code: ["cmd/", "internal/cli/"]
generate:
  # a .md file appends all packages; a directory creates one file per package
  output: "README.md"

Without .markguard.yaml, defaults are:

Provider: Gemini (gemini-2.5-flash)
API key env: GEMINI_API_KEY
Doc paths: docs/, README.md
Mappings: None (sends all docs, fine for small repos)

Development

make build     # build binary to bin/mark-guard
make test      # go test ./... -v -race
make lint      # golangci-lint run ./...
make run       # go run ./cmd/mark-guard format
make generate  # dry-run generate (preview only)
make generate-write  # generate and append to configured output

References

Project	What I used it for
`golang.org/x/exp/apidiff`	Reference for map-keyed symbol comparison and API change detection between package versions.
`go/doc`	Grouping methods, consts, and vars under parent types.
`go/parser` + `go/ast`	AST parsing without type-checking (works on raw strings from `git show`).
Cobra (`spf13/cobra`)	Subcommand routing and flag parsing.
`golangci-lint`	Reference for shelling out to `git` instead of pulling in a Go git library.
Gemini OpenAI compatibility	ai.google.dev/gemini-api/docs/openai

What's Next

Support for other languages (Python, TypeScript, Rust) each needs its own parser
Per-edit validation before applying (currently per-file only)
Configurable content-loss thresholds via .markguard.yaml

Name		Name	Last commit message	Last commit date
Latest commit History 71 Commits
.github/workflows		.github/workflows
cmd/mark-guard		cmd/mark-guard
internal		internal
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
.markguard.yaml.example		.markguard.yaml.example
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mark-guard

Status

How It Works

Usage

format

format flags

generate

generate flags

Docker

Key Design Decisions

What It Does Not Do

Dependencies

Config

Development

References

What's Next

About

Uh oh!

Releases 1

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

mark-guard

Status

How It Works

Usage

format

format flags

generate

generate flags

Docker

Key Design Decisions

What It Does Not Do

Dependencies

Config

Development

References

What's Next

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Contributors

Uh oh!

Languages