🌐 Language: English | 日本語
Your agent can spend money, leak data, and call tools before you notice. Shingan catches dangerous workflow structures before runtime.
Status: Beta. Shingan is under active development; v1.0 is targeted for late 2026 once
baseline/ignore/ severity-policy / PR-bot land. Not yet recommended for production-critical CI gating — informational use (continue-on-error: true) is the recommended integration mode today.
A Go-based static analyzer for AI agent workflows. It catches dangerous structures — infinite loops, unreachable nodes, missing error handlers, runaway cost paths, prompt-injection sinks, PII leak paths, code-execution from LLM output — before the workflow ever runs.
LLM orchestration is mainstream now, but the "design-time bug" detection layer is missing. Runtime observability (LangSmith / Langfuse) only tells you a thing happened after it cost you money or leaked data. n8n-only linters (FlowLint) miss everything else. Shingan inspects the workflow graph before execution, across LangGraph, CrewAI, ADK-Go, n8n, and custom JSON DSLs.
AI agents are unforgiving once they fan out: external API calls, browser automation, and code execution all leave irreversible side effects. Catching infinite loops, unreachable nodes, missing error handlers, PII leak paths, and prompt-injection sinks before deploy prevents the majority of cost blowups and incidents.
Every workflow framework reduces to the same primitive: a directed graph of nodes and edges. Shingan keeps that intermediate representation (IR) at the center of an Onion Architecture and runs 20+ rules that are framework-agnostic.
A static analyzer wins or loses on operational ergonomics (how disruptive it is to your CI), not just rule count. Honest current state:
| Operational dimension | Shingan v0.8.7 | What you'd need before flipping CI to fail-on-finding |
|---|---|---|
| Multi-framework (LangGraph / CrewAI / n8n / ADK-Go / JSON / Samurai) | ✅ | — |
AST-based fallback (factory / instance-method / @CrewBase / Flow) |
✅ | — |
| GitHub Action + SARIF + Code Scanning integration | ✅ | — |
| MCP + LSP (Cursor / Claude Code / Neovim / VS Code / LangGraph Studio) | ✅ | — |
| Severity × Confidence two-axis model | ✅ | — |
Diff mode (--since main) + --baseline JSON |
✅ | — |
// shingan:ignore line / file comments |
✅ | — |
Severity-policy-as-code (.shingan.yaml, per-rule / per-path) |
✅ | — |
| PR bot (inline comments on changed nodes) | ⏳ v0.10 | required for "informational → blocking" promotion |
| Org dashboard (cost / PII / cycle metrics over time) | ⏳ v0.10+ | required for AppSec / Platform team adoption |
| Public false-positive rate (measured against ≥100 OSS workflows) | ⏳ v0.9 | required for procurement / vendor evaluation |
| OWASP Agentic Top 10 — full mapping | ⏳ v0.9 | required for SOC 2 / ISO 42001 / enterprise auditors |
| Plugin SDK (community rules) | ✅ v0.9 (experimental:) |
stability promise lands at v1.0 (ADR-010) |
So: today's recommended use is continue-on-error: true informational CI plus IDE feedback via the LSP. v0.9–v0.10 is closing the operational gap.
The OWASP Agentic AI Top 10 (2025) lists ten failure modes specific to agentic LLM systems. Static analysis can only catch the structural class of these — runtime observability tools (LangSmith, Langfuse) cover the rest. Today's coverage:
| OWASP Agentic Top 10 (2025) | Class | Shingan rule(s) | Status |
|---|---|---|---|
| AAI01 — Memory poisoning | runtime | (out of static scope) | ❌ runtime-only |
| AAI02 — Tool misuse | structural | eval_missing, unbounded_tool_arg, secret_in_prompt_template |
✅ partial |
| AAI03 — Privilege compromise | structural | circular_dep_agents, dynamic_node_construction |
✅ partial |
| AAI04 — Resource overload | structural | loop_guard, retry_storm, cost_estimation, redundant_llm_call |
✅ |
| AAI05 — Cascading hallucination amplification | runtime | (out of static scope) | ❌ runtime-only |
| AAI06 — Intent breaking & goal manipulation | structural | prompt_injection_sink, temperature_misuse |
✅ partial |
| AAI07 — Misaligned & deceptive behaviors | runtime | (out of static scope, evaluation-only) | ❌ runtime-only |
| AAI08 — Repudiation & untraceability | structural | error_handler_checker, missing_eval_dataset |
✅ partial |
| AAI09 — Identity spoofing & impersonation | runtime / config | model_card_mismatch, deprecated_model |
🟡 partial |
| AAI10 — Overwhelming human in the loop | structural | cycle_detection, unreachable_node |
✅ partial |
Roadmap to full structural coverage (everything but AAI01 / AAI05 / AAI07, which are runtime-class): v0.9 — see the v0.9 plan in shingan-adr.md.
Onion Architecture — dependencies flow inward only.
┌─────────────────────────────────────────────┐
│ cmd/ DI wiring + entry points │
│ ┌───────────────────────────────────────┐ │
│ │ infrastructure/ concrete adapters │ │
│ │ ┌─────────────────────────────────┐ │ │
│ │ │ application/ use cases │ │ │
│ │ │ ┌───────────────────────────┐ │ │ │
│ │ │ │ domain/ zero external dep│ │ │ │
│ │ │ └───────────────────────────┘ │ │ │
│ │ └─────────────────────────────────┘ │ │
│ └───────────────────────────────────────┘ │
└─────────────────────────────────────────────┘
| Layer | Responsibility | Allowed dependencies |
|---|---|---|
| domain/ | WorkflowGraph, rules, entity definitions | Standard library only |
| application/ | AnalysisOrchestrator, interface definitions | domain/ |
| infrastructure/ | Parsers, reporters, factory implementations | application/, domain/ |
| cmd/ | CLIs, DI wiring | infrastructure/ |
Three Factory points
AnalyzerFactory— registers and creates analysis rules (domain.AnalysisRule)ParserFactory— switches parsers by input format (application.WorkflowParser)ReporterFactory— switches reporters by output format (application.ReportFormatter)
The fastest way to see Shingan working — runs a bundled sample workflow through the full pipeline and prints a real findings report. No input file needed.
npx --yes shingan-lint demo
# → exit 2, surfaces loop_guard (Critical) + unreachable_node (Warning) + moreThen run it on your own workflow JSON:
# A minimal but failing workflow, written to disk and analyzed in one shot.
cat > workflow.json <<'EOF'
{
"entry_node_id": "start",
"nodes": [
{"id": "start", "type": "llm", "name": "Planner"},
{"id": "loop", "type": "loop", "name": "Retry Loop"},
{"id": "step", "type": "llm", "name": "Worker"}
],
"edges": [
{"from": "start", "to": "loop"},
{"from": "loop", "to": "step"},
{"from": "step", "to": "loop"}
]
}
EOF
npx --yes shingan-lint analyze --input workflow.json --output markdown# one-shot run (no install)
npx --yes shingan-lint demo
# project-pinned
pnpm add -D shingan-lint
pnpm exec shingan demo
# global
npm install -g shingan-lint
shingan demoshingan-lint is a thin Node wrapper. Its postinstall step downloads the platform-specific Go binary from GitHub Releases, verifies the SHA-256 checksum, and caches it under ~/.cache/shingan-lint/v<ver>/. Linux, macOS, and Windows on amd64 / arm64 are all supported.
go install github.com/hatyibei/shingan/cmd/shingan@latest
shingan demogit clone https://github.com/hatyibei/shingan.git
cd shingan
go build -o shingan ./cmd/shingan
./shingan demodocker pull ghcr.io/hatyibei/shingan:latest
docker run --rm ghcr.io/hatyibei/shingan demoJSON input (default format):
shingan analyze --input workflow.json --output markdownADK-Go input (directory of .go files):
shingan analyze --format adk-go --input ./agents/ --output markdownLangGraph (Python) input — requires the analyzed project's deps to be importable:
# prerequisite: pip install langgraph (plus whatever your project imports)
shingan analyze --format langgraph --input agent.py --output markdown
shingan analyze --format langgraph --input ./agents/ --output sarif --output-file findings.sarifExit codes: 0 = info-only or clean, 1 = warnings present, 2 = at least one critical.
CI integration (GitHub Actions):
- name: Shingan check
run: shingan analyze --format adk-go --input ./agents/Shingan is run against production LangGraph / CrewAI / n8n repos before
each release. Zero Critical false positives across 12+ swept OSS at
the latest release. Every Critical FP we ever surfaced has a regression
fixture pinned in testdata/ and a "dogfood-driven" CHANGELOG entry.
| Repo | Framework | Findings | Critical FP | Notes |
|---|---|---|---|---|
| gpt-researcher (24K★) | LangGraph | 1 cycle_detection | 0 | Real bug → Issue #1766 |
| open_deep_research (7K★) | LangGraph | 9 → 1 | 0 | Real bug → Issue #269 |
| executive-ai-assistant (1K★) | LangGraph | 14 → 3 | 0 | v0.8.6 sentinel/router-Literal fix |
| company-researcher | LangGraph | 1 Critical FP → 0 | 0 | Triggered tools_condition builtin handling |
| DATAGEN (1.7K★) | LangGraph | 2 unreachable FP → 0 | 0 | Triggered v0.8.7 for-loop unrolling |
| Devyan | CrewAI | 3 unreachable FP → 0 | 0 | Triggered v0.8.7 agents-only fallback |
| swe-agent (630★) | LangGraph | 4 cycle_detection | 0 | Real bug → Issue #6 |
| SRAgent, open-multi-agent-canvas | LangGraph | 0 | 0 | Clean repos |
→ Full track record + reproducible accuracy benchmark: docs/benchmarks.md.
Idioms surfaced and supported during the v0.8.5+ dogfood loop:
Command(goto=...)anddef fn() -> Command[Literal["a","b"]]typed dispatchadd_conditional_edges("src", router_fn)with omitted path_map — router's-> Literal[...]annotation read insteadadd_conditional_edgespath_mapas list (["a","b"]) and dict ({"k":"a"})add_edge([a, b], c)fan-in form- Multi-graph composition (
builder.add_node("section", section_builder.compile())) - LangGraph builtin routers (
tools_condition, with hooks for adding more) Literal[END, ...]sentinel exit recognition (cycle severity downgrade Critical → Warning)- Generic-exception fallback to AST when modules side-effect at import (
OpenAIError, missing API keys, etc)
Running the same scan on your fork:
shingan analyze --format langgraph --input path/to/graph.py --output markdownexamples/real/ ships three samples written against google.golang.org/adk v1.1.0. Shingan detects the following findings on each:
| Sample | Rule | Severity | What it catches |
|---|---|---|---|
| examples/real/infinite_loop.go | cycle_detection | Critical | loopagent.New without MaxIterations — unbounded loop |
| examples/real/unreachable.go | unreachable_node | Warning | orphan_analyzer not wired into the orchestrator's SubAgents |
| examples/real/missing_handler.go | error_handler_checker | Warning | planner calls browser_search but has no conditional branch for failure |
Run them:
shingan analyze --format adk-go --input examples/real/infinite_loop.go --output markdown
# exit code 2 (Critical)
shingan analyze --format adk-go --input examples/real/unreachable.go --output markdown
# exit code 1 (Warning)
shingan analyze --format adk-go --input examples/real/missing_handler.go --output markdown
# exit code 2 (Critical: loop_guard + Warning: error_handler_checker)Notes on official ADK-Go SDK coverage:
- Supports the
loopagent.New(loopagent.Config{AgentConfig: agent.Config{SubAgents: ...}})shape (v1.1.0) - Detects
LlmAgent/SequentialAgent/LoopAgentNew()constructor patterns via AST - Resolves tool nodes registered through
functiontool.New(Config{Name: "..."}, handler)by followingConfig.Nameand ident references - Uses a
go/typessecond pass to readfunctiontool.New[TArgs, TResults](...)generic arguments and infer Tool category from theTArgsstruct fields (v0.2.0+, viaParseFile). This is howmissing_handler.go'sbrowser_searchtool is correctly detected. error_handler_checkeralso fires when an LLM node carries a Tool edge but has no conditional branch (LLM→Tool pattern in ADK-Go)- ADK-Go SDK v1.1.0 requires
go 1.25.0+; reflected ingo.mod's minimum version
# E2E auto-verification under the demo build tag
go test -tags=demo -v -run TestDemo_ .| Rule ID | Detects | Max Severity | Confidence |
|---|---|---|---|
| cycle_detection | Cycles among non-Loop nodes; cycles inside LoopAgent scope |
Critical | 1.0 (deterministic) |
| loop_guard | LoopAgent (Loop type) without MaxIterations set |
Critical | 1.0 (deterministic) |
| unreachable_node | LLM/Tool nodes unreachable from the entry node | Warning | 1.0 (deterministic) |
| error_handler_checker | Missing error handling after external-I/O nodes | Critical | 0.8 (heuristic) |
| cost_estimation | Expensive LLM models inside loops; expensive models on trivial tasks | Warning | 0.7 (price drifts) |
| redundant_llm_call | Duplicate calls with the same (prompt_template, model) |
Warning | 0.9 (exact match) |
| pii_leak_scanner | Path from RAG/PII source to external sink with no human gate | Warning | 0.6 (RAG) / 0.3 (name hint) |
| secret_exposure_scanner | Hardcoded API keys / secrets in Node.Config |
Critical | 0.95 (Critical/Warning) / 0.5 (Info) |
| max_parallel_branches | A single node's fan-out (outgoing edge count) exceeds the threshold | Critical | 1.0 (Critical) / 0.9 (Warning) / 0.7 (Info) |
| deprecated_model | Shutdown or soon-to-be-deprecated LLM model names (OpenAI / Anthropic / Google) | Critical | 1.0 (shutdown) / 0.9 (deprecated soon) |
| temperature_misuse | LLM with temperature > 0 paired with a deterministic task signature |
Warning | 0.9 / 0.7 / 0.5 |
| model_card_mismatch | LLM whose declared model disagrees with provider / base_url |
Critical | 1.0 (known prefix) / 0.4 (unknown) |
| prompt_injection_sink | user_input reaches an LLM system-prompt template (substitution → Critical / no substitution → Warning / non-system → Info) | Critical | 0.9 / 0.7 / 0.5 |
| eval_missing | LLM output reaches a code-execution tool (no validation → Critical / Condition gate → Warning / Human gate → skip) | Critical | 0.9 / 0.6 |
| dynamic_node_construction | eval(/exec(/Function(/etc. inside Node.Config (body/fn/handler/...) |
Critical | 0.95 / 0.85 / 0.6 |
| retry_storm | Tool retry × parallelism = blast radius (≥100 → Critical, ≥30 → Warning, ≥10 → Info) | Critical | 0.9 / 0.7 / 0.5 |
| circular_dep_agents | Multi-agent A→B→A delegation cycle | Warning | 0.85 / 0.75 / 0.6 |
| unbounded_tool_arg | Tool argument schema fields without maxLength / maxItems / maximum |
Warning | 0.7 / 0.5 / 0.4 |
| secret_in_prompt_template | Hardcoded credentials inside LLM prompt templates | Critical | 0.95 (exact) / 0.7 (JWT) |
| missing_eval_dataset | Production-flagged graph without an eval_dataset reference |
Warning | 0.7 |
| human_gate_missing | Production-flagged graph that performs sensitive external actions (API / code-exec / send / payment …) with no Human approval node anywhere |
Warning | 0.6 (heuristic) |
| tool_description_missing | Tool node lacking a usable description (LLM picks tools from description text; missing → wrong-tool selection) |
Info | 0.6 (heuristic) |
| Format | Status | Notes |
|---|---|---|
| langgraph | Phase 1 primary (ADR-011) | Extracts Python langgraph.graph.StateGraph via long-lived Python subprocess + JSON-RPC. Requires pip install langgraph (details) |
| adk-go | GA / maintained | AST analysis of Google ADK-Go (google.golang.org/adk) |
| json | GA | Shingan's native WorkflowGraph JSON |
| samurai | Alpha | Generic JSON-schema adapter for GUI workflow editors (extension example) |
| n8n | Beta | n8n workflow JSON export, pure Go (no Python / Node bridge) (details) |
| crewai | Beta | CrewAI Crew/Agent/Task definitions via Python long-lived subprocess + JSON-RPC. Requires pip install "crewai>=0.50.0" (details) |
| langgraph-js | Experimental (PoC, v0.9) | LangGraph.js (@langchain/langgraph), TypeScript/JS, via the TypeScript Compiler API in a Node shim. Needs node; installs typescript on first use |
| pydantic-graph | Experimental (PoC, v0.9) | pydantic-ai pydantic_graph (BaseNode.run() return types), AST-only Python shim |
| llamaindex | Experimental (PoC, v0.9) | LlamaIndex Workflows (event-driven @step methods), AST-only Python shim |
| autogen | Experimental (PoC, v0.9) | Microsoft AutoGen GraphFlow (DiGraphBuilder), AST-only Python shim |
| mastra | Experimental (PoC, v0.9) | Mastra (mastra.ai) TS workflows (createWorkflow().then()…), TypeScript Compiler API in a Node shim. Needs node |
| openai-agents | Experimental (PoC, v0.9) | OpenAI Agents SDK (@openai/agents) multi-agent definitions: new Agent({ handoffs }) / Agent.create(...) handoffs (and agent.asTool()) extracted as directed edges, TypeScript Compiler API in a Node shim. Needs node |
The six v0.9 parsers are PoC-level: each handles idiomatic workflows of its framework. Coverage limits (dynamic graphs, closure branch conditions, split-builder chains, annotation-only return types) are documented per format.
| Integration | Status | Notes |
|---|---|---|
CLI (shingan analyze) |
GA | Core experience, --since / --baseline supported |
| GitHub Action | GA | action.yml, emits SARIF for GitHub Code Scanning |
MCP server (shingan-mcp) |
GA | Callable from Claude Desktop / Cursor / LangGraph Studio |
LSP server (shingan-lsp) |
Beta | VS Code / Cursor / Neovim / Helix / Zed / JetBrains. SHA-256 LRU diff cache + degraded mode (ADR-009). See docs/lsp.md |
VS Code extension (vscode-shingan) |
Beta | extensions/vscode-shingan/, spawns shingan-lsp |
| Format | Content type | Use |
|---|---|---|
| json | application/json | API response, program-to-program |
| markdown | text/markdown | CLI, human-readable reports |
| sarif | application/sarif+json | GitHub Code Scanning integration |
Through v1.0 we commit to no breaking changes in the following public surfaces. The version line below each item is the floor — breaking changes will not happen earlier than the listed major bump.
- Rule names + IDs (
cycle_detection,pii_leak_scanner, …) — stable through v1.0. Rules may be added; existing IDs will not be renamed or repurposed. .shingan.yamlpolicy schema — semver-pinned. Additive keys only through v1.0.- SARIF output structure — conforms to SARIF 2.1.0; shingan-specific extensions live in
propertiesand are append-only through v1.0. - CLI flags (
shingan analyze --format / --input / --output / --min-confidence / --baseline / --since) — stable through v1.0. New flags will be added; existing flags will not change semantics. - Exit codes (0 = clean, 1 = Warning, 2 = Critical) — stable through v2.0.
Plugin SDK (when it ships in v0.9+) is gated behind an experimental:
prefix until v1.0 and explicitly not covered by this commitment.
Static analysis is only useful if developers can trust the output. Shingan tracks Critical false positives as a load-bearing quality metric — we have ended every release sweep at 0 Critical FP since v0.7 (see docs/benchmarks.md for the per-release breakdown).
If you hit a false positive:
- Open an Issue using the false-positive template — paste the repo URL or minimal repro, the
shingan analyzeoutput, and why you believe it's wrong. - Critical FP: triaged within 24h on weekdays (best-effort), fix + regression fixture land in the next release with a
dogfood-drivenCHANGELOG entry. - Warning / Info FP: same template, triaged within 1 week.
- The fix is always a parser/shim precision improvement or a confidence-rule tweak — never silently muting the finding, never moving it to a deny-list.
Every dogfood-driven FP fix shipped since v0.5 is listed in docs/benchmarks.md § Dogfood-driven shim improvements. v0.8.7 alone closed two FPs surfaced by 1.7K-star and ~300-star LangGraph/CrewAI projects.
- v0.1〜v0.5 (Apr 2026): JSON / ADK-Go / Samurai parsers, Confidence × Severity 2-axis, SARIF / GitHub Action, 9 rules ✓
- v0.6 (May 2026): ESLint-style visitor + 3-tier split (ADR-006/007), shingan-lsp, shingan-mcp, LangGraph parser, 20 rules,
shingan-lintnpm distribution, tag→release→npm-publish automation ✓ - v0.7 (May 2026): n8n parser (pure Go, JSON DSL), bilingual EN/JA docs ✓
- v0.8 (May 2026): CrewAI parser (Python shim, reuses LangGraph PythonWorker), 6 frameworks total ✓
- v0.9+: Mastra parser (TypeScript bridge), 30+ rules, Plugin SDK preview, official site + demo video
- v1.0: 5+ frameworks × 25+ rules, Plugin SDK GA, Marketplace listing
go test ./...
go vet ./...
go build -o shingan ./cmd/shingan
make lint # check-confidence-reason (go/analysis) + go vetWhen adding a new rule, see docs/rule-authoring.md.
- Architecture
- Rule authoring guide (internal)
- Framework parsers: LangGraph · CrewAI · n8n
- Case studies (real OSS dogfood): crewAI-examples · n8n community workflows · gpt-researcher — index
- LSP server (
shingan-lsp) — VS Code / Neovim / Helix / Zed setup - MCP server (
shingan-mcp) — Claude Desktop / Cursor / LangGraph Studio setup - SARIF output + GitHub Code Scanning integration
- diff mode + baseline (
--since/--baseline) - Confidence scoring
- cycle_detection technical note
- All ADRs (001〜013)
Contributors implementing new builtin rules should start with docs/rule-authoring.md. It covers the Local / Path / Global three-tier templates (ADR-007), ConfidenceReason selection guide (ADR-008), the check-confidence-reason go/analysis linter, TDD patterns, and design notes for every existing rule. External rule authors who want to ship rules from their own repo (no fork required) should read docs/plugin-sdk.md — the public plugin.Register API shipped in v0.9 with an experimental: prefix requirement; the stability promise on the ABI lands at v1.0 (ADR-010 originally deferred all external exposure to v1.0; the v0.9 implementation supersedes that with the prefix-gated early-access path).
MIT