Skip to content

hatyibei/shingan

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

235 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🌐 Language: English | 日本語

Shingan (心眼)

Your agent can spend money, leak data, and call tools before you notice. Shingan catches dangerous workflow structures before runtime.

Go version License CI npm

Status: Beta. Shingan is under active development; v1.0 is targeted for late 2026 once baseline / ignore / severity-policy / PR-bot land. Not yet recommended for production-critical CI gating — informational use (continue-on-error: true) is the recommended integration mode today.

A Go-based static analyzer for AI agent workflows. It catches dangerous structures — infinite loops, unreachable nodes, missing error handlers, runaway cost paths, prompt-injection sinks, PII leak paths, code-execution from LLM output — before the workflow ever runs.

Why Shingan

LLM orchestration is mainstream now, but the "design-time bug" detection layer is missing. Runtime observability (LangSmith / Langfuse) only tells you a thing happened after it cost you money or leaked data. n8n-only linters (FlowLint) miss everything else. Shingan inspects the workflow graph before execution, across LangGraph, CrewAI, ADK-Go, n8n, and custom JSON DSLs.

AI agents are unforgiving once they fan out: external API calls, browser automation, and code execution all leave irreversible side effects. Catching infinite loops, unreachable nodes, missing error handlers, PII leak paths, and prompt-injection sinks before deploy prevents the majority of cost blowups and incidents.

Every workflow framework reduces to the same primitive: a directed graph of nodes and edges. Shingan keeps that intermediate representation (IR) at the center of an Onion Architecture and runs 20+ rules that are framework-agnostic.

Where Shingan stands today

A static analyzer wins or loses on operational ergonomics (how disruptive it is to your CI), not just rule count. Honest current state:

Operational dimension Shingan v0.8.7 What you'd need before flipping CI to fail-on-finding
Multi-framework (LangGraph / CrewAI / n8n / ADK-Go / JSON / Samurai)
AST-based fallback (factory / instance-method / @CrewBase / Flow)
GitHub Action + SARIF + Code Scanning integration
MCP + LSP (Cursor / Claude Code / Neovim / VS Code / LangGraph Studio)
Severity × Confidence two-axis model
Diff mode (--since main) + --baseline JSON
// shingan:ignore line / file comments
Severity-policy-as-code (.shingan.yaml, per-rule / per-path)
PR bot (inline comments on changed nodes) ⏳ v0.10 required for "informational → blocking" promotion
Org dashboard (cost / PII / cycle metrics over time) ⏳ v0.10+ required for AppSec / Platform team adoption
Public false-positive rate (measured against ≥100 OSS workflows) ⏳ v0.9 required for procurement / vendor evaluation
OWASP Agentic Top 10 — full mapping ⏳ v0.9 required for SOC 2 / ISO 42001 / enterprise auditors
Plugin SDK (community rules) ✅ v0.9 (experimental:) stability promise lands at v1.0 (ADR-010)

So: today's recommended use is continue-on-error: true informational CI plus IDE feedback via the LSP. v0.9–v0.10 is closing the operational gap.

OWASP Agentic AI — Top 10 (2025) coverage

The OWASP Agentic AI Top 10 (2025) lists ten failure modes specific to agentic LLM systems. Static analysis can only catch the structural class of these — runtime observability tools (LangSmith, Langfuse) cover the rest. Today's coverage:

OWASP Agentic Top 10 (2025) Class Shingan rule(s) Status
AAI01 — Memory poisoning runtime (out of static scope) ❌ runtime-only
AAI02 — Tool misuse structural eval_missing, unbounded_tool_arg, secret_in_prompt_template ✅ partial
AAI03 — Privilege compromise structural circular_dep_agents, dynamic_node_construction ✅ partial
AAI04 — Resource overload structural loop_guard, retry_storm, cost_estimation, redundant_llm_call
AAI05 — Cascading hallucination amplification runtime (out of static scope) ❌ runtime-only
AAI06 — Intent breaking & goal manipulation structural prompt_injection_sink, temperature_misuse ✅ partial
AAI07 — Misaligned & deceptive behaviors runtime (out of static scope, evaluation-only) ❌ runtime-only
AAI08 — Repudiation & untraceability structural error_handler_checker, missing_eval_dataset ✅ partial
AAI09 — Identity spoofing & impersonation runtime / config model_card_mismatch, deprecated_model 🟡 partial
AAI10 — Overwhelming human in the loop structural cycle_detection, unreachable_node ✅ partial

Roadmap to full structural coverage (everything but AAI01 / AAI05 / AAI07, which are runtime-class): v0.9 — see the v0.9 plan in shingan-adr.md.

Architecture

Onion Architecture — dependencies flow inward only.

┌─────────────────────────────────────────────┐
│  cmd/          DI wiring + entry points      │
│  ┌───────────────────────────────────────┐  │
│  │  infrastructure/   concrete adapters   │  │
│  │  ┌─────────────────────────────────┐  │  │
│  │  │  application/   use cases        │  │  │
│  │  │  ┌───────────────────────────┐  │  │  │
│  │  │  │  domain/  zero external dep│  │  │  │
│  │  │  └───────────────────────────┘  │  │  │
│  │  └─────────────────────────────────┘  │  │
│  └───────────────────────────────────────┘  │
└─────────────────────────────────────────────┘
Layer Responsibility Allowed dependencies
domain/ WorkflowGraph, rules, entity definitions Standard library only
application/ AnalysisOrchestrator, interface definitions domain/
infrastructure/ Parsers, reporters, factory implementations application/, domain/
cmd/ CLIs, DI wiring infrastructure/

Three Factory points

  • AnalyzerFactory — registers and creates analysis rules (domain.AnalysisRule)
  • ParserFactory — switches parsers by input format (application.WorkflowParser)
  • ReporterFactory — switches reporters by output format (application.ReportFormatter)

Quick start (30 seconds, no setup)

The fastest way to see Shingan working — runs a bundled sample workflow through the full pipeline and prints a real findings report. No input file needed.

npx --yes shingan-lint demo
# → exit 2, surfaces loop_guard (Critical) + unreachable_node (Warning) + more

Then run it on your own workflow JSON:

# A minimal but failing workflow, written to disk and analyzed in one shot.
cat > workflow.json <<'EOF'
{
  "entry_node_id": "start",
  "nodes": [
    {"id": "start", "type": "llm",  "name": "Planner"},
    {"id": "loop",  "type": "loop", "name": "Retry Loop"},
    {"id": "step",  "type": "llm",  "name": "Worker"}
  ],
  "edges": [
    {"from": "start", "to": "loop"},
    {"from": "loop",  "to": "step"},
    {"from": "step",  "to": "loop"}
  ]
}
EOF
npx --yes shingan-lint analyze --input workflow.json --output markdown

Install

npm (recommended, zero setup)

# one-shot run (no install)
npx --yes shingan-lint demo

# project-pinned
pnpm add -D shingan-lint
pnpm exec shingan demo

# global
npm install -g shingan-lint
shingan demo

shingan-lint is a thin Node wrapper. Its postinstall step downloads the platform-specific Go binary from GitHub Releases, verifies the SHA-256 checksum, and caches it under ~/.cache/shingan-lint/v<ver>/. Linux, macOS, and Windows on amd64 / arm64 are all supported.

Go install (Go developers)

go install github.com/hatyibei/shingan/cmd/shingan@latest
shingan demo

Build from source

git clone https://github.com/hatyibei/shingan.git
cd shingan
go build -o shingan ./cmd/shingan
./shingan demo

Docker

docker pull ghcr.io/hatyibei/shingan:latest
docker run --rm ghcr.io/hatyibei/shingan demo

Usage

JSON input (default format):

shingan analyze --input workflow.json --output markdown

ADK-Go input (directory of .go files):

shingan analyze --format adk-go --input ./agents/ --output markdown

LangGraph (Python) input — requires the analyzed project's deps to be importable:

# prerequisite: pip install langgraph   (plus whatever your project imports)
shingan analyze --format langgraph --input agent.py --output markdown
shingan analyze --format langgraph --input ./agents/ --output sarif --output-file findings.sarif

Exit codes: 0 = info-only or clean, 1 = warnings present, 2 = at least one critical.

CI integration (GitHub Actions):

- name: Shingan check
  run: shingan analyze --format adk-go --input ./agents/

Real-world dogfood (v0.5.0 → v0.8.7)

Shingan is run against production LangGraph / CrewAI / n8n repos before each release. Zero Critical false positives across 12+ swept OSS at the latest release. Every Critical FP we ever surfaced has a regression fixture pinned in testdata/ and a "dogfood-driven" CHANGELOG entry.

Repo Framework Findings Critical FP Notes
gpt-researcher (24K★) LangGraph 1 cycle_detection 0 Real bug → Issue #1766
open_deep_research (7K★) LangGraph 9 → 1 0 Real bug → Issue #269
executive-ai-assistant (1K★) LangGraph 14 → 3 0 v0.8.6 sentinel/router-Literal fix
company-researcher LangGraph 1 Critical FP → 0 0 Triggered tools_condition builtin handling
DATAGEN (1.7K★) LangGraph 2 unreachable FP → 0 0 Triggered v0.8.7 for-loop unrolling
Devyan CrewAI 3 unreachable FP → 0 0 Triggered v0.8.7 agents-only fallback
swe-agent (630★) LangGraph 4 cycle_detection 0 Real bug → Issue #6
SRAgent, open-multi-agent-canvas LangGraph 0 0 Clean repos

→ Full track record + reproducible accuracy benchmark: docs/benchmarks.md.

Idioms surfaced and supported during the v0.8.5+ dogfood loop:

  • Command(goto=...) and def fn() -> Command[Literal["a","b"]] typed dispatch
  • add_conditional_edges("src", router_fn) with omitted path_map — router's -> Literal[...] annotation read instead
  • add_conditional_edges path_map as list (["a","b"]) and dict ({"k":"a"})
  • add_edge([a, b], c) fan-in form
  • Multi-graph composition (builder.add_node("section", section_builder.compile()))
  • LangGraph builtin routers (tools_condition, with hooks for adding more)
  • Literal[END, ...] sentinel exit recognition (cycle severity downgrade Critical → Warning)
  • Generic-exception fallback to AST when modules side-effect at import (OpenAIError, missing API keys, etc)

Running the same scan on your fork:

shingan analyze --format langgraph --input path/to/graph.py --output markdown

Demo on real ADK-Go samples

examples/real/ ships three samples written against google.golang.org/adk v1.1.0. Shingan detects the following findings on each:

Sample Rule Severity What it catches
examples/real/infinite_loop.go cycle_detection Critical loopagent.New without MaxIterations — unbounded loop
examples/real/unreachable.go unreachable_node Warning orphan_analyzer not wired into the orchestrator's SubAgents
examples/real/missing_handler.go error_handler_checker Warning planner calls browser_search but has no conditional branch for failure

Run them:

shingan analyze --format adk-go --input examples/real/infinite_loop.go --output markdown
# exit code 2 (Critical)

shingan analyze --format adk-go --input examples/real/unreachable.go --output markdown
# exit code 1 (Warning)

shingan analyze --format adk-go --input examples/real/missing_handler.go --output markdown
# exit code 2 (Critical: loop_guard + Warning: error_handler_checker)

Notes on official ADK-Go SDK coverage:

  • Supports the loopagent.New(loopagent.Config{AgentConfig: agent.Config{SubAgents: ...}}) shape (v1.1.0)
  • Detects LlmAgent / SequentialAgent / LoopAgent New() constructor patterns via AST
  • Resolves tool nodes registered through functiontool.New(Config{Name: "..."}, handler) by following Config.Name and ident references
  • Uses a go/types second pass to read functiontool.New[TArgs, TResults](...) generic arguments and infer Tool category from the TArgs struct fields (v0.2.0+, via ParseFile). This is how missing_handler.go's browser_search tool is correctly detected.
  • error_handler_checker also fires when an LLM node carries a Tool edge but has no conditional branch (LLM→Tool pattern in ADK-Go)
  • ADK-Go SDK v1.1.0 requires go 1.25.0+; reflected in go.mod's minimum version
# E2E auto-verification under the demo build tag
go test -tags=demo -v -run TestDemo_ .

Rules

Rule ID Detects Max Severity Confidence
cycle_detection Cycles among non-Loop nodes; cycles inside LoopAgent scope Critical 1.0 (deterministic)
loop_guard LoopAgent (Loop type) without MaxIterations set Critical 1.0 (deterministic)
unreachable_node LLM/Tool nodes unreachable from the entry node Warning 1.0 (deterministic)
error_handler_checker Missing error handling after external-I/O nodes Critical 0.8 (heuristic)
cost_estimation Expensive LLM models inside loops; expensive models on trivial tasks Warning 0.7 (price drifts)
redundant_llm_call Duplicate calls with the same (prompt_template, model) Warning 0.9 (exact match)
pii_leak_scanner Path from RAG/PII source to external sink with no human gate Warning 0.6 (RAG) / 0.3 (name hint)
secret_exposure_scanner Hardcoded API keys / secrets in Node.Config Critical 0.95 (Critical/Warning) / 0.5 (Info)
max_parallel_branches A single node's fan-out (outgoing edge count) exceeds the threshold Critical 1.0 (Critical) / 0.9 (Warning) / 0.7 (Info)
deprecated_model Shutdown or soon-to-be-deprecated LLM model names (OpenAI / Anthropic / Google) Critical 1.0 (shutdown) / 0.9 (deprecated soon)
temperature_misuse LLM with temperature > 0 paired with a deterministic task signature Warning 0.9 / 0.7 / 0.5
model_card_mismatch LLM whose declared model disagrees with provider / base_url Critical 1.0 (known prefix) / 0.4 (unknown)
prompt_injection_sink user_input reaches an LLM system-prompt template (substitution → Critical / no substitution → Warning / non-system → Info) Critical 0.9 / 0.7 / 0.5
eval_missing LLM output reaches a code-execution tool (no validation → Critical / Condition gate → Warning / Human gate → skip) Critical 0.9 / 0.6
dynamic_node_construction eval(/exec(/Function(/etc. inside Node.Config (body/fn/handler/...) Critical 0.95 / 0.85 / 0.6
retry_storm Tool retry × parallelism = blast radius (≥100 → Critical, ≥30 → Warning, ≥10 → Info) Critical 0.9 / 0.7 / 0.5
circular_dep_agents Multi-agent A→B→A delegation cycle Warning 0.85 / 0.75 / 0.6
unbounded_tool_arg Tool argument schema fields without maxLength / maxItems / maximum Warning 0.7 / 0.5 / 0.4
secret_in_prompt_template Hardcoded credentials inside LLM prompt templates Critical 0.95 (exact) / 0.7 (JWT)
missing_eval_dataset Production-flagged graph without an eval_dataset reference Warning 0.7
human_gate_missing Production-flagged graph that performs sensitive external actions (API / code-exec / send / payment …) with no Human approval node anywhere Warning 0.6 (heuristic)
tool_description_missing Tool node lacking a usable description (LLM picks tools from description text; missing → wrong-tool selection) Info 0.6 (heuristic)

Supported formats

Input

Format Status Notes
langgraph Phase 1 primary (ADR-011) Extracts Python langgraph.graph.StateGraph via long-lived Python subprocess + JSON-RPC. Requires pip install langgraph (details)
adk-go GA / maintained AST analysis of Google ADK-Go (google.golang.org/adk)
json GA Shingan's native WorkflowGraph JSON
samurai Alpha Generic JSON-schema adapter for GUI workflow editors (extension example)
n8n Beta n8n workflow JSON export, pure Go (no Python / Node bridge) (details)
crewai Beta CrewAI Crew/Agent/Task definitions via Python long-lived subprocess + JSON-RPC. Requires pip install "crewai>=0.50.0" (details)
langgraph-js Experimental (PoC, v0.9) LangGraph.js (@langchain/langgraph), TypeScript/JS, via the TypeScript Compiler API in a Node shim. Needs node; installs typescript on first use
pydantic-graph Experimental (PoC, v0.9) pydantic-ai pydantic_graph (BaseNode.run() return types), AST-only Python shim
llamaindex Experimental (PoC, v0.9) LlamaIndex Workflows (event-driven @step methods), AST-only Python shim
autogen Experimental (PoC, v0.9) Microsoft AutoGen GraphFlow (DiGraphBuilder), AST-only Python shim
mastra Experimental (PoC, v0.9) Mastra (mastra.ai) TS workflows (createWorkflow().then()…), TypeScript Compiler API in a Node shim. Needs node
openai-agents Experimental (PoC, v0.9) OpenAI Agents SDK (@openai/agents) multi-agent definitions: new Agent({ handoffs }) / Agent.create(...) handoffs (and agent.asTool()) extracted as directed edges, TypeScript Compiler API in a Node shim. Needs node

The six v0.9 parsers are PoC-level: each handles idiomatic workflows of its framework. Coverage limits (dynamic graphs, closure branch conditions, split-builder chains, annotation-only return types) are documented per format.

IDE / editor integrations

Integration Status Notes
CLI (shingan analyze) GA Core experience, --since / --baseline supported
GitHub Action GA action.yml, emits SARIF for GitHub Code Scanning
MCP server (shingan-mcp) GA Callable from Claude Desktop / Cursor / LangGraph Studio
LSP server (shingan-lsp) Beta VS Code / Cursor / Neovim / Helix / Zed / JetBrains. SHA-256 LRU diff cache + degraded mode (ADR-009). See docs/lsp.md
VS Code extension (vscode-shingan) Beta extensions/vscode-shingan/, spawns shingan-lsp

Output

Format Content type Use
json application/json API response, program-to-program
markdown text/markdown CLI, human-readable reports
sarif application/sarif+json GitHub Code Scanning integration

Stability commitment

Through v1.0 we commit to no breaking changes in the following public surfaces. The version line below each item is the floor — breaking changes will not happen earlier than the listed major bump.

  • Rule names + IDs (cycle_detection, pii_leak_scanner, …) — stable through v1.0. Rules may be added; existing IDs will not be renamed or repurposed.
  • .shingan.yaml policy schema — semver-pinned. Additive keys only through v1.0.
  • SARIF output structure — conforms to SARIF 2.1.0; shingan-specific extensions live in properties and are append-only through v1.0.
  • CLI flags (shingan analyze --format / --input / --output / --min-confidence / --baseline / --since) — stable through v1.0. New flags will be added; existing flags will not change semantics.
  • Exit codes (0 = clean, 1 = Warning, 2 = Critical) — stable through v2.0.

Plugin SDK (when it ships in v0.9+) is gated behind an experimental: prefix until v1.0 and explicitly not covered by this commitment.

Zero-FP guarantee

Static analysis is only useful if developers can trust the output. Shingan tracks Critical false positives as a load-bearing quality metric — we have ended every release sweep at 0 Critical FP since v0.7 (see docs/benchmarks.md for the per-release breakdown).

If you hit a false positive:

  1. Open an Issue using the false-positive template — paste the repo URL or minimal repro, the shingan analyze output, and why you believe it's wrong.
  2. Critical FP: triaged within 24h on weekdays (best-effort), fix + regression fixture land in the next release with a dogfood-driven CHANGELOG entry.
  3. Warning / Info FP: same template, triaged within 1 week.
  4. The fix is always a parser/shim precision improvement or a confidence-rule tweak — never silently muting the finding, never moving it to a deny-list.

Every dogfood-driven FP fix shipped since v0.5 is listed in docs/benchmarks.md § Dogfood-driven shim improvements. v0.8.7 alone closed two FPs surfaced by 1.7K-star and ~300-star LangGraph/CrewAI projects.

Roadmap

  • v0.1〜v0.5 (Apr 2026): JSON / ADK-Go / Samurai parsers, Confidence × Severity 2-axis, SARIF / GitHub Action, 9 rules ✓
  • v0.6 (May 2026): ESLint-style visitor + 3-tier split (ADR-006/007), shingan-lsp, shingan-mcp, LangGraph parser, 20 rules, shingan-lint npm distribution, tag→release→npm-publish automation ✓
  • v0.7 (May 2026): n8n parser (pure Go, JSON DSL), bilingual EN/JA docs ✓
  • v0.8 (May 2026): CrewAI parser (Python shim, reuses LangGraph PythonWorker), 6 frameworks total ✓
  • v0.9+: Mastra parser (TypeScript bridge), 30+ rules, Plugin SDK preview, official site + demo video
  • v1.0: 5+ frameworks × 25+ rules, Plugin SDK GA, Marketplace listing

Development

go test ./...
go vet ./...
go build -o shingan ./cmd/shingan
make lint        # check-confidence-reason (go/analysis) + go vet

When adding a new rule, see docs/rule-authoring.md.

Documentation

Contributing → New rules

Contributors implementing new builtin rules should start with docs/rule-authoring.md. It covers the Local / Path / Global three-tier templates (ADR-007), ConfidenceReason selection guide (ADR-008), the check-confidence-reason go/analysis linter, TDD patterns, and design notes for every existing rule. External rule authors who want to ship rules from their own repo (no fork required) should read docs/plugin-sdk.md — the public plugin.Register API shipped in v0.9 with an experimental: prefix requirement; the stability promise on the ABI lands at v1.0 (ADR-010 originally deferred all external exposure to v1.0; the v0.9 implementation supersedes that with the prefix-gated early-access path).

License

MIT

About

AI Agent Workflow Static Analyzer — detect infinite loops, cost explosions, PII leak paths before execution. Go + Onion Architecture + Factory Pattern + Vertex AI integration.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors