A platform that operationalizes the threat intelligence cycle by integrating external CTI data (STIX 2.1) with internal asset and organizational information. It visualizes and weights attack paths, and delivers actionable outputs to Red, Blue, and IR teams.
This system receives data from the following — it does not replace them: real-time SIEM detection, endpoint protection, vulnerability scanning automation.
- Multi-source ingestion — OpenCTI (STIX 2.1), AWS Security Hub, GCP Security Command Center, TRACE (web/PDF crawler with PIR-driven validation gate), and analyst manual input via API
- Attack Graph — Models asset connectivity and reachable attack paths. Asset criticality is dynamically adjusted per PIR at ETL time
- Attack Flow — Tracks TTP time-series transitions as weighted
FollowedByedges - PIR cascade —
PIRis a first-class graph node withPirPrioritizesActor(TAP),PirPrioritizesTTP(PTTP), andPirWeightsAssetedges materializing the Strategic → Operational → Tactical cascade - Identity targeting —
IdentitySDO andActorTargetsIdentityedges capture credential / org-targeting attribution (paired with TRACE) - Pluggable database backend — SQLite file (default since 4.0.0; synced via StorageBackend to a local directory or GCS) or Cloud Spanner (
SAGE_DB=spanner) - Analysis API — Internal REST API (Cloud Run, VPC-internal, IAP-protected) exposing attack paths, choke points, actor TTPs, and asset exposure queries
- Team outputs — GitHub Enterprise playbook issues, Slack priority alerts, Caldera adversary profiles for red team simulations
- TLP enforcement — TLP Red objects excluded from storage; only
white/green/amberingested - IR feedback loop — Incident records feed back into
FollowedByweights over time
[OpenCTI]──STIX 2.1───────┐
[Security Hub]─────────────┤
[SCC]──────────────────────┼──→ [GCS: Landing Zone]
[TRACE: validated STIX]────┤ (PIR-driven L2 gate +
[Analyst Input API]─manual─┘ semantic + stix2-validator)
[BEACON: assets.json / pir_output.json /
identity_assets.json / user_accounts.json]
│ (TRACE: validate_assets / validate_pir /
│ validate_identity_assets / validate_user_accounts で検証通過後)
▼
[StorageBackend: Local (output/) or GCS]
├── stix/ ← TRACE STIX bundles
├── assets/ ← BEACON assets outputs
├── pir/ ← BEACON PIR outputs
├── plans/ ← collection_plan, sources_candidate
└── db/ ← sage.db (SQLite backend database file)
│
▼
[SAGE: load_assets / load_identity_assets / load_user_accounts /
PIR ingest] (falls back to StorageBackend when --input omitted)
│
▼
[ETL Worker — Cloud Run]
├── Reads ALL bundles from StorageBackend stix/ category
├── STIX parsing + deduplication (identity SDO 含む)
├── TLP enforcement
├── PIR cascade build (TAP/PTTP/WeightsAsset)
├── FollowedBy weight recalculation
└── Graph upsert (via sage.db backend dispatch)
│
▼
[Database — selected by SAGE_DB]
├── sqlite (default): sage.db file synced via StorageBackend
│ local: <base_dir>/db/sage.db in place
│ gcs: download on startup → write → upload after ETL
└── spanner (optional): Spanner Graph ThreatIntelGraph
│
▼
[Analysis API — Cloud Run, VPC-internal, read-only DB access]
GET /attack-paths GET /choke-points
GET /actor-ttps GET /asset-exposure
GET /actors GET /similar-incidents
GET /threat-summary
POST /caldera/adversary
POST /api/incidents GET /api/incidents
POST /api/annotate
│
▼
[GHE Issues] [Slack alerts] [Caldera adversary profiles]
| Document | Description |
|---|---|
| docs/setup.md | Clone, install, configure, first run, testing |
| docs/deploy.md | Cloud Run deployment and Cloud Scheduler |
| docs/usage.md | CLI commands, workflows, operations, troubleshooting |
| docs/data-model.md | Database schema (SQLite default / Spanner optional), node/edge definitions, PIR formulas |
| docs/ir-feedback-flow.md | IR feedback loop and scoring formulas |
| docs/structure.md | Project directory layout |
| docs/dependencies.md | Dependency rationale and licenses |
| docs/api-stability.md | API stability policy and BC guarantees |
Cross-project:
- BEACON pipeline-guide.md — End-to-end CTI pipeline
- BEACON citations.md — External citations and license inventory
git clone https://github.com/sw33t-b1u/sage.git
cd sage
uv sync --extra dev
cp .env.example .env # defaults run on SQLite with local storage — no GCP values needed
# set SAGE_DB=spanner (+ GCP_PROJECT_ID, SPANNER_*) for the Spanner backendSee docs/setup.md for the full setup procedure.
See docs/structure.md for the full directory layout and design criteria.
make check # lint + test + audit (full quality gate)
make vet # ruff check
make lint # ruff format --check
make format # ruff format + fix
make test # pytest
make audit # pip-auditSAGE consumes PIR JSON produced by BEACON, validated by TRACE before ingestion. The PIR model follows:
- FIRST CTI-SIG — Priority Intelligence Requirements curriculum
- SANS — Bridging Gaps in CTI: A Practical Guide to Threat-Informed Security PIRs
PIRs cascade into Operational TAP (Threat Actor Prioritization) and Tactical PTTPs (Priority TTPs). This cascade is materialized in the graph as PIR nodes plus PirPrioritizesActor / PirPrioritizesTTP / PirWeightsAsset edges (added in 0.4.1, generalized in 0.5.0).
Apache-2.0 — see LICENSE