Skip to content

feat(audit): Site Audit — 47-signal AEO/GEO page scoring with AI fixes#259

Merged
gkhngyk merged 2 commits into
mainfrom
feat/site-audit
Jun 16, 2026
Merged

feat(audit): Site Audit — 47-signal AEO/GEO page scoring with AI fixes#259
gkhngyk merged 2 commits into
mainfrom
feat/site-audit

Conversation

@gkhngyk

@gkhngyk gkhngyk commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Summary

A new Site Audit feature (under Content Optimization) that scores any page against a 47-signal AEO/GEO rubric and returns page-specific, AI-written fix recommendations with ready-to-paste drafts.

The rubric is an open standard implemented entirely in-house — no third-party scoring API.

How it works

POST /api/audits {url, brandId} returns a running row immediately; the audit runs in the background and the client polls GET /api/audits/:id until it flips to completed/failed. Per audit:

  1. Fetch the page via Scrape.do (proxy + JS render); robots.txt / llms.txt fetched directly with a Scrape.do fallback only when blocked.
  2. 33 deterministic signals — cheerio DOM parsing (schema/headings/lists/tables/meta/links, Flesch readability, etc.).
  3. 13 LLM-judged signals — one batched structured-output call. The page is evaluated on its own inferred topic — no brand/external context is injected, so auditing an arbitrary URL never imports unrelated subject matter.
  4. Wikidata brand-entity lookup.
  5. AI fix recommendations — a second LLM call over the fixable failing signals, returning prioritized advice + drafted snippets (meta description, H1, BLUF, FAQ JSON-LD, simplified rewrites…).

Score = weighted category mean, normalized over evaluated signals (coverage shown as X/47).

Guardrails

  • Feature-gated behind content_optimization (server-side, not just UI).
  • Monthly per-org quota — Starter 100 / Growth 500; Enterprise & self-hosted unlimited. Enforced at request time; usage recorded only for completed audits (failures don't burn quota). Mirrors the existing brief-usage pattern.
  • LLM model is provider-flexible via resolveModel — env AUDIT_LLM_MODEL (default google/gemini-3-flash-preview); self-hosters can point it at any provider with a key.

Web UI

Site Audit page: circular score gauge + grade, pass/warn/fail summary chips, category breakdown bars, signals grouped by category (each expands to finding + why + AI-written fix with copy), an AI Recommendations card, recent-audits list with delete, an issues-only filter, staged loading state, and an empty state.

Schema

  • 00017_site_audit.sqlsite_audits + audit_signal_results (+ RLS, org-membership scoped).
  • 00018_site_audit_recommendations.sqlrecommendations jsonb column.
  • 00019_site_audit_usage.sqlsite_audit_usage quota table (RLS, service-role only).

New env vars (see server/.env.example)

  • SCRAPEDO_API_KEY — page fetching.
  • AUDIT_LLM_MODEL — defaults to google/gemini-3-flash-preview.

Notes

  • Adds cheerio to the server package.
  • Server-only runtime — needs a server redeploy after merge. The three migrations are already applied to production; self-hosters get them via the repo.
  • Cost: ≈10k tokens/audit (~$0.016 at $0.5/$3 per 1M in/out).

gkhngyk added 2 commits June 16, 2026 15:23
…fixes

Scores any page against a 47-signal AEO/GEO rubric across 5 weighted
categories (structure, content, authority, trust, E-E-A-T) and returns
page-specific, AI-written fix recommendations with ready-to-paste drafts.

Server (server/src/lib/audit/ + routes/audits.js):
- Scrape.do fetcher (proxy + JS render) with smart robots.txt/llms.txt fallback
- 33 deterministic signals (cheerio DOM parsing + readability/links/meta)
- 13 LLM-judged signals in one batched structured-output call; pages are
  evaluated on their own inferred topic (no external/brand context injected)
- Wikidata brand-entity lookup
- AI fix recommendations (own LLM call) for the fixable failing signals,
  with drafted snippets; model is provider-flexible via AUDIT_LLM_MODEL
  (default google/gemini-3-flash-preview)
- Runs async: POST returns a running row immediately, client polls GET /:id
- Feature-gated (content_optimization) + monthly per-org quota
  (Starter 100 / Growth 500; Enterprise & self-hosted unlimited)

Web (dashboard/audit):
- Site Audit page under Content Optimization: score gauge + grade, pass/warn/
  fail summary, category breakdown, signal list with findings + AI fixes,
  recent audits with delete, issues-only filter, staged loading + empty state

Migrations 00017 (tables + RLS), 00018 (recommendations), 00019 (usage quota).
@gkhngyk gkhngyk merged commit dc3ccf5 into main Jun 16, 2026
4 checks passed
@gkhngyk gkhngyk deleted the feat/site-audit branch June 16, 2026 13:58
@gkhngyk gkhngyk mentioned this pull request Jun 21, 2026
12 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant