I scale organizations with AI, not headcount — taking problems from the first customer conversation to a shipped, tested production system by directing fleets of coding and LLM agents.
Collectively Gary — a prototype for autonomous AI organizations in animal advocacy. Not agents bolted onto existing orgs: entirely new organizations that run themselves.
→ GaryOS — the pipeline itself. ~6k lines of orchestrator, a 12-stage / 7-playbook dispatcher, filesystem-as-state with atomic-rename transitions, stage-scoped credential isolation against prompt injection. Spawns short-lived coding agents per task; a human approves only the decisions the pipeline surfaces.
-
Shipped full-stack production systems end-to-end, solo. A campaign-intelligence platform for a UK advocacy org — ~120k lines, 50+ data collectors, an LLM enrichment pipeline, 430+ merged PRs in five months. A full LMS with per-student API-key provisioning and weekly spend caps. An autonomous deal-scouting dashboard for a commercial real-estate investor.
-
Built an agent that contributes to open source at scale — and lands. An orchestrator that scans, drafts, dedups, and triages upstream PRs across ~150 repositories, calibrated against historical merge rates to contribute rather than spam. Substantive merges into Babel (43.9k★), Faker.js, Wger (Django schema + migrations), pyinaturalist (new API controller), OpenStates, TandoorRecipes, OpenFoodFacts, and MDN Web Docs.
-
Built autonomous outreach with measurable outcomes at near-zero cost — 678 researched, personalized emails for $40.68, yielding 8 commitments from target organizations.
-
Wrote "non-human welfare" into the EU AI Act — adopted into the General-Purpose AI Code of Practice signed by major AI labs.
-
Published LLM-safety research at ACM FAccT 2025 — co-authored AnimalHarmBench, then retrained the worst-scoring model on the benchmark into the best-performing one, with no loss of general language ability.