Echo: results so far
Routing LLM requests cheaply without training a router — and the measurement bug that nearly fooled us. A cross-family local judge reaches 94% of the oracle's routing quality at ~29% lower cost than always using the big model.
Thoughts on AI, memory, learning systems, and building digital products.
Routing LLM requests cheaply without training a router — and the measurement bug that nearly fooled us. A cross-family local judge reaches 94% of the oracle's routing quality at ~29% lower cost than always using the big model.
Call the cheap model twice with different personas. If the two answers agree, keep the cheap answer; if they disagree, escalate. No classifier, no labels. The hard part turns out to be measuring agreement, and the winning signal is a surprise.
Tech World is a multiplayer game engine where players walk through a tile-based world, solve coding challenges in an embedded IDE, cast spells by speaking aloud, and see each other as live video bubbles. There's no game server. We hold our Imagineering meetups here.
What it means to engineer systems where AI agents are first-class participants — and why we think it deserves a job title of its own.
What the title actually means here, why it's both roles glued into one, and what changes when the team you lead includes humans and agents on equal footing.
How we take engineers who are excited about AI and turn them into people who can ship agentic systems unsupervised. The progression isn't what most bootcamps teach.
What five academic disciplines taught us about how AI coding assistants should remember
We built a sleep cycle for Claude Code — memory consolidation & creative dreaming, running overnight. Ten days later, Anthropic shipped Auto Dream with identical mechanics.