The skill runtime
Open-source runnerTurn a SKILL into a product pipeline.
Write the prompt. The agent plans and runs the steps — calling tools in parallel where it can — and traces every run, step by step.
You are a refund agent.Stay within policy. Be fair.## Check the policyMatch the order to the rules.## Verify the orderLook up the order and charge.## Assess the refundDecide: full, partial, or deny.## Issue the refundRefund and email the customer.
Every call metered to the cent — a line-item receipt per run.
What is a skill
A skill is one folder.
A prompt, its tools and evals, and a typed input/output — bundled, versioned, and deployed as one.
## triage
## research
## draft
def search(q)
→ results
def fetch(url)
→ text
assert cited
score ≥ 0.9
in: question
out: answer
Deploy · call
Deploy once.
Call it anywhere.
One command ships your skill to a versioned endpoint — then call it from Python, TypeScript, or your coding agent over MCP.
Deploy with puras
pip install puras, then puras deploy — the CLI zips your skill, uploads it, and activates a version. No servers, no Dockerfile, no CI.
Call it from your app
import puras and call it by its workspace/skillpack/skill path — Python, TypeScript, or MCP. Get a typed result back.
Built in, not bolted on
Reliable and observable
by default.
Every pipeline the runtime builds is graded, traced, gated, and resumable — you wire up none of it. Three you'd otherwise build yourself, live on a real pipeline.
Reliability · evals & guardrails
Test a skill like code.
Attach an eval suite as data — graded by schema, exact match, your own code, or an LLM judge — and gate every deploy on pass-rate with --threshold, so a regression never ships. Then guardrails enforce policy at runtime: PII redaction, prompt-injection blocks, and schema or tool-call rails that stop a bad output before it leaves the run.
A mismatched invoice total or a leaked PII field can never silently pass.
Human-in-the-loop
Sensitive steps wait for a person.
Mark a high-stakes step — releasing a payment, sending a contract for signature — as needing approval in skill.yaml. The run pauses for a human decision, then resumes exactly where it left off.
The guardrail between an agent and an action it shouldn't take alone.
approved · payment released · posted to ledger
Durable runs
Resume, don't restart.
Long runs checkpoint each step. A vendor rate-limit or a crash near the end resumes from the last good step instead of re-running the whole pipeline — and you are never double-charged for work already done.
You didn't wire retries — the runtime resumed.
resumed from the last good step · validated rows not re-charged
Built into the puras runtime
Your runtime learns from every run.
Because puras runs, traces, and grades every job, it can look back across them. Hindsight reads your recent runs — their traces, eval scores, and user feedback — and hands you concrete fixes. It's not a tool you wire up; it's the runtime improving your skill for you.
lead-enrichment
last 10 runsHindsight report
Outcome
Open source
No lock-in.
The whole runner is open source — the same engine, tools, and capabilities run on your machine and in our cloud. The cloud just runs it for you.
Local
open source · MITRun it yourself, on your own keys:
- The full agent loop & your skill code
- Your skill.yaml — same format, unchanged
- Tools: media, web & memory
- Evals, traces & versioning
pip install "puras[local]" — your machine, your keys.
Cloud
managedThe same engine, run for you — plus:
- A fresh, isolated machine for every job — secure & sandboxed
- Managed keys — nothing to wire up
- Durable resume & human approvals
- Dashboard, traces & per-job billing
One command: puras deploy.
Same skill format, same engine — walk away with the open-source runner anytime. View on GitHub
FAQ
Questions, answered.
One folder. A skill.yaml manifest with a typed input/output contract, plus a SKILL.md prompt — or a plain Python function — and any tools and evals it needs. puras deploy gives it an immutable, versioned endpoint you call like an API.
You don't author a graph. You write the work; the agent plans the steps, runs them in parallel, retries, and traces every one — then hands back a typed response, files, or the side effects. You watch it run as a live pipeline; you never wire one.
The agent chooses its own path, so two runs can differ. You get confidence a different way: every run is traced step by step, you gate each deploy on an eval suite in CI, and guardrails enforce policy at runtime — so behavior is pinned by tests and rails, not by freezing a graph.
Every run is traced step by step — each model call and tool call, with timing and cost — in the dashboard and over the API. You see exactly what happened, and get a line-item receipt for what it spent.
Mark a tool as needing confirmation in skill.yaml. When the agent calls it, the runtime pauses the run for a human to approve or reject, then resumes exactly where it left off. It's enforced by the runtime — never just asked for in the prompt.
You pay per job, to the cent, from a prepaid balance — only the model tokens and media a run actually uses, plus a flat 5% platform fee. Every run returns a line-item receipt.
import puras and call it by its workspace/skillpack/skill path — a skillpack is a deployed bundle of related skills — from Python, TypeScript, raw HTTP, or your coding agent over MCP (OAuth in the browser, no key to paste). You get a typed result, files, or the side effects back.
Pick any supported family per skill — claude/*, gpt/*, or gemini/* — pinned in skill.yaml and overridable per run. For local parity, pip install "puras[local]" runs the open-source runner — the same agent loop, on your own machine and keys.
Write the skill.
Skip the plumbing.
You define the work; the runtime builds and runs it as a traced, tested pipeline. Deploy once, call it like an API. $10 free credit, no card, no subscription.
Or browse example skills already deployed on puras.