GPT-5.5 In Deep

GPT-5.5 now powers Amp's deep mode.

It is a better coding agent than GPT-5.4: more steerable, more interactive, and better at staying inside constraints.

More Agent-Shaped

GPT-5.5 is better at the actual agent loop: read enough code, make the change, verify it, explain what happened. Whereas with GPT-5.4, prompts often had to spell out the process.

With GPT-5.5 we found it's best to clearly describe the outcome and put the rules and repeatable steps into the guidance files and tools.

If the task is vague, it can still solve the wrong problem cleanly. Good prompts matter more, not less.

Reasoning Effort

With GPT-5.5 we lowered deep's default effort from high to medium (deep²).

Do not assume higher reasoning is always better: in our eval, GPT-5.5 high cost more than medium and performed worse.

xhigh (deep³) is for cases where maximum quality matters more than cost.

As before, you can toggle the thinking effort directly in the CLI with Opt+D (Alt+D), cycling through low (deep), medium, and xhigh.

How To Use It

The most important guideline to follow: tell GPT-5.5 what success looks like.

A few patterns have worked well for us:

  • Give it the outcome and the constraints. Example: “Refactor transcript caching into a separate module. Keep the public API unchanged. Perf logging should only run behind this env var. Cache growth should be capped. Run the focused tests and typecheck.”
  • Give it a way to prove the fix. Example: “This CLI focus bug should be verified in the actual CLI, not just by inspection. Reproduce it interactively, check focus state, then run the focused test.”
  • Use it for planning when the shape of the fix is unclear. Example: “Analyze this protocol deadlock. Is it an infrastructure bug, a protocol bug, or something the client must recover from? Propose 2–3 options with tradeoffs and pseudo-code. Do not implement yet.”

Update Amp to the latest version by running amp update and you're ready to go.

Model Card

We wrote up the full GPT-5.5 model card with evals, reasoning guidance, prompt changes, and caching/ZDR caveats.