Skip to content

fix(codex): Fast mode (service tier) toggle + /fast command (closes #898)#904

Open
swear01 wants to merge 10 commits into
tiann:mainfrom
swear01:fix/issue-898-codex-fast-mode
Open

fix(codex): Fast mode (service tier) toggle + /fast command (closes #898)#904
swear01 wants to merge 10 commits into
tiann:mainfrom
swear01:fix/issue-898-codex-fast-mode

Conversation

@swear01

@swear01 swear01 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Closes #898 (Codex portion)

Adds Codex Fast mode (the service-tier speed switch — ~1.5× faster on GPT-5.5 / GPT-5.4 for higher credit cost), both as a /fast command and as a composer UI toggle, wired end-to-end with persistence.

Fast mode is a separate service tier, not reasoning effort. It uses the native serviceTier field on the Codex app-server turn/start / thread/start (added upstream in openai/codex#13334, merged before HAPI's Codex floor 0.124.0 — no version bump needed).

What's included

CLI (Codex)

  • serviceTier on TurnStartParams / ThreadStartParams; passed through buildTurnStartParams (three-state: omit = untouched, null = explicit Standard, "fast" = Fast).
  • /fast on|off|status intercepted in resolveCodexSlashCommand.
  • runCodex threads it into every EnhancedMode and accepts it via SetSessionConfig.
  • AgentSessionBase carries it through the session-alive keepalive so the hub can persist it.

Hub (persistence)

  • service_tier column (schema v10 + idempotent migration), store setter, sessionCache + syncEngine plumbing, change-detection/broadcast.
  • New POST /sessions/:id/service-tier (Codex + remote-only), forwarding SetSessionConfig.

Shared (protocol)

  • serviceTier on Session / SessionPatch, the session-alive payload, the resume target, and a SessionServiceTierRequest schema.

Web (UI)

  • api.setServiceTier + useSessionActions mutation (Codex + remote gating).
  • A Fast / Standard toggle in the composer settings overlay, shown only for Codex GPT-5.5 / GPT-5.4 (codexModelSupportsFastMode).
  • StatusBar now reflects the real tier instead of the effort heuristic.
  • New misc.fastMode* i18n keys (en + zh-CN).

Behavior notes / upstream gotchas honored

  • Three-state field (Option<Option<String>>): Fast off sends explicit null, never an empty string — avoids the openai/codex#15853 turn-context-clears-tier pitfall.
  • Sticky override ("this turn and subsequent turns"), like model.
  • ChatGPT login only — with an API key Codex bills at standard pricing; HAPI can't detect this, so Fast simply has no effect there.

Out of scope (follow-up)

  • Claude Code /fast — a different (Opus output-acceleration) mechanism; the non-interactive switch is still unconfirmed and will be scoped separately.

Test plan

  • TDD: failing tests added first (appServerConfig, slashCommands), then implementation
  • New web helper test codexFastMode.test.ts
  • tsc --noEmit clean across cli / web / hub (CI gate)
  • Full suites green: cli 986, web 968, hub 416 (+3 skip) — no regressions

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No findings.

Summary

  • Review mode: initial
  • No issues found in the latest diff. Residual risk: I did not execute the test suite, and the new service tier behavior still depends on the installed Codex app-server accepting serviceTier for thread/start and turn/start at runtime.

Testing

  • Not run (automation)

HAPI Bot

Wires the Codex Fast mode (service tier) end-to-end so it can be toggled
from the web composer and survives reload/handoff:

- shared: serviceTier on Session/SessionPatch, session-alive payload,
  resume target, and a SessionServiceTierRequest schema
- cli: AgentSessionBase carries serviceTier through keepAlive; runCodex
  syncs it to the session instance
- hub: service_tier column (schema v10 + migration), store setter,
  sessionCache + syncEngine plumbing, POST /sessions/:id/service-tier
- web: api.setServiceTier + mutation, a Fast/Standard toggle in the
  composer settings (gated to Codex GPT-5.5/5.4), and StatusBar now
  reflects the real tier instead of the effort heuristic

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Keepalive clears the persisted/default service tier on Codex startup — applyCurrentConfigToSession() runs when the Codex wrapper becomes ready, and setServiceTier(currentServiceTier ?? null) turns the untouched undefined state into explicit Standard. setCollaborationMode() then immediately pushes a keepalive, so the hub persists serviceTier: null before the user does anything. A resumed Fast session is silently downgraded, and account-default Fast is also disabled for new untouched sessions. Evidence cli/src/codex/runCodex.ts:109
    Suggested fix:
    const { api, session, sessionInfo } = bootstrap
    let currentServiceTier: string | null | undefined =
        sessionInfo.serviceTier === 'fast' ? 'fast' : undefined
    
    const applyCurrentConfigToSession = (options?: { syncModel?: boolean }) => {
        const sessionInstance = sessionWrapperRef.current
        if (!sessionInstance) return
        // Preserve the third state: undefined means "omit/leave account default".
        if (currentServiceTier !== undefined) {
            sessionInstance.setServiceTier(currentServiceTier)
        }
    }

Summary

  • Review mode: follow-up after new commits
  • Found one Major issue in the new service-tier plumbing: startup keepalive collapses “unset” into explicit Standard and overwrites Fast/default state.

Testing

  • Not run (automation). Add a Codex resume/startup test that seeds serviceTier: 'fast' and asserts the first config sync/keepalive preserves fast, plus an untouched session test that asserts no service tier is emitted.

HAPI Bot

Comment thread cli/src/codex/runCodex.ts Outdated
sessionInstance.setModel(currentModel ?? null);
}
sessionInstance.setModelReasoningEffort(currentModelReasoningEffort ?? null);
sessionInstance.setServiceTier(currentServiceTier ?? null);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] currentServiceTier starts as undefined, but this added line converts it to null every time the Codex wrapper becomes ready. Because setCollaborationMode() immediately pushes a keepalive, the hub persists serviceTier: null before any user action, so resuming a persisted Fast session silently downgrades it and untouched sessions lose the account-default tier.

Suggested fix:

const { api, session, sessionInfo } = bootstrap
let currentServiceTier: string | null | undefined =
    sessionInfo.serviceTier === 'fast' ? 'fast' : undefined

const applyCurrentConfigToSession = (options?: { syncModel?: boolean }) => {
    const sessionInstance = sessionWrapperRef.current
    if (!sessionInstance) return
    // Preserve the third state: undefined means "omit/leave account default".
    if (currentServiceTier !== undefined) {
        sessionInstance.setServiceTier(currentServiceTier)
    }
}

Addresses HAPI Bot [Major] on PR tiann#904: applyCurrentConfigToSession ran
setServiceTier(currentServiceTier ?? null) on wrapper-ready, collapsing the
untouched `undefined` state into explicit Standard. The immediate
setCollaborationMode keepalive then persisted serviceTier: null, silently
downgrading resumed Fast sessions and disabling account-default Fast.

- Seed currentServiceTier from the persisted session (sessionInfo.serviceTier),
  so a resumed Fast thread keeps running Fast.
- Only call setServiceTier when the tier is explicit (!== undefined), preserving
  the three-state omit semantics at the keepalive boundary.
- Add regression tests: persisted Fast is re-asserted; untouched omits the tier.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Service tier persistence still collapses explicit Standard into “unset” — the follow-up keeps persisted Fast, but sessionInfo.serviceTier ?? undefined still maps stored null to undefined. Since the new /service-tier path stores explicit Standard as null, a user who turns Fast off loses that explicit Standard choice after restart/resume; later Codex starts/turns omit serviceTier instead of sending app-server null, so account/default Fast can come back. The nullable free-form schema also leaves unsupported tier strings accepted and forwarded. Evidence cli/src/codex/runCodex.ts:90, shared/src/apiTypes.ts:145, hub/src/web/routes/sessions.ts:579.
    Suggested fix:
    // shared: reserve null for untouched; persist explicit Standard distinctly.
    const ServiceTierRequestValueSchema = z.preprocess(
        (value) => value === null ? 'standard' : typeof value === 'string' ? value.trim().toLowerCase() : value,
        z.enum(['fast', 'standard'])
    )
    
    export const SessionServiceTierRequestSchema = z.object({
        serviceTier: ServiceTierRequestValueSchema
    })
    
    type StoredServiceTier = 'fast' | 'standard' | null
    type CodexServiceTier = 'fast' | null | undefined
    
    const toCodexServiceTier = (tier: StoredServiceTier | undefined): CodexServiceTier =>
        tier === 'standard' ? null : tier === 'fast' ? 'fast' : undefined
    Use the stored standard value when persisting/returning applied config, and translate to Codex app-server null only when building thread/start or turn/start params.

Summary

  • Review mode: follow-up after new commits
  • The prior Fast-resume regression is partially fixed, but the persisted representation still cannot distinguish untouched from explicit Standard. That makes Fast-off non-sticky after restart/resume and leaves invalid service-tier strings accepted.

Testing

  • Not run (automation). Add regression coverage for: toggle Standard → persist distinct standard → restart → first Codex turn sends app-server serviceTier: null; and POST /service-tier rejects an unsupported value.

HAPI Bot

Comment thread cli/src/codex/runCodex.ts Outdated
// explicitly. Seed from the persisted session so a resumed Fast thread
// keeps running Fast; a persisted standard/null is treated as untouched so
// we never collapse "unset" into an explicit Standard the user didn't pick.
let currentServiceTier: string | null | undefined = sessionInfo.serviceTier ?? undefined;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This preserves persisted Fast, but it still treats persisted null as “unset”. The new /service-tier path also uses null for explicit Standard, so after restart/resume Fast-off is no longer sticky: HAPI omits serviceTier instead of sending Codex app-server null, and account/default Fast can come back. Store explicit Standard as a distinct value such as standard, reserve null for untouched, and translate standard to app-server null only when building thread/start / turn/start params.

Suggested fix:

type StoredServiceTier = 'fast' | 'standard' | null
type CodexServiceTier = 'fast' | null | undefined

const toCodexServiceTier = (tier: StoredServiceTier | undefined): CodexServiceTier =>
    tier === 'standard' ? null : tier === 'fast' ? 'fast' : undefined

The Fast toggle was gated on a model-name regex (gpt-5.5/5.4), which still
showed a no-op control to API-key users — Fast credits only apply with
ChatGPT login. Codex's model/list catalog advertises the service tiers
actually available for each model in the current auth/plan context, so gate
on that instead:

- cli: capture serviceTiers (ids) per model in ModelListItem + normalizeModel
- shared: CodexModelSummary.serviceTiers (flows through the existing
  getSessionCodexModels pass-through; no hub change needed)
- web: codexModelAdvertisesFastTier(sessionModel, models) replaces the regex;
  SessionChat gates the toggle on it (hidden while the catalog is
  loading/errored). The toggle now only appears when toggling it will
  actually take effect.

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Explicit Standard service tier is still persisted as “unset” — the UI sends null for Standard, the API schema accepts/forwards that same null, and Codex startup converts persisted null to undefined. That means a user who turns Fast off only sends app-server serviceTier: null while the process is alive; after restart/resume HAPI omits serviceTier, so an account/thread default Fast tier can come back. The same loose schema also accepts arbitrary non-empty strings and forwards them as tier ids. Evidence web/src/components/AssistantChat/HappyComposer.tsx:632, shared/src/apiTypes.ts:145, hub/src/web/routes/sessions.ts:579, cli/src/codex/runCodex.ts:90.
    Suggested fix:
    const ServiceTierRequestValueSchema = z.preprocess(
        (value) => value === null ? "standard" : typeof value === "string" ? value.trim().toLowerCase() : value,
        z.enum(["fast", "standard"])
    )
    
    type StoredServiceTier = "fast" | "standard" | null
    type CodexServiceTier = "fast" | null | undefined
    
    const toCodexServiceTier = (tier: StoredServiceTier | undefined): CodexServiceTier =>
        tier === "standard" ? null : tier === "fast" ? "fast" : undefined
    Store/return standard for an explicit Fast-off choice, reserve stored null for untouched, and translate standard to app-server null only when building thread/start / turn/start params.

Summary

  • Review mode: follow-up after new commits
  • The follow-up improves UI gating by reading the Codex model catalog, but the earlier Standard-vs-unset persistence issue remains unresolved. This still makes Fast-off non-sticky across restart/resume.

Testing

  • Not run (automation). Add regression coverage for: Fast on -> Fast off -> persisted standard -> restart/resume -> first Codex turn includes app-server serviceTier: null; and POST /service-tier rejects unsupported strings.

HAPI Bot

Comment thread cli/src/codex/runCodex.ts Outdated
// explicitly. Seed from the persisted session so a resumed Fast thread
// keeps running Fast; a persisted standard/null is treated as untouched so
// we never collapse "unset" into an explicit Standard the user didn't pick.
let currentServiceTier: string | null | undefined = sessionInfo.serviceTier ?? undefined;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] This still treats persisted null as “unset”, while the new Standard option also sends/stores null. After the user disables Fast, HAPI sends app-server serviceTier: null only until the Codex process restarts; on resume this line converts the stored value to undefined, so later thread/start / turn/start params omit the tier and an account/thread Fast default can reappear. Store explicit Standard as a distinct value such as standard, reserve null for untouched, and translate standard to app-server null only at the Codex app-server boundary.

Suggested fix:

type StoredServiceTier = "fast" | "standard" | null
type CodexServiceTier = "fast" | null | undefined

const toCodexServiceTier = (tier: StoredServiceTier | undefined): CodexServiceTier =>
    tier === "standard" ? null : tier === "fast" ? "fast" : undefined

Addresses HAPI Bot [Major] (round 2): a single persisted null conflated
"untouched" with "explicit Standard". A user who turned Fast off persisted
null, but startup mapped null -> undefined (untouched) and omitted serviceTier,
so an account/thread-default Fast could silently return after restart/resume.

Introduce a distinct stored representation:
- 'fast' / 'standard' are explicit user choices; null/undefined = untouched.
- Translate 'standard' -> Codex app-server serviceTier: null ONLY when building
  thread/turn params (toAppServerServiceTier); untouched omits the field.
- /fast off now stores 'standard'; the web Standard option sends 'standard'.
- Tighten SessionServiceTierRequest to enum(['fast','standard']) so stray tier
  strings are never forwarded.

Tests: sticky-Standard-on-resume regression; turn/thread params translate
'standard'->null and omit on untouched; hub route applies fast/standard and
rejects unsupported values + local sessions.

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No issues found in the latest diff with confidence >= 80%.

Summary

  • Review mode: follow-up after new commits
  • The previous Standard-vs-untouched service tier issue appears resolved: the web/API path now sends standard, the hub rejects unsupported tier strings, persistence keeps the sentinel distinct from null, and the Codex app-server boundary translates standard to serviceTier: null only when building thread/turn params.
  • Residual risk: I did not run the full suites in this automation pass, so this is a static review only.

Testing

  • Not run (automation)

HAPI Bot

…catalog gate

Live E2E against an authed Codex session revealed the model catalog advertises
the Fast tier with id 'priority' and display name 'Fast' (not id 'fast'), so the
/fast/i gate — which only saw tier ids — wrongly hid the toggle for valid
ChatGPT users on gpt-5.5/gpt-5.4. Capture both the tier id and name as
lowercased tokens so the existing name-based match recognizes 'Fast'. The sent
value stays 'fast' (the documented service_tier value / raw additionalSpeedTiers
request tier). Verified end-to-end: gpt-5.5/gpt-5.4 gate on, gpt-5.4-mini off.

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No issues found in the latest diff with confidence >= 80%.

Summary

  • Review mode: follow-up after new commits
  • Re-reviewed the full PR diff, including the follow-up service-tier catalog/token changes and the Codex Fast mode path across web, hub persistence/RPC, shared schemas, CLI keepalive, slash commands, and Codex app-server request builders.
  • Residual risk: this was a static review pass; I did not run the test suites.

Testing

  • Not run (automation)

HAPI Bot

Resuming a Codex session spawns a fresh session (serviceTier null) and merges
the old one in. Unlike model/effort/permissionMode, serviceTier was neither
threaded through the resume spawn nor preserved in mergeSessionData, so a
resumed Fast (or explicit Standard) session silently reverted to the account
default.

Thread serviceTier through the spawn path like its siblings:
- hub: resumeSession passes session.serviceTier to spawnSession; rpcGateway +
  syncEngine carry it in the spawn RPC payload; mergeSessionData preserves it
  old->new (safety net).
- cli: SpawnSessionOptions.serviceTier; apiMachine forwards it; buildCliArgs
  emits --service-tier for codex; the codex command parses it; runCodex seeds
  currentServiceTier from the spawn override first (opts.serviceTier ??
  sessionInfo.serviceTier), so a resumed thread immediately runs the right tier.

Verified end-to-end: set Fast -> kill process -> reopen -> resumed session (new
id) still runs Fast. Tests: buildCliArgs --service-tier (codex only), runCodex
spawn-override seed, mergeSessionData service-tier preservation.

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Major] Fast mode sends the wrong app-server tier value — the new catalog normalization says the Codex Fast tier's catalog id/request value is priority, but the request builder translates HAPI's stored fast state into serviceTier: 'fast'. That means /fast and the UI toggle can persist as Fast while thread/start / turn/start do not request the advertised app-server tier. Evidence cli/src/codex/utils/appServerConfig.ts:60.
    Suggested fix:
    function toAppServerServiceTier(stored: string | null | undefined): string | null | undefined {
        if (stored === 'fast') {
            return 'priority';
        }
        if (stored === 'standard') {
            return null;
        }
        return undefined;
    }

Summary

  • Review mode: follow-up after new commits
  • Re-reviewed the full PR diff and the follow-up resume propagation path. Found one app-server wire-value mismatch in the Fast-mode request translation.

Testing

  • Not run (automation)

HAPI Bot

Comment thread cli/src/codex/utils/appServerConfig.ts Outdated
*/
function toAppServerServiceTier(stored: string | null | undefined): string | null | undefined {
if (stored === 'fast') {
return 'fast';

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MAJOR] Fast mode is persisted internally as fast, but this boundary is where that sentinel becomes the Codex app-server serviceTier value. The new catalog handling documents/tests the real Fast tier as id/request value priority, while this line sends fast, so /fast and the UI toggle can show/persist Fast without requesting the advertised tier on thread/start / turn/start.

Suggested fix:

function toAppServerServiceTier(stored: string | null | undefined): string | null | undefined {
    if (stored === 'fast') {
        return 'priority';
    }
    if (stored === 'standard') {
        return null;
    }
    return undefined;
}

The model catalog advertises the Fast tier with request id 'priority' (display
name 'Fast'), and OpenAI docs confirm service_tier='fast' maps to the request
value 'priority'. The app-server serviceTier override is a raw request value
that does not validate unknown strings (a live probe accepted 'bogus-xyz'), so
sending 'fast' risks being silently ignored — no Fast applied.

Translate the stored 'fast' state to app-server 'priority' at the thread/turn
param boundary (toAppServerServiceTier); the stored/UI/command representation
stays 'fast'/'standard'. Verified live: a turn with serviceTier='priority' runs
and consumes the Fast-tier rate budget.

Addresses HAPI Bot [Major]. Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • [Minor] Validate CLI service-tier values before starting Codex — the web endpoint only accepts fast/standard, but the new --service-tier CLI path accepts any non-empty string. runCodex then seeds currentServiceTier from that option and the keepalive path can persist the invalid value, while app-server param building silently omits unknown tiers. Evidence cli/src/commands/codex.ts:80.
    Suggested fix:
    function parseServiceTier(value: string): 'fast' | 'standard' {
        const normalized = value.trim().toLowerCase()
        if (normalized === 'fast' || normalized === 'standard') {
            return normalized
        }
        throw new Error('Invalid --service-tier value')
    }
    
    // ...
    } else if (arg === '--service-tier') {
        const tier = commandArgs[++i]
        if (!tier) {
            throw new Error('Missing --service-tier value')
        }
        options.serviceTier = parseServiceTier(tier)
    }

Summary

  • Review mode: follow-up after new commits
  • Re-reviewed the full PR diff. The prior fast vs app-server priority finding appears addressed; one remaining ingress can still store unsupported service-tier values.

Testing

  • Not run (automation; local runner does not have bun installed)

HAPI Bot

Comment thread cli/src/commands/codex.ts
throw new Error('Missing --model-reasoning-effort value')
}
options.modelReasoningEffort = parseReasoningEffort(effort)
} else if (arg === '--service-tier') {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MINOR] The web endpoint validates serviceTier to only fast/standard, but this new CLI flag accepts any non-empty string. Since runCodex seeds currentServiceTier directly from opts.serviceTier, the keepalive path can persist an unsupported value like turbo, while toAppServerServiceTier later omits it from app-server requests. Validate this ingress before assigning it.

Suggested fix:

function parseServiceTier(value: string): 'fast' | 'standard' {
    const normalized = value.trim().toLowerCase()
    if (normalized === 'fast' || normalized === 'standard') {
        return normalized
    }
    throw new Error('Invalid --service-tier value')
}

// ...
} else if (arg === '--service-tier') {
    const tier = commandArgs[++i]
    if (!tier) {
        throw new Error('Missing --service-tier value')
    }
    options.serviceTier = parseServiceTier(tier)
}

Addresses HAPI Bot [Minor]: the internal --service-tier spawn arg accepted any
non-empty string, unlike the web /service-tier enum, so a malformed value could
be seeded into currentServiceTier and persisted via keepalive. Parse it to
'fast'|'standard' and reject anything else, matching the web endpoint.

Refs tiann#898

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Findings

  • No issues found.

Summary

  • Review mode: follow-up after new commits
  • Re-reviewed the full current PR diff. The previous CLI --service-tier validation finding is addressed in the latest head. Residual risk: I could not run local typecheck/tests in this environment.

Testing

  • Not run (automation; bun is not installed in this runner)

HAPI Bot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature: Fast mode toggle for Codex and Claude Code sessions

1 participant