Skip to content

Ensure each session.commit() call is either handled by lifecycle (framework/worker) or is an exception #584

@MrNaif2018

Description

@MrNaif2018

SQLAlchemy: commit / flush / session lifecycle — a working guide

Distilled from:

Use this as a checklist when auditing your own repo.


The one rule

session.commit() belongs in lifecycle / infrastructure code. Never in business logic.

Business logic (services, repositories, endpoint handlers) just does session.add(...). Commit happens exactly once, at the outermost scope, owned by a framework-level context manager.

Rationale: a Session wraps a SQL transaction. commit() is one-way — there is no "uncommit." If you call it mid-flow and something later fails, you're left with a partially-written DB and no way back.

The official docs state this as two rules:

  1. Keep the session lifecycle separate and external from functions that access data.
  2. Keep transactions short — end them after a logical unit of work, not held open indefinitely.

Session scope vs transaction scope

These are two distinct things and the docs are explicit about it:

  • A Session is a long-lived container for ORM state (identity map, pending changes).
  • A transaction is the SQL-level BEGIN…COMMIT around a logical operation.

A session starts a transaction the first time it talks to the DB. That transaction ends on commit(), rollback(), or close(). After it ends, the session can start a new transaction. So one session can span many transactions sequentially — but only one at a time.

In practice (for web apps and workers) you usually make these scopes identical: one session, one transaction, one commit, then close. That's the shape Polar uses.


What goes where

Call Who calls it Why
session.add(obj) business logic (service/repo) stage a change in the unit-of-work
session.flush() business logic, only when needed send pending SQL to the DB so a later SELECT in the same tx can see your writes — still rolls back on failure
session.commit() framework layer only — one place per request / task finalize the transaction
session.rollback() framework layer, on exception abort cleanly
session.close() / context manager exit framework layer return the connection to the pool

If you think you need flush(), check first: SQLAlchemy autoflush is on by default and fires before every execute(), query, lazy load, and session.merge(). Most explicit flush() calls are redundant. Don't turn autoflush off globally.

Facts worth internalising (straight from the docs):

  • session.commit() unconditionally runs a flush before emitting COMMIT. You never need flush() immediately before commit().
  • session.begin_nested() also flushes unconditionally.
  • If you need to suppress autoflush for a tight block (e.g. constructing a half-built object graph), use the context manager with session.no_autoflush: rather than disabling autoflush on the sessionmaker.

Two idiomatic lifecycle patterns from the docs

(a) Bare session, explicit commit

with Session(engine) as session:
    ThingOne().go(session)
    ThingTwo().go(session)
    session.commit()   # or rollback on error

(b) session.begin() block — auto-commits on success, auto-rolls back on exception

with Session(engine) as session, session.begin():
    ThingOne().go(session)
    ThingTwo().go(session)
# inner `begin()` commits on normal exit; outer `Session(...)` closes

Polar doesn't use (b) directly because it needs the session available as a FastAPI dependency across many handler functions, but the principle is the same: the commit is owned by the context manager, not by the inner calls.

The official anti-pattern (don't do this)

class ThingOne:
    def go(self):
        session = Session()       # ❌ creates its own session
        try:
            session.execute(...)
            session.commit()      # ❌ commits inside business logic
        except:
            session.rollback()
            raise

Two smells in one: the function builds its own session and controls its transaction. Callers lose the ability to compose ThingOne().go() with other work in a single unit.


Where the single commit lives (per process type)

HTTP request (FastAPI / Starlette)

Two-layer setup:

  1. An ASGI middleware opens the session and stores it on the request scope.
  2. A FastAPI dependency yields the session and commits on normal teardown, rolls back on exception.
# middleware — creates the session for the request
class AsyncSessionMiddleware:
    async def __call__(self, scope, receive, send):
        sessionmaker = scope["state"]["async_sessionmaker"]
        async with sessionmaker() as session:
            scope["state"]["async_session"] = session
            await self.app(scope, receive, send)

# dependency — commits / rolls back
async def get_db_session(request) -> AsyncGenerator[AsyncSession]:
    session = request.state.async_session
    try:
        yield session
    except:
        await session.rollback()
        raise
    else:
        await session.commit()

Endpoints and services just take session: AsyncSession = Depends(get_db_session) and never touch commit/rollback themselves.

Background worker (Dramatiq / Celery / etc.)

@contextlib.asynccontextmanager
async def AsyncSessionMaker() -> AsyncIterator[AsyncSession]:
    async with sessionmaker() as session:
        try:
            yield session
        except:
            await session.rollback()
            raise
        else:
            await session.commit()

# each task:
async def my_task(...):
    async with AsyncSessionMaker() as session:
        # ... add / flush / read ... no commit

One-off scripts / CLI tools

Scripts don't run through the middleware or the worker harness, so they own their session lifecycle and must commit themselves. That's fine — they are the framework layer. The docs explicitly sanction "single global session, commit when the task is done" for CLI tools.

Read-only endpoints

The docs note that for web apps the natural place to commit is at the end of a POST/PUT/DELETE. GETs don't need a commit — they don't write.

Polar reflects this with two dependencies:

  • get_db_session → read-write session, commits on teardown.
  • get_db_read_session → read session (routed to the read replica when configured), no commit in its teardown. It just yields the session; the context manager closes it.

Rule of thumb: if your endpoint doesn't mutate, use a read session and don't commit. If you reach for a commit inside a read-only endpoint, something is off.

Consider expire_on_commit=False

The docs call this out specifically for web apps: by default, after commit() SQLAlchemy expires every instance in the session, so the next access to any attribute triggers a fresh SELECT. If you commit before serializing the response (as Polar's AutoCommitAPIRoute does in polar/kit/routing.py), attribute access during serialization will re-query. That's usually wasted round-trips.

sessionmaker(expire_on_commit=False) avoids it. The tradeoff: attribute values no longer guarantee "reflects the latest DB state" after commit — fine for a request that's already handing back a response and closing.


The notification-before-commit race (and how to kill it)

The problem: service code does

repo.create(Order(...))
send_webhook_to_customer(...)   # HTTP POST

The customer receives the webhook, hits your API, and sees nothing — the order hasn't been committed yet.

The fix: don't dispatch side effects directly. Buffer them, and have the framework layer flush the buffer after commit.

Polar does this with a JobQueueManager that lives in a contextvars.ContextVar:

# in business code — looks like an immediate enqueue, but isn't
enqueue_job("webhook_event.send", webhook_event_id=event.id)

# in the framework layer — wraps the request:
async with JobQueueManager.open(broker, redis):
    await self.app(scope, receive, send)   # inside here: commit happens
# ↑ on exit, flushes buffered jobs to Redis — AFTER the commit

Middleware stack ordering matters. The session middleware wraps the flush middleware:

AsyncSessionMiddleware             ← session active
  FlushEnqueuedWorkerJobsMiddleware  ← buffer active
    endpoint → Depends(get_db_session) → commits on teardown
  ← flush buffer to Redis
← close session

Same pattern in the worker: the task decorator opens the JobQueueManager outside AsyncSessionMaker(), so session commits before jobs flush.

The property this gives you: any job that reaches Redis is for data that is already committed. The worker that picks it up, or the customer who receives the resulting webhook, can always read-after-write.


Legitimate exceptions (where an explicit commit is OK)

Keep this list short. Every one of these needs a comment explaining why.

  1. Long-lived connection after a short DB hit. SSE/WebSocket endpoints that do a quick lookup and then stream for minutes. Commit and release the pooled connection before entering the stream.

    checkout = await checkout_service.get_by_client_secret(session, client_secret)
    await session.commit()   # release the session before SSE
    return EventSourceResponse(subscribe(...))
  2. Durability across a retry/generator boundary. A task that may raise a framework-level Retry exception needs to persist audit/attempt rows before the outer context rolls back.

    try:
        response = await client.post(url, ...)
    except HTTPError:
        raise Retry(...)
    finally:
        session.add(delivery_attempt_record)
        await session.commit()   # survive the Retry
  3. Streaming / batched jobs. Long-running generator that yields progress, or a batched UPDATE ... LIMIT N loop. Commit per batch to release row locks and make incremental progress.

  4. Savepoints. await session.begin_nested() returns a SAVEPOINT. nested.commit() commits the savepoint only; the outer transaction is unaffected. This is a different operation from session.commit() — fine to use.

  5. Scripts. Already covered above — they are the framework layer.

Not a legitimate exception: "I want to fire a hook / notification / email after this state change is durable." That's the job of the buffered-job-queue pattern (previous section), not an inline commit.


Smells to flag in code review

  • session.commit() in a service, repository, or endpoint body.
  • session.commit() followed by more DB writes in the same function. (The writes after the commit are in a second, separate transaction — usually a bug.)
  • session.commit() immediately before calling a hook/webhook/email. Use the buffered-job pattern instead.
  • session.commit() inside a loop with no comment. Either it's a legitimate batch (see Update channels to 2.2.0 #3) or it's a foot-gun.
  • Disabling autoflush. Almost never the right answer.
  • session.flush() followed by a SELECT on the flushed rows without a clear reason. Autoflush already does this.
  • Business code that takes a session parameter and calls sessionmaker() to make its own session. Now you have two transactions and no coherent unit of work.

Minimal audit checklist for your repo

  1. rg '\.commit\(' --type py — how many hits? Each one should map to an entry in the "legitimate exceptions" list or the framework layer.
  2. Is there exactly one request-lifecycle commit point (middleware or dependency)?
  3. Is there exactly one worker/task commit point (context manager or decorator)?
  4. Do services and repositories ever touch commit / rollback? They shouldn't.
  5. Where do you dispatch webhooks / emails / cross-service calls? Are those dispatched inline inside services, or buffered and flushed by the framework layer after commit?
  6. Do you have any code paths where a commit happens and then more DB writes follow in the same function? Audit each one.
  7. Is autoflush enabled on your sessionmaker? Any autoflush=False needs justification.
  8. Do business functions ever construct their own session (Session(), sessionmaker()(), etc.)? They shouldn't — session should be passed in.
  9. Read-only endpoints: do they use a read-only session path, or do they still go through the write/commit dependency? Split them.
  10. Have you considered expire_on_commit=False for your web sessionmaker? Check whether your post-commit response serialization is re-querying unnecessarily.

If all ten answers are clean, the commit story in your repo is probably fine.

Metadata

Metadata

Assignees

No one assigned

    Labels

    priority: highBlocks a release or major workflowtype: enhancementImprovement to existing functionality

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions