You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Polar codebase (polar/postgres.py, polar/worker/_sqlalchemy.py, polar/kit/routing.py, polar/middlewares.py, polar/worker/_enqueue.py)
Use this as a checklist when auditing your own repo.
The one rule
session.commit() belongs in lifecycle / infrastructure code. Never in business logic.
Business logic (services, repositories, endpoint handlers) just does session.add(...). Commit happens exactly once, at the outermost scope, owned by a framework-level context manager.
Rationale: a Session wraps a SQL transaction. commit() is one-way — there is no "uncommit." If you call it mid-flow and something later fails, you're left with a partially-written DB and no way back.
The official docs state this as two rules:
Keep the session lifecycle separate and external from functions that access data.
Keep transactions short — end them after a logical unit of work, not held open indefinitely.
Session scope vs transaction scope
These are two distinct things and the docs are explicit about it:
A Session is a long-lived container for ORM state (identity map, pending changes).
A transaction is the SQL-level BEGIN…COMMIT around a logical operation.
A session starts a transaction the first time it talks to the DB. That transaction ends on commit(), rollback(), or close(). After it ends, the session can start a new transaction. So one session can span many transactions sequentially — but only one at a time.
In practice (for web apps and workers) you usually make these scopes identical: one session, one transaction, one commit, then close. That's the shape Polar uses.
What goes where
Call
Who calls it
Why
session.add(obj)
business logic (service/repo)
stage a change in the unit-of-work
session.flush()
business logic, only when needed
send pending SQL to the DB so a later SELECT in the same tx can see your writes — still rolls back on failure
session.commit()
framework layer only — one place per request / task
finalize the transaction
session.rollback()
framework layer, on exception
abort cleanly
session.close() / context manager exit
framework layer
return the connection to the pool
If you think you need flush(), check first: SQLAlchemy autoflush is on by default and fires before every execute(), query, lazy load, and session.merge(). Most explicit flush() calls are redundant. Don't turn autoflush off globally.
Facts worth internalising (straight from the docs):
session.commit()unconditionally runs a flush before emitting COMMIT. You never need flush() immediately before commit().
session.begin_nested() also flushes unconditionally.
If you need to suppress autoflush for a tight block (e.g. constructing a half-built object graph), use the context manager with session.no_autoflush: rather than disabling autoflush on the sessionmaker.
Two idiomatic lifecycle patterns from the docs
(a) Bare session, explicit commit
withSession(engine) assession:
ThingOne().go(session)
ThingTwo().go(session)
session.commit() # or rollback on error
(b) session.begin() block — auto-commits on success, auto-rolls back on exception
withSession(engine) assession, session.begin():
ThingOne().go(session)
ThingTwo().go(session)
# inner `begin()` commits on normal exit; outer `Session(...)` closes
Polar doesn't use (b) directly because it needs the session available as a FastAPI dependency across many handler functions, but the principle is the same: the commit is owned by the context manager, not by the inner calls.
The official anti-pattern (don't do this)
classThingOne:
defgo(self):
session=Session() # ❌ creates its own sessiontry:
session.execute(...)
session.commit() # ❌ commits inside business logicexcept:
session.rollback()
raise
Two smells in one: the function builds its own session and controls its transaction. Callers lose the ability to compose ThingOne().go() with other work in a single unit.
Where the single commit lives (per process type)
HTTP request (FastAPI / Starlette)
Two-layer setup:
An ASGI middleware opens the session and stores it on the request scope.
A FastAPI dependency yields the session and commits on normal teardown, rolls back on exception.
Scripts don't run through the middleware or the worker harness, so they own their session lifecycle and must commit themselves. That's fine — they are the framework layer. The docs explicitly sanction "single global session, commit when the task is done" for CLI tools.
Read-only endpoints
The docs note that for web apps the natural place to commit is at the end of a POST/PUT/DELETE. GETs don't need a commit — they don't write.
Polar reflects this with two dependencies:
get_db_session → read-write session, commits on teardown.
get_db_read_session → read session (routed to the read replica when configured), no commit in its teardown. It just yields the session; the context manager closes it.
Rule of thumb: if your endpoint doesn't mutate, use a read session and don't commit. If you reach for a commit inside a read-only endpoint, something is off.
Consider expire_on_commit=False
The docs call this out specifically for web apps: by default, after commit() SQLAlchemy expires every instance in the session, so the next access to any attribute triggers a fresh SELECT. If you commit before serializing the response (as Polar's AutoCommitAPIRoute does in polar/kit/routing.py), attribute access during serialization will re-query. That's usually wasted round-trips.
sessionmaker(expire_on_commit=False) avoids it. The tradeoff: attribute values no longer guarantee "reflects the latest DB state" after commit — fine for a request that's already handing back a response and closing.
The notification-before-commit race (and how to kill it)
The problem: service code does
repo.create(Order(...))
send_webhook_to_customer(...) # HTTP POST
The customer receives the webhook, hits your API, and sees nothing — the order hasn't been committed yet.
The fix: don't dispatch side effects directly. Buffer them, and have the framework layer flush the buffer after commit.
Polar does this with a JobQueueManager that lives in a contextvars.ContextVar:
# in business code — looks like an immediate enqueue, but isn'tenqueue_job("webhook_event.send", webhook_event_id=event.id)
# in the framework layer — wraps the request:asyncwithJobQueueManager.open(broker, redis):
awaitself.app(scope, receive, send) # inside here: commit happens# ↑ on exit, flushes buffered jobs to Redis — AFTER the commit
Middleware stack ordering matters. The session middleware wraps the flush middleware:
AsyncSessionMiddleware ← session active
FlushEnqueuedWorkerJobsMiddleware ← buffer active
endpoint → Depends(get_db_session) → commits on teardown
← flush buffer to Redis
← close session
Same pattern in the worker: the task decorator opens the JobQueueManageroutsideAsyncSessionMaker(), so session commits before jobs flush.
The property this gives you: any job that reaches Redis is for data that is already committed. The worker that picks it up, or the customer who receives the resulting webhook, can always read-after-write.
Legitimate exceptions (where an explicit commit is OK)
Keep this list short. Every one of these needs a comment explaining why.
Long-lived connection after a short DB hit. SSE/WebSocket endpoints that do a quick lookup and then stream for minutes. Commit and release the pooled connection before entering the stream.
checkout=awaitcheckout_service.get_by_client_secret(session, client_secret)
awaitsession.commit() # release the session before SSEreturnEventSourceResponse(subscribe(...))
Durability across a retry/generator boundary. A task that may raise a framework-level Retry exception needs to persist audit/attempt rows before the outer context rolls back.
Streaming / batched jobs. Long-running generator that yields progress, or a batched UPDATE ... LIMIT N loop. Commit per batch to release row locks and make incremental progress.
Savepoints.await session.begin_nested() returns a SAVEPOINT. nested.commit() commits the savepoint only; the outer transaction is unaffected. This is a different operation from session.commit() — fine to use.
Scripts. Already covered above — they are the framework layer.
Not a legitimate exception: "I want to fire a hook / notification / email after this state change is durable." That's the job of the buffered-job-queue pattern (previous section), not an inline commit.
Smells to flag in code review
session.commit() in a service, repository, or endpoint body.
session.commit() followed by more DB writes in the same function. (The writes after the commit are in a second, separate transaction — usually a bug.)
session.commit() immediately before calling a hook/webhook/email. Use the buffered-job pattern instead.
session.commit() inside a loop with no comment. Either it's a legitimate batch (see Update channels to 2.2.0 #3) or it's a foot-gun.
Disabling autoflush. Almost never the right answer.
session.flush() followed by a SELECT on the flushed rows without a clear reason. Autoflush already does this.
Business code that takes a session parameter and calls sessionmaker() to make its own session. Now you have two transactions and no coherent unit of work.
Minimal audit checklist for your repo
rg '\.commit\(' --type py — how many hits? Each one should map to an entry in the "legitimate exceptions" list or the framework layer.
Is there exactly one request-lifecycle commit point (middleware or dependency)?
Is there exactly one worker/task commit point (context manager or decorator)?
Do services and repositories ever touch commit / rollback? They shouldn't.
Where do you dispatch webhooks / emails / cross-service calls? Are those dispatched inline inside services, or buffered and flushed by the framework layer after commit?
Do you have any code paths where a commit happens and then more DB writes follow in the same function? Audit each one.
Is autoflush enabled on your sessionmaker? Any autoflush=False needs justification.
Do business functions ever construct their own session (Session(), sessionmaker()(), etc.)? They shouldn't — session should be passed in.
Read-only endpoints: do they use a read-only session path, or do they still go through the write/commit dependency? Split them.
Have you considered expire_on_commit=False for your web sessionmaker? Check whether your post-commit response serialization is re-querying unnecessarily.
If all ten answers are clean, the commit story in your repo is probably fine.
SQLAlchemy: commit / flush / session lifecycle — a working guide
Distilled from:
session.commit()— https://www.fvoron.com/blog/sqlalchemy-stop-calling-session-commit/polar/postgres.py,polar/worker/_sqlalchemy.py,polar/kit/routing.py,polar/middlewares.py,polar/worker/_enqueue.py)Use this as a checklist when auditing your own repo.
The one rule
Business logic (services, repositories, endpoint handlers) just does
session.add(...). Commit happens exactly once, at the outermost scope, owned by a framework-level context manager.Rationale: a
Sessionwraps a SQL transaction.commit()is one-way — there is no "uncommit." If you call it mid-flow and something later fails, you're left with a partially-written DB and no way back.The official docs state this as two rules:
Session scope vs transaction scope
These are two distinct things and the docs are explicit about it:
A session starts a transaction the first time it talks to the DB. That transaction ends on
commit(),rollback(), orclose(). After it ends, the session can start a new transaction. So one session can span many transactions sequentially — but only one at a time.In practice (for web apps and workers) you usually make these scopes identical: one session, one transaction, one commit, then close. That's the shape Polar uses.
What goes where
session.add(obj)session.flush()session.commit()session.rollback()session.close()/ context manager exitIf you think you need
flush(), check first: SQLAlchemy autoflush is on by default and fires before everyexecute(), query, lazy load, andsession.merge(). Most explicitflush()calls are redundant. Don't turn autoflush off globally.Facts worth internalising (straight from the docs):
session.commit()unconditionally runs a flush before emitting COMMIT. You never needflush()immediately beforecommit().session.begin_nested()also flushes unconditionally.with session.no_autoflush:rather than disabling autoflush on the sessionmaker.Two idiomatic lifecycle patterns from the docs
(a) Bare session, explicit commit
(b)
session.begin()block — auto-commits on success, auto-rolls back on exceptionPolar doesn't use (b) directly because it needs the session available as a FastAPI dependency across many handler functions, but the principle is the same: the commit is owned by the context manager, not by the inner calls.
The official anti-pattern (don't do this)
Two smells in one: the function builds its own session and controls its transaction. Callers lose the ability to compose
ThingOne().go()with other work in a single unit.Where the single commit lives (per process type)
HTTP request (FastAPI / Starlette)
Two-layer setup:
Endpoints and services just take
session: AsyncSession = Depends(get_db_session)and never touch commit/rollback themselves.Background worker (Dramatiq / Celery / etc.)
One-off scripts / CLI tools
Scripts don't run through the middleware or the worker harness, so they own their session lifecycle and must commit themselves. That's fine — they are the framework layer. The docs explicitly sanction "single global session, commit when the task is done" for CLI tools.
Read-only endpoints
The docs note that for web apps the natural place to commit is at the end of a POST/PUT/DELETE. GETs don't need a commit — they don't write.
Polar reflects this with two dependencies:
get_db_session→ read-write session, commits on teardown.get_db_read_session→ read session (routed to the read replica when configured), no commit in its teardown. It just yields the session; the context manager closes it.Rule of thumb: if your endpoint doesn't mutate, use a read session and don't commit. If you reach for a commit inside a read-only endpoint, something is off.
Consider
expire_on_commit=FalseThe docs call this out specifically for web apps: by default, after
commit()SQLAlchemy expires every instance in the session, so the next access to any attribute triggers a fresh SELECT. If you commit before serializing the response (as Polar'sAutoCommitAPIRoutedoes inpolar/kit/routing.py), attribute access during serialization will re-query. That's usually wasted round-trips.sessionmaker(expire_on_commit=False)avoids it. The tradeoff: attribute values no longer guarantee "reflects the latest DB state" after commit — fine for a request that's already handing back a response and closing.The notification-before-commit race (and how to kill it)
The problem: service code does
The customer receives the webhook, hits your API, and sees nothing — the order hasn't been committed yet.
The fix: don't dispatch side effects directly. Buffer them, and have the framework layer flush the buffer after commit.
Polar does this with a
JobQueueManagerthat lives in acontextvars.ContextVar:Middleware stack ordering matters. The session middleware wraps the flush middleware:
Same pattern in the worker: the task decorator opens the
JobQueueManageroutsideAsyncSessionMaker(), so session commits before jobs flush.The property this gives you: any job that reaches Redis is for data that is already committed. The worker that picks it up, or the customer who receives the resulting webhook, can always read-after-write.
Legitimate exceptions (where an explicit commit is OK)
Keep this list short. Every one of these needs a comment explaining why.
Long-lived connection after a short DB hit. SSE/WebSocket endpoints that do a quick lookup and then stream for minutes. Commit and release the pooled connection before entering the stream.
Durability across a retry/generator boundary. A task that may raise a framework-level
Retryexception needs to persist audit/attempt rows before the outer context rolls back.Streaming / batched jobs. Long-running generator that yields progress, or a batched
UPDATE ... LIMIT Nloop. Commit per batch to release row locks and make incremental progress.Savepoints.
await session.begin_nested()returns aSAVEPOINT.nested.commit()commits the savepoint only; the outer transaction is unaffected. This is a different operation fromsession.commit()— fine to use.Scripts. Already covered above — they are the framework layer.
Not a legitimate exception: "I want to fire a hook / notification / email after this state change is durable." That's the job of the buffered-job-queue pattern (previous section), not an inline commit.
Smells to flag in code review
session.commit()in a service, repository, or endpoint body.session.commit()followed by more DB writes in the same function. (The writes after the commit are in a second, separate transaction — usually a bug.)session.commit()immediately before calling a hook/webhook/email. Use the buffered-job pattern instead.session.commit()inside a loop with no comment. Either it's a legitimate batch (see Update channels to 2.2.0 #3) or it's a foot-gun.autoflush. Almost never the right answer.session.flush()followed by a SELECT on the flushed rows without a clear reason. Autoflush already does this.sessionparameter and callssessionmaker()to make its own session. Now you have two transactions and no coherent unit of work.Minimal audit checklist for your repo
rg '\.commit\(' --type py— how many hits? Each one should map to an entry in the "legitimate exceptions" list or the framework layer.commit/rollback? They shouldn't.autoflushenabled on your sessionmaker? Anyautoflush=Falseneeds justification.Session(),sessionmaker()(), etc.)? They shouldn't — session should be passed in.expire_on_commit=Falsefor your web sessionmaker? Check whether your post-commit response serialization is re-querying unnecessarily.If all ten answers are clean, the commit story in your repo is probably fine.