Description
Kubernetes relies on the /health endpoint to monitor Pod status. Currently, this endpoint is only available after Keycloak finishes initializing. Depending on the release and database size, initialization may take a significant amount of time; it can include database schema changes, column or index updates, or data migration to a new internal format.
During these long-running initializations, Kubernetes may determine that the Pod is unresponsive and restart it. This can create a restart loop where the migration never completes, leaving the cluster stuck. Users currently work around this by configuring large initialDelaySeconds or startupProbe timeouts, but these values are hard to estimate and vary depending on database size and migration complexity.
We should improve this by decoupling the health endpoint from the full initialization lifecycle, making it available early while slow tasks continue in the background.
Value Proposition
- Prevents restart loops during migrations: The health endpoint becomes available early, preventing Kubernetes from killing Pods before long-running migrations complete.
- Eliminates guesswork around probe timeouts: Users no longer need to configure large
initialDelaySeconds or startupProbe timeouts to account for unpredictable migration durations.
Goals
- Expose the health endpoint as early as possible in the startup lifecycle.
- Run
KeycloakSessionFactory initialization, Liquibase migrations, and realm imports in the background.
- The health endpoint must clearly distinguish between "initializing" and "ready" states so that users can configure Kubernetes readiness and liveness probes appropriately (e.g., liveness returns
200 during migration, readiness returns 503 until fully initialized).
- Keycloak must not serve authentication or admin requests until background initialization is complete (return
503 until fully initialized).
Non-Goals
- Change the behavior of the
import and export CLI commands.
- Reduce the actual time spent on migration or initialization logic.
- Speed up import/export operations.
Discussion
No response
Notes
Failure handling: If a background migration fails, the Pod is shut down.
Description
Kubernetes relies on the
/healthendpoint to monitor Pod status. Currently, this endpoint is only available after Keycloak finishes initializing. Depending on the release and database size, initialization may take a significant amount of time; it can include database schema changes, column or index updates, or data migration to a new internal format.During these long-running initializations, Kubernetes may determine that the Pod is unresponsive and restart it. This can create a restart loop where the migration never completes, leaving the cluster stuck. Users currently work around this by configuring large
initialDelaySecondsorstartupProbetimeouts, but these values are hard to estimate and vary depending on database size and migration complexity.We should improve this by decoupling the health endpoint from the full initialization lifecycle, making it available early while slow tasks continue in the background.
Value Proposition
initialDelaySecondsorstartupProbetimeouts to account for unpredictable migration durations.Goals
KeycloakSessionFactoryinitialization, Liquibase migrations, and realm imports in the background.200during migration, readiness returns503until fully initialized).503until fully initialized).Non-Goals
importandexportCLI commands.Discussion
No response
Notes
Failure handling: If a background migration fails, the Pod is shut down.