Replies: 1 comment 4 replies
-
|
Thanks for sharing this, @webervin! Overall this problem and solution is intuitive to me and I don't really have much to add to it. The one part that stuck out to me as surprising was your proposal to represent this in the configuration... is the workflow you're imagining that you'd notice that one of your vendors is having an outage and then send the "dormant mode" enable through your pull request process to turn it on, and then again through the pull request process to turn it off again afterwards? I can see how that would work, but in my experience I've typically wanted "incident-related" controls to be something I can do outside of the configuration, in case e.g. the version control system itself is degraded in some way, or if the outage only applies to one environment and so the setting only needs to be twiddled temporarily for that environment. I ask this question mainly because during the discussion about the Thanks again for sharing this! It's something we've discussed in various forms before, and this is an interesting specific take on it. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Goal
Introduce a native mechanism that enables successful plan and apply operations on a subset of healthy providers within a single root module. This effectively mitigates the impact of a temporary outage or degradation of a single third-party SaaS provider without resorting to complex module separation or error-prone interactive targeting.
Rationale (Addressing Existing Gaps)
The current workarounds for single-provider outages (separating root modules or using
--target) introduce significant operational overhead and reduce the benefits of centralized dependency management:--target Flag: Requires intimate knowledge of resource addresses, is difficult to secure and automate robustly in CI/CD pipelines, and should not be relied upon for generalized deployment.We can do better by teaching OpenTofu to skip problematic providers whenever possible.
Implementation Details
I propose the introduction of a new, optional setting, dormant_mode = true, configurable within the provider block or via a dedicated environment variable.
Inline configuration: Known long-term maintenance/issue periods; temporary manual override.
Environment variable, in CI/CD pipelines, where I can set variables for multiple workspaces at same time, or even use custom scripting to detect provider availability automatically as part of pipeline.
export OT_DORMANT_MODE_PROVIDERS="saas_vendor,aws-east,cloudflare"(value is coma separated list of providers that will be disabled in current operations).OpenTofu Behavior in "Dormant" Mode
When a provider is marked as dormant, OpenTofu modifies the standard graph refresh and execution sequence for that provider only:
Beta Was this translation helpful? Give feedback.
All reactions