-
Notifications
You must be signed in to change notification settings - Fork 291
Run controller managers in separate namespaces #4657
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Run controller managers in separate namespaces #4657
Conversation
Thank you so much for the detailed descriptions. I saw this PR also contains a lot of changes to unmanaged-detector. Could you elaborate a little bit more about those changes? |
tenancy.gke.io/access-level: supervisor | ||
tenancy.gke.io/project: no-project | ||
tenancy.gke.io/tenant: no-tenant |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I ask more about these 3 labels? I want to know the life cycle of them, e.g. how is it used/modified, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Answered in corporate chat
var b []byte | ||
if managerNamespaceSuffix == "" { | ||
b = []byte(defaultConfigConnector) | ||
} else { | ||
b = []byte(fmt.Sprintf(defaultConfigConnectorNamespaceSeparated, managerNamespaceSuffix)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest making this a Unstructured level operations, especially considering that a string formatting is used.
u.SetLabels(
map[string]string{
...
}
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
pkg/k8s/constants.go
Outdated
NamespaceEnvVar = "NAMESPACE" | ||
ImmediateReconcileRequestsBufferSize = 10000 | ||
MaxNumResourceWatcherRoutines = 10000 | ||
ManagerNamespaceSuffixLabel = "cnrm.cloud.google.com/manager-namespace-suffix" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest using a new field to CCC spec
instead of labels. Spec is specifically for user configurations, and it is more managable and traceble compared to labels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you suggesting spec
on CCC (ConfigConnectorContext)? There should be a global flag that indicates:
- If managers need to run in separate namespaces
- How to name the namespaces )what should be the suffix of the manager's namespace)
This suggest that CC (ConfigConnector) should have theManagerNamespaceSuffix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
update Unstructured instead of string substitution
Update: This PR is under the context that a GKE MT cluster creates the custom namespace (different from KCC creates its system ns), and adds the mt specific labels. Per offline discussion, we expect the MT team to share the guide on these additional setup so we can verify the PR in the GKE autopilot cluster. |
cmd/unmanageddetector/main.go
Outdated
flag.CommandLine.AddGoFlagSet(goflag.CommandLine) | ||
flag.BoolVar(&enablePprof, "enable-pprof", false, "Enable the pprof server.") | ||
flag.IntVar(&pprofPort, "pprof-port", 6060, "The port that the pprof server binds to if enabled.") | ||
flag.BoolVar(&unmanageddetector.EnableManagerNamespace, "enable-manager-namespace", false, "Enable running controller managers in separate namespaces.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking
I dislike bools in APIs. It always seems like something which is 2 state eventually becomes 3+ state. Right now the manager "run-mode" is either "cluster-only" or "namespace-only". I can imagine wanting something in the future like "cluster-namespace-hybrid" or even "project". Switching from a bool to an enum seems like future proofing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed from boolean to string everywhere in the code.
operator/cmd/manager/main.go
Outdated
|
||
imagePrefix := os.Getenv("IMAGE_PREFIX") | ||
flag.StringVar(&imagePrefix, "image-prefix", imagePrefix, "Remap container images to pull from the specified registry or mirror.") | ||
flag.BoolVar(&enableManagerNamespace, "enable-manager-namespace", false, "Enable running controller managers in separate namespaces.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not blocking
I dislike bools in APIs. It always seems like something which is 2 state eventually becomes 3+ state. Right now the manager "run-mode" is either "cluster-only" or "namespace-only". I can imagine wanting something in the future like "cluster-namespace-hybrid" or even "project". Switching from a bool to an enum seems like future proofing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed from boolean to string everywhere in the code.
if err := cluster.DeleteNamespaceID(ctx, k8s.OperatorNamespaceIDConfigMapNN, r.client, ccc.Namespace); err != nil { | ||
return err | ||
} | ||
if r.enableManagerNamespace && ccc.Spec.ManagerNamespace != "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like it should not happen. Can we add a warning log and/or maybe a status message in the CCC? I understand why your doing it but we certainly should not be taking such a drastic step quietly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated code to fail reconciliation of CCC if global mode and spec.ManagerNamespace
are not compatible.
Can we add a check for not enableManagerNamespace that either prevent the managerNamepsace field or report a status warning? |
Updated code to fail reconciliation of CCC if global mode and spec.ManagerNamespace are not compatible. |
/assign @cheftako |
@dshnayder The PR template should have had a Release Notes and Docs section. Can you put those back in your top notes? |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cheftako The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
827c965
into
GoogleCloudPlatform:master
Updated pull request descriotion with release notes and documentation. |
Change Description
Summary
This change allows Kubernetes Config Connector (KCC) controller managers to be deployed in dedicated, tenant-specific namespaces instead of the default
cnrm-system
namespace. This significantly improves the security of multi-tenant environments by providing better namespace-level isolation and reducing the blast radius of potential security incidents.Motivation
Currently, all KCC controller managers operate within the shared
cnrm-system
namespace, irrespective of the specific ConfigConnectorContext or tenant they serve. This centralized deployment model presents a security concern: it increases the potential blast radius in the event of a security compromise or critical failure affecting one controller manager. A vulnerability exploited withincnrm-system
could potentially impact all tenants managed by that KCC installation due to the shared execution environment.The goal of this refactoring is to mitigate this risk by allowing controller managers to be deployed into distinct, dedicated namespaces. This architectural change facilitates stronger isolation between tenants or configurations, thereby reducing the scope of impact from potential security incidents or operational issues within a single manager instance. This approach aligns with best practices for secure multi-tenant system design in Kubernetes.
Implementation Details
The new functionality is gated by command line argument “manager-namespace-isolation” - “shared” or “dedicated” (default os "shared"). The argument can be passed to the pod “manager” (/configconnector-operator/manager):
When this flag is enabled, KCC will check for the optional
managerNamespace
field in theConfigConnectorContext
resource. If this field is present, KCC will deploy the corresponding controller manager and its associated ServiceAccount into the specified namespace, rather than the defaultcnrm-system
.managerNamespace
When this field is present on a
ConfigConnectorContext
resource specification and it is not empty, the system is instructed not to deploy the corresponding controller manager components into the defaultcnrm-system
namespace. Instead, the namespace specified inmanagerNamespace
is used. ThemanagerNamespace
will be created if it doesn’t exists. The selection of this specification field as the configuration mechanism adheres to standard Kubernetes practices for declarative configuration, allowing behavior modification without altering the core ConfigConnector API specification.Example:
Consider a ConfigConnectorContext resource, located in the namespace t1234-tenant0-provider:
If the ConfigConnectorContext resource includes the specification field
managerNamespace: t1234-tenant0-supervisor
, then the controller manager StatefulSet and its ServiceAccount will be created within thet1234-tenant0-supervisor
namespace.If the new functionality is not activated, then the specification field “managerNamespace” on each ConfigConnectorContext must not be specified. If “managerNamespace” is specified when the feature is globally disabled, then ConfigConnectorContext reconciliation will fail with an error message indicating that “managerNamespace” is not supported.
If the new functionality is activated then the specification field “managerNamespace” on each ConfigConnectorContext must be specified. If “managerNamespace” is not specified when the feature is globally enabled, then ConfigConnectorContext reconciliation will fail with an error message indicating that “managerNamespace” must be populated.
Summary of configuration
--manager-namespace-isolation
dedicated
for themanagerNamespace
field to have any effect.managerNamespace
ConfigConnectorContext
spec fieldScope and Impact
This enhancement is required for multi-tenant GKE deployments, where strict tenant isolation is often a requirement.
Positive Impacts:
Negative Impacts:
--manager-namespace-isolation=dedicated
and a new Custom Resource is created, an additional request fromunmanageddetector
toapiserver
required to get specification ofConfigConnectorContext
resource.Tests
cnrm-system
namespacemake ready-pr
to ensure this PR is ready for review.Does this PR add something which needs to be 'release noted'?
Additional documentation e.g., references, usage docs, etc.: