Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

kueue

Version: 0.17.1 Type: application AppVersion: v0.17.1

Kueue is a set of APIs and controllers for job queueing. It is a job-level manager that decides when a job should be admitted to start (as in pods can be created) and when it should stop (as in active pods should be deleted).

Installation

Quick start instructions for the setup and configuration of kueue using Helm.

Prerequisites

Installing the chart

Install chart using Helm v3.0+

Either clone the kueue repository:

$ git clone git@github.com:kubernetes-sigs/kueue.git
$ cd kueue/charts
$ helm install kueue kueue/ --create-namespace --namespace kueue-system

Or use the charts pushed to oci://registry.k8s.io/kueue/charts/kueue:

helm install kueue oci://registry.k8s.io/kueue/charts/kueue --version="0.17.1" --create-namespace --namespace=kueue-system

For more advanced parametrization of Kueue, we recommend using a local overrides file, passed via the --values flag. For example:

controllerManager:
  featureGates:
    - name: TopologyAwareScheduling
      enabled: true
  replicas: 2
  manager:
    resources:
      limits:
        cpu: "2"
        memory: 2Gi
      requests:
        cpu: "2"
        memory: 2Gi
helm install kueue oci://registry.k8s.io/kueue/charts/kueue --version="0.17.1" \
  --create-namespace --namespace=kueue-system \
  --values overrides.yaml

You can also use the --set flag. For example, to enable a feature gate (e.g., TopologyAwareScheduling):

helm install kueue oci://registry.k8s.io/kueue/charts/kueue --version="0.17.1" \
  --create-namespace --namespace=kueue-system \
  --set "controllerManager.featureGates[0].name=TopologyAwareScheduling" \
  --set "controllerManager.featureGates[0].enabled=true"
Verify that controller pods are running properly.
$ kubectl get deploy -n kueue-system
NAME                           READY   UP-TO-DATE   AVAILABLE   AGE
kueue-controller-manager       1/1     1            1           7s
Cert Manager

Kueue has support for third-party certificates. One can enable this by setting enableCertManager to true. By default the chart creates a self-signed Issuer. To reuse an existing Issuer or ClusterIssuer, set certManager.issuerRef, for example:

enableCertManager: true
certManager:
  issuerRef:
    group: cert-manager.io
    kind: ClusterIssuer
    name: my-cluster-issuer

If you reference a namespace-scoped Issuer, it must already exist in the same namespace as the Helm release. The referenced issuer must provide the CA data required by Kueue's cert-manager integration, including ca.crt in the generated Secrets and the CA bundle used for webhook and visibility API injection.

This will use certManager to generate a secret, inject the CABundles and set up the TLS.

Check out the site for more information on installing cert manager with our Helm chart.

Prometheus

Kueue supports Prometheus metrics. Check out the site for more information on installing kueue with metrics using our Helm chart.

Configuration

The following table lists the configurable parameters of the kueue chart and their default values.

Key Type Default Description
certManager.issuerRef object {} Override the default self-signed cert-manager issuer reference. When set, the chart skips creating its own Issuer and uses this reference for webhook, metrics, and visibility certificates. The referenced issuer must provide the CA data required by Kueue's cert-manager integration.
controllerManager.featureGates list [] ControllerManager's feature gates
controllerManager.imagePullSecrets list [] ControllerManager's imagePullSecrets
controllerManager.livenessProbe.failureThreshold int 3 ControllerManager's livenessProbe failureThreshold
controllerManager.livenessProbe.initialDelaySeconds int 15 ControllerManager's livenessProbe initialDelaySeconds
controllerManager.livenessProbe.periodSeconds int 20 ControllerManager's livenessProbe periodSeconds
controllerManager.livenessProbe.successThreshold int 1 ControllerManager's livenessProbe successThreshold
controllerManager.livenessProbe.timeoutSeconds int 1 ControllerManager's livenessProbe timeoutSeconds
controllerManager.manager.containerSecurityContext object {"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"readOnlyRootFilesystem":true} ControllerManager's container securityContext
controllerManager.manager.image.pullPolicy string "Always" ControllerManager's image pullPolicy. This should be set to 'IfNotPresent' for released version
controllerManager.manager.image.repository string "us-central1-docker.pkg.dev/k8s-staging-images/kueue/kueue" ControllerManager's image repository
controllerManager.manager.image.tag string "main" ControllerManager's image tag
controllerManager.manager.logLevel int 2 Zap log level. Higher values increase verbosity.
controllerManager.manager.podAnnotations object {}
controllerManager.manager.podSecurityContext object {"runAsNonRoot":true,"seccompProfile":{"type":"RuntimeDefault"}} ControllerManager's pod securityContext
controllerManager.manager.priorityClassName string nil ControllerManager's pod priorityClassName
controllerManager.manager.resources object {"limits":{"cpu":"2","memory":"512Mi"},"requests":{"cpu":"500m","memory":"512Mi"}} ControllerManager's pod resources
controllerManager.nodeSelector object {} ControllerManager's nodeSelector
controllerManager.podDisruptionBudget.enabled bool false Enable PodDisruptionBudget
controllerManager.podDisruptionBudget.minAvailable int 1 PodDisruptionBudget's topologySpreadConstraints
controllerManager.readinessProbe.failureThreshold int 3 ControllerManager's readinessProbe failureThreshold
controllerManager.readinessProbe.initialDelaySeconds int 5 ControllerManager's readinessProbe initialDelaySeconds
controllerManager.readinessProbe.periodSeconds int 10 ControllerManager's readinessProbe periodSeconds
controllerManager.readinessProbe.successThreshold int 1 ControllerManager's readinessProbe successThreshold
controllerManager.readinessProbe.timeoutSeconds int 1 ControllerManager's readinessProbe timeoutSeconds
controllerManager.replicas int 1 ControllerManager's replicas count
controllerManager.tolerations list [] ControllerManager's tolerations
controllerManager.topologySpreadConstraints list [] ControllerManager's topologySpreadConstraints
enableCertManager bool false Enable x509 automated certificate management using cert-manager (cert-manager.io)
enableKueueViz bool false Enable KueueViz dashboard
enablePrometheus bool false Enable Prometheus
enableVisibilityAPF bool false Enable API Priority and Fairness configuration for the visibility API
fullnameOverride string "" Override the resource name
kubernetesClusterDomain string "cluster.local" Kubernetes cluster's domain
kueueViz.backend.auth.mode string "Disabled" Authentication mode: "Disabled" or "TokenReview" (Alpha, disabled by default)
kueueViz.backend.auth.tokenReviewConfig object {"audiences":"","cacheTTL":"60s","negativeCacheTTL":"5s"} TokenReview-specific configuration (only used when mode is "TokenReview")
kueueViz.backend.auth.tokenReviewConfig.audiences string "" Optional comma-separated list of audiences for TokenReview
kueueViz.backend.auth.tokenReviewConfig.cacheTTL string "60s" TTL for successful authentication cache
kueueViz.backend.auth.tokenReviewConfig.negativeCacheTTL string "5s" TTL for failed authentication cache (prevents API server abuse)
kueueViz.backend.containerSecurityContext object {"allowPrivilegeEscalation":false,"capabilities":{"drop":["ALL"]},"readOnlyRootFilesystem":true} KueueViz backend container securityContext
kueueViz.backend.env list [{"name":"KUEUEVIZ_ALLOWED_ORIGINS","value":"https://frontend.kueueviz.local"}] Environment variables for KueueViz backend deployment
kueueViz.backend.image.pullPolicy string "Always" KueueViz dashboard backend image pullPolicy. This should be set to 'IfNotPresent' for released version
kueueViz.backend.image.repository string "us-central1-docker.pkg.dev/k8s-staging-images/kueue/kueueviz-backend" KueueViz dashboard backend image repository
kueueViz.backend.image.tag string "main" KueueViz dashboard backend image tag
kueueViz.backend.imagePullSecrets list [] Sets ImagePullSecrets for KueueViz dashboard backend deployments. This is useful when the images are in a private registry.
kueueViz.backend.ingress.annotations object {"nginx.ingress.kubernetes.io/rewrite-target":"/","nginx.ingress.kubernetes.io/ssl-redirect":"true"} KueueViz dashboard backend ingress annotations
kueueViz.backend.ingress.enabled bool true Enable KueueViz dashboard backend ingress
kueueViz.backend.ingress.host string "backend.kueueviz.local" KueueViz dashboard backend ingress host
kueueViz.backend.ingress.ingressClassName string nil KueueViz dashboard backend ingress class name
kueueViz.backend.ingress.tlsSecretName string "kueueviz-backend-tls" KueueViz dashboard backend ingress tls secret name
kueueViz.backend.nodeSelector object {} KueueViz backend nodeSelector
kueueViz.backend.podSecurityContext object {"runAsNonRoot":true,"seccompProfile":{"type":"RuntimeDefault"}} KueueViz backend pod securityContext
kueueViz.backend.priorityClassName string nil Enable PriorityClass for KueueViz dashboard backend deployments
kueueViz.backend.resources object {"limits":{"cpu":"500m","memory":"512Mi"},"requests":{"cpu":"500m","memory":"512Mi"}} KueueViz backend pod resources
kueueViz.backend.tolerations list [] KueueViz backend tolerations
kueueViz.frontend.containerSecurityContext object {} KueueViz frontend container securityContext
kueueViz.frontend.env list [] Environment variables for KueueViz frontend deployment
kueueViz.frontend.image.pullPolicy string "Always" KueueViz dashboard frontend image pullPolicy. This should be set to 'IfNotPresent' for released version
kueueViz.frontend.image.repository string "us-central1-docker.pkg.dev/k8s-staging-images/kueue/kueueviz-frontend" KueueViz dashboard frontend image repository
kueueViz.frontend.image.tag string "main" KueueViz dashboard frontend image tag
kueueViz.frontend.imagePullSecrets list [] Sets ImagePullSecrets for KueueViz dashboard frontend deployments. This is useful when the images are in a private registry.
kueueViz.frontend.ingress.annotations object {"nginx.ingress.kubernetes.io/rewrite-target":"/","nginx.ingress.kubernetes.io/ssl-redirect":"true"} KueueViz dashboard frontend ingress annotations
kueueViz.frontend.ingress.enabled bool true Enable KueueViz dashboard frontend ingress
kueueViz.frontend.ingress.host string "frontend.kueueviz.local" KueueViz dashboard frontend ingress host
kueueViz.frontend.ingress.ingressClassName string nil KueueViz dashboard frontend ingress class name
kueueViz.frontend.ingress.tlsSecretName string "kueueviz-frontend-tls" KueueViz dashboard frontend ingress tls secret name
kueueViz.frontend.nodeSelector object {} KueueViz frontend nodeSelector
kueueViz.frontend.podSecurityContext object {} KueueViz frontend pod securityContext
kueueViz.frontend.priorityClassName string nil Enable PriorityClass for KueueViz dashboard frontend deployments
kueueViz.frontend.resources object {"limits":{"cpu":"500m","memory":"512Mi"},"requests":{"cpu":"500m","memory":"512Mi"}} KueueViz frontend pod resources
kueueViz.frontend.tolerations list [] KueueViz frontend tolerations
managerConfig.controllerManagerConfigYaml string controllerManagerConfigYaml controller_manager_config.yaml. ControllerManager utilizes this yaml via manager-config Configmap.
metrics.prometheusNamespace string "monitoring" Prometheus namespace
metrics.serviceMonitor.tlsConfig object {"insecureSkipVerify":true} ServiceMonitor's tlsConfig
metricsService.annotations object {} metricsService's annotations
metricsService.labels object {} metricsService's labels
metricsService.ports list [{"name":"https","port":8443,"protocol":"TCP","targetPort":8443}] metricsService's ports
metricsService.type string "ClusterIP" metricsService's type
mutatingWebhook.reinvocationPolicy string "Never" MutatingWebhookConfiguration's reinvocationPolicy
nameOverride string "" Override the resource name
webhookService.ipDualStack.enabled bool false webhookService's ipDualStack enabled
webhookService.ipDualStack.ipFamilies list ["IPv6","IPv4"] webhookService's ipDualStack ipFamilies
webhookService.ipDualStack.ipFamilyPolicy string "PreferDualStack" webhookService's ipDualStack ipFamilyPolicy
webhookService.ports list [{"port":443,"protocol":"TCP","targetPort":9443}] webhookService's ports
webhookService.type string "ClusterIP" webhookService's type