Trousseau pods die in a few minutes after they're started
Detailed Description
Even though the Trousseau pod works and secrets are in fact encrypted via Vault transit key,
after 20m timeout the healthcheck apparently fails with a 403 error when trying to perform
a vault operation which results in the pod termination.
This doesn't make sense to me as the pod uses a token which is valid and can, in fact, do
the actual encryption/decryption while the pod is alive. This has been checked by the
kubectl generate secret and etcdcl get /registry/secrets/....
Using Trousseau v1.1.3.
Logs:
{
"level": "Level(-3)",
"timestamp": "2022-08-16T10:46:37.626434563Z",
"caller": "server/health.go:33",
"msg": "Initialize health check\n"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:34.104487451Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:34.104850154Z",
"caller": "encrypt/vault.go:285",
"msg": "Failed to encrypt locked",
"key": "kube-ktest-kms",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:34.111772569Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:34.11254637Z",
"caller": "encrypt/vault.go:305",
"msg": "Failed to decrypt locked",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "error",
"timestamp": "2022-08-16T11:06:34.113574164Z",
"caller": "server/health.go:85",
"msg": "Encryption failed",
"original": "healthcheck",
"decrypted": "",
"error": "failed to properly decrypt encrypted data",
"stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:44.102098227Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:44.102356133Z",
"caller": "encrypt/vault.go:285",
"msg": "Failed to encrypt locked",
"key": "kube-ktest-kms",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:44.110282553Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:44.110369672Z",
"caller": "encrypt/vault.go:305",
"msg": "Failed to decrypt locked",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "error",
"timestamp": "2022-08-16T11:06:44.110413036Z",
"caller": "server/health.go:85",
"msg": "Encryption failed",
"original": "healthcheck",
"decrypted": "",
"error": "failed to properly decrypt encrypted data",
"stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.110033047Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.110140296Z",
"caller": "encrypt/vault.go:285",
"msg": "Failed to encrypt locked",
"key": "kube-ktest-kms",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.117075111Z",
"caller": "encrypt/vault.go:202",
"msg": "Failed to send request",
"code": 403,
"error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.117137565Z",
"caller": "encrypt/vault.go:305",
"msg": "Failed to decrypt locked",
"error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied"
}
{
"level": "error",
"timestamp": "2022-08-16T11:06:54.117172855Z",
"caller": "server/health.go:85",
"msg": "Encryption failed",
"original": "healthcheck",
"decrypted": "",
"error": "failed to properly decrypt encrypted data",
"stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.142295621Z",
"caller": "kubernetes-kms-vault/main.go:156",
"msg": "Received shutdown signal\n"
}
{
"level": "info",
"timestamp": "2022-08-16T11:06:54.142362222Z",
"caller": "kubernetes-kms-vault/main.go:139",
"msg": "Terminating the server\n"
}
Expected Behavior
The healthcheck should come through and the pod shouldn't crash.
Current Behavior
The pod crashes after 20m apparently due to a failed encryption healthcheck.
Steps to Reproduce
- Deploy trousseau
- Watch logs
- Wait 20m
- Maybe successfully encrypt and decrypt some secrets in the mean time
- the pod crashes with the logs above
Context (Environment)
Puprose-built cluster (kubespray) to test Trousseau (deployed with Terraform, so is trousseau-related Vault config)
Possible Solution/Implementation
It seems like Troussseau uses the wrong token for healthchecks (if any).
Trousseau pods die in a few minutes after they're started
Detailed Description
Even though the Trousseau pod works and secrets are in fact encrypted via Vault transit key,
after 20m timeout the healthcheck apparently fails with a 403 error when trying to perform
a vault operation which results in the pod termination.
This doesn't make sense to me as the pod uses a token which is valid and can, in fact, do
the actual encryption/decryption while the pod is alive. This has been checked by the
kubectl generate secretandetcdcl get /registry/secrets/....Using Trousseau v1.1.3.
Logs:
{ "level": "Level(-3)", "timestamp": "2022-08-16T10:46:37.626434563Z", "caller": "server/health.go:33", "msg": "Initialize health check\n" } { "level": "info", "timestamp": "2022-08-16T11:06:34.104487451Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:34.104850154Z", "caller": "encrypt/vault.go:285", "msg": "Failed to encrypt locked", "key": "kube-ktest-kms", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:34.111772569Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:34.11254637Z", "caller": "encrypt/vault.go:305", "msg": "Failed to decrypt locked", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "error", "timestamp": "2022-08-16T11:06:34.113574164Z", "caller": "server/health.go:85", "msg": "Encryption failed", "original": "healthcheck", "decrypted": "", "error": "failed to properly decrypt encrypted data", "stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966" } { "level": "info", "timestamp": "2022-08-16T11:06:44.102098227Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:44.102356133Z", "caller": "encrypt/vault.go:285", "msg": "Failed to encrypt locked", "key": "kube-ktest-kms", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:44.110282553Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:44.110369672Z", "caller": "encrypt/vault.go:305", "msg": "Failed to decrypt locked", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "error", "timestamp": "2022-08-16T11:06:44.110413036Z", "caller": "server/health.go:85", "msg": "Encryption failed", "original": "healthcheck", "decrypted": "", "error": "failed to properly decrypt encrypted data", "stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966" } { "level": "info", "timestamp": "2022-08-16T11:06:54.110033047Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:54.110140296Z", "caller": "encrypt/vault.go:285", "msg": "Failed to encrypt locked", "key": "kube-ktest-kms", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/encrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:54.117075111Z", "caller": "encrypt/vault.go:202", "msg": "Failed to send request", "code": 403, "error": "Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "info", "timestamp": "2022-08-16T11:06:54.117137565Z", "caller": "encrypt/vault.go:305", "msg": "Failed to decrypt locked", "error": "forbidden error Error making API request.\n\nURL: POST https://vault.internal:8200/v1/transit/decrypt/kube-ktest-kms\nCode: 403. Errors:\n\n* permission denied" } { "level": "error", "timestamp": "2022-08-16T11:06:54.117172855Z", "caller": "server/health.go:85", "msg": "Encryption failed", "original": "healthcheck", "decrypted": "", "error": "failed to properly decrypt encrypted data", "stacktrace": "github.com/ondat/trousseau/internal/server.(*HealthZ).ServeHTTP\n\t/work/internal/server/health.go:85\nnet/http.HandlerFunc.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2084\nnet/http.(*ServeMux).ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2462\nnet/http.serverHandler.ServeHTTP\n\t/usr/local/go/src/net/http/server.go:2916\nnet/http.(*conn).serve\n\t/usr/local/go/src/net/http/server.go:1966" } { "level": "info", "timestamp": "2022-08-16T11:06:54.142295621Z", "caller": "kubernetes-kms-vault/main.go:156", "msg": "Received shutdown signal\n" } { "level": "info", "timestamp": "2022-08-16T11:06:54.142362222Z", "caller": "kubernetes-kms-vault/main.go:139", "msg": "Terminating the server\n" }Expected Behavior
The healthcheck should come through and the pod shouldn't crash.
Current Behavior
The pod crashes after 20m apparently due to a failed encryption healthcheck.
Steps to Reproduce
Context (Environment)
Puprose-built cluster (kubespray) to test Trousseau (deployed with Terraform, so is trousseau-related Vault config)
Possible Solution/Implementation
It seems like Troussseau uses the wrong token for healthchecks (if any).