Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add checks for available cipher before setting h2 proto #16850

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

pschou
Copy link

@pschou pschou commented Oct 30, 2023

Noting:

"When we implemented TLS 1.3 in Go 1.12, we didn't make TLS 1.3 cipher suites configurable, because they are a disjoint set from the TLS 1.0–1.2" - Golang

So, in effect... With crypto/tls: TLS 1.3 ciphers are not configurable

This setting of enforcing the TLS maximum at the command line is a must-- if the maximum is not set, golang will add all three TLSv1.3 ciphers, effectively making the configuration of which ciphers are allowed not configurable-- it's an all or nothing setup.

This PR addresses the issue when the maximum is set and a subset of the remaining ciphers are set, effectively enabling or disabling support for "h2" proto automatically based on ability.

Without this, etcd crashes on startup.

@jmhbnz jmhbnz added the area/tls label Nov 2, 2023
Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pschou - Thanks for raising this!

To save reviewers some time would you mind either sharing the reproduce steps / configuration you are using, or alternatively adding a test?

@serathius
Copy link
Member

Without this, etcd crashes on startup.

If that's a case, please provide a test that confirms that and prevents invalid configuration in future. Best if that would be an e2e test. Please let me know if you need any help with writing it.

@pschou
Copy link
Author

pschou commented Nov 6, 2023

Example:

./bin/etcd --tls-max-version=TLS1.2 --cipher-suites='TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384' --cert-file=/etc/pki/server.pem --key-file=/etc/pki/server.pem --trusted-ca-file=/etc/pki/ca.pem --client-cert-auth=true --advertise-client-urls=https://localhost:32379 --listen-client-urls=https://localhost:32379

Without the update:
...
{"level":"fatal","ts":"2023-11-06T11:34:00.039138-0500","caller":"etcdmain/etcd.go:196","msg":"listener failed","error":"http2: TLSConfig.CipherSuites is missing an HTTP/2-required AES_128_GCM_SHA256 cipher (need at least one of TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 or TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256)","stacktrace":"go.etcd.io/etcd/server/v3/etcdmain.startEtcdOrProxyV2\n\tgo.etcd.io/etcd/server/v3/etcdmain/etcd.go:196\ngo.etcd.io/etcd/server/v3/etcdmain.Main\n\tgo.etcd.io/etcd/server/v3/etcdmain/main.go:40\nmain.main\n\tgo.etcd.io/etcd/server/v3/main.go:31\nruntime.main\n\truntime/proc.go:250"}
...

With the update:
...
{"level":"info","ts":"2023-11-06T13:38:32.33227-0500","caller":"embed/serve.go:251","msg":"serving client traffic securely","traffic":"http","address":"127.0.0.1:32379"}
...

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @pschou - Thanks for adding the example. As suggested above, to make sure this behavior doesn't get broken again in future it would be ideal to have an e2e test covering this situation.

Please let us know if you need a hand adding this.

@pschou
Copy link
Author

pschou commented Nov 7, 2023

Hey @pschou - Thanks for adding the example. As suggested above, to make sure this behavior doesn't get broken again in future it would be ideal to have an e2e test covering this situation.

Please let us know if you need a hand adding this.

Thank you for the review and I do think I will need help. I have made an e2e attempt to see if this is a step in the right direction. Any advice or help would be greatly appreciated.

Copy link
Member

@jmhbnz jmhbnz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This aligns with http2 spec[1][2] so LGTM, thanks for refining this @pschou.

Just one suggestion about tracking a future improvement to how we implement this check.

1: https://httpwg.org/specs/rfc9113.html#tls12ciphers
2: https://httpwg.org/specs/rfc9113.html#BadCipherSuites

client/pkg/transport/listener.go Show resolved Hide resolved
@jmhbnz
Copy link
Member

jmhbnz commented Nov 23, 2023

/ok-to-test

serathius
serathius previously approved these changes Nov 24, 2023
@serathius
Copy link
Member

cc @ahrtr

@serathius
Copy link
Member

Please squash commits.

@pschou
Copy link
Author

pschou commented Nov 24, 2023

Sorry about that last-minute thought. I want to ensure that users can see an information line indicating what is happening so debugging can be more graceful.

@serathius serathius dismissed their stale review November 24, 2023 17:50

Squash commit

tests/e2e/etcd_config_test.go Outdated Show resolved Hide resolved
tests/e2e/etcd_config_test.go Outdated Show resolved Hide resolved
@pschou pschou force-pushed the http2-cipher-check branch 2 times, most recently from 6e347a5 to 796ad50 Compare November 24, 2023 19:40
Signed-off-by: schou <pschou@users.noreply.github.com>
@pschou
Copy link
Author

pschou commented Nov 24, 2023

Please squash commits.

done

@pschou
Copy link
Author

pschou commented Nov 24, 2023

/retest

@pschou
Copy link
Author

pschou commented Nov 24, 2023

/retest

cfg.NextProtos = []string{"h2"}
} else {
// Display a warning that none of the h2 ciphers are available and likewise GRBC cannot be started.
info.Logger.Info("CipherSuites is missing an HTTP2 capable cipher (TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256 or TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256). Disabling HTTP2 protocol.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should return an error directly to fail the etcd's startup in such case.

The only concern is we also support httpOnly mode, in which case we shouldn't return error because it doesn't have to necessarily be over http2,

sctx.httpOnly = true

It's OK to have one listener for gRPC and the other for http, but it does't make sense if an etcd member only supports serving http requests. Probably we should add a validation to prevent such extreme scenario.

Comment on lines +206 to +217
// Verify that HTTP2 is available, and if not terminate early
var hasH2 bool
for _, p := range tlscfg.NextProtos {
if p == "h2" {
hasH2 = true
}
}
if !hasH2 {
grpcStartError := errors.New("GRPC cannot be started without HTTP2 protocol. Please add at least one of the HTTP2 capable ciphers or set tls-max-version to TLS1.3.")
sctx.lg.Error("Configure https server failed", zap.Error(grpcStartError))
return grpcStartError
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see https://github.com/etcd-io/etcd/pull/16850/files#r1405109505 .

etcd should have already failed before entering into this method.

Suggested change
// Verify that HTTP2 is available, and if not terminate early
var hasH2 bool
for _, p := range tlscfg.NextProtos {
if p == "h2" {
hasH2 = true
}
}
if !hasH2 {
grpcStartError := errors.New("GRPC cannot be started without HTTP2 protocol. Please add at least one of the HTTP2 capable ciphers or set tls-max-version to TLS1.3.")
sctx.lg.Error("Configure https server failed", zap.Error(grpcStartError))
return grpcStartError
}

Copy link

stale bot commented Mar 17, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed after 21 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 17, 2024
@jmhbnz
Copy link
Member

jmhbnz commented May 23, 2024

Discussed during sig-etcd triage meeting. Would be great to finish this off @pschou do you have time to rebase this and please address the two comments from @ahrtr? Thanks!

@stale stale bot removed the stale label May 23, 2024
@k8s-ci-robot
Copy link

@pschou: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-etcd-verify f67aec5 link true /test pull-etcd-verify

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants