start: add --mdns flag to enable mDNS on every VM start#22964
start: add --mdns flag to enable mDNS on every VM start#22964carterpewpew wants to merge 1 commit into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: carterpewpew The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @carterpewpew! |
|
Hi @carterpewpew. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Regular contributors should join the org to skip this step. Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Can one of the admins verify this patch? |
|
Thanks @carterpewpew! This is very exciting change that will make my life much nicer. I'll review this later this week. |
|
/ok-to-test |
This comment has been minimized.
This comment has been minimized.
nirs
left a comment
There was a problem hiding this comment.
Thanks for picking this up and putting together such a clean PR! The code is well-organized, the commit message is excellent, and the defensive error handling (warning instead of aborting) is right for an optional feature like this.
I have a few suggestions before we merge:
- The mDNS setting is lost after
minikube stop && minikube start— it needs to move from provisioning to the start path so it runs on every start (see inline comment with reproduction steps). - The unit test should be replaced with an integration test that verifies mDNS actually works, including after a stop/start cycle.
- The flag should be
--mdnsto match the naming convention of other boolean flags. - Use a config constant instead of a string literal, and remove the
RequiresRestartMsgcallback.
Sorry for the long list — the PR is really close, these are just refinements to make it solid.
|
@nirs All four addressed: --mdns naming, moved to startMachine so it survives reboots, removed the misleading restart callback, and added an integration test that validates persistence across stop/start. PTAL, Thanks again for the thorough review and feedback. |
nirs
left a comment
There was a problem hiding this comment.
@carterpewpew Thanks! please the see the comments about using the config value and the integration test.
|
/ok-to-test |
nirs
left a comment
There was a problem hiding this comment.
Added more comments on the new test.
|
thanks for this PR @carterpewpew please rebase |
There was a problem hiding this comment.
Pull request overview
Adds an opt-in --mdns start flag (and corresponding minikube config key) that enables mDNS (.local) resolution on the cluster node by running sudo resolvectl mdns eth0 yes during every startMachine, so it survives stop/start cycles. The flag is persisted into ClusterConfig and is a no-op for non-VM drivers. Addresses issue #22804.
Changes:
- Add
MDNSfield toClusterConfig, themdnsconfig key, and the--mdnsstart flag wiring. - Invoke
resolvectl mdns eth0 yesvia the runner duringstartMachinefor VM drivers when enabled. - Add an integration test (
TestMDNS) verifying mDNS after start and after a stop/start cycle.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/minikube/config/types.go | Adds MDNS bool to ClusterConfig. |
| pkg/minikube/config/config.go | Adds the MDNS config key constant. |
| cmd/minikube/cmd/config/config.go | Registers MDNS as a settable bool config option. |
| cmd/minikube/cmd/start_flags.go | Declares the --mdns flag and wires it into new/existing cluster configs. |
| pkg/minikube/node/start.go | Runs resolvectl mdns eth0 yes on every start for VM drivers when MDNS is set. |
| test/integration/mdns_test.go | Adds an integration test to verify mDNS is enabled and survives stop/start. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
lets have a tutorial page in our site for it and make sure we unrestand and tell users how it is different than |
This comment has been minimized.
This comment has been minimized.
|
/ok-to-test |
There was a problem hiding this comment.
@carterpewpew Thanks for the updates!
The new tests works nicely and will be easy to debug.
% make integration TEST_ARGS="-minikube-start-args='--driver=vfkit --network=vmnet-shared' -test.run TestMDNS"
go test -ldflags="-X k8s.io/minikube/pkg/version.version=v1.38.1 -X k8s.io/minikube/pkg/version.isoVersion="v1.38.0"-1778539887-22974 -X k8s.io/minikube/pkg/version.gitCommitID="746cbc6fadc5d03d9ebea9bb06a7cb21fc978d72" -X k8s.io/minikube/pkg/version.storageProvisionerVersion=v5" -v -test.timeout=90m ./test/integration --tags="integration libvirt_dlopen" -minikube-start-args='--driver=vfkit --network=vmnet-shared' -test.run TestMDNS 2>&1 | tee "./out/testout_746cbc6fa.txt"
Found 12 cores, limiting parallelism with --test.parallel=6
=== RUN TestMDNS
mdns_test.go:41: (dbg) Run: out/minikube start -p mdns-727000 --mdns --memory=3072 --driver=vfkit --network=vmnet-shared
mdns_test.go:41: (dbg) Done: out/minikube start -p mdns-727000 --mdns --memory=3072 --driver=vfkit --network=vmnet-shared: (16.311147667s)
mdns_test.go:47: (dbg) Run: out/minikube ssh -p mdns-727000 -- sudo resolvectl status eth0
mdns_test.go:52: (dbg) Run: out/minikube stop -p mdns-727000
mdns_test.go:52: (dbg) Done: out/minikube stop -p mdns-727000: (12.203785333s)
mdns_test.go:61: (dbg) Run: out/minikube start -p mdns-727000 --memory=3072 --driver=vfkit --network=vmnet-shared
mdns_test.go:61: (dbg) Done: out/minikube start -p mdns-727000 --memory=3072 --driver=vfkit --network=vmnet-shared: (9.453992375s)
mdns_test.go:67: (dbg) Run: out/minikube ssh -p mdns-727000 -- sudo resolvectl status eth0
helpers_test.go:185: Cleaning up "mdns-727000" profile ...
helpers_test.go:188: (dbg) Run: out/minikube delete -p mdns-727000
--- PASS: TestMDNS (38.20s)
PASS
Tests completed in 38.200416542s (result code 0)
ok k8s.io/minikube/test/integration 39.219s
We still have some issues - minikube is old and complicated and not easy to work with.
| startCmd.Flags().StringP(gpus, "g", "", "Allow pods to use your GPUs. Options include: [all,nvidia,amd] (Docker driver with Docker container-runtime only)") | ||
| startCmd.Flags().Duration(autoPauseInterval, time.Minute*1, "Duration of inactivity before the minikube VM is paused (default 1m0s)") | ||
| startCmd.Flags().String(preloadSrc, "auto", "Which source to download the preload from (valid options: gcs, github, auto). Defaults to auto (try both).") | ||
| startCmd.Flags().Bool(mdns, mdnsDefault(), "If set, enable mDNS (.local address resolution) by configuring systemd-resolved inside the node (VM drivers only).") |
There was a problem hiding this comment.
This provides best user experience - we will see the right default in --help. But this is not consistent with other code, and it read the config twice.
Minikube configures viper to read minikube config, so we can use here viper.GetBool(mdns) to get the default.
The issue is that it does not work now, since minikube call viper.ReadConfig() too late - after the flags were created. This should be fixed to create read config before flags are created.
So we have 2 options - keep the "false" hardcoded value, and replace with viper.GetBool(mdns) when viper integration is fixed, or use now viper.GetBool(mdns) and fix the viper issue properly. I'm fine with both options.
There was a problem hiding this comment.
Kept the hardcoded false default for now. Will swap to viper.GetBool(mdns) once the viper config read ordering is fixed upstream.
| return n | ||
| } | ||
|
|
||
| func mdnsDefault() bool { |
There was a problem hiding this comment.
Adding special helper to read the config for every flag does not scale and not efficient. If we do this for every flag we will read the config many times during start.
There was a problem hiding this comment.
Agreed, removed the per-flag config reader. Using hardcoded false now.
| updateDurationFromFlag(cmd, &cc.AutoPauseInterval, autoPauseInterval) | ||
| updateBoolFromFlag(cmd, &cc.Rosetta, rosetta) | ||
| updateBoolFromFlag(cmd, &cc.VmnetOffloading, vmnetOffloading) | ||
| updateBoolFromFlag(cmd, &cc.MDNS, mdns) |
There was a problem hiding this comment.
This works and allow changing the flag when starting an existing machine:
minikube start # mdns disabled
minikube stop
minikube start --mdns # mdns enabled
minikube stop
minikube start # mdns enabled
minikube stop
minikube start --mdns=false # mdns disable
This is helps the user to fix mistakes 0 they forgot to enable mdns when the cluster was created.
But it is not fully baked - this will not work:
minikube start --mdns # mdns enabled
minikube start --mdns=false # mdns disabled in the running vm
minikube stop
minikube start # mdns disabled
I think the callback you add in the first version to warn that the user needs to restart is the way to handle this case, but this is a very poor user exprience.
The right way to support enable and disabling the flag dynamically is to disable mdns when on a running vm. But for this we need to keep the value in the config as a pointer so it can present:
- nil: mdns was not specified
- true: mdns enabled
- false: mdns disabled
But we don't have the infra to do this - the current code supports only bool values.
So we have 2 options:
- Support partial working flag that does not work running cluster
- Requires adding test coverage for dynamic changes - we need to cover disabling mdns on a cluster created with --mdns and enabling it back - that's 2 more start and stop test steps.
- Disable support for dynamic modifications - make --mdns create time flag. You need to recreate the cluster if you want to change it
- Current test are good enough of this.
I suggest we remove the update here to make the flag create time flag to keep the tests simpler. We can add dynamic enable/disable and support changing running vms later if users will request this feature. I think this is lower priority before we have a request, and we have much more important work.
BTW, rosetta and ventOffloading options are bugs - they do not work every for starting a stopped vm, since these options use ~/.minikube/machines/NAME/config.json and not ~/.minikube/profiles/NAME/config.json - which is the file modified in this function.
There was a problem hiding this comment.
Went with option 2, --mdns is create-time only. No update in updateExistingConfigFromFlags, users recreate to change it.
| } | ||
|
|
||
| if cfg.MDNS && driver.IsVM(hostInfo.Driver.DriverName()) { | ||
| if _, err := runner.RunCmd(exec.Command("sudo", "resolvectl", "mdns", "eth0", "yes")); err != nil { |
There was a problem hiding this comment.
It will help to add a comment about eth0, or add a constant. AI agents complain that this depends the distro. This is right since we use buildroot. The comment can explain why and help to fix this is we switch to another distro later.
A more robust way would be to use a constant from the code assuming buildroot, so it is clear that code depend on the distro. I'm not sure we have a good place of this, so adding a constant here should be good enough for now.
There was a problem hiding this comment.
Extracted eth0 into a primaryInterface constant with a comment noting it's buildroot-specific.
| } | ||
|
|
||
| // 5. Validate mDNS is still configured | ||
| assertMDNSEnabled(restartCtx, t, profile) |
There was a problem hiding this comment.
The test looks correct now, we run the entire flow so we will not have false negative like the other tests using sub tests. We will also get good logs when the test fails since we see the entire flow in test. This is another issue with tests trying to use very small sub tests.
One issue, we don't don't test that starting without --mdns does not enable mdns - if the code is broken and we always enable it the tests will succeed.
If we add a test it may fail if systemd changes the default and enable mdns by default, but in this case we want to detect the issue quickly and simplify the code.
Adding a test for disable --mdns should be skipped if minikube is configured to enable mdns by default.
So we are missing another test that start minikube without the flag and test that mdns is not set. This can be a test that run in parallel with this test. We can create 2 sub tests with MaybeParallel() so they we wait only for the slower test.
This again can be adde to another test that start a cluster, but it does not make sense to split tests like this. We want to have clean tests that are easy to modify and debug, and handle performance issue by running them in parallel.
| defer CleanupWithLogs(t, profile, func() {}) | ||
|
|
||
| // 1. Start with --mdns | ||
| startCtx, startCancel := context.WithTimeout(context.Background(), Minutes(5)) |
There was a problem hiding this comment.
We need to use t.Context() and not context.Background(). This context is canceled when we reach the global test timeout.
| // 4. Start without --mdns (setting is persisted in cluster config) | ||
| restartCtx, restartCancel := context.WithTimeout(context.Background(), Minutes(5)) | ||
| defer restartCancel() | ||
| restartArgs := append([]string{"start", "-p", profile, "--memory=3072"}, StartArgs()...) |
There was a problem hiding this comment.
This should run only "minkube start -p profile" no other argument is needed.
| // 1. Start with --mdns | ||
| startCtx, startCancel := context.WithTimeout(context.Background(), Minutes(5)) | ||
| defer startCancel() | ||
| startArgs := append([]string{"start", "-p", profile, "--mdns", "--memory=3072"}, StartArgs()...) |
There was a problem hiding this comment.
We can cover the config here by checking if minikube is configured to enable mdns by default - in this case we don't need to add "--mdns" to the command and the test should succeed.
|
kvm2 driver with docker runtime DetailsTimes for minikube start: 37.1s 38.1s 37.1s 38.7s 36.9s Times for minikube (PR 22964) ingress: 19.3s 15.2s 18.8s 15.2s 15.2s docker driver with docker runtime DetailsTimes for minikube start: 19.1s 21.6s 21.1s 21.3s 18.8s Times for minikube ingress: 12.7s 12.6s 12.7s 12.7s 12.7s docker driver with containerd runtime DetailsTimes for minikube ingress: 23.1s 25.1s 25.1s 23.7s 25.1s Times for minikube start: 16.0s 19.3s 17.2s 19.9s 19.4s |
|
@carterpewpew Please check #23069 - very similar problem, it would be nice to have configureMDNS() in node/dns.go, and we can test both using the same dns_test.go. Since writing integration tests is the hard part and we are close to the release, we can split the integration test to another pr and merge the fix now. |
|
Thank you @nirs for the inputs and apologies for the delay. I will take a look at this this week. |
@nirs Sounds good, split the integration test out and kept this PR to just the core feature. Moved the resolvectl logic into pkg/minikube/node/dns.go as suggested. PTAL |
Add an opt-in --mdns start flag and minikube config option that enables mDNS (.local address resolution) inside the cluster node by running `resolvectl mdns eth0 yes` on every start. The resolvectl call lives in configureMDNS() in node/dns.go and is invoked from startMachine (not provisioning) so it persists across stop/start cycles, since resolvectl is a runtime setting that does not survive a VM reboot. The flag is create-time only: it is baked into the cluster config at first start and does not need to be repeated on subsequent starts.
|
@carterpewpew: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
nirs
left a comment
There was a problem hiding this comment.
@carterpewpew thanks, deferring the test is the right way. Let's make this more like #23069.
Feel free to review #23069.
| AutoPauseInterval: viper.GetDuration(autoPauseInterval), | ||
| Rosetta: getRosetta(drvName), | ||
| VmnetOffloading: getVmnetOffloading(drvName), | ||
| MDNS: viper.GetBool(mdns), |
There was a problem hiding this comment.
Lets add getMDNS() helper like in #23069, so we filter non vm drivers (!driver.IsVM(), driver.IsSSH()) and warn that the flag will be ignored if the flag changed.
a2544ac
This will allow simplifying the code later, since cfg.MDNS will be set only if we need to use it, and user will get better feedback if they use an option that does not work with the current driver.
|
|
||
| // primaryInterface is the main network interface in the buildroot-based minikube ISO. | ||
| // Update this if the ISO switches to a different distro or naming scheme. | ||
| const primaryInterface = "eth0" |
There was a problem hiding this comment.
Good, I need to adopt this in 781cab1#diff-be4fac8e21b0e76d789b6aad73dceccc3118e0cbc88c1345e8aab848722e5e14R56
Check how we use "eth0" in other code, maybe we can have this constant in a common location, maybe part of the buildroot provisioner.
| func configureMDNS(runner command.Runner) { | ||
| if _, err := runner.RunCmd(exec.Command("sudo", "resolvectl", "mdns", primaryInterface, "yes")); err != nil { | ||
| out.WarningT("Failed to enable mDNS on {{.iface}}: {{.error}}", out.V{"iface": primaryInterface, "error": err}) | ||
| } |
There was a problem hiding this comment.
Lets keep this style to make the code easier to follow and change.
cmd := exec.Command("sudo", "resolvectl", "mdns", primaryInterface, "yes")
if _, err := runner.RunCmd(cmd); err != nil {
out.WarningT("Failed to enable mDNS on {{.iface}}: {{.error}}", out.V{"iface": primaryInterface, "error": err})
}This is easier to read and easier to modify, for example if we want to set cmd fields later.
|
|
||
| if cfg.MDNS && driver.IsVM(hostInfo.Driver.DriverName()) { | ||
| configureMDNS(runner) | ||
| } |
There was a problem hiding this comment.
If we do the check early in start_flags.go we can use here:
configureMDNS(runner, cfg.MDNS)Same as #23069
| // configureMDNS enables mDNS (.local address resolution) on the VM by configuring | ||
| // systemd-resolved. This allows the guest to resolve other machines on the local | ||
| // network that advertise via mDNS. | ||
| func configureMDNS(runner command.Runner) { |
There was a problem hiding this comment.
Lets pass the cfg.MDNS value and skip if it is not enabled. Keeping the logic of when to configure here keeps the caller code clean and make future changes easier. For exmple if we move to three-sate (true, false, nil) we need to change only this function.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Add an opt-in
--mdnsstart flag andminikube config set mdns trueoption that enables mDNS (.local address resolution) inside the cluster node by runningresolvectl mdns eth0 yeson every start.The resolvectl call lives in
configureMDNS()inpkg/minikube/node/dns.goand is invoked fromstartMachine(not provisioning) so it persists across stop/start cycles, since resolvectl is a runtime setting that does not survive a VM reboot.false, no behavior change for existing usersdriver.IsVM()guardout.WarningT, start continuesFixes #22804