drivers: Allow parallel start for safe vm drivers#23112
Conversation
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: nirs The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/ok-to-test |
This comment has been minimized.
This comment has been minimized.
|
/retest-required |
1 similar comment
|
/retest-required |
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
54ddce7 to
e761875
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Replace exponential backoff retry with a simple 2s polling loop,
consistent with other drivers (vfkit, krunkit, qemu).
Before (exponential backoff, actual timestamps from logs):
0.000s attempt 1
0.219s attempt 2
0.492s attempt 3
0.900s attempt 4
1.284s attempt 5
2.016s attempt 6
2.952s attempt 7
4.025s attempt 8
5.103s attempt 9
6.477s attempt 10
8.066s attempt 11
10.230s attempt 12
12.566s attempt 13
15.990s attempt 14
(retry.go: will retry after 5.3s)
21.285s attempt 15 <-- found IP
After (fixed 2s interval, actual timestamps from logs):
0.000s attempt 1
2.005s attempt 2
4.010s attempt 3
6.015s attempt 4
8.020s attempt 5
10.025s attempt 6
12.031s attempt 7
14.037s attempt 8
16.042s attempt 9
18.046s attempt 10 <-- found IP
The first 7 attempts in the old schedule fire within 2.9 seconds at
sub-second intervals. DHCP negotiation cannot complete this fast, so
every one of these attempts is guaranteed to fail — they only add load
on libvirt.
The last attempts 10-15 waited more than 2 second, increasing the time
to detect the IP, slowing start time by 3.2 seconds. Exponential backoff
is not the right tool for this job.
- Pass the domain object to avoid LookupDomainByName on every iteration
- Use d.PrivateMAC directly instead of parsing domain XML for the MAC
- Log start with timeout and finish with elapsed time
- Log attempt count for easier debugging
- Extract lookupIP and reserveIP to clarify the wait loop flow
The KVM driver's GetIP() re-queries libvirt for the IP address on every
call instead of returning the stored d.IPAddress. This causes a bug
under parallel stress:
Call chain:
1. waitForStaticIP() finds IP, sets d.IPAddress = "192.168.x.x"
2. Start() returns successfully
3. saveHost() calls api.Save(h) — persists driver JSON (with IP set)
4. saveHost() calls h.Driver.GetIP() to store in node config (n.IP)
5. GetIP() ignores d.IPAddress, opens 2 new libvirt connections,
calls ipFromXML() which queries dhcpLease() for the MAC
6. Under stress, dhcpLease() returns nil (lease not yet visible in
the network's DHCP database) — returns ("", nil)
7. saveHost() stores n.IP = ""
8. setupKubeconfig() calls ControlPlaneEndpoint() with cp.IP = ""
9. net.LookupIP("") fails → "failed to lookup ip for \"\""
Fix by returning the stored d.IPAddress when available (fast path),
falling back to querying libvirt only when the IP is not stored and
the domain is running. This is consistent with vfkit and krunkit which
also return the stored IP directly.
The fallback queries the domain's live interface addresses
(ListAllInterfaceAddresses) rather than the network's DHCP lease
database (GetDHCPLeases). This is more reliable because:
- The domain's interfaces reflect the actual assigned IP
- The lease database may be stale when addStaticIP races with
concurrent network updates from other VMs
- No need to look up the MAC from domain XML (d.PrivateMAC is known)
This comment has been minimized.
This comment has been minimized.
Currently acquireMachinesLock uses a single lock per driver, serializing all "minikube start" commands for profiles using the same driver. This was needed for VirtualBox (VBoxManage cannot handle concurrent calls) but is unnecessary for drivers where each profile creates an independent VM or container with no shared global state. In drenv (RamenDR test framework) we create 3 minikube clusters in parallel using kvm2. Due to the shared lock, cluster creation is serialized for ~25 seconds per cluster: [hub] Cluster started in 41.09 seconds [dr2] Cluster started in 64.46 seconds [dr1] Cluster started in 89.48 seconds All three clusters are requested at the same time, but the lock forces them to wait. The ~48 second spread between the first and last cluster is pure lock serialization overhead. The random lock acquisition order causes up to 50 seconds of runtime variance between runs (p95: 6m50s when both large clusters start first vs 7m40s when they start last). Add a Parallel property to the driver registry. Drivers that set Parallel to true use a per-profile lock, allowing concurrent creation of multiple profiles. Drivers that do not set it (default false) keep the current serialized behavior. Enable parallel start for kvm2, qemu2, vfkit, krunkit, docker, and podman. These drivers create fully independent VMs or containers and have no shared state that requires serialization. Other drivers (virtualbox, vmware, hyperv, hyperkit) are left serialized until someone tests them and confirms they are safe.
|
kvm2 driver with docker runtime DetailsTimes for minikube start: 40.8s 36.9s 40.2s 39.5s 39.8s Times for minikube ingress: 15.3s 19.2s 18.8s 18.7s 19.3s docker driver with docker runtime DetailsTimes for minikube ingress: 12.6s 12.6s 12.6s 12.6s 12.6s Times for minikube start: 18.5s 17.9s 21.6s 18.1s 17.5s docker driver with containerd runtime DetailsTimes for minikube start: 16.4s 19.3s 16.5s 19.5s 15.9s Times for minikube ingress: 23.6s 24.1s 24.1s 23.6s 23.6s |
|
/retest |
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
|
@nirs: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
This comment was marked as outdated.
This comment was marked as outdated.
|
/retest-required |
|
@obnoxxx can you review? |
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Add envs/provider.yaml creating 6 tiny clusters in parallel for testing minikube parallel clusters creation. Related-to: kubernetes/minikube#23112 Signed-off-by: Nir Soffer <nsoffer@redhat.com>
Currently acquireMachinesLock uses a single lock per driver, serializing all "minikube start" commands for profiles using the same driver. This was needed for VirtualBox (VBoxManage cannot handle concurrent calls) but is unnecessary for drivers where each profile creates an independent VM or container with no shared global state.
In drenv (RamenDR test framework) we create 3 minikube clusters in parallel using kvm2. Due to the shared lock, cluster creation is serialized for ~25 seconds per cluster:
All three clusters are requested at the same time, but the lock forces them to wait. The ~48 second spread between the first and last cluster is pure lock serialization overhead. The random lock acquisition order causes up to 50 seconds of runtime variance between runs (p95: 6m50s when both large clusters start first vs 7m40s when they start last).
Add a Parallel property to the driver registry. Drivers that set Parallel to true use a per-profile lock, allowing concurrent creation of multiple profiles. Drivers that do not set it (default false) keep the current serialized behavior.
Enable parallel start for kvm2, qemu2, vfkit, krunkit, docker, and podman. These drivers create fully independent VMs or containers and have no shared state that requires serialization. Other drivers (virtualbox, vmware, hyperv, hyperkit) are left serialized until someone tests them and confirms they are safe.
vfkit test results
Tested with drenv, creating 6 small clusters in parallel. Testing shows that start time for 6 clusters decreased from 36.3 seconds to 22.7 seconds (1.61x faster).
Before
After
kvm test results
Tested with drenv, creating 6 small clusters in parallel. Testing shows that start time for 6 clusters decreased from 141.1 seconds to 63.3 seconds (2.2x faster).
Before
After
drenv stress test results
drenv stress test starting 3 minikube cluster using kvm driver and provisioning the cluster for DR testing. Tested with latest minikube release (1.38.1) and this PR.
The test show decreased start time from 426s to 406s (1.05x faster), and decreased variance from 16.9s to 10s (1.69x better).
Notes: all failures are known issues, not related to this change.
Before - drenv addon times
With latest release cluster start order is random and it has big effect on addon deployment times.
After - drenv addons times
With this change all clusters starts in parallel which make addons start time much more consistent and predictable.