Skip to content

Networking/VSOCK communication not working reliably after hibernation (Linux host) #4929

@kpouget

Description

@kpouget

Description

While looking at the guest Linux hibernation capabilities, I noticed that the host<>guest networking was not working reliably after the guest system hibernation.

Investigations showed that this network is built on top of VSOCKS:

crc daemon <> vsock server <> ... <host|guest> .... vsock client <> gvforwarder

and the problem was reproducible at the vsock communication level: vsock server <> ... <host|guest> .... vsock client.

Prerequisite

  • Linux on the guest and host.
    • my host kernel: 6.14.0-63.fc42.x86_64
    • my guest kernel: 5.14.0-570.25.1.el9_6.x86_64
  • ability to hibernate/wake up the guest
    • see in the my raw notes in the <details> at the bottom of this ticket for try it with crc 2.53.0.
  • libvirt + qemu
    • my qemu-kvm is 9.2.4
    • libvirt is 11.0.0

Reproducer

python3 vsock-server.py 1025
python3 vsock-client.py 1025 --multiple

==> do this after two hibernations. After one hibernation it works well.

Results observed

the client script opens a vsock, writes hello on it and close. With --multiple, it does this sequence multiple times. After a few tries (40), this error occurs:

TimeoutError: [Errno 110] Connection timed out

and the vsock link cannot be used anymore, no data is received by the host.

Running this command:

python3 vsock-client.py 1025 --infinite

sends an infinite number of hello messages, always with the same vsock. The link does not break.

The error is not the same as when no server is listening. When no server is listening, this is the error:

ConnectionResetError: [Errno 104] Connection reset by peer

Mind that if crc daemon + gvforwarder are running, they will use the vsock and the link will break faster that what I describe above. Better shut them down.

Misc: Raw notes for enabling hibernation in CRC

Raw notes for enabling hibernation in CRC

On the host:

  • enable the console
EDITOR=vi virsh -c qemu:///system  edit crc
# update the serial and console blocks to s/stdio/pty

<serial type='pty'>
  <target port='0'/>
</serial>
<console type='pty'>
  <target type='serial' port='0'/>
</console>
  • disable virtio-fs
    <filesystem type='mount' accessmode='passthrough'>
      <driver type='virtiofs'/>
      <source dir='/home/kpouget'/>
      <target dir='dir0'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </filesystem>
  • add a disk for the swap
SWAP_DISK=$HOME/.crc/machines/crc/swap.qcow2
qemu-img create -f qcow2 "$SWAP_DISK" 11G
virsh -c qemu:///system  attach-disk crc "$SWAP_DISK" vdswap --subdriver qcow2 --persistent
# virsh -c qemu:///system  detach-disk crc vdswap --persistent
  • copy the kubeconfig to the guest
scp .kube/config crc:.kube/config
  • extend the memory
virsh -c qemu:///system  destroy crc
virsh -c qemu:///system  setmaxmem crc 16G
virsh -c qemu:///system  setmem crc 16G --config
  • restart with the console
virsh -c qemu:///system  start crc;  virsh -c qemu:///system  console crc

On the guest

  • Enable the console
sudo mount -o remount,rw /boot
sudo vi /boot/loader/entries/ostree-1.conf
# add console=ttyS0,115200 to the options
  • set the password of the core user
crc config set developer-password core

or

sudo su
passwd core
  • Allow kubelet to work with swap (requires a node reboot)
oc label machineconfigpool master kubelet-swap=enabled
apiVersion: machineconfiguration.openshift.io/v1
kind: KubeletConfig
metadata:
  name: swap-config
spec:
  machineConfigPoolSelector:
    matchLabels:
      kubelet-swap: enabled
  kubeletConfig:
    failSwapOn: false

On the guest

  • enable the swap
sudo mkswap /dev/vda
sudo swapon /dev/vda
  • enable the reboot-from-swap
sudo mount -o remount,rw /boot
sudo blkid /dev/vda
/dev/vda: UUID="a8491f2c-46b1-41c0-bd95-2cc77d90675b" TYPE="swap"
sudo vi /boot/loader/entries/ostree-1.conf
# add resume=UUID=(see above)
  • hibernate
sudo systemctl hibernate
  • restart from hibernation
virsh -c qemu:///system  start crc;  virsh -c qemu:///system  console crc

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions