RDMA breaking Passthrough - TASK ERROR: timeout waiting on systemd

Feb 4, 2024
89
12
8
Hello all, since some days we run on Vitastor which can leverage RDMA. Since enabling RDMA we have the problem that VMs which PCI passthrough GPUS only boot after serverboot but then when trying to restart or shut down / power on the VM wont start.

1752604233022.png

is that a know problem? enabled rdma on connect x 4 and 5 cards

Jul 15 15:14:18 pve1 systemd[1]: Watchdog running with a hardware timeout of 10min.
Jul 15 15:14:18 pve1 systemd-shutdown[1]: Watchdog running with a hardware timeout of 10min.
Jul 15 16:34:12 pve1 pvedaemon[66214]: timeout waiting on systemd
Jul 15 16:34:12 pve1 pvedaemon[2363]: <root@pam> end task UPID:pve1:000102A6:00071BA4:687666D0:qmstart:188:root@pam: timeout waiting on systemd



proxmox-ve: 8.4.0 (running kernel: 6.8.12-11-pve)pve-manager: 8.4.1 (running version: 8.4.1/2a5fa54a8503f96d)proxmox-kernel-helper: 8.1.1proxmox-kernel-6.8.12-11-pve-signed: 6.8.12-11proxmox-kernel-6.8: 6.8.12-11proxmox-kernel-6.8.12-10-pve-signed: 6.8.12-10proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6proxmox-kernel-6.5: 6.5.13-6proxmox-kernel-6.5.11-8-pve-signed: 6.5.11-8ceph-fuse: 19.2.1-pve3corosync: 3.1.9-pve1criu: 3.17.1-2+deb12u1glusterfs-client: 10.3-5ifupdown2: 3.2.0-1+pmx11ksm-control-daemon: 1.5-1libjs-extjs: 7.0.0-5libknet1: 1.30-pve2libproxmox-acme-perl: 1.6.0libproxmox-backup-qemu0: 1.5.1libproxmox-rs-perl: 0.3.5libpve-access-control: 8.2.2libpve-apiclient-perl: 3.3.2libpve-cluster-api-perl: 8.1.0libpve-cluster-perl: 8.1.0libpve-common-perl: 8.3.1libpve-guest-common-perl: 5.2.2libpve-http-server-perl: 5.2.2libpve-network-perl: 0.11.2libpve-rs-perl: 0.9.4libpve-storage-perl: 8.3.6libspice-server1: 0.15.1-1lvm2: 2.03.16-2lxc-pve: 6.0.0-1lxcfs: 6.0.0-pve2novnc-pve: 1.6.0-2proxmox-backup-client: 3.4.1-1proxmox-backup-file-restore: 3.4.1-1proxmox-firewall: 0.7.1proxmox-kernel-helper: 8.1.1proxmox-mail-forward: 0.3.2proxmox-mini-journalreader: 1.4.0proxmox-offline-mirror-helper: 0.6.7proxmox-widget-toolkit: 4.3.11pve-cluster: 8.1.0pve-container: 5.2.7pve-docs: 8.4.0pve-edk2-firmware: 4.2025.02-3pve-esxi-import-tools: 0.7.4pve-firewall: 5.1.2pve-firmware: 3.15-4pve-ha-manager: 4.0.7pve-i18n: 3.4.4pve-qemu-kvm: 9.2.0-5+vitastor1pve-xtermjs: 5.5.0-2qemu-server: 8.3.14smartmontools: 7.3-pve1spiceterm: 3.3.0swtpm: 0.8.0+pve1vncterm: 1.8.0zfsutils-linux: 2.2.7-pve2
 
Last edited:
error writing '1' to '/sys/bus/pci/devices/0000:01:00.0/reset': Inappropriate ioctl for device
failed to reset PCI device '0000:01:00.0', but trying to continue as not all devices need a reset
TASK ERROR: timeout waiting on systemd