Strange pcie passthrough behavior

pierloh · Sep 16, 2022

Hi there,

I am setting up proxmox on a new beefy system (dual epyc 7742, 512gb of memory, and 4 nvme drives).

Upon setting up my first VM, I went all in by assigning 128 cores in 2 sockets with numa enabled, with a good chunk of memory. I was able to boot it up and install my OS as normal. Booting the VM on and off is instantaneous at that stage.

However as soon as I assign any device for pcie passthrough, such as my nvme drives, the following steps occur in order:

- host memory usage ramps up to maximum guest allowance
- guest hangs for about 15-20 minutes and prevents me from loading its vnc console, or even ssh (proxmox tasks indicates "Error: Failed to run vnc proxy")
- once booted in, I can see my passed through pcie devices listed via `lspci` but none of the drives listed via `lsblk`

I tried reducing the amount of memory and cores per socket, reducing socket back to 1, disabling numa, but same behavior and results. Now and oddly enough when lowering the total amount of cores to around 32 or less, these issues seem to resolve partly:

- host memory usage still ramps up upon start of the VM
- no hanging, VM boots immediately after than
- nvme drives are all showing via both `lspci` and `lsblk`

I couldn't find any information online with regards to beefy VMs so unsure if that is common behavior for one reason or another. But I should also mention none of these issues seem to happen when I assign these same nvme drives as hard disks to the VM, on the contrary it appears to boot just as fast as before adding pcie passthrough devices. I would greatly appreciate any advice on how to resolve this.

This is my VM config and pve version in case it helps, please let me know any other information I may provide to help understand what is happening.

Code:

# cat /etc/pve/qemu-server/112.conf
agent: 1
balloon: 0
boot: order=scsi0;net0
cores: 128
cpu: host
hostpci0: 0000:81:00.0,pcie=1
hostpci1: 0000:82:00.0,pcie=1
hostpci2: 0000:83:00.0,pcie=1
hostpci3: 0000:84:00.0,pcie=1
machine: q35
memory: 458752
meta: creation-qemu=7.0.0,ctime=1663244103
name: fedora
net0: virtio=62:8E:05:39:8E:8A,bridge=vmbr0,firewall=1
numa: 1
ostype: l26
scsi0: local-zfs:vm-112-disk-0,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=afb7e0d3-b63f-433c-bf77-a6c4870bcfb8
sockets: 2
vmgenid: 265cd566-241a-45bc-9cb3-2a4133db5879

Code:

# pveversion -v
proxmox-ve: 7.2-1 (running kernel: 5.15.53-1-pve)
pve-manager: 7.2-11 (running version: 7.2-11/b76d3178)
pve-kernel-helper: 7.2-12
pve-kernel-5.15: 7.2-10
pve-kernel-5.15.53-1-pve: 5.15.53-1
ceph-fuse: 15.2.14-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve1
libproxmox-acme-perl: 1.4.2
libproxmox-backup-qemu0: 1.3.1-1
libpve-access-control: 7.2-4
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.2-2
libpve-guest-common-perl: 4.1-2
libpve-http-server-perl: 4.1-3
libpve-storage-perl: 7.2-8
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.0-3
lxcfs: 4.0.12-pve1
novnc-pve: 1.3.0-3
proxmox-backup-client: 2.2.6-1
proxmox-backup-file-restore: 2.2.6-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.5.1
pve-cluster: 7.2-2
pve-container: 4.2-2
pve-docs: 7.2-2
pve-edk2-firmware: 3.20220526-1
pve-firewall: 4.2-6
pve-firmware: 3.5-1
pve-ha-manager: 3.4.0
pve-i18n: 2.7-2
pve-qemu-kvm: 7.0.0-3
pve-xtermjs: 4.16.0-1
qemu-server: 7.2-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.7.1~bpo11+1
vncterm: 1.7-1
zfsutils-linux: 2.1.5-pve1

Neobin · Sep 16, 2022

Can't say anything regarding the (big) amount of assigned cores and the behavior with that; but regarding the big amount of memory and PCIe-passthrough, I wrote a little bit here:
https://forum.proxmox.com/threads/vm-take-too-much-time-to-start.115096/#post-497666

pierloh · Sep 16, 2022

Thanks for your reply, unfortunately it didn't seem to make a difference here

pierloh · Sep 16, 2022

I did a test setting up unraid with a similarly spec'd VM on the same server, passing through the same storage devices, and it booted without trouble. Memory still ramps up on boot as soon as I passthrough any pcie device, but apart from that I was able to boot and detect drives with all cores assigned.

My understanding is that Proxmox uses `kvm` and Unraid uses `libvirt`, which are both using `qemu` as the low level virtualisation engine. Maybe I am not getting this right though. But would there be any clue around there? Or maybe the fact that Unraid uses cpu pinning when assigning cores, which possibly handles hardware initialization differently?

Would love to stick with Proxmox in this use case, any advice welcome!

dcsapak · Sep 19, 2022

just a shot in the dark, but did you try with bios ovmf instead of seabios ? also a 'dmesg' from inside the guest would be interesting

pierloh · Sep 19, 2022

I rebuilt my VM using UEFI bios, but am getting same resulting behavior. I did reduce a bit its memory resource so it boots up a little quicker. See output `dmesg` with and without pcie passthrough attached to this response.

This is my new VM config:

Code:

agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;net0
cores: 128
cpu: host
hostpci0: 0000:81:00.0,pcie=1
hostpci1: 0000:82:00.0,pcie=1
hostpci2: 0000:83:00.0,pcie=1
hostpci3: 0000:84:00.0,pcie=1
efidisk0: local-zfs:vm-112-disk-1,efitype=4m,pre-enrolled-keys=1,size=1M
hookscript: iso:snippets/taskset-hook.sh
hugepages: 1024
machine: q35
memory: 131072
meta: creation-qemu=7.0.0,ctime=1663244103
name: fedora
net0: virtio=62:8E:05:39:8E:8A,bridge=vmbr0,firewall=1
numa: 1
ostype: l26
scsi0: local-zfs:vm-112-disk-0,size=32G
scsihw: virtio-scsi-pci
smbios1: uuid=afb7e0d3-b63f-433c-bf77-a6c4870bcfb8
sockets: 2
vmgenid: 265cd566-241a-45bc-9cb3-2a4133db5879

When adding nvme pcie passthrough, the machine boots in emergency mode with the following message:

Code:

You are in emergency mode. After logging in, type "journalctl -xb" to view system logs, "systemctl reboot" to reboot, "systemctl default" or "exit" to boot into default mode.

I've attached the output of `journalctl -xb` as well.

dcsapak · Sep 20, 2022

hmm.. sadly nothing that would indicate a problem from that, do the host logs show anything?

also a few things that you could try:
using a different cpu model
using a different machine type (e.g. the older versions of q35 like 6.2 instead of 7.0)
using a different host kernel (e.g. the new optional 5.19 kernel)

pierloh · Sep 21, 2022

Thanks for your response. I tried switching to q35 6.2 and kernel 5.19 but getting same behavior. I haven't tried a different cpu model, did you mean one in particular other than host in the VM config?

This is my host syslog, and attached its dmesg as well. There seem to be vfio related error messages, none of which occur when I lower core count way back.

Code:

Sep 20 16:59:47 mordor pvedaemon[6286]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 16:59:51 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 16:59:51 mordor pvestatd[6256]: status update time (6.117 seconds)
Sep 20 17:00:01 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:01 mordor pvestatd[6256]: status update time (6.112 seconds)
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:00:02 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio 0000:83:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-25: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:00:02 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:00:02 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio 0000:84:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-24: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio 0000:82:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-25: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:00:02 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:00:02 mordor QEMU[25696]: kvm: vfio 0000:81:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-25: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:00:06 mordor pvedaemon[6286]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:11 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:11 mordor pvestatd[6256]: status update time (6.122 seconds)
Sep 20 17:00:21 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:21 mordor pvestatd[6256]: status update time (6.113 seconds)
Sep 20 17:00:26 mordor pvedaemon[6285]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:31 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:31 mordor pvestatd[6256]: status update time (6.111 seconds)
Sep 20 17:00:42 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:42 mordor pvestatd[6256]: status update time (6.111 seconds)
Sep 20 17:00:45 mordor pvedaemon[6285]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:51 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:00:51 mordor pvestatd[6256]: status update time (6.111 seconds)
Sep 20 17:01:01 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:01 mordor pvestatd[6256]: status update time (6.112 seconds)
Sep 20 17:01:04 mordor pvedaemon[6287]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:11 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:11 mordor pvestatd[6256]: status update time (6.114 seconds)
Sep 20 17:01:21 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:21 mordor pvestatd[6256]: status update time (6.113 seconds)
Sep 20 17:01:23 mordor pvedaemon[6286]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:31 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:31 mordor pvestatd[6256]: status update time (6.110 seconds)
Sep 20 17:01:41 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:41 mordor pvestatd[6256]: status update time (6.110 seconds)
Sep 20 17:01:42 mordor pvedaemon[6287]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:51 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:01:51 mordor pvestatd[6256]: status update time (6.112 seconds)
Sep 20 17:02:01 mordor pvedaemon[6286]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:02:01 mordor pvestatd[6256]: VM 112 qmp command failed - VM 112 qmp command 'query-proxmox-support' failed - unable to connect to VM 112 qmp socket - timeout after 31 retries
Sep 20 17:02:01 mordor pvestatd[6256]: status update time (6.112 seconds)
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:02:05 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio 0000:84:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-25: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:02:05 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:02:05 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio 0000:83:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-26: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio 0000:82:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-26: VFIO_DEVICE_SET_IRQS failure: Invalid argument
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: Error: event_notifier_init failed
Sep 20 17:02:05 mordor kernel: Spurious interrupt (vector 0xef) on CPU#0. Acked
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio: failed to enable vectors, -1
Sep 20 17:02:05 mordor QEMU[25696]: kvm: vfio 0000:81:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-26: VFIO_DEVICE_SET_IRQS failure: Invalid argument

dcsapak · Sep 21, 2022

pierloh said:
kvm: vfio 0000:81:00.0: Failed to set up TRIGGER eventfd signaling for interrupt MSIX-25: VFIO_DEVICE_SET_IRQS failure: Invalid argument

i am not really familiar with that error, but a short google search yielded this thread (which seems to have some solution?) https://forums.unraid.net/topic/120895-passing-a-capture-card-to-vms/

its probably even related to the other problems you have...

pierloh · Sep 21, 2022

Thanks for this link, I tried it, but it doesn't seem directly related, not to these devices anyway.

Code:

root@mordor:~# /mnt/pve/iso/snippets/check_intx.sh 0000:81:00.0
INTx disable supported and enabled on 0000:81:00.0
root@mordor:~# /mnt/pve/iso/snippets/check_intx.sh 0000:82:00.0
INTx disable supported and enabled on 0000:82:00.0
root@mordor:~# /mnt/pve/iso/snippets/check_intx.sh 0000:83:00.0
INTx disable supported and enabled on 0000:83:00.0
root@mordor:~# /mnt/pve/iso/snippets/check_intx.sh 0000:84:00.0
INTx disable supported and enabled on 0000:84:00.0

I should mention, in case it sparks an idea, that I connected these U.2 drives via a linkreal adapter card (4x onboard pcie 3.0 u.2 slots), and rely on my motherboard's 4x4 pcie bifurcation mode to access them. I can't see the adapter card itself show up as a device to passthrough, which I imagine is fine considering it has no onboard retimer logic other than ensuring signal integrity.

I do have an occulink retimer card on hand, so will try to passthrough those drives by the intermediary of a card passthrough instead, see if that works better somehow.

Search

Search

Strange pcie passthrough behavior

pierloh

New Member

Neobin

Distinguished Member

pierloh

New Member

pierloh

New Member

dcsapak

Proxmox Staff Member

pierloh

New Member

Attachments

dcsapak

Proxmox Staff Member

pierloh

New Member

Attachments

dcsapak

Proxmox Staff Member

pierloh

New Member

We value your privacy