PCIE Passtrough with a ZFS based VM

xdreez

Member
Nov 7, 2020
4
0
6
27
Hi everyone,
I shortly made an Upgrade, I used to use a LVM Storage for all my VMs Storage. Now I'm usinge a ZFS Pool (2x(RaidZ1 with 4xSSD)).
During moving some VMs from LVM Storage to the new ZFS Storage, I saw that the VMs with PCIe Passtrough cannot start anymore... With the Failure:
Code:
TASK ERROR: start failed: command '/usr/bin/kvm -id 112 -name Test -no-shutdown -chardev 'socket,id=qmp,path=/var/run/qemu-server/112.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/112.pid -daemonize -smbios 'type=1,uuid=032f0c3d-558a-48f4-bcfb-29672ce45919' -drive 'if=pflash,unit=0,format=raw,readonly,file=/usr/share/pve-edk2-firmware//OVMF_CODE.fd' -drive 'if=pflash,unit=1,format=raw,id=drive-efidisk0,size=131072,file=/dev/zvol/local-raid/vm-112-disk-1' -smp '12,sockets=2,cores=6,maxcpus=12' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/112.vnc,password -cpu host,+kvm_pv_eoi,+kvm_pv_unhalt -m 8192 -readconfig /usr/share/qemu-server/pve-q35-4.0.cfg -device 'vmgenid,guid=40f3b5cb-c3af-49c8-a158-d2777a79cd79' -device 'usb-tablet,id=tablet,bus=ehci.0,port=1' -device 'vfio-pci,host=0000:08:00.0,id=hostpci0.0,bus=ich9-pcie-port-1,addr=0x0.0,multifunction=on' -device 'vfio-pci,host=0000:08:00.1,id=hostpci0.1,bus=ich9-pcie-port-1,addr=0x0.1' -device 'VGA,id=vga,bus=pcie.0,addr=0x1' -chardev 'socket,path=/var/run/qemu-server/112.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:7c4dd2c78ae' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/zvol/local-raid/vm-112-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap112i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown,vhost=on' -device 'virtio-net-pci,mac=FA:CF:70:5E:E8:0C,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=101' -machine 'type=q35+pve0'' failed: got timeout

By removing the PCIe Passtrough, the VM could start...
I'm mostly sure that my IOMMU config work, becasue it was working on the LVM Storage.
Where do the ZFS Storage an difference which avoid the VM start...?

here some info about my standart VM setup with Pcie Passtrough:

Code:
balloon: 2048
bios: ovmf
boot: order=scsi0;net0
cores: 6
cpu: host,hidden=1,flags=+pcid
efidisk0: local-raid:vm-104-disk-1,size=1M
machine: q35
memory: 16384
name: Jellyfin
net0: virtio=***,bridge=vmbr0,firewall=1,tag=***
numa: 0
onboot: 1
ostype: l26
parent: Snapshot_02
scsi0: local-raid:vm-104-disk-0,cache=writethrough,discard=on,size=20G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=***
sockets: 2
startup: order=3
vga: std
vmgenid: ***

My Hardware are:

HP ProLiant DL380 G7 Rack Server with:

- 24 x Intel(R) Xeon(R) CPU X5660 @ 2.80GHz

- 96 GB RAM ECC

- M.2 SAS SSD WD Green 120 GB (local+local-lvm) --> Hypervisor Disk

- M.2 NVME Samsung EVO 512 GB (LVM-Thin) --> VM Storage

- LSI Logic SAS2008 (SAS Controller) --> Passtrough

- 2x Nvidia quadro p400 --> 2x Passtrough

- 4x 840 EVO SSD + 4x 850 EVO SSD in 2xraidz1 like:

Code:
NAME                                        STATE     READ WRITE CKSUM
        local-raid                          ONLINE       0     0     0
          raidz1-0                          ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_850_EVO_250GB_  ONLINE       0     0     0
          raidz1-1                          ONLINE       0     0     0
            ata-Samsung_SSD_840_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_840_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_840_EVO_250GB_  ONLINE       0     0     0
            ata-Samsung_SSD_840_EVO_250GB_  ONLINE       0     0     0

Any Idea?
 
But you are not trying to run the 8 ZFS SSD on that SAS2008 you are trying to passthrough right?

And you are not running out of RAM? ZFS will use UP to 50% of your hosts RAM so 48GB RAM for ARC. And if you start a VM with PCIe passthrough all that VMs RAM will be pinned and always will use the full size. Don't know if it is possible that the ARC can't shrink fast enough if the passthrough VM got alot of RAM assigned.

Another thing that has nothing to do with your problem but isn't optimal is your "cache=writethrough". ARC is already doing the caching so I think all VMs should get "cache=none".
 
Hi, No i'm not trying to passtrough the SAS card.
The RAM usage is fine, I set a max ram uasge of 8gb for the zfs, with:
/etc/modprobe.d/zfs.conf
Code:
options zfs zfs_arc_max=8589934592
As I say, if I remove the PCIE Device in the Hardware Setting of the VM, it work.
Doesn't matter which Cache setting a choose, it doesn't work.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!