[SOLVED] PCI block start

MajRonin

Member
Jun 16, 2020
21
1
23
Hello

I reinstalled proxmox 3 days ago cause i got issue with NIC. To make that clean, i decided to do a fresh install.

But until that reinstallation, adding a GPU card only by web-interface by doing add->PCI device --> choosing my GPU, was still possible.

But with that fresh install, when i'm trying to add by the same way, machine can't boot and gets freeze state during boot before getting just :
" TASK ERROR: start failed: command '/usr/bin/kvm -id 116 -name testGPU -chardev 'socket,id=qmp,path=/var/run/qemu-server/116.qmp,server,nowait' -mon 'chardev=qmp,mode=control' -chardev 'socket,id=qmp-event,path=/var/run/qmeventd.sock,reconnect=5' -mon 'chardev=qmp-event,mode=control' -pidfile /var/run/qemu-server/116.pid -daemonize -smbios 'type=1,uuid=a315699b-47c0-4ab9-bcd2-26863545303b' -smp '1,sockets=1,cores=1,maxcpus=1' -nodefaults -boot 'menu=on,strict=on,reboot-timeout=1000,splash=/usr/share/qemu-server/bootsplash.jpg' -vnc unix:/var/run/qemu-server/116.vnc,password -cpu kvm64,enforce,+kvm_pv_eoi,+kvm_pv_unhalt,+lahf_lm,+sep -m 1000 -device 'pci-bridge,id=pci.1,chassis_nr=1,bus=pci.0,addr=0x1e' -device 'pci-bridge,id=pci.2,chassis_nr=2,bus=pci.0,addr=0x1f' -device 'vmgenid,guid=8afed51c-2fc2-474f-ad1a-277e9fe4e5a3' -device 'piix3-usb-uhci,id=uhci,bus=pci.0,addr=0x1.0x2' -device 'usb-tablet,id=tablet,bus=uhci.0,port=1' -device 'vfio-pci,host=0000:0a:00.0,id=hostpci0.0,bus=pci.0,addr=0x10.0,multifunction=on' -device 'vfio-pci,host=0000:0a:00.1,id=hostpci0.1,bus=pci.0,addr=0x10.1' -device 'VGA,id=vga,bus=pci.0,addr=0x2' -chardev 'socket,path=/var/run/qemu-server/116.qga,server,nowait,id=qga0' -device 'virtio-serial,id=qga0,bus=pci.0,addr=0x8' -device 'virtserialport,chardev=qga0,name=org.qemu.guest_agent.0' -device 'virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x3' -iscsi 'initiator-name=iqn.1993-08.org.debian:01:58bac947f2c8' -drive 'file=/var/lib/vz/template/iso/debian-10.4.0-amd64-netinst.iso,if=none,id=drive-ide2,media=cdrom,aio=threads' -device 'ide-cd,bus=ide.1,unit=0,drive=drive-ide2,id=ide2,bootindex=200' -device 'virtio-scsi-pci,id=scsihw0,bus=pci.0,addr=0x5' -drive 'file=/dev/pve/vm-116-disk-0,if=none,id=drive-scsi0,format=raw,cache=none,aio=native,detect-zeroes=on' -device 'scsi-hd,bus=scsihw0.0,channel=0,scsi-id=0,lun=0,drive=drive-scsi0,id=scsi0,bootindex=100' -netdev 'type=tap,id=net0,ifname=tap116i0,script=/var/lib/qemu-server/pve-bridge,downscript=/var/lib/qemu-server/pve-bridgedown' -device 'rtl8139,mac=3A:7F:0F:4B:E7:89,netdev=net0,bus=pci.0,addr=0x12,id=net0,bootindex=300' -machine 'type=pc+pve0'' failed: got timeout "

i did a bios update before making fresh install : does someone think problem can come from that ?

Thanks a lot in advance
 
BIOS updates can definitely affect IOMMU and passthrough related issues. Could you post your full VM config (/etc/pve/qemu-server/<vmid>.conf) and potentially the output of the following:

Code:
find /sys/kernel/iommu_groups/ -type l

Also check any logs (dmesg, journalctl -e) to see if anything shows up when starting the VM or on boot.
 
I resolved the issue in Bios where IOMMU was not enabled

here is the result of
Code:
find /sys/kernel/iommu_groups/ -type l

Code:
root@pve:~# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/17/devices/0000:0b:00.2
/sys/kernel/iommu_groups/7/devices/0000:00:07.0
/sys/kernel/iommu_groups/15/devices/0000:0a:00.0
/sys/kernel/iommu_groups/15/devices/0000:0a:00.1
/sys/kernel/iommu_groups/5/devices/0000:00:03.1
/sys/kernel/iommu_groups/13/devices/0000:01:00.0
/sys/kernel/iommu_groups/3/devices/0000:00:02.0
/sys/kernel/iommu_groups/21/devices/0000:0c:00.3
/sys/kernel/iommu_groups/11/devices/0000:00:14.3
/sys/kernel/iommu_groups/11/devices/0000:00:14.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.1
/sys/kernel/iommu_groups/18/devices/0000:0b:00.3
/sys/kernel/iommu_groups/8/devices/0000:00:07.1
/sys/kernel/iommu_groups/16/devices/0000:0b:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:04.0
/sys/kernel/iommu_groups/14/devices/0000:03:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.2
/sys/kernel/iommu_groups/14/devices/0000:02:00.0
/sys/kernel/iommu_groups/14/devices/0000:03:06.0
/sys/kernel/iommu_groups/14/devices/0000:08:00.0
/sys/kernel/iommu_groups/14/devices/0000:03:05.0
/sys/kernel/iommu_groups/14/devices/0000:06:00.0
/sys/kernel/iommu_groups/14/devices/0000:02:00.1
/sys/kernel/iommu_groups/14/devices/0000:03:04.0
/sys/kernel/iommu_groups/14/devices/0000:03:07.0
/sys/kernel/iommu_groups/4/devices/0000:00:03.0
/sys/kernel/iommu_groups/12/devices/0000:00:18.3
/sys/kernel/iommu_groups/12/devices/0000:00:18.1
/sys/kernel/iommu_groups/12/devices/0000:00:18.6
/sys/kernel/iommu_groups/12/devices/0000:00:18.4
/sys/kernel/iommu_groups/12/devices/0000:00:18.2
/sys/kernel/iommu_groups/12/devices/0000:00:18.0
/sys/kernel/iommu_groups/12/devices/0000:00:18.7
/sys/kernel/iommu_groups/12/devices/0000:00:18.5
/sys/kernel/iommu_groups/2/devices/0000:00:01.3
/sys/kernel/iommu_groups/20/devices/0000:0c:00.2
/sys/kernel/iommu_groups/10/devices/0000:00:08.1
/sys/kernel/iommu_groups/0/devices/0000:00:01.0
/sys/kernel/iommu_groups/19/devices/0000:0c:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:08.0

my GPU is 0000:0a:00 so it seems to be available for passthrough is that right ?

journalctl -e give me that :
Bash:
Jun 18 13:38:01 pve systemd[1]: Started Proxmox VE replication runner.
Jun 18 13:38:13 pve pvedaemon[10787]: start VM 116: UPID:pve:00002A23:00149AE3:5EEB5225:qmstart:116:root@pam:
Jun 18 13:38:13 pve pvedaemon[7770]: <root@pam> starting task UPID:pve:00002A23:00149AE3:5EEB5225:qmstart:116:root@pam:
Jun 18 13:38:13 pve kernel: vfio-pci 0000:0a:00.0: Refused to change power state, currently in D3
Jun 18 13:38:14 pve kernel: vfio-pci 0000:0a:00.0: timed out waiting for pending transaction; performing function level
Jun 18 13:38:15 pve kernel: vfio-pci 0000:0a:00.0: not ready 1023ms after FLR; waiting
Jun 18 13:38:16 pve kernel: vfio-pci 0000:0a:00.0: not ready 2047ms after FLR; waiting
Jun 18 13:38:18 pve kernel: vfio-pci 0000:0a:00.0: not ready 4095ms after FLR; waiting
Jun 18 13:38:20 pve postfix/qmgr[1368]: 11ECB4C101F: from=<root@pve.home>, size=11598, nrcpt=1 (queue active)
Jun 18 13:38:20 pve postfix/smtp[10795]: connect to gmail-smtp-in.l.google.com[2a00:1450:400c:c07::1a]:25: Network is u
Jun 18 13:38:23 pve kernel: vfio-pci 0000:0a:00.0: not ready 8191ms after FLR; waiting
Jun 18 13:38:25 pve pvedaemon[9880]: <root@pam> successful auth for user 'root@pam'
Jun 18 13:38:31 pve kernel: vfio-pci 0000:0a:00.0: not ready 16383ms after FLR; waiting
Jun 18 13:38:49 pve kernel: vfio-pci 0000:0a:00.0: not ready 32767ms after FLR; waiting

so it seems there is a problem in alimentation, but i can't find exactly where.

and when i execute
Code:
dmesg | grep -e DMAR -e IOMMU -e AMD-Vi

i get no answer so, i don't know where to go, knowing just after reboot, in proxmox, doing that command gives me the right answer.
 
I resolved the issue in Bios where IOMMU was not enabled
Do you mean that your issue is fixed?

journalctl -e give me that :
Bash:
Jun 18 13:38:01 pve systemd[1]: Started Proxmox VE replication runner.
Jun 18 13:38:13 pve pvedaemon[10787]: start VM 116: UPID:pve:00002A23:00149AE3:5EEB5225:qmstart:116:root@pam:
Jun 18 13:38:13 pve pvedaemon[7770]: <root@pam> starting task UPID:pve:00002A23:00149AE3:5EEB5225:qmstart:116:root@pam:
Jun 18 13:38:13 pve kernel: vfio-pci 0000:0a:00.0: Refused to change power state, currently in D3
Jun 18 13:38:14 pve kernel: vfio-pci 0000:0a:00.0: timed out waiting for pending transaction; performing function level
Jun 18 13:38:15 pve kernel: vfio-pci 0000:0a:00.0: not ready 1023ms after FLR; waiting
Jun 18 13:38:16 pve kernel: vfio-pci 0000:0a:00.0: not ready 2047ms after FLR; waiting
Jun 18 13:38:18 pve kernel: vfio-pci 0000:0a:00.0: not ready 4095ms after FLR; waiting
Jun 18 13:38:20 pve postfix/qmgr[1368]: 11ECB4C101F: from=<root@pve.home>, size=11598, nrcpt=1 (queue active)
Jun 18 13:38:20 pve postfix/smtp[10795]: connect to gmail-smtp-in.l.google.com[2a00:1450:400c:c07::1a]:25: Network is u
Jun 18 13:38:23 pve kernel: vfio-pci 0000:0a:00.0: not ready 8191ms after FLR; waiting
Jun 18 13:38:25 pve pvedaemon[9880]: <root@pam> successful auth for user 'root@pam'
Jun 18 13:38:31 pve kernel: vfio-pci 0000:0a:00.0: not ready 16383ms after FLR; waiting
Jun 18 13:38:49 pve kernel: vfio-pci 0000:0a:00.0: not ready 32767ms after FLR; waiting
so it seems there is a problem in alimentation, but i can't find exactly where.
The issue is with the "FLR" lines. "Function Level Reset" is not supported by some devices, which usually means they're unsuitable for passthrough. You mentioned you already had it working before, was that with the same device? Which GPU model are you trying to pass through?
 
  • Like
Reactions: o9ffx
thank you for help me once again.

In previous version of my server, it was working without doing anything.
It's exactly same model : GTX1050 (Nvidia GP107).

As i said previously, in previous version (i think i was lucky to succes), i had only to add a PCI device, and reboot VM to make it work.
 
1050 is not known for FLR error AFAIK. So it might actually be the BIOS update that was funky. Can you try to downgrade maybe?
 
  • Like
Reactions: MajRonin
unfortunately, i have an ASUS motherbord where Bios can't be downgrade ...

next time i'll choose other brand!


edit : after searching by a deep way to make a downgrade, i found one and i'm testing passthrough right now
 
Last edited:
unfortunately, i have an ASUS motherbord where Bios can't be downgrade ...

next time i'll choose other brand!


edit : after searching by a deep way to make a downgrade, i found one and i'm testing passthrough right now
please can you share the link on how to downgrade Asus Mother board BIOS?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!