Hi,
I'm struggling with an issue on Proxmox. I want to create a VM on my server, with 3 GPU NVidia GA102. 2 GPUs are working well when I add them but the third one doesn't working with the 2 other one, the VM start but I have a black screen. Noticed that the GPU which is not working well, work if I don't install any other GPU on the VM.
In the both case, I have the message "No more image in the PCI ROM".
pve-versions:
vfio logs:
the /etc/default/grub file:
My VM config:
The server is a TRX40 Creator.
Does anyone has an idea of why I have this error on the specific GPU when I add other GPUs to the VM.
I don't know if I gave every information needed but it's a start.
Thanks you very much.
greetings
I'm struggling with an issue on Proxmox. I want to create a VM on my server, with 3 GPU NVidia GA102. 2 GPUs are working well when I add them but the third one doesn't working with the 2 other one, the VM start but I have a black screen. Noticed that the GPU which is not working well, work if I don't install any other GPU on the VM.
In the both case, I have the message "No more image in the PCI ROM".
pve-versions:
Bash:
root@Nexus-002:~# pveversion -v
proxmox-ve: 7.1-1 (running kernel: 5.13.19-2-pve)
pve-manager: 7.1-7 (running version: 7.1-7/df5740ad)
pve-kernel-helper: 7.1-6
pve-kernel-5.13: 7.1-5
pve-kernel-5.13.19-2-pve: 5.13.19-4
ceph-fuse: 15.2.15-pve1
corosync: 3.1.5-pve2
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx3
ksm-control-daemon: 1.4-1
libjs-extjs: 7.0.0-1
libknet1: 1.22-pve2
libproxmox-acme-perl: 1.4.0
libproxmox-backup-qemu0: 1.2.0-1
libpve-access-control: 7.1-5
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.0-14
libpve-guest-common-perl: 4.0-3
libpve-http-server-perl: 4.0-4
libpve-storage-perl: 7.0-15
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 4.0.11-1
lxcfs: 4.0.11-pve1
novnc-pve: 1.2.0-3
proxmox-backup-client: 2.1.2-1
proxmox-backup-file-restore: 2.1.2-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.4-4
pve-cluster: 7.1-2
pve-container: 4.1-2
pve-docs: 7.1-2
pve-edk2-firmware: 3.20210831-2
pve-firewall: 4.2-5
pve-firmware: 3.3-3
pve-ha-manager: 3.3-1
pve-i18n: 2.6-2
pve-qemu-kvm: 6.1.0-3
pve-xtermjs: 4.12.0-1
qemu-server: 7.1-4
smartmontools: 7.2-1
spiceterm: 3.2-2
swtpm: 0.7.0~rc1+2
vncterm: 1.7-1
zfsutils-linux: 2.1.1-pve3
vfio logs:
Bash:
root@Nexus-002:~# dmesg | grep vfio
[ 4.633343] vfio-pci 0000:01:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=io+mem
[ 4.656069] vfio-pci 0000:21:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 4.676086] vfio-pci 0000:4c:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=io+mem:owns=none
[ 4.696096] vfio_pci: add [10de:2204[ffffffff:ffffffff]] class 0x000000/00000000
[ 4.760172] vfio_pci: add [10de:1aef[ffffffff:ffffffff]] class 0x000000/00000000
[ 6340.522636] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 6340.522659] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 6340.522667] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[ 6340.522668] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[ 6340.522670] vfio-pci 0000:01:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[ 6340.524865] vfio-pci 0000:01:00.0: No more image in the PCI ROM
[ 6340.542570] vfio-pci 0000:01:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[ 6340.543701] vfio-pci 0000:21:00.0: enabling device (0000 -> 0003)
[ 6340.650624] vfio-pci 0000:21:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[ 6340.650644] vfio-pci 0000:21:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[ 6340.650652] vfio-pci 0000:21:00.0: vfio_ecap_init: hiding ecap 0x26@0xc1c
[ 6340.650653] vfio-pci 0000:21:00.0: vfio_ecap_init: hiding ecap 0x27@0xd00
[ 6340.650655] vfio-pci 0000:21:00.0: vfio_ecap_init: hiding ecap 0x25@0xe00
[ 6340.670450] vfio-pci 0000:21:00.1: enabling device (0000 -> 0002)
[ 6340.670558] vfio-pci 0000:21:00.1: vfio_ecap_init: hiding ecap 0x25@0x160
[ 6345.775986] vfio-pci 0000:01:00.0: No more image in the PCI ROM
[ 6345.776015] vfio-pci 0000:01:00.0: No more image in the PCI ROM
the /etc/default/grub file:
Bash:
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt video=vesafb:off video=efifb:off"
GRUB_CMDLINE_LINUX=""
My VM config:
Code:
root@Nexus-002:~# qm config 102
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 48
efidisk0: local-new:vm-102-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00,pcie=1
hostpci1: 0000:21:00,pcie=1,x-vga=1
ide2: local:iso/ubuntu-20.04.3-desktop-amd64.iso,media=cdrom
machine: q35
memory: 122528
meta: creation-qemu=6.1.0,ctime=1640106321
name: TestML
net0: virtio=02:ED:26:D7:80:ED,bridge=vmbr0,firewall=1
numa: 0
ostype: l26
scsi0: zfs2:vm-102-disk-0,size=1720G
scsi1: zfs2:vm-102-disk-1,size=20G
scsihw: virtio-scsi-pci
smbios1: uuid=0e656a4f-bfa0-4692-bcac-5aa511d03a44
sockets: 1
vga: virtio
vmgenid: 746b0604-6900-4b0a-985a-364fa54ce7cd
The server is a TRX40 Creator.
Does anyone has an idea of why I have this error on the specific GPU when I add other GPUs to the VM.
I don't know if I gave every information needed but it's a start.
Thanks you very much.
greetings