Nvidia Error 43 with Quadro RTX4000

goeste

New Member
Apr 17, 2023
14
1
3
Hi all,
I am new to this forum but already read alot about the common error43. BUT I got into a situation that I cannot resolve also not with the awnsers of all already creaded threads...

But first things first:
Code:
1. System:
Dell R7910
CPU: 2x E5-2690v4
RAM: 128GB DDR4 ECC
GPU: Nvidia Quadro RTX4000 (Slot 4)
Storage: 2x250gb SSD Raid1 ZFS, 6x 500gb ZRaid2 ZFS
OS: PVE 7.4-3

2. KVM config:
bios: ovmf
boot: order=scsi0;net0
cores: 8
cpu: host
efidisk0: data:vm-101-disk-0,efitype=4m,pre-enrolled-keys=1,size=1M
hostpci0: 0000:82:00,pcie=1,romfile=QuadroRTX4000.rom
machine: pc-q35-7.2
memory: 16384
meta: creation-qemu=7.2.0,ctime=1681658462
name: Win10-Felix
net0: virtio=BA:5D:56:88:8D:7C,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: data:vm-101-disk-1,cache=writeback,iothread=1,replicate=0,size=250G,ssd=1
scsihw: virtio-scsi-single
smbios1: uuid=70868f26-81c3-4a74-9c94-d09bac496ade
sockets: 1
vga: none
vmgenid: 76df22b0-4e89-491e-8d48-4fa269849ec2

3. /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:off video=efifb:off

4. uname -rv:
6.1.15-1-pve #1 SMP PREEMPT_DYNAMIC PVE 6.1.15-1 (2023-03-08T08:53Z)

5. /etc/modules:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

6. iommu groups:
IOMMU Group 14: 82:00.0 VGA compatible controller [0300]: NVIDIA Corporation TU104GL [Quadro RTX 4000] [10de:1eb1] (rev a1)
IOMMU Group 15: 82:00.1 Audio device [0403]: NVIDIA Corporation TU104 HD Audio Controller [10de:10f8] (rev a1)
IOMMU Group 16: 82:00.2 USB controller [0c03]: NVIDIA Corporation TU104 USB 3.1 Host Controller [10de:1ad8] (rev a1)
IOMMU Group 17: 82:00.3 Serial bus controller [0c80]: NVIDIA Corporation TU104 USB Type-C UCSI Controller [10de:1ad9] (rev a1)

7.  /etc/modprobe.d/vfio.conf:
options vfio-pci ids=10de:1eb1,10de:10f8,10de:1ad8,10de:1ad9 disable_vga=1

8.  /etc/modprobe.d/blacklist.conf:
blacklist radeon
blacklist nouveau
blacklist nvidia
blacklist nvidiafb

9. /etc/modprobe.d/iommu_unsafe_interrupts.conf:
options vfio_iommu_type1 allow_unsafe_interrupts=1

10.  /etc/modprobe.d/kvm.conf:
options kvm ignore_msrs=1

RTX4000 is showing up in device manager, but with "Error 43". I get to the VM via RDP sinve VNC is showing a blackscreen (due to not running GPU). I can not find the correct setting to get the card up and running I also completll reinstalled everything and updated the kernel to 6.1 from 5.15. Also I tried different drivers (531.61 and 528.89). also I tried connecting a monitor the the GPU, but wether it is connected or not doesn't matter, the error remains.

Any help or pointing into the correct direction is very much (and more than) appreciated. Maybe there is someone that got a Quadro RTX4000 running maybe?

Best regards, goeste
 
is there anything in the journal when starting the vm? also the whole dmesg output would be interesting

also do you really need
pcie_acs_override=downstream,multifunction
this? normally on server hardware this should not be necessary
 
is there anything in the journal when starting the vm? also the whole dmesg output would be interesting

also do you really need

this? normally on server hardware this should not be necessary
I had to put it on pastebin since it is too long: dmesg output

I also removed the override part, but no luck. However I take it out for now.
 
ok, log looks normal, but can you send a log with the pcie_acs_override disabled too ?

are you sure you need romfile in the config (normally it should work without that)
 
ok, log looks normal, but can you send a log with the pcie_acs_override disabled too ?

are you sure you need romfile in the config (normally it should work without that)
rebooted and started the VM withoutpcie_acs_override and romfile removed. However the VM doesn't start without romfile... I get an 0x4 or 0x204 error with rdp no connection can be established.

dmesg without pcie override
 
Last edited:
HA, couple of minutes ago I got it working:

I only added intel_iommu=on iommu=pt to /etc/kernel/cmdline and options vfio_iommu_type1 allow_unsafe_interrupts=1 to /etc/modprobe.d/iommu_unsafe_interrupts.conf as well as the blacklistings. No Kernel update (since I reinstalled the whole thing) and bam... it worked out of the box even without rom file :) However I didn't install the Nvidia drivers.


BTW 0x4 and 0x204 are RDP eror codes from MS. both mean that no connection can ba established to the host.

thanks for your time @dcsapak !!!

//update: also latest NVIDIA drivers don't cause any issues!
 
Last edited:
  • Like
Reactions: dcsapak

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!