[SOLVED] Problem with GPU Passthrough

Hello everyone,
I'm really banging my head against the wall for the last week or so trying to make my GPU passthrough to work, but I keep getting the error Code 43.

I've read and tried some of the workarounds in this and other threads but nothing seems to work.


So, I'm using a HP workstation:

Machine: Hp Z2 MINI G4
CPU: Intel CoreTM i7
Ram: 16GB
GPU: Nvidia Quadro P1000
SDD: 512 GB SSD for Proxmox

Using Proxmox VE 7.3-3
Guest VM: Windows 10 Pro


/etc/default/grub settings
GRUB_DEFAULT=0
GRUB_TIMEOUT=5
GRUB_DISTRIBUTOR=`lsb_release -i -s 2> /dev/null || echo Debian`
#GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
#GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream,multifunction nofb nomodeset video=vesafb:eek:ff video=efifb:eek:ff"
#GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on initcall_blacklist=sysfb_init"
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt pcie_acs_override=downstream pcie_acs_overrid=multifunction nofb nomodeset video=vesafb:eek:ff initcall_blacklist=sysfb_init"
GRUB_CMDLINE_LINUX=""

/etc/modprobe.d/blacklist.conf
blacklist radeon
blacklist nouveau
blacklist nvidia

/etc/modprobe.d/kvm.conf
options kvm ignore_msrs=1

/etc/modprobe.d/vfio.conf
options vfio-pci ids=10de:1cbb,10de:0fb9 disable_vga=1

/etc/modules
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

/etc/pve/qemu-server/106.conf
#cpu: host,hidden=1,flags=+pcid
agent: 1
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 4
cpu: host
efidisk0: local-lvm:vm-106-disk-0,efitype=4m,pre-enrolled-keys=1,size=4M
hostpci0: 0000:01:00,pcie=1,x-vga=1
machine: pc-q35-7.1
memory: 16384
meta: creation-qemu=7.1.0,ctime=1675419718
name: windows10-pro-gpu
net0: e1000=22:FE:72:AB:C2:35,bridge=vmbr0,firewall=1
numa: 0
ostype: win10
scsi0: local-lvm:vm-106-disk-1,cache=writeback,discard=on,iothread=1,size=64G
scsihw: virtio-scsi-single
smbios1: uuid=7bc517c5-3752-4b10-8f3d-da2612419b40
sockets: 1
vga: none
vmgenid: 5fdf2e44-8f15-442f-9b11-eff996da96a4
args: -cpu 'host,+kvm_pv_unhalt,+kvm_pv_eoi,hv_vendor_id=NV43FIX,kvm=off'

dmesg | grep -e DMAR -e IOMMU
[ 0.000000] Warning: PCIe ACS overrides enabled; This may allow non-IOMMU protected peer-to-peer DMA
[ 0.009083] ACPI: DMAR 0x00000000CBC0D000 000070 (v01 INTEL CFL 00000002 01000013)
[ 0.009124] ACPI: Reserving DMAR table memory at [mem 0xcbc0d000-0xcbc0d06f]
[ 0.046897] DMAR: IOMMU enabled
[ 0.134935] DMAR: Host address width 39
[ 0.134936] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.134940] DMAR: dmar0: reg_base_addr fed91000 ver 1:0 cap d2008c40660462 ecap f050da
[ 0.134943] DMAR: RMRR base: 0x000000cb89e000 end: 0x000000cb8bdfff
[ 0.134945] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 0
[ 0.134946] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.134947] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.136316] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.382435] DMAR: No ATSR found
[ 0.382436] DMAR: No SATC found
[ 0.382437] DMAR: dmar0: Using Queued invalidation
[ 0.382627] DMAR: Intel(R) Virtualization Technology for Directed I/O

lspci -nnv
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GP107GLM [Quadro P1000 Mobile] [10de:1cbb] (rev a1) (prog-if 00 [VGA controller])
Flags: fast devsel, IRQ 16, IOMMU group 8
Memory at e2000000 (32-bit, non-prefetchable) [size=16M]
Memory at d0000000 (64-bit, prefetchable) [size=256M]
Memory at e0000000 (64-bit, prefetchable) [size=32M]
I/O ports at 3000
Expansion ROM at e3080000 [disabled] [size=512K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Virtual Channel
Capabilities: [250] Latency Tolerance Reporting
Capabilities: [258] L1 PM Substates
Capabilities: [128] Power Budgeting <?>
Capabilities: [420] Advanced Error Reporting
Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
Capabilities: [900] Secondary PCI Express
Kernel driver in use: vfio-pci
Kernel modules: nvidiafb, nouveau

01:00.1 Audio device [0403]: NVIDIA Corporation GP107GL High Definition Audio Controller [10de:0fb9] (rev a1)
Flags: fast devsel, IRQ 17, IOMMU group 8
Memory at e3000000 (32-bit, non-prefetchable) [size=16K]
Capabilities: [60] Power Management version 3
Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
Capabilities: [78] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel



Please if someone has any tips on how to make this work, help me :)
 
Thanks for your kind notes! I encountered a similar issue in Proxmox 7.2.5. Replacing video=efifb:off with initcall_blacklist=sysfb_init in /etc/default/grub indeed resolves the issue.
This indeed resolve my issue with BAR 1 error also, thank you guys.
 
Hey all,

I've been attempting to get integrated gpu passthrough to work the past couple days.
Through extensive searching I came across multiple tutorials online and this thread.

Whenever I start the Win11 VM the entire PVE crashes and requires a hard restart of the host machine.

I have attempted to add the remove/rescan script and the mentioned config modifications.

I am hoping someone may have some insight on something I have missed.

Thanks!

Machine: MinisForum UM773 Lite
CPU: AMD Ryzen 7 7735HS
GPU: Integrated Graphics (AMD Radeon™ 680M)
Virtualization is enabled by default (confirmed with Manufacturer)

Proxmox Virtual Environment 7.4-3:
Proxmox Kernel: Linux 5.15.107-1-pve
VM: Win11 Pro 22H2 x64

Based on the syslog, this is what happens when trying to remove to the GPU
Code:
May 04 11:55:01 u773prox CRON[1232]: (root) CMD (/root/fix_gpu_pass.sh)
May 04 11:55:01 u773prox kernel: vfio-pci 0000:34:00.0: vgaarb: changed VGA decodes: olddecodes=none,decodes=io+mem:owns=none
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: Removing from iommu group 16
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: [1002:1681] type 00 class 0x030000
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: reg 0x10: [mem 0x7fe0000000-0x7fefffffff 64bit pref]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: reg 0x18: [mem 0x7ff0000000-0x7ff01fffff 64bit pref]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: reg 0x20: [io  0xf000-0xf0ff]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: reg 0x24: [mem 0xdc700000-0xdc77ffff]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: PME# supported from D1 D2 D3hot D3cold
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: Adding to iommu group 16
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: BAR 0: assigned [mem 0x7fe0000000-0x7fefffffff 64bit pref]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: BAR 2: assigned [mem 0x7ff0000000-0x7ff01fffff 64bit pref]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: BAR 5: assigned [mem 0xdc700000-0xdc77ffff]
May 04 11:55:01 u773prox kernel: pci 0000:34:00.0: BAR 4: assigned [io  0xf000-0xf0ff]
May 04 11:55:01 u773prox kernel: vfio-pci 0000:34:00.0: vgaarb: changed VGA decodes: olddecodes=io+mem,decodes=none:owns=none
May 04 11:55:01 u773prox CRON[1228]: pam_unix(cron:session): session closed for user root
May 04 11:55:01 u773prox kernel: [drm:amdgpu_init [amdgpu]] *ERROR* VGACON disables amdgpu kernel modesetting.

The Syslog is attached (shortened to what I thought may be important)

Here is all the other info / configs:


lspci devices:
Code:
34:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1681] (rev 0a) (prog-if 00 [VGA controller])
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1681]
        Flags: bus master, fast devsel, latency 0, IRQ 255, IOMMU group 16
        Memory at 7fe0000000 (64-bit, prefetchable) [size=256M]
        Memory at 7ff0000000 (64-bit, prefetchable) [size=2M]
        I/O ports at f000 [disabled] [size=256]
        Memory at dc700000 (32-bit, non-prefetchable) [size=512K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/4 Maskable- 64bit+
        Capabilities: [c0] MSI-X: Enable- Count=4 Masked-
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [270] Secondary PCI Express
        Capabilities: [2a0] Access Control Services
        Capabilities: [2b0] Address Translation Service (ATS)
        Capabilities: [2c0] Page Request Interface (PRI)
        Capabilities: [2d0] Process Address Space ID (PASID)
        Capabilities: [410] Physical Layer 16.0 GT/s <?>
        Capabilities: [450] Lane Margining at the Receiver <?>
        Kernel driver in use: vfio-pci
        Kernel modules: amdgpu

34:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1640]
        Subsystem: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1640]
        Flags: fast devsel, IRQ 255, IOMMU group 17
        Memory at dc7c8000 (32-bit, non-prefetchable) [disabled] [size=16K]
        Capabilities: [48] Vendor Specific Information: Len=08 <?>
        Capabilities: [50] Power Management version 3
        Capabilities: [64] Express Legacy Endpoint, MSI 00
        Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+
        Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?>
        Capabilities: [2a0] Access Control Services
        Kernel driver in use: vfio-pci
        Kernel modules: snd_hda_intel

lspci -n
Code:
34:00.0 0300: 1002:1681 (rev 0a)
34:00.1 0403: 1002:1640

cat /proc/cmdline
Code:
BOOT_IMAGE=/boot/vmlinuz-5.15.107-1-pve root=/dev/mapper/pve-root ro quiet amd_iommu=on iommu=pt textonly nofb nomodeset video=vesafb:off initcall_blacklist=sysfb_init

IOMMU groups (16 for video, 17 for audio)
Code:
IOMMU group 0 00:01.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b7] (rev 01)
IOMMU group 10 00:08.3 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b9] (rev 10)
IOMMU group 11 00:14.0 SMBus [0c05]: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller [1022:790b] (rev 71)
IOMMU group 11 00:14.3 ISA bridge [0601]: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge [1022:790e] (rev 51)
IOMMU group 12 00:18.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1679]
IOMMU group 12 00:18.1 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167a]
IOMMU group 12 00:18.2 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167b]
IOMMU group 12 00:18.3 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167c]
IOMMU group 12 00:18.4 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167d]
IOMMU group 12 00:18.5 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167e]
IOMMU group 12 00:18.6 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:167f]
IOMMU group 12 00:18.7 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:1680]
IOMMU group 13 01:00.0 Ethernet controller [0200]: Intel Corporation Ethernet Controller I225-V [8086:15f3] (rev 03)
IOMMU group 14 02:00.0 Network controller [0280]: MEDIATEK Corp. Device [14c3:0608]
IOMMU group 15 03:00.0 Non-Volatile memory controller [0108]: Phison Electronics Corporation E12 NVMe Controller [1987:5012] (rev 01)
IOMMU group 16 34:00.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1681] (rev 0a)
IOMMU group 17 34:00.1 Audio device [0403]: Advanced Micro Devices, Inc. [AMD/ATI] Device [1002:1640]
IOMMU group 18 34:00.2 Encryption controller [1080]: Advanced Micro Devices, Inc. [AMD] VanGogh PSP/CCP [1022:1649]
IOMMU group 19 34:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:161d]
IOMMU group 1 00:02.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b7] (rev 01)
IOMMU group 20 34:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:161e]
IOMMU group 21 34:00.5 Multimedia controller [0480]: Advanced Micro Devices, Inc. [AMD] Raven/Raven2/FireFlight/Renoir Audio Processor [1022:15e2] (rev 60)
IOMMU group 22 34:00.6 Audio device [0403]: Advanced Micro Devices, Inc. [AMD] Family 17h (Models 10h-1fh) HD Audio Controller [1022:15e3]
IOMMU group 23 35:00.0 SATA controller [0106]: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] [1022:7901] (rev a1)
IOMMU group 24 36:00.0 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:161f]
IOMMU group 25 36:00.3 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d6]
IOMMU group 26 36:00.4 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:15d7]
IOMMU group 27 36:00.5 USB controller [0c03]: Advanced Micro Devices, Inc. [AMD] Device [1022:162e]
IOMMU group 2 00:02.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14ba]
IOMMU group 3 00:02.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14ba]
IOMMU group 4 00:02.4 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14ba]
IOMMU group 5 00:03.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b7] (rev 01)
IOMMU group 5 00:03.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14cd]
IOMMU group 6 00:04.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b7] (rev 01)
IOMMU group 7 00:08.0 Host bridge [0600]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b7] (rev 01)
IOMMU group 8 00:08.1 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b9] (rev 10)
IOMMU group 9 00:08.2 PCI bridge [0604]: Advanced Micro Devices, Inc. [AMD] Device [1022:14b9] (rev 10)

/etc/default/grub
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet amd_iommu=on iommu=pt textonly nofb nomodeset video=vesafb:off initcall_blacklist=sysfb_init"

/etc/modeprobe.d/blacklist.conf
Code:
blacklist radeon
blacklist nouveau
blacklist nvidia

/etc/modprobe.d/kvm.conf
Code:
options kvm ignore_msrs=1

/etc/modprobe.d/vfio.conf
Code:
options vfio-pci ids=1002:1681,1002:1640 disable_vga=1

/etc/modules
Code:
vfio
vfio_iommu_type1
vfio_pci
vfio_virqfd

Win11 VM Conf:
(Boots with default Display option in PVE)
(when adding the PCI device I was adding 0000:34:00,pcie=1,x-vga=1 and setting display to none - system crashes)
Code:
bios: ovmf
boot: order=sata0;net0
cores: 4
cpu: host
efidisk0: local-lvm:vm-104-disk-0,efitype=4m,size=4M
machine: pc-q35-7.2
memory: 4096
meta: creation-qemu=7.2.0,ctime=1683165745
name: win11
net0: e1000=3A:29:67:E3:B2:93,bridge=vmbr0,firewall=1
numa: 0
ostype: win11
sata0: local-lvm:vm-104-disk-1,size=100G
scsihw: virtio-scsi-single
smbios1: uuid=dfd63dcf-1d96-4473-a690-500f6c5057e0
sockets: 1
startup: order=2
tpmstate0: local-lvm:vm-104-disk-2,size=4M,version=v2.0
vmgenid: b4096c69-6ba2-40e2-a6b1-291202f474b5

dmesg | grep -e DMAR -e IOMMU
Code:
[    0.435605] pci 0000:00:00.2: AMD-Vi: IOMMU performance counters supported
[    0.436187] pci 0000:00:00.2: AMD-Vi: Found IOMMU cap 0x40
[    0.436611] perf/amd_iommu: Detected AMD IOMMU #0 (2 banks, 4 counters/bank).
[    4.157199] AMD-Vi: AMD IOMMUv2 loaded and initialized

/root/fix_gpu_pass.sh
Code:
#!/bin/bash
echo 1 > /sys/bus/pci/devices/0000:34:00.0/remove
echo 1 > /sys/bus/pci/rescan

Ran this code
Code:
chmod +x /root/fix_gpu_pass.sh

crontab file contains
Code:
@reboot /root/fix_gpu_pass.sh
 

Attachments

  • reboot_short.txt
    21.7 KB · Views: 25
  • Like
Reactions: uiffiu
Solved!

Using:
echo 1 > /sys/bus/pci/devices/0000\:09\:00.0/remove
echo 1 > /sys/bus/pci/rescan

You can create a .sh chmod +x and add it to cron

File: /root/fix_gpu_pass.sh

//Note Change "0000\:0X\:00.0" for your GPU PCI ID

#!/bin/bash
echo 1 > /sys/bus/pci/devices/0000\:0X\:00.0/remove
echo 1 > /sys/bus/pci/rescan

Add to cron:

crontab -e

add:

@reboot /root/fix_gpu_pass.sh

Published in other post
Thank You ! This resolved my issue. Thought i would have to replace the GPU.
 
Hey there,

I've been trying to get my 2 gpu's passed through to one VM in proxmox.

Weirdly if they are allocated to separate VM's they work fine (they show up in nvidia-smi at ubuntu terminal or in task manager in windows, and I can run a game). Their IDs are 08 and 09.

However when I try to run them on the same VM (I allocate 08 and 09 to my Windows.P2V) it doesnt even load, I just get a black screen..

They are the same model, Nvidia 2080 OEM's from HP, the reason for dual GPU is NVlink and memory pooling.

The following is the dmesg:


Code:
[  677.990044] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  677.990074] vfio-pci 0000:08:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  677.991525] vfio-pci 0000:08:00.0: No more image in the PCI ROM
[  678.158043] vfio-pci 0000:09:00.0: vfio_ecap_init: hiding ecap 0x1e@0x258
[  678.158070] vfio-pci 0000:09:00.0: vfio_ecap_init: hiding ecap 0x19@0x900
[  682.930327] vfio-pci 0000:08:00.0: No more image in the PCI ROM
[  682.930353] vfio-pci 0000:08:00.0: No more image in the PCI ROM

Any clue how to go by?
 
I've been doing GPU passthrough for years and I've not had any issues in recent years until today.
I've tried everything in this thread and nothing worked.
I've tried eveything in the wiki and nothing worked.
I've tried all the combinations of what was in thread and what was in the wiki and it didn't work.

I even did a sanity check and removed the gpu and the vm booted just fine.

Finally, I found the issue: I was allocating too much memory.
I lowered the amount down and now everything works.

I'm on PVE 8.04.
 
Hi,

On PVE 8.0.4. AMD 5900x. 2 x Nvidia RTX3090 GPU.

In an Ubuntu 22.04.3 LTS VM, when I first run nvidia-smi, I only see 1 GPU. After doing the following in the VM:

Code:
sudo nano /etc/modprobe.d/blacklist.conf
# then add


blacklist nvidiafb 
blacklist nouveau


# Save, then run:
sudo depmod -ae 
sudo update-initramfs -u


# then reboot the VM

Repeating nvidia-smi, I now see both GPUs. However, after a proxmox host shutdown and power up, back in the VM I only see 1 GPU, so have to repeat the above prescribed process of blacklisting. i.e. the changes do not stick in the VM after host reboot.

Would be most grateful for any ideas how to resolve, thank you
 
Hello guys. Does anyone use this method (removing GPU from devices) with vendor-reset (AMD RX580), I can't add script to my device, because it's not in the device list?
 
Last edited:
Hello guys. Does anyone use this method (removing GPU from devices) with vendor-reset (AMD RX580), I can't add script to my device, because it's not in the device list?
vendor-reset works fine with RX580. I don't understand what you mean by "removing GPU from devices" and "add script to my device", sorry.
Maybe look at a thread about vendor-reset and RX580, like this one: https://forum.proxmox.com/threads/g...-strange-screen-behaviour.126727/#post-553643
There are lots of other threads about vendor-reset and how to get it working: https://forum.proxmox.com/search/6415725/?q=vendor-reset&o=date
 
  • Like
Reactions: moxerproxer
vendor-reset works fine with RX580. I don't understand what you mean by "removing GPU from devices" and "add script to my device", sorry.
Maybe look at a thread about vendor-reset and RX580, like this one: https://forum.proxmox.com/threads/g...-strange-screen-behaviour.126727/#post-553643
There are lots of other threads about vendor-reset and how to get it working: https://forum.proxmox.com/search/6415725/?q=vendor-reset&o=date
There is a part of code in vendor-reset where I shoul add like this echo 'device_specific' > /sys/bus/pci/devices/<pci_device_id_here>/reset_method, but I think because I use this code I got an error - there is no device in the list (because whithout that code my RX580 doesn't work, but with this code everything is fine, except when GPU is glitching my whole server dies :()
echo 1 > /sys/bus/pci/devices/0000\:0X\:00.0/remove echo 1 > /sys/bus/pci/rescan
 
There is a part of code in vendor-reset where I shoul add like this echo 'device_specific' > /sys/bus/pci/devices/<pci_device_id_here>/reset_method, but I think because I use this code I got an error - there is no device in the list (because whithout that code my RX580 doesn't work, but with this code everything is fine, except when GPU is glitching my whole server dies :()
You only need to do echo device_specific > '/sys/bus/pci/devices/0000:0X:00.0/reset_method' once, every reboot of the Proxmox host. Use crontab, /etc/rc.local or a hookscript (or type it yourself on the console or use SSH).
echo 1 > /sys/bus/pci/devices/0000\:0X\:00.0/remove echo 1 > /sys/bus/pci/rescan
Stop doing that. You don't need to do that. That's why you use vendor-reset; to reset the RX580 properly using RX580-specific code. Don't drop the GPU from the PCIe bus like you would with NVidia GPUs when they are used during boot of the Proxmox host. Instead, just let the amdgpu driver load for the RX580; don't blacklist it!
 
  • Like
Reactions: moxerproxer
You only need to do echo device_specific > '/sys/bus/pci/devices/0000:0X:00.0/reset_method' once, every reboot of the Proxmox host. Use crontab, /etc/rc.local or a hookscript (or type it yourself on the console or use SSH).

Stop doing that. You don't need to do that. That's why you use vendor-reset; to reset the RX580 properly using RX580-specific code. Don't drop the GPU from the PCIe bus like you would with NVidia GPUs when they are used during boot of the Proxmox host. Instead, just let the amdgpu driver load for the RX580; don't blacklist it!
Thx for your help! But when I try to echo device_specific > '/sys/bus/pci/devices/0000:03:00.0/reset_method' I get -bash: echo: write error: Invalid argument Also I've tried to install vendor-reset but when I check it with modprobe vendor-reset I get Module vendor-reset not found in directory /lib/modules/6.5.11-6-pve I remove that code (/sys/bus/pci/devices/0000\:0X\:00.0/remove) but I think it doesn't help me
 
This is what available
apt install pve-headers pve-headers pve-headers-6.2 pve-headers-6.2.16-3-pve pve-headers-6.1 pve-headers-6.2.16-1-pve pve-headers-6.2.16-4-pve pve-headers-6.1.10-1-pve pve-headers-6.2.16-2-pve pve-headers-6.2.16-5-pve
no 6.5.11-6-pve :(
 
This is what available
apt install pve-headers pve-headers pve-headers-6.2 pve-headers-6.2.16-3-pve pve-headers-6.1 pve-headers-6.2.16-1-pve pve-headers-6.2.16-4-pve pve-headers-6.1.10-1-pve pve-headers-6.2.16-2-pve pve-headers-6.2.16-5-pve
no 6.5.11-6-pve :(
Use proxmox-headers-6.5, which is the new name. (Why so many old and discontinued kernels?)
 
Last edited:
Use apt install proxmox-headers or apt install proxmox-headers-6.5 or apt install proxmox-headers-6.5.11-6-pve.
I don't know what helped, but I remove AMD card from vfio, updated system with apt update, rebooted the system and vendor-reset started to work, but id didn't help in my case, because the whole system still stack when the GPU glithches :( Anyway, I give it a try... Thank you leesteken for your help!
 
Hi, I did set up everithing and it was working like charm. (win 11)
I worked on some other VMs. Today was my free day and I wanted to play some games in my win 11 vm. I got error 43 and bad resolution.

I tried out what I found here, but did not help. My GRUB looks like this:
Code:
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
GRUB_CMDLINE_LINUX="textonly video=efifb:off"

I did roll back a backup, set it fresh up, but no luck so far:-(
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!