AMD Epyc PCIe-Passthrough FLR error.

With a custom compiled kernel it worked fine as far as I tested it. But I have no way to tell how stable it is nor how reliable it is.

After that I did not have time for anything else. but I just recently started working on this again.

I did one test according to the information leesteken mentioned, but to no success yet.

  1. Updated Proxmox to version 5.15.39
  2. Identify the controller I want to pass through
    Code:
    root@pve:# lspci | grep SATA
    84:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
    85:00.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 51)
  3. Check with reset_methods are available for the PCI device
    Code:
    root@pve:# cat /sys/bus/pci/devices/0000\:85\:00.0/reset_method
    flr bus
  4. Disable the flr mode
    Code:
    root@pve:# echo bus > /sys/bus/pci/devices/0000\:85\:00.0/reset_method
  5. I repeated step 3 here to check if flr was disabled
    Code:
    root@pve:# cat /sys/bus/pci/devices/0000\:85\:00.0/reset_method
    bus
Al that was easy to configure but after creating a VM and passing the second SATA controller (pci device 85:00.0) to it all went wrong.
Upon starting the VM the first SATA controller (pci device 84:00.0) lost the disks and I got a bunch read errors because my VM disk images are on the first SATA controller.
I have not figured out why the first controller throws issues when I use the second as pass-through.

But the above steps did prevent the system from crashing with the flr reset timeout that I had before this. So this is at least a step in the right direction.

I did remember that the wiki Proxmox PCIe passthrough recommends to disable the device so the host wont use it. But both SATA controllers have the same device IDs, so cant blacklist one of them, only both. There is probably a work around for this but I have not found it yet.
Thank you from 2024. For two days I could not understand why my virtual machine is not working, into which the adaptec raid controller is being paas through on the x399 platform
 
Thank you from 2024. For two days I could not understand why my virtual machine is not working, into which the adaptec raid controller is being paas through on the x399 platform
I am glad this helped you

But I never used it in production as I am was not sure what my changes implicated for stability.

But just recently I got IOMMU & Pass-through to work a lot better than when I posted this thread. (Keep in mind that al the software updates between now and when this thread was posted)
In my motherboard PCI AER (PCI Advanced Error Reporting) was disabled, because this was off an other required IOMMU setting was hidden in the BIOS.

After enabling PCI AER and the other option. A way better IOMMU group separation was available where basically every PCIe device got its own IOMMU group, and the FLR reset bug also did not happen. I could easly start/stop/restart the VMs with the SATA controllers without any modification to the OS/software. This was on the same hardware from when I posted this thread.
Unfortunatly the motboard (Gigabyte MZ01-CE1) died shortly after before I could use it fully. After order a new motherboard (Supermicro H12SSL-I) and enabling all the correct bios options it also work like the old motherboard.

I don't know if this is relevant for you platform but it might help you find a better solution.
 
Last edited:
I am glad this helped you

But I never used it in production as I am was not sure what my changes implicated for stability.

But just recently I got IOMMU & Pass-through to work a lot better than when I posted this thread. (Keep in mind that al the software updates between now and when this thread was posted)
In my motherboard PCI AER (PCI Advanced Error Reporting) was disabled, because this was off an other required IOMMU setting was hidden in the BIOS.

After enabling PCI AER and the other option. A way better IOMMU group separation was available where basically every PCIe device got its own IOMMU group, and the FLR reset bug also did not happen. I could easly start/stop/restart the VMs with the SATA controllers without any modification to the OS/software. This was on the same hardware from when I posted this thread.
Unfortunatly the motboard (Gigabyte MZ01-CE1) died shortly after before I could use it fully. After order a new motherboard (Supermicro H12SSL-I) and enabling all the correct bios options it also work like the old motherboard.

I don't know if this is relevant for you platform but it might help you find a better solution.
My motherboard too have aes and other advanced virtualization features, I can easily pass through gpu, but it didn't work with adaptec 8405e raid card, may be because it's gigabyte motherboard, I heard on some other models of the same platform they broke the iommu in all bios except the first versions.
 
With my Gigabyte server motherboard I had the same. without PCI AER, GPU pass-through worked fine but the SATA controller build in the EPYC cpu wouldn't work with FLR errors. After enabling the PCI AER the I got a lot better IOMMU separation and the FLR errors are also gone.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!