Proxmox 4.0, PCI passthrough broken in several ways

Only for information: I use PCIe passthroug since proxmox 3.4. And it works perfectly also in PVE 4.1. So it's only in PCI passthroug.
 
I just upload the patched kernel to pvetest:

ftp://download1.proxmox.com/debian/dists/jessie/pvetest/binary-amd64/pve-kernel-4.2.6-1-pve_4.2.6-29_amd64.deb

Please can you test?
 
I just upload the patched kernel to pvetest:

ftp://download1.proxmox.com/debian/dists/jessie/pvetest/binary-amd64/pve-kernel-4.2.6-1-pve_4.2.6-29_amd64.deb

Please can you test?
I am quick to admit that I'm fairly new to proxmox, but fairly experienced with Debian. I installed this package with dpkg -i. Here is the output of that
Code:
$ sudo dpkg -i pve-kernel-4.2.6-1-pve_4.2.6-29_amd64.deb
[sudo] password for brian:
(Reading database ... 40684 files and directories currently installed.)
Preparing to unpack pve-kernel-4.2.6-1-pve_4.2.6-29_amd64.deb ...
Unpacking pve-kernel-4.2.6-1-pve (4.2.6-29) over (4.2.6-28) ...
Setting up pve-kernel-4.2.6-1-pve (4.2.6-29) ...
Examining /etc/kernel/postinst.d.
run-parts: executing /etc/kernel/postinst.d/apt-auto-removal 4.2.6-1-pve /boot/vmlinuz-4.2.6-1-pve
run-parts: executing /etc/kernel/postinst.d/initramfs-tools 4.2.6-1-pve /boot/vmlinuz-4.2.6-1-pve
update-initramfs: Generating /boot/initrd.img-4.2.6-1-pve
run-parts: executing /etc/kernel/postinst.d/zz-update-grub 4.2.6-1-pve /boot/vmlinuz-4.2.6-1-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.2.6-1-pve
Found initrd image: /boot/initrd.img-4.2.6-1-pve
Found memtest86+ image: /boot/memtest86+.bin
Found memtest86+ multiboot image: /boot/memtest86+_multiboot.bin
done
brian@pve2:~$ sudo vi /etc/default/grub
brian@pve2:~$ sudo update-grub
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-4.2.6-1-pve
Found initrd image: /boot/initrd.img-4.2.6-1-pve
Found memtest86+ image: /boot/memtest86+.bin
Found memtest86+ multiboot image: /boot/memtest86+_multiboot.bin
done

I edited /etc/default/grub to include
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 rootdelay=10 scsi_mod pcie_acs_override=downstream"

Ran update-grub and then rebooted.

Code:
$ sudo ls /sys/kernel/iommu_groups/1/devices/
0000:00:01.0  <-PCI Bridge x16
0000:00:01.1  <-PCI Bridge x8
0000:01:00.0  <-Nvidia 750Ti Video
0000:01:00.1    <-Nvidia 750Ti Audio
0000:02:00.0 <- LSI2308

/etc/initramfs-tools/modules contains pci_stub_ids=10de:1380,10de:0fbc as well. These match up to the output from lspci -nn |grep NVIDIA

Anything I'm missing? Or can this board never passthrough the graphics card without also passing the LSI Controller.
 
Code:
$ sudo ls /sys/kernel/iommu_groups/1/devices/
0000:00:01.0  <-PCI Bridge x16
0000:00:01.1  <-PCI Bridge x8
0000:01:00.0  <-Nvidia 750Ti Video
0000:01:00.1    <-Nvidia 750Ti Audio
0000:02:00.0 <- LSI2308

/etc/initramfs-tools/modules contains pci_stub_ids=10de:1380,10de:0fbc as well. These match up to the output from lspci -nn |grep NVIDIA

Anything I'm missing? Or can this board never passthrough the graphics card without also passing the LSI Controller.

devices should be in different iommu_groups, or it will not work. you can try to move lsi controller to a different pci slot.
 
devices should be in different iommu_groups, or it will not work. you can try to move lsi controller to a different pci slot.
Unfortunately the controller is onboard (Supermicro X10SL7) and it appears that both of the PCI slots are in that group. I guess that makes PCI passthrough not an option with this motherboard. I guess that nixes any hope of this particular virtual machine. Thanks for the help spirit.
 
Unfortunately the controller is onboard (Supermicro X10SL7) and it appears that both of the PCI slots are in that group. I guess that makes PCI passthrough not an option with this motherboard. I guess that nixes any hope of this particular virtual machine. Thanks for the help spirit.

Well, I have the exact same motherboard for a setup here and the funny thing is, pcie passthrough (IOMMU splitting) actually works with the PVE3.4 kernel (Kernel 3.10) using pcie_acs_override=downstream. So the hardware IS capable of splitting the IOMMU groups.
 
Good news! I was finally able to test the new kernel spirit and dietmar uploaded to the repositories, and it works!

Motherboard: Supermicro X10SL7-F

Code:
# uname -r
4.2.6-1-pve

# dpkg -l |grep pve-kernel-4
ii  pve-kernel-4.2.6-1-pve         4.2.6-33                       amd64        The Proxmox PVE Kernel Image

# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/0/devices/0000:00:00.0
/sys/kernel/iommu_groups/1/devices/0000:00:01.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.1
/sys/kernel/iommu_groups/3/devices/0000:00:14.0
/sys/kernel/iommu_groups/4/devices/0000:00:1a.0
/sys/kernel/iommu_groups/5/devices/0000:00:1c.0
/sys/kernel/iommu_groups/6/devices/0000:00:1c.2
/sys/kernel/iommu_groups/7/devices/0000:00:1c.3
/sys/kernel/iommu_groups/8/devices/0000:00:1d.0
/sys/kernel/iommu_groups/9/devices/0000:00:1f.0
/sys/kernel/iommu_groups/9/devices/0000:00:1f.2
/sys/kernel/iommu_groups/9/devices/0000:00:1f.3
/sys/kernel/iommu_groups/9/devices/0000:00:1f.6
/sys/kernel/iommu_groups/10/devices/0000:01:00.0 <-GTX 760
/sys/kernel/iommu_groups/10/devices/0000:01:00.1 <- GTX 760 Audio
/sys/kernel/iommu_groups/11/devices/0000:02:00.0 <- LSI SAS Controller
/sys/kernel/iommu_groups/12/devices/0000:03:00.0
/sys/kernel/iommu_groups/12/devices/0000:04:00.0
/sys/kernel/iommu_groups/13/devices/0000:05:00.0
/sys/kernel/iommu_groups/14/devices/0000:06:00.0

Without the pcie_acs_override=downstream the GTX 760 and LSI SAS controller would always end up in the same IOMMU group (This is because i use an E3-1200 series CPU which lacks ACS support!). With this patch and pcie_acs_override=downstream the IOMMU groups are split finally :)

Thank you!
 
I am quick to admit that I'm fairly new to proxmox, but fairly experienced with Debian.

I edited /etc/default/grub to include
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 rootdelay=10 scsi_mod pcie_acs_override=downstream"

Ran update-grub and then rebooted.

Code:
$ sudo ls /sys/kernel/iommu_groups/1/devices/
0000:00:01.0  <-PCI Bridge x16
0000:00:01.1  <-PCI Bridge x8
0000:01:00.0  <-Nvidia 750Ti Video
0000:01:00.1    <-Nvidia 750Ti Audio
0000:02:00.0 <- LSI2308

/etc/initramfs-tools/modules contains pci_stub_ids=10de:1380,10de:0fbc as well. These match up to the output from lspci -nn |grep NVIDIA

Anything I'm missing? Or can this board never passthrough the graphics card without also passing the LSI Controller.

It should work, see my post below.

Also, looking closely to your GRUB_CMDLINE_LINUX_DEFAULT i noticed the scsi_mod doesn't have any argument and maybe then the pcie_acs_override doesn't get parsed correctly. Maybe you have to change the scsi_mod to: scsi_mod.scan=sync

Then update-grub and reboot
 
/etc/initramfs-tools/modules contains pci_stub_ids=10de:1380,10de:0fbc as well. These match up to the output from lspci -nn |grep NVIDIA

Anything I'm missing?

Well, in my setup the pci_stub_ids in /etc/initramfs-tools/modules are NOT needed, so you can safely remove them I think, maybe these interfere with the rest.
 
I can confirm that with the latest kernel in the non-subscription repository and pcie_acs_override=downstream, everything works again. VM with passthrough boots quickly, and IOMMU groups are properly split.
 
I noticed this patch was also removed from the 3.10 kernel branch:

Code:
pve-kernel-3.10.0 (3.10.0-40) unstable; urgency=low

  * remmove override_for_missing_acs_capabilities.patch

What is the motive behind this removal? This change breaks PCIe Passthrough for me.
 
My IOMMU grouping issue got finally fixed with the latest kernel, hell yeah, now I'll probably get a Proxmox subscription. :):):)
 
pcistub is no more used in 4.x kernel. you need to use vfio equivalent
echo "options vfio-pci ids=10de:1381,10de:0fbc" > /etc/modprobe.d/vfio.conf

I have updated the wiki last week:
https://pve.proxmox.com/wiki/Pci_passthrough#GPU_PASSTHROUGH
Not sure the official way to request an update to the wiki, but on the pci passthrough page you have "/etc/pve/qemuserver/<vmid>.cfg" because of the way the qemu server is setup now I believe it should be "/etc/pve/qemu-server/<vmid>.cfg"
 
Not sure the official way to request an update to the wiki, but on the pci passthrough page you have "/etc/pve/qemuserver/<vmid>.cfg" because of the way the qemu server is setup now I believe it should be "/etc/pve/qemu-server/<vmid>.cfg"

I have fixed the wiki, thanks for report the typo
 
  • Like
Reactions: SwampRabbit
Sorry for bumping that thread, but there have been some new insights affecting PCI passthrough on Skylake systems for which it might be worthwhile patching the PVE kernel.

It has been recognized that Skylake PCH root ports actually have an ACS capability, but due to Intel implementing a non-standard way of reporting that capability, the current linux kernel is not able to group devices on PCH root ports into different IOMMUs, even though proper isolation would actually be available. A patch set has been developed (see above link), but it will earliest be released upstream with kernel v4.7. Would it be possible to patch the present PVE kernel, so we have not to wait until a release based on v4.7 ?
 
I had the same slow boot problem when I recently migrated my FreeNAS guest which previously had been running fine for years in ESXi with two passed through LSI 9211-8i controllers.

When my first scrub initialized I started seeing massive amounts of checksum errors, and lost a good amount of files. (thank god for backups) As soon as noticed, I shut it down, exported the pool and imported it directly in the host and started a scrub, and had no problems at all.

So, something odd is going on with PCI passthrough, and as a result I have stopped using FreeNAS, and instead manage my big pool manually from the console under the host. (Granted, this was before I bought my subscription and was running off the Kernel on the 4.1 iso, so it might have been fixed by the patches reported in this thread)

This has come with a lot of side benefits actually. Bind mounted folders in LXC containers are MUCH faster than using NFS mounts between VM's, and having the host manage the ZFS ARC winds up resulting in MUCH more efficient RAM use.

ZFS itself is relatively simple to manage from the command line. Where I really miss FreeNAS is the management of NFS/SAMBA/AFP shares. Doing it manually has been a royal pain, but I am getting close to being done.

NFS shares are coming directly off the host, and allowed per IP only for security. It took a while to figure out how /etc/exports settings (especially the squash, anonid and anongid options) interacted with user permissions and groups, but I finally got that done.

I have an Ubuntu container dedicated to a manually compiled netatalk 3.1.8 install, that allows the one mac in the house to do time machine backups, and I am now in the process of setting up a separate container just for SAMBA shares.

It is honestly a pain to set it all up manually, but once configured that work is over, and the RAM and CPU efficiency from not using FreeNAS in a VM will last for the life of the system :p
 
Sorry for bumping that thread, but there have been some new insights affecting PCI passthrough on Skylake systems for which it might be worthwhile patching the PVE kernel.

It has been recognized that Skylake PCH root ports actually have an ACS capability, but due to Intel implementing a non-standard way of reporting that capability, the current linux kernel is not able to group devices on PCH root ports into different IOMMUs, even though proper isolation would actually be available. A patch set has been developed (see above link), but it will earliest be released upstream with kernel v4.7. Would it be possible to patch the present PVE kernel, so we have not to wait until a release based on v4.7 ?

Could you please file a bug report at bugzilla.proxmox.com for easier tracking? This patch is still very new, so let's see whether it gets backported to 4.4 or whether we have to backport it ourselves..
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!