[TUTORIAL] Compile Proxmox VE with patched intel-iommu driver to remove RMRR check

Whitterquick

Member
Aug 1, 2020
232
7
18
This is fantastic! Kiler129 you truly are the best! :)

I am having a few issues with a USB attached device being a bit fiddly on boot (totally unrelated to this patch I think). I sometimes need to make sure it is switched on before or after everything is booted, which makes reboots fiddly. I have just wiped everything so can’t paste any error examples but does this sound familiar to anyone?
 

kiler129

New Member
Oct 20, 2020
12
12
3
29
Is this specific to the internal B120i controller?
Yes, I actually use passed to a DSM install.

I am having a few issues with a USB attached device being a bit fiddly on boot (totally unrelated to this patch I think). I sometimes need to make sure it is switched on before or after everything is booted, which makes reboots fiddly. I have just wiped everything so can’t paste any error examples but does this sound familiar to anyone?
This sounds familiar to me in one scenario: USB reconfiguration due to speed. Make sure your physical controller you connect the device to is the same in generation (1.1 vs 2.0 vs 3.0) as the virtual one. Many devices are fine with being re-connected between 3 and 2 but some don't. Check that first ;)
 

Whitterquick

Member
Aug 1, 2020
232
7
18
Yes, I actually use passed to a DSM install.


This sounds familiar to me in one scenario: USB reconfiguration due to speed. Make sure your physical controller you connect the device to is the same in generation (1.1 vs 2.0 vs 3.0) as the virtual one. Many devices are fine with being re-connected between 3 and 2 but some don't. Check that first ;)
Oh thanks for the great tip, where do I find that virtual option?
 

kbftech

New Member
Mar 22, 2021
3
1
3
120
@killer129 Thanks for your work. After hours of head bashing around obscure forum threads, I finally stumbled upon a post that led me to you github. A reboot later I was able to successfully pass my PCI HBA SAS card (LSI) to a TrueNAS VM in ProxMox.

Now I wonder:
What does it takes for a path like this to be included in the main trunk of ProxMox and simply remove the whole hassle I (and anyone else using ProxMox on an HP Gen8 server) faced?

Thanks again!
 
  • Like
Reactions: Whitterquick

FelixCLC

Member
Feb 6, 2020
33
10
13
24
News on alternative method: turns out there's a built in kernel module that's rarely if ever enabled for hp ilo2+ support.

This may mean we can remove the Rmrr for certain slots as proposed previously
 

kbftech

New Member
Mar 22, 2021
3
1
3
120
News on alternative method: turns out there's a built in kernel module that's rarely if ever enabled for hp ilo2+ support.

This may mean we can remove the Rmrr for certain slots as proposed previously
That would be great! Care to share the package name?
 

FelixCLC

Member
Feb 6, 2020
33
10
13
24
That would be great! Care to share the package name?
not a package, it's a kernel configuration option when building from source

I've yet to test this, but came across it while compiling a custom kernel for a work thing

in your kernel source folder run
Bash:
scripts/config --enable  CONFIG_HP_ILO=Y

to enable the ilo2+ interface.

then run nano .config and search for ilo (ctrl w to search, alt w to go between instances)

Busy with some other projects right now, but will compile this when I get a chance
 

kbftech

New Member
Mar 22, 2021
3
1
3
120
That would be great! Care to share the package name?
Thanks for the highlight! I'm a bit behind on this aspect. I thought/hoped we could load the module (as opposed to recompile the kernel to have it baked-in).

I haven't fiddled with kernel recompilation since early 2000 when I had so much time at hand that I would bootstrap gentoo on a weekly basis. I'll have to google around to re-learn how to do that "correctly".

Thanks again!
 

FelixCLC

Member
Feb 6, 2020
33
10
13
24
Thanks for the highlight! I'm a bit behind on this aspect. I thought/hoped we could load the module (as opposed to recompile the kernel to have it baked-in).

I haven't fiddled with kernel recompilation since early 2000 when I had so much time at hand that I would bootstrap gentoo on a weekly basis. I'll have to google around to re-learn how to do that "correctly".

Thanks again!
it may be possible to do so assuming you mean via something like DKMS.

until then:

Quite busy ATM, but if you want to test yourself:

download source for kernel 5.8+. untar/zip etc.

cd linux

make clean

make oldconfig

scripts/config --enable CONFIG_HP_ILO=Y

make prepare

make -j35
[1.5*number of threads -1, so for 2 hexacore xeons, 24 cores,*1.5 = 36 -1= 35]

make modules -j35

make install

make clean

you would then use the utility in my post a few pages back to disable RMRR directly for the slot.

Once done, regardless of kernel, the slot should no longer have issues with RMRR
 
  • Like
Reactions: Whitterquick

Whitterquick

Member
Aug 1, 2020
232
7
18
it may be possible to do so assuming you mean via something like DKMS.

until then:

Quite busy ATM, but if you want to test yourself:

download source for kernel 5.8+. untar/zip etc.

cd linux

make clean

make oldconfig

scripts/config --enable CONFIG_HP_ILO=Y

make prepare

make -j35
[1.5*number of threads -1, so for 2 hexacore xeons, 24 cores,*1.5 = 36 -1= 35]

make modules -j35

make install

make clean

you would then use the utility in my post a few pages back to disable RMRR directly for the slot.

Once done, regardless of kernel, the slot should no longer have issues with RMRR
This seems very interesting. When do you think you will have the time to test it?
 

Whitterquick

Member
Aug 1, 2020
232
7
18
few weeks- this is really low priority for me right now

If you want to give it a shot and let me know, could be a good option
Kind of in the same boat. I haven’t had much time lately to even troubleshoot existing issues. Funny because during a pandemic I would expect to have more time than ever!
 

arh

New Member
Apr 1, 2021
3
1
3
43
Tutorial:

Well it took a bit more time but here it is, a tutorial to disable the RMRR check. There are other tutorials but those aren't complete, don't work or are for v4.4 and earlier. If you've been affected by RMRR on HP Proliant G7 or earlier, other solutions will not work (excluding RMRR with conrep, acs_override, etc. etc.).

You have probably read this already as well, otherwise you wouldn't be here. ;-)
https://forum.proxmox.com/threads/gpu-passthrough-tutorial-reference.34303/

You should only use this tutorial if you have old hardware that cannot disable RMRR (Reserved Memory Region Reporting). Red Hat has a good whitepaper on the subject, please read it before continuing: https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf

Anyway, if you're aware of the security risk of accepting RMRR in a virtualized environment, you accept the inherent risk of compiling a Linux kernel with a self-made patch (typos in the patch may kill your system) and still want to use passthrough: here's a how to for Proxmox VE v5.1 v5.2 v5.3 v5.4 v6.0 v6.2. I'm pretty new to all this stuff (first time compiling ever) so don't expect this to be the prettiest tutorial ever and there's probably a better or quicker way to do things. This however did work for me where all other suggestions failed.

If you're trying to passthrough a GPU but can't get it to work and get a message like this in the GUI:
Code:
vfio: failed to set iommu for container: Operation not permitted
RMRR exclusion might be in effect, you can check this like this:
Code:
dmesg | grep -e DMAR -e IOMMU

If that returns
Code:
vfio-pci 0000:0X:YY.Z: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
RMRR exclusion is indeed in effect and passthrough for that device has been disabled.

Still on board? Okay then, we're going to make a patch to disable the RMRR check in the offending intel-iommu driver. Sidenote: this patch does NOT survive a kernel update. You'll need to patch again using this tutorial (works as long as kernel = Zesty, Artful Aardvark, Bionic Beaver, Eoan Ermine).

Also, the required patch has changed since Eoan Ermine because Intel changed their intel-iommu driver. Make sure to use the v2 patch if you are on Eoan.

First, download and install required packages. I'm not sure if you need all of them, but this is what other users used and I've been adding to the list whenever I got an error stating I was missing a package:
Code:
apt-get update
apt-get install git nano screen patch fakeroot build-essential devscripts libncurses5 libncurses5-dev libssl-dev bc flex bison libelf-dev libaudit-dev libgtk2.0-dev libperl-dev asciidoc xmlto gnupg gnupg2 rsync lintian debhelper libdw-dev libnuma-dev libslang2-dev sphinx-common asciidoc-base automake cpio dh-python file gcc kmod libiberty-dev libpve-common-perl libtool perl-modules python-minimal sed tar zlib1g-dev lz4

Then download the pve-kernel git:
Code:
cd /usr/src/
git clone git://git.proxmox.com/git/pve-kernel.git

This makes a map called pve-kernel in /usr/src/ (the preferred folder for compiling stuff) where all the magic is going to happen.

Next you can either create the patch yourself (option A) or download my variant (option B).

OPTION A:
If you want to build the patch yourself, download the Ubuntu kernel, this is needed to make a patch for the driver:
Code:
cd /usr/src/
git clone git://git.proxmox.com/git/mirror_ubuntu-focal-kernel.git
mv mirror_ubuntu-focal-kernel ubuntu-focal

Then, copy the driver file so we can edit the copy and make a patch out of the differences between the copy and original:
Code:
cp ubuntu-focal/drivers/iommu/intel-iommu.c ubuntu-focal/drivers/iommu/intel-iommu_new.c

And use your preferred editor (mine is Nano) to edit the intel.iommu_new.c driver file:
Code:
nano ubuntu-focal/drivers/iommu/intel-iommu_new.c

Find this section (ctrl-w in Nano):
Code:
        if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
            device_is_rmrr_locked(dev)) {
                dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
                return -EPERM;

       }

Edit out the offending "return -EPERM" line, modify the text for your convenience and close with ctrl-o and ctrl-x, it should look like this:
Code:
        if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
            device_is_rmrr_locked(dev)) {
                dev_warn(dev, "Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.\n");
        }

Then, make a patch file in the pve-kernel directory out of the original and new driver file:
Code:
cd /usr/src/pve-kernel
diff -u /usr/src/ubuntu-focal/drivers/iommu/intel-iommu.c /usr/src/ubuntu-focal/drivers/iommu/intel-iommu_new.c > remove_rmrr_check_v2.patch

Open your new patch file, you need to edit the first two lines to co-align with the necessary filenames and structure when compiling.
Code:
nano remove_rmrr_check.patch
The first two lines should look like this (don't mind the dates, just edit the directories en filenames):
Code:
--- a/drivers/iommu/intel-iommu.c   2019-11-24 17:47:46.031359158 +0200
+++ b/drivers/iommu/intel-iommu.c       2019-11-24 17:50:59.904823799 +0200
Close and save with ctrl-o, ctrl-x (or ctrl-x and y)

Find your newly created (or downloaded) patch file in the pve-kernel directory and place it in the patches directory.
Code:
cd /usr/src/pve-kernel
mv remote_rmrr_check.patch ./patches/kernel

OPTION B
If you want to use my patch, download the attachment (remove_rmrr_check for Disco and earliers, remove_rmrr_check_v2 for Eoan and further), unzip it and place it in the following directory with WinSCP, FTP or another program of choice:
Code:
/usr/src/pve-kernel/patches/kernel/

PROCESS AFTER BOTH OPTION A AND B
There is a script that checks for strange dir names that needs to be disabled in order for us to have a custom kernel version name so we can update systems that are already up to date:
Code:
nano /usr/src/pve-kernel/debian/scripts/find-firmware.pl
Comment out the fourth line from above with a # so that it looks as such:
Code:
#die "strange directory name" if..
Save with ctrl-x and y.

Then edit the control.in file located at /usr/src/pve-kernel so that the generated package is named differently. This prevents errors when updating an up to date vanilla install. Add -removermrr to the pve-kernel article so the line lookes like this:
Code:
pve-kernel-@KVNAME@-removermrr

To finish up and give your system a nice identifier, edit the Makefile:
Code:
nano /usr/src/pve-kernel/Makefile
Edit the EXTRAVERSION line near the top of the Makefile and add this to the end:
Code:
-removermrr
Again, save with ctrl-x and y.

Now, you've made a patch to modify the intel-iommu driver and set everything to successfully compile the new kernel. All that's left is to compile! Since the Makefile does all the thinking, you don't need to pass arguments like -j:
Code:
make

Now watch the magic, it will take a while and put quite a load on your system. It took 30 minutes on my HP DL380 G6 with twin X5670's. When it's finished, you should find a couple of new .deb files in your /usr/src/pve-kernel directory. Install them:
Code:
dpkg -i *.deb

The Makefile also updates GRUB and initramfs, so no need to update those manually. When finished, reboot and check again:
Code:
dmesg | grep -e DMAR -e IOMMU

If you still get an error about unsafe interrupts, note that since 6.1 the allow_unsafe_interrupts method has changed: https://forum.proxmox.com/threads/i...nabled-still-error-message.67341/#post-312870

You should see your newly modified line about RMRR patch being in effect and passthrough should now work. If not, something else might be wrong, you made a typo somewhere or I forgot something (reply below) or this tutorial is outdated already because Proxmox uses a different kernel than Zesty (updated already to Artful after writing this tutorial, had to rewrite. Thanks to and3l12 for providing info on the new makefile).Bionic already. ;-). And on to Disco. And it's Eoan now. Focal.

Happy compiling and don't forget to like this post if it helped you. :cool::D
Help... Sorry... HP ProLiant ML350P Gen8 ProxMox 6.2-1 or 6.3-1. I've tried both A and B on fresh installs of both ProxMox versions. The problem I have is with the make command.

Here is the output.

Code:
root@sky-pve-1:/usr/src/pve-kernel# make
test -f "submodules/ubuntu-hirsute/README" || git submodule update --init submodules/ubuntu-hirsute
test -f "submodules/zfsonlinux/Makefile" || git submodule update --init --recursive submodules/zfsonlinux
rm -rf build/ubuntu-hirsute ubuntu-hirsute.prepared
mkdir -p build
cp -a submodules/ubuntu-hirsute build/ubuntu-hirsute
cat build/ubuntu-hirsute/debian.master/config/config.common.ubuntu build/ubuntu-hirsute/debian.master/config/amd64/config.common.amd64 build/ubuntu-hirsute/debian.master/config/amd64/config.flavour.generic > config-5.11.7.org
cp config-5.11.7.org build/ubuntu-hirsute/.config
sed -i build/ubuntu-hirsute/Makefile -e 's/^EXTRAVERSION.*$/EXTRAVERSION=-1-pve-removermrr/'
rm -rf build/ubuntu-hirsute/debian build/ubuntu-hirsute/debian.master
set -e; cd build/ubuntu-hirsute; for patch in ../../patches/kernel/*.patch; do echo "applying patch '$patch'" && patch -p1 < ${patch}; done
applying patch '../../patches/kernel/0001-Make-mkcompile_h-accept-an-alternate-timestamp-strin.patch'
patching file scripts/mkcompile_h
applying patch '../../patches/kernel/0002-bridge-keep-MAC-of-first-assigned-port.patch'
patching file net/bridge/br_stp_if.c
applying patch '../../patches/kernel/0003-pci-Enable-overrides-for-missing-ACS-capabilities-4..patch'
patching file Documentation/admin-guide/kernel-parameters.txt
patching file drivers/pci/quirks.c
applying patch '../../patches/kernel/0004-kvm-disable-default-dynamic-halt-polling-growth.patch'
patching file virt/kvm/kvm_main.c
applying patch '../../patches/kernel/0005-net-core-downgrade-unregister_netdevice-refcount-lea.patch'
patching file net/core/dev.c
applying patch '../../patches/kernel/remove_rmrr_check_v2.patch'
can't find file to patch at input line 3
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|--- a/drivers/iommu/intel-iommu.c      2021-03-31 20:43:21.127349418 -0500
|+++ b/drivers/iommu/intel-iommu.c      2021-03-31 20:45:25.418800372 -0500
--------------------------
File to patch:

Any and all help is much much appreciated.
 

arh

New Member
Apr 1, 2021
3
1
3
43
Help... Sorry... HP ProLiant ML350P Gen8 ProxMox 6.2-1 or 6.3-1. I've tried both A and B on fresh installs of both ProxMox versions. The problem I have is with the make command.

Here is the output.

Code:
root@sky-pve-1:/usr/src/pve-kernel# make
test -f "submodules/ubuntu-hirsute/README" || git submodule update --init submodules/ubuntu-hirsute
test -f "submodules/zfsonlinux/Makefile" || git submodule update --init --recursive submodules/zfsonlinux
rm -rf build/ubuntu-hirsute ubuntu-hirsute.prepared
mkdir -p build
cp -a submodules/ubuntu-hirsute build/ubuntu-hirsute
cat build/ubuntu-hirsute/debian.master/config/config.common.ubuntu build/ubuntu-hirsute/debian.master/config/amd64/config.common.amd64 build/ubuntu-hirsute/debian.master/config/amd64/config.flavour.generic > config-5.11.7.org
cp config-5.11.7.org build/ubuntu-hirsute/.config
sed -i build/ubuntu-hirsute/Makefile -e 's/^EXTRAVERSION.*$/EXTRAVERSION=-1-pve-removermrr/'
rm -rf build/ubuntu-hirsute/debian build/ubuntu-hirsute/debian.master
set -e; cd build/ubuntu-hirsute; for patch in ../../patches/kernel/*.patch; do echo "applying patch '$patch'" && patch -p1 < ${patch}; done
applying patch '../../patches/kernel/0001-Make-mkcompile_h-accept-an-alternate-timestamp-strin.patch'
patching file scripts/mkcompile_h
applying patch '../../patches/kernel/0002-bridge-keep-MAC-of-first-assigned-port.patch'
patching file net/bridge/br_stp_if.c
applying patch '../../patches/kernel/0003-pci-Enable-overrides-for-missing-ACS-capabilities-4..patch'
patching file Documentation/admin-guide/kernel-parameters.txt
patching file drivers/pci/quirks.c
applying patch '../../patches/kernel/0004-kvm-disable-default-dynamic-halt-polling-growth.patch'
patching file virt/kvm/kvm_main.c
applying patch '../../patches/kernel/0005-net-core-downgrade-unregister_netdevice-refcount-lea.patch'
patching file net/core/dev.c
applying patch '../../patches/kernel/remove_rmrr_check_v2.patch'
can't find file to patch at input line 3
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|--- a/drivers/iommu/intel-iommu.c      2021-03-31 20:43:21.127349418 -0500
|+++ b/drivers/iommu/intel-iommu.c      2021-03-31 20:45:25.418800372 -0500
--------------------------
File to patch:

Any and all help is much much appreciated.
Never mind, Kiler129's post #137 did the trick. I would like to learn and understand what my problem was, but it is not urgent as Kiler129's solution worked GREAT. Never purchasing anything HPE ever again!!

Thanks everyone.
 
Last edited:
  • Like
Reactions: Whitterquick

arh

New Member
Apr 1, 2021
3
1
3
43
I tested the packages throughly and prepared a complete rundown of the issue with all the possible fixes & technical reasons.

https://github.com/kiler129/relax-intel-rmrr

Anyone interested can either download precompiled debs or build them from sources. After installation flipping a kernel switch will activate the patch.

Enjoy. Open source FTW :D

Worked like a charm. THANK YOU!!!!! Never purchasing anything HPE ever again!!! I've spent over a month (few hours here and there) trying to get my GPU to passthrough. Even tried esxi 6.5 custom HPE image for my ML350P gen8. But starting the VM would crash/reboot the host. UGH.

Now if only I could get ProxMox to use my (non passthrough) HP BK835A CN1100R Dual Port 10G network ethernet adapter (dual sfp+ pci card). It works great for PVE, but not any of the VPN's. The VPN's can ping the pve host but not the gateway or any network device. If you have any thoughts I'm all ears.

Thanks again!!!
 
Last edited:

pr0xnmoz

New Member
Aug 4, 2020
3
1
1
93
So I made a new attempt to get passthrough working with my Microserver G8.
I read that XCP-NG had gained support for passthrough via gui but it was only for GPU.
And as expected it did not like IOMMU so scratch that.

I have also properly used esxi older version that has the patch built in for IOMMU but it was also not stable.
It could sometimes freeze and reboot the whole server and show critical health on the HPE H240 card so I will not rely on esxi.

Moving on to proxmox, ..again.
I made some progress from last time when I ended up on a black window starting the vm, this time I get some actual results from vm starting up.

I followed the guide on github https://github.com/kiler129/relax-intel-rmrr
I can boot the vm, I see some initiation from my HPE H240 but its stuck here, main console is flooded with error message:
Code:
[  341.172029] DMAR: DRHD: handling fault status reg 2
[  341.172668] DMAR: [DMA Write] Request device [07:00.0] PASID ffffffff fault addr f1f8f000 [fault reason 05] PTE Write access is not set
[  341.173988] DMAR: [DMA Write] Request device [07:00.0] PASID ffffffff fault addr f1f8f000 [fault reason 05] PTE Write access is not set


1.PNG

2.PNG

.......

Ok scratch that. I forgot to put the vm in uefi mode.
VM is booting now, installing Truenas even though console is still flooded with the message.
Shutting down server to insert my hard disks iLO switch to indicate the usual critical health when trying to passthrough, it reset itself to green once you reboot the server through a normal cycle.

3.PNG


Had some issues with grub not booting the correct kernel but solved with this guide
https://meetrix.io/blog/aws/changing-default-ubuntu-kernel.html

Edit grub
Bash:
nano /etc/default/grub

GRUB_DEFAULT="gnulinux-advanced-8176020c-3837-4c38-90c6-c9525f91dc14>gnulinux-5.4.101-1-pve-relaxablermrr-advanced-8176020c-3837-4c38-90c6-c9525f91dc14"

Now kernel will automatically boot the relaxablermrr kernel

uname -a
Linux pve 5.4.101-1-pve-relaxablermrr #1 SMP PVE 5.4.101-1 (Fri, 26 Feb 2021 13:13:09 +0100) x86_64 GNU/Linux

I successfully booted into Truenas and behold my HPE H240 Smart Host Bus Adapter is visible and my drives are visible and successfully imported!

4.PNG


The only thing now is the error message flooding dmesg, the fix suggested seem to have no effect on my system.
https://bugzilla.kernel.org/show_bug.cgi?id=202723

New morning and new opportunity to test log level to see where the log flooding starts.
Using these links as my resource and reference:

https://elinux.org/Debugging_by_printing
https://linuxconfig.org/introduction-to-the-linux-kernel-log-levels


No messages when booting the Truenas vm using

Bash:
# echo 0 > /proc/sys/kernel/printk
# echo 1 > /proc/sys/kernel/printk
# echo 2 > /proc/sys/kernel/printk
# echo 3 > /proc/sys/kernel/printk
But log flooding starting at KERN_WARNING, log level 4
# echo 4 > /proc/sys/kernel/printk
Going back  to log level 3 letting me have peace in the log.
# echo 3 > /proc/sys/kernel/printk

Now for setting log level 3 permanent on boot
Bash:
nano /etc/default/grub

Add loglevel=3 to:
Bash:
GRUB_CMDLINE_LINUX_DEFAULT="loglevel=3"

update-grub and shutdown -r now

Upon boot I am booting the Truenas vm no more kernel log flooding on the screen.
Syslog still shows it in the web ui but at least I don't have to look at it.
 
Last edited:
  • Like
Reactions: gildra

gildra

New Member
Apr 15, 2021
10
1
3
29
I am on the edge of my chair. I was dealing with this s*it since almost 2 years!! This made me hate HPE so much, besides the fact that you can not even get the downloads for your server without any enterprise license, even for their homeuser stuff!!

Finally I found this thread where people are actually talking about the same problem and even proof was posted, I want to cry. Now, before I go ahead and apply the patch, which is a really hard thing not to do right now. I am on version 6.3-6, am I safe to apply the 6.3-4 patch anyway, without creating the next major problem? :D

EDIT:
Holy Sh*t, it works!! Neven been happier to hear the Ungine Heaven Benchmark music (which is horrible) lol
Also, yes you can use the latest release for 6.3-6, in fact you can use it as long as the kernel of the official release is not too far off as explained in this comment.
 
Last edited:
  • Like
Reactions: Whitterquick

Assassins88

New Member
Jan 15, 2021
1
0
1
33
Tutorial:

Well it took a bit more time but here it is, a tutorial to disable the RMRR check. There are other tutorials but those aren't complete, don't work or are for v4.4 and earlier. If you've been affected by RMRR on HP Proliant G7 or earlier, other solutions will not work (excluding RMRR with conrep, acs_override, etc. etc.).

You have probably read this already as well, otherwise you wouldn't be here. ;-)
https://forum.proxmox.com/threads/gpu-passthrough-tutorial-reference.34303/

You should only use this tutorial if you have old hardware that cannot disable RMRR (Reserved Memory Region Reporting). Red Hat has a good whitepaper on the subject, please read it before continuing: https://access.redhat.com/sites/default/files/attachments/rmrr-wp1.pdf

Anyway, if you're aware of the security risk of accepting RMRR in a virtualized environment, you accept the inherent risk of compiling a Linux kernel with a self-made patch (typos in the patch may kill your system) and still want to use passthrough: here's a how to for Proxmox VE v5.1 v5.2 v5.3 v5.4 v6.0 v6.2. I'm pretty new to all this stuff (first time compiling ever) so don't expect this to be the prettiest tutorial ever and there's probably a better or quicker way to do things. This however did work for me where all other suggestions failed.

If you're trying to passthrough a GPU but can't get it to work and get a message like this in the GUI:
Code:
vfio: failed to set iommu for container: Operation not permitted
RMRR exclusion might be in effect, you can check this like this:
Code:
dmesg | grep -e DMAR -e IOMMU

If that returns
Code:
vfio-pci 0000:0X:YY.Z: Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.
RMRR exclusion is indeed in effect and passthrough for that device has been disabled.

Still on board? Okay then, we're going to make a patch to disable the RMRR check in the offending intel-iommu driver. Sidenote: this patch does NOT survive a kernel update. You'll need to patch again using this tutorial (works as long as kernel = Zesty, Artful Aardvark, Bionic Beaver, Eoan Ermine).

Also, the required patch has changed since Eoan Ermine because Intel changed their intel-iommu driver. Make sure to use the v2 patch if you are on Eoan.

First, download and install required packages. I'm not sure if you need all of them, but this is what other users used and I've been adding to the list whenever I got an error stating I was missing a package:
Code:
apt-get update
apt-get install git nano screen patch fakeroot build-essential devscripts libncurses5 libncurses5-dev libssl-dev bc flex bison libelf-dev libaudit-dev libgtk2.0-dev libperl-dev asciidoc xmlto gnupg gnupg2 rsync lintian debhelper libdw-dev libnuma-dev libslang2-dev sphinx-common asciidoc-base automake cpio dh-python file gcc kmod libiberty-dev libpve-common-perl libtool perl-modules python-minimal sed tar zlib1g-dev lz4

Then download the pve-kernel git:
Code:
cd /usr/src/
git clone git://git.proxmox.com/git/pve-kernel.git

This makes a map called pve-kernel in /usr/src/ (the preferred folder for compiling stuff) where all the magic is going to happen.

Next you can either create the patch yourself (option A) or download my variant (option B).

OPTION A:
If you want to build the patch yourself, download the Ubuntu kernel, this is needed to make a patch for the driver:
Code:
cd /usr/src/
git clone git://git.proxmox.com/git/mirror_ubuntu-focal-kernel.git
mv mirror_ubuntu-focal-kernel ubuntu-focal

Then, copy the driver file so we can edit the copy and make a patch out of the differences between the copy and original:
Code:
cp ubuntu-focal/drivers/iommu/intel-iommu.c ubuntu-focal/drivers/iommu/intel-iommu_new.c

And use your preferred editor (mine is Nano) to edit the intel.iommu_new.c driver file:
Code:
nano ubuntu-focal/drivers/iommu/intel-iommu_new.c

Find this section (ctrl-w in Nano):
Code:
        if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
            device_is_rmrr_locked(dev)) {
                dev_warn(dev, "Device is ineligible for IOMMU domain attach due to platform RMRR requirement.  Contact your platform vendor.\n");
                return -EPERM;

       }

Edit out the offending "return -EPERM" line, modify the text for your convenience and close with ctrl-o and ctrl-x, it should look like this:
Code:
        if (domain->type == IOMMU_DOMAIN_UNMANAGED &&
            device_is_rmrr_locked(dev)) {
                dev_warn(dev, "Device was ineligible for IOMMU domain attach due to platform RMRR requirement. Patch is in effect.\n");
        }

Then, make a patch file in the pve-kernel directory out of the original and new driver file:
Code:
cd /usr/src/pve-kernel
diff -u /usr/src/ubuntu-focal/drivers/iommu/intel-iommu.c /usr/src/ubuntu-focal/drivers/iommu/intel-iommu_new.c > remove_rmrr_check_v2.patch

Open your new patch file, you need to edit the first two lines to co-align with the necessary filenames and structure when compiling.
Code:
nano remove_rmrr_check.patch
The first two lines should look like this (don't mind the dates, just edit the directories en filenames):
Code:
--- a/drivers/iommu/intel-iommu.c   2019-11-24 17:47:46.031359158 +0200
+++ b/drivers/iommu/intel-iommu.c       2019-11-24 17:50:59.904823799 +0200
Close and save with ctrl-o, ctrl-x (or ctrl-x and y)

Find your newly created (or downloaded) patch file in the pve-kernel directory and place it in the patches directory.
Code:
cd /usr/src/pve-kernel
mv remote_rmrr_check.patch ./patches/kernel

OPTION B
If you want to use my patch, download the attachment (remove_rmrr_check for Disco and earliers, remove_rmrr_check_v2 for Eoan and further), unzip it and place it in the following directory with WinSCP, FTP or another program of choice:
Code:
/usr/src/pve-kernel/patches/kernel/

PROCESS AFTER BOTH OPTION A AND B
There is a script that checks for strange dir names that needs to be disabled in order for us to have a custom kernel version name so we can update systems that are already up to date:
Code:
nano /usr/src/pve-kernel/debian/scripts/find-firmware.pl
Comment out the fourth line from above with a # so that it looks as such:
Code:
#die "strange directory name" if..
Save with ctrl-x and y.

Then edit the control.in file located at /usr/src/pve-kernel so that the generated package is named differently. This prevents errors when updating an up to date vanilla install. Add -removermrr to the pve-kernel article so the line lookes like this:
Code:
pve-kernel-@KVNAME@-removermrr

To finish up and give your system a nice identifier, edit the Makefile:
Code:
nano /usr/src/pve-kernel/Makefile
Edit the EXTRAVERSION line near the top of the Makefile and add this to the end:
Code:
-removermrr
Again, save with ctrl-x and y.

Now, you've made a patch to modify the intel-iommu driver and set everything to successfully compile the new kernel. All that's left is to compile! Since the Makefile does all the thinking, you don't need to pass arguments like -j:
Code:
make

Now watch the magic, it will take a while and put quite a load on your system. It took 30 minutes on my HP DL380 G6 with twin X5670's. When it's finished, you should find a couple of new .deb files in your /usr/src/pve-kernel directory. Install them:
Code:
dpkg -i *.deb

The Makefile also updates GRUB and initramfs, so no need to update those manually. When finished, reboot and check again:
Code:
dmesg | grep -e DMAR -e IOMMU

If you still get an error about unsafe interrupts, note that since 6.1 the allow_unsafe_interrupts method has changed: https://forum.proxmox.com/threads/i...nabled-still-error-message.67341/#post-312870

You should see your newly modified line about RMRR patch being in effect and passthrough should now work. If not, something else might be wrong, you made a typo somewhere or I forgot something (reply below) or this tutorial is outdated already because Proxmox uses a different kernel than Zesty (updated already to Artful after writing this tutorial, had to rewrite. Thanks to and3l12 for providing info on the new makefile).Bionic already. ;-). And on to Disco. And it's Eoan now. Focal.

Happy compiling and don't forget to like this post if it helped you. :cool::D
Hi, nice How To but i get this question:

1621080531308.png

So what can i do. I used Option B and with Option A i get the same "error".

Thx 4 your help

Edit:

I used this version of pve:
1621080693010.png
 
Last edited:

ShinigamiLY

Member
Jul 16, 2019
23
1
6
30
Tutorial:


Then edit the control.in file located at /usr/src/pve-kernel so that the generated package is named differently. This prevents errors when updating an up to date vanilla install. Add -removermrr to the pve-kernel article so the line lookes like this:
Hi. Im not done with ur tutorial and I rly hope it works for me too cause im frustrated about the passthrough.
Anyway. I noticed there is a missspelling.
The control.in is not located in /usr/src/pve-kernel but in /usr/src/pve-kernel/debian
Just wanted to point out cause I was confused too :)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!