[TUTORIAL] Compile Proxmox VE with patched intel-iommu driver to remove RMRR check

Whitterquick

Member
Aug 1, 2020
77
4
8
If we have compiled a patched kernel and it’s working as expected, if we need to reinstall the system is there a way of taking/backing up this patched kernel and using it on a fresh install without having to go through the whole patch process again? Would this be as simple as dropping the kernel file into the correct directory or does other stuff need to be done to make it work with a new install?
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
If we have compiled a patched kernel and it’s working as expected, if we need to reinstall the system is there a way of taking/backing up this patched kernel and using it on a fresh install without having to go through the whole patch process again? Would this be as simple as dropping the kernel file into the correct directory or does other stuff need to be done to make it work with a new install?
Should be able to load it in as a kernel module at startup, just need the .deb AFAIK
 
  • Like
Reactions: Whitterquick

Whitterquick

Member
Aug 1, 2020
77
4
8
It's a bit of a shame that whenever someone tries to get to the bottom of these passthrough issues they end up giving up or finding an alternative solution. Proxmox staff seem to avoid getting involved in these thread for there'd most part too. It's something that definitely needs looking at for future versions; even if not resolved, it would be nice to have a clearer understanding of why in any given situation.
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
It's a bit of a shame that whenever someone tries to get to the bottom of these passthrough issues they end up giving up or finding an alternative solution. Proxmox staff seem to avoid getting involved in these thread for there'd most part too. It's something that definitely needs looking at for future versions; even if not resolved, it would be nice to have a clearer understanding of why in any given situation.
Not sure how to take that

Tried different things, including conrep, but just doesn't seem possible, without being so drastic as to mod the bios. for now I'm just out of ideas on this.

As for the proxmox Staff, it's very intentional (and I'd do the same thing if I was a dev on this project). This sort of kernel patching very much increases the attack vectors of the machine in use. As Proxmox is first and foremost designed for the enterprise where this sort of thing isn't needed, there's no reason to put in time to QA this sort of thing.

Even if it isn't on by default/ only accessible via the command line is you're asking for trouble. not to mention you're technically messing with the memory space used by IPMI, so messing with that could further introduce issues.
 

Whitterquick

Member
Aug 1, 2020
77
4
8
Not sure how to take that

Tried different things, including conrep, but just doesn't seem possible, without being so drastic as to mod the bios. for now I'm just out of ideas on this.

As for the proxmox Staff, it's very intentional (and I'd do the same thing if I was a dev on this project). This sort of kernel patching very much increases the attack vectors of the machine in use. As Proxmox is first and foremost designed for the enterprise where this sort of thing isn't needed, there's no reason to put in time to QA this sort of thing.

Even if it isn't on by default/ only accessible via the command line is you're asking for trouble. not to mention you're technically messing with the memory space used by IPMI, so messing with that could further introduce issues.
I myself am thinking of just using a simple distro with QEMU/KVM/Libvirt and kind of agree with what you say, I just think it’s a shame that a real solution was never found.
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
I myself am thinking of just using a simple distro with QEMU/KVM/Libvirt and kind of agree with what you say, I just think it’s a shame that a real solution was never found.
I've since switched to using Pop!_OS for all of my VM needs, and since its based on Ubuntu, which itself is based on Debian, you can just directly load the kernel module in and not have to worry about it. for VM management I just use a VNC client that attaches to a virtual display with the VirtManager being on that virtual display
 
  • Like
Reactions: Whitterquick

Whitterquick

Member
Aug 1, 2020
77
4
8
I've since switched to using Pop!_OS for all of my VM needs, and since its based on Ubuntu, which itself is based on Debian, you can just directly load the kernel module in and not have to worry about it. for VM management I just use a VNC client that attaches to a virtual display with the VirtManager being on that virtual display
Can I ask why Pop!_OS is your distro of choice? Any reason why you prefer it to plain Debian or Ubuntu?
 

alexw

New Member
Sep 13, 2020
3
0
1
30
I'm trying to follow this track today in order to patch the godforsaken AMD reset bug workaround, but I've run into a couple problems:

1) git submodule update --init hung. Or rather, it didn't - it just takes quite a while to clone ubuntu upstream. So if you run into this and it looks like your host isn't using much CPU and you're not sure why it's stopped: it's just downloading ubuntu source.
2) it looks like zfsonlinux isn't compiling:

Bash:
test -f "submodules/zfsonlinux/Makefile" || git submodule update --init --recursive submodules/zfsonlinux
rm -rf build/modules/pkg-zfs build/modules/tmp pkg-zfs.prepared
mkdir -p build/modules/tmp
cp -a submodules/zfsonlinux/* build/modules/tmp
cd build/modules/tmp; make kernel
make[1]: Entering directory '/usr/src/pve-kernel/build/modules/tmp'
test -f "upstream/README.md" || git submodule update --init
Submodule path '../../../submodules/zfsonlinux': checked out '38e2c8078f63f952e3b24ec2057c7c389543ecb2'
rm -rf zfs-linux_0.8.3 zfs-linux_0.8.3.tmp
cp -a upstream zfs-linux_0.8.3.tmp
cp -a debian zfs-linux_0.8.3.tmp/debian
mv zfs-linux_0.8.3.tmp zfs-linux_0.8.3
tar czf zfs-linux_0.8.3.orig.tar.gz zfs-linux_0.8.3
cd zfs-linux_0.8.3; dpkg-buildpackage -S -uc -us -d
dpkg-buildpackage: info: source package zfs-linux
dpkg-buildpackage: info: source version 0.8.3-pve1
dpkg-buildpackage: info: source distribution pve pmg
dpkg-buildpackage: info: source changed by Proxmox Support Team <support@proxmox.com>
 dpkg-source --before-build .
 debian/rules clean
make[2]: Entering directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
dh clean --with autoreconf,python3,sphinxdoc --parallel
   debian/rules override_dh_auto_clean
make[3]: Entering directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
find . -name .gitignore -delete
rm -rf zfs-0.8.3
dh_auto_clean
make[3]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
   dh_clean
make[2]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
 dpkg-source -b .
dpkg-source: info: using source format '3.0 (quilt)'
dpkg-source: info: building zfs-linux using existing ./zfs-linux_0.8.3.orig.tar.gz
dpkg-source: info: using patch list from debian/patches/series
can't find file to patch at input line 16
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
|From: Debian ZFS on Linux maintainers
| <pkg-zfsonlinux-devel@alioth-lists.debian.net>
|Date: Wed, 30 Jan 2019 15:12:04 +0100
|Subject: [PATCH] Check-for-META-and-DCH-consistency-in-autoconf
|
|Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
|---
| config/zfs-meta.m4 | 34 +++++++++++++++++++++++++++++-----
| 1 file changed, 29 insertions(+), 5 deletions(-)
|
|diff --git a/config/zfs-meta.m4 b/config/zfs-meta.m4
|index b3c1befaa..660d8ccb9 100644
|--- a/config/zfs-meta.m4
|+++ b/config/zfs-meta.m4
--------------------------
No file to patch.  Skipping patch.
3 out of 3 hunks ignored
dpkg-source: info: the patch has fuzz which is not allowed, or is malformed
dpkg-source: info: if patch '0001-Check-for-META-and-DCH-consistency-in-autoconf.patch' is correctly applied by quilt, use 'quilt refresh' to update it
dpkg-source: error: LC_ALL=C patch -t -F 0 -N -p1 -u -V never -E -b -B .pc/0001-Check-for-META-and-DCH-consistency-in-autoconf.patch/ --reject-file=- < zfs-linux_0.8.3.orig.sTihxY/debian/patches/0001-Check-for-META-and-DCH-consistency-in-autoconf.patch subprocess returned exit status 1
dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 2
make[1]: *** [Makefile:53: zfs-linux_0.8.3-pve1.dsc] Error 2
make[1]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp'
make: *** [Makefile:99: pkg-zfs.prepared] Error 2
Has anybody run into this recently and figured out a path forward? I tried checking out a few recent commits and didn't get anywhere.
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
Can I ask why Pop!_OS is your distro of choice? Any reason why you prefer it to plain Debian or Ubuntu?
Using it for now mainly for testing, mainly because it's a derivative from debian/buntu. Most of the included packages are things I'd be using anyway, and it's polished pretty damn well. Also, because the patch will be portable to pretty much anything based on debian buster without needing to recompile, I should be fine switching between pretty much whatever I want
 
  • Like
Reactions: Whitterquick

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
I'm trying to follow this track today in order to patch the godforsaken AMD reset bug workaround, but I've run into a couple problems:

1) git submodule update --init hung. Or rather, it didn't - it just takes quite a while to clone ubuntu upstream. So if you run into this and it looks like your host isn't using much CPU and you're not sure why it's stopped: it's just downloading ubuntu source.
2) it looks like zfsonlinux isn't compiling:

Bash:
test -f "submodules/zfsonlinux/Makefile" || git submodule update --init --recursive submodules/zfsonlinux
rm -rf build/modules/pkg-zfs build/modules/tmp pkg-zfs.prepared
mkdir -p build/modules/tmp
cp -a submodules/zfsonlinux/* build/modules/tmp
cd build/modules/tmp; make kernel
make[1]: Entering directory '/usr/src/pve-kernel/build/modules/tmp'
test -f "upstream/README.md" || git submodule update --init
Submodule path '../../../submodules/zfsonlinux': checked out '38e2c8078f63f952e3b24ec2057c7c389543ecb2'
rm -rf zfs-linux_0.8.3 zfs-linux_0.8.3.tmp
cp -a upstream zfs-linux_0.8.3.tmp
cp -a debian zfs-linux_0.8.3.tmp/debian
mv zfs-linux_0.8.3.tmp zfs-linux_0.8.3
tar czf zfs-linux_0.8.3.orig.tar.gz zfs-linux_0.8.3
cd zfs-linux_0.8.3; dpkg-buildpackage -S -uc -us -d
dpkg-buildpackage: info: source package zfs-linux
dpkg-buildpackage: info: source version 0.8.3-pve1
dpkg-buildpackage: info: source distribution pve pmg
dpkg-buildpackage: info: source changed by Proxmox Support Team <support@proxmox.com>
dpkg-source --before-build .
debian/rules clean
make[2]: Entering directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
dh clean --with autoreconf,python3,sphinxdoc --parallel
   debian/rules override_dh_auto_clean
make[3]: Entering directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
find . -name .gitignore -delete
rm -rf zfs-0.8.3
dh_auto_clean
make[3]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
   dh_clean
make[2]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp/zfs-linux_0.8.3'
dpkg-source -b .
dpkg-source: info: using source format '3.0 (quilt)'
dpkg-source: info: building zfs-linux using existing ./zfs-linux_0.8.3.orig.tar.gz
dpkg-source: info: using patch list from debian/patches/series
can't find file to patch at input line 16
Perhaps you used the wrong -p or --strip option?
The text leading up to this was:
--------------------------
|From 0000000000000000000000000000000000000000 Mon Sep 17 00:00:00 2001
|From: Debian ZFS on Linux maintainers
| <pkg-zfsonlinux-devel@alioth-lists.debian.net>
|Date: Wed, 30 Jan 2019 15:12:04 +0100
|Subject: [PATCH] Check-for-META-and-DCH-consistency-in-autoconf
|
|Signed-off-by: Thomas Lamprecht <t.lamprecht@proxmox.com>
|---
| config/zfs-meta.m4 | 34 +++++++++++++++++++++++++++++-----
| 1 file changed, 29 insertions(+), 5 deletions(-)
|
|diff --git a/config/zfs-meta.m4 b/config/zfs-meta.m4
|index b3c1befaa..660d8ccb9 100644
|--- a/config/zfs-meta.m4
|+++ b/config/zfs-meta.m4
--------------------------
No file to patch.  Skipping patch.
3 out of 3 hunks ignored
dpkg-source: info: the patch has fuzz which is not allowed, or is malformed
dpkg-source: info: if patch '0001-Check-for-META-and-DCH-consistency-in-autoconf.patch' is correctly applied by quilt, use 'quilt refresh' to update it
dpkg-source: error: LC_ALL=C patch -t -F 0 -N -p1 -u -V never -E -b -B .pc/0001-Check-for-META-and-DCH-consistency-in-autoconf.patch/ --reject-file=- < zfs-linux_0.8.3.orig.sTihxY/debian/patches/0001-Check-for-META-and-DCH-consistency-in-autoconf.patch subprocess returned exit status 1
dpkg-buildpackage: error: dpkg-source -b . subprocess returned exit status 2
make[1]: *** [Makefile:53: zfs-linux_0.8.3-pve1.dsc] Error 2
make[1]: Leaving directory '/usr/src/pve-kernel/build/modules/tmp'
make: *** [Makefile:99: pkg-zfs.prepared] Error 2
Has anybody run into this recently and figured out a path forward? I tried checking out a few recent commits and didn't get anywhere.
do you mean the Navi Reset patch from Level1techs?

If so, I didn't have any issues patching it and the RMRR patch from this thread; just make sure that both patches are separate. Also, you may have to LVM extend to make sure that it all fits- my compile peaked ~70GiB IIRC
 

alexw

New Member
Sep 13, 2020
3
0
1
30
do you mean the Navi Reset patch from Level1techs?

If so, I didn't have any issues patching it and the RMRR patch from this thread; just make sure that both patches are separate. Also, you may have to LVM extend to make sure that it all fits- my compile peaked ~70GiB IIRC
That's the one.

To be clear, the error above doesn't seem to indicate anything wrong with the patch, per se. It's ZFS that's failing. But i'll double check that I've got enough space on that disk, that could always cause weird problems.
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
That's the one.

To be clear, the error above doesn't seem to indicate anything wrong with the patch, per se. It's ZFS that's failing. But i'll double check that I've got enough space on that disk, that could always cause weird problems.
Read it as more of pulling the ZoL down, some of the files aren't where they're expected to be called from. almost looks like someone did a merge without double checking dependencies.

More of an issue with Make than anything
 

alexxedo

New Member
Sep 16, 2020
9
1
3
37
Hi

Just used the tutorial on my proliant gen8.
The only change I had to do is to checkout the 5.3 branch of the kernel as some changes have been done about vfio handling in 5.4 that prevents the VMs to start, you end up with a different error message (memory listener initialization failed: Region isa-bios: vfio_dma_map).
It seems that this one cannot be workaround for the time being.
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
Hi

Just used the tutorial on my proliant gen8.
The only change I had to do is to checkout the 5.3 branch of the kernel as some changes have been done about vfio handling in 5.4 that prevents the VMs to start, you end up with a different error message (memory listener initialization failed: Region isa-bios: vfio_dma_map).
It seems that this one cannot be workaround for the time being.
for G8 there's an official press for disabling this! check this link https://www.jimmdenton.com/proliant-intel-dpdk/
 

alexxedo

New Member
Sep 16, 2020
9
1
3
37
Unfortunately that's not working on the microserver gen8. Tried that before recompiling and I was unable to passthrough
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
Unfortunately that's not working on the microserver gen8. Tried that before recompiling and I was unable to passthrough
Mind sharing the error you received? I've been trying to get conrep working on g6/g7 which ate both officially supported, but haven't had success with it. if the errors aline it may point to conrep to correctly writing data or our input parameters not being parsed correctly
 

alexxedo

New Member
Sep 16, 2020
9
1
3
37
conrep is working, it changes the endpoints without any issue.
The patch on 5.4 too. However when I start the VM I get that:
Code:
Virtual Environment 6.2-4
Search
Virtual Machine 101 (markab) on node 'pve'
Server View
Logs
()
kvm: -device vfio-pci,host=0000:07:00.0,id=hostpci0,bus=pci.0,addr=0x10: VFIO_MAP_DMA failed: Invalid argument
kvm: -device vfio-pci,host=0000:07:00.0,id=hostpci0,bus=pci.0,addr=0x10: vfio 0000:07:00.0: failed to setup container for group 11: memory listener initialization failed: Region pc.bios: vfio_dma_map(0x7ff9c4467200, 0xe0000, 0x20000, 0x7ff93e220000) = -22 (Invalid argument)
TASK ERROR: start failed: QEMU exited with code 1
my verify.dat from conrep is
XML:
<?xml version="1.0" encoding="UTF-8"?>
<!--generated by conrep version 5.5.0.0-->
<Conrep version="5.5.0.0" originating_platform="ProLiant MicroServer Gen8" originating_family="J06" originating_romdate="05/21/2018" originating_processor_manufacturer="Intel">
  <Section name="RMRDS_Slot1" helptext=".">Endpoints_Excluded</Section>
  <Section name="RMRDS_Slot2" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot3" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot4" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot5" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot6" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot7" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot8" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot9" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot10" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot11" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot12" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot13" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot14" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot15" helptext=".">Endpoints_Included</Section>
  <Section name="RMRDS_Slot16" helptext=".">Endpoints_Included</Section>
</Conrep>
 

FelixCLC

New Member
Feb 6, 2020
24
7
3
23
So that's pointing to a memory mapping error of some sort. Are you passing through the entire device? otherwise it looks like you're not alone in this per https://forum.level1techs.com/t/loo...-to-troubleshoot-error-vfio-dma-map-22/153539

from the second line of qemu complaints, looks like it might be errors passing the rom of the device. Something you could try is dumping the rom, then passing the rom as a rom bar argument via the .conf

Had some issues with a Nic and a gpu that were solved in this way


Also if you check out the level one techs thread further up, you'll get a process for doing it
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE and Proxmox Mail Gateway. We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!