I wanted to perform PCIe passthrough of my Intel iGPU to a VM to use for hardware accelerated media transcoding, but whenever I enable passthrough the VM has random segmentation faults. I previously had this set up on an Arch Linux VM and never experienced any issues, but now I've passed it through to an Ubuntu VM and the behaviour reliably occurs after some time.
My best guess so far is that the VM Linux kernel doesn't correctly map virtual addresses and starts handing out memory addresses that are actually address for the PCI bus and not RAM, but I don't know why it's happening or what I'm doing wrong. Because it worked (and still works, from what I could test) with Arch Linux, there might be a difference in how the two VMs or operating systems are set up that's causing this.
So far, searching the internet and forum has not yielded anyone experiencing the same symptoms, though it's been difficult to search for since there's no singular error message that always occurs.
It's actually quite simple:
Some details about the machine:
I've also attached a journal that contains all logs since a fresh boot.
A few more observations/notes:
My best guess so far is that the VM Linux kernel doesn't correctly map virtual addresses and starts handing out memory addresses that are actually address for the PCI bus and not RAM, but I don't know why it's happening or what I'm doing wrong. Because it worked (and still works, from what I could test) with Arch Linux, there might be a difference in how the two VMs or operating systems are set up that's causing this.
So far, searching the internet and forum has not yielded anyone experiencing the same symptoms, though it's been difficult to search for since there's no singular error message that always occurs.
It's actually quite simple:
- Stop Ubuntu VM
- Enable GPU passthrough (using "pcie" option)
- Start VM
- Run hardware transcode (e.g. "sudo /usr/lib/jellyfin-ffmpeg/ffmpeg -hwaccel qsv -hwaccel_output_format qsv -i ./big_buck_bunny_1080p_surround.avi -c:v h264_qsv -b:v 5M -look_ahead 1 output.mp4"). This works just fine and "intel_gpu_top" shows the iGPU being used.
- Run some other, non GPU workloads
- Eventually something experience memory issues and usually segfaults
Code:
$ sudo memtester 1G
memtester version 4.6.0 (64-bit)
Copyright (C) 2001-2020 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).
pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 1024MB (1073741824 bytes)
got 1024MB (1073741824 bytes), trying mlock ...locked.
Loop 1:
Stuck Address : testing 0FAILURE: possible bad address line at offset 0x0000000000ffe5e0.
Skipping to next test...
Random Value : FAILURE: 0x0001010100010101 != 0x5f9d0efa6ef7cdca at offset 0x0000000000ffe5e0.
FAILURE: 0x0001010100010101 != 0xfeefdb60e7ef3805 at offset 0x0000000000ffe5e8.
FAILURE: 0x0001010100010101 != 0xbfcf30c6bb6c9a6d at offset 0x0000000000ffe5f0.
FAILURE: 0x0001010100010101 != 0xf7a7eec0bfff884a at offset 0x0000000000ffe5f8.
FAILURE: 0x0001010100010101 != 0x6ffe3f578fff976c at offset 0x0000000001843de0.
FAILURE: 0x0001010100010101 != 0x77ff1e1bb4481854 at offset 0x0000000001843de8.
FAILURE: 0x0001010100010101 != 0x1ef6c4b23fffc4c1 at offset 0x0000000001843df0.
FAILURE: 0x0001010100010101 != 0xbd6ea9bf5f54de91 at offset 0x0000000001843df8.
...
Some details about the machine:
Code:
$ cat /etc/os-release
PRETTY_NAME="Ubuntu 24.04.3 LTS"
NAME="Ubuntu"
VERSION_ID="24.04"
VERSION="24.04.3 LTS (Noble Numbat)"
VERSION_CODENAME=noble
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=noble
LOGO=ubuntu-logo
Code:
$ uname -a
Linux newhost 6.8.0-85-generic #85-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 18 15:26:59 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Code:
$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-6.8.0-85-generic root=UUID=d36f7414-beb7-45e0-900e-9ab79cdbcb2d ro console=tty1 console=ttyS0
I've also attached a journal that contains all logs since a fresh boot.
A few more observations/notes:
- The error isn't 100% reliably reproducible. My guess is that it depends what addresses Linux allocates for requested memory. Often running the VM for some time with workloads will eventually lead to an error. After that it's quite easy to reproduce.
- Using LLMs for troubleshooting, it insists that I need some special kernel parameters to block out certain memory regions so Linux recognises it for PCIe. It also insists that my running Arch Linux stable for multiple years in this setup without a single issue was pure luck. I could not find any reasonable search results for things it suggested like "pci=nocrs" or "memmap" - it's not documented anywhere for PCIe passthrough so I don't trust this is the right result. And because it's so unreliable to reproduce it can take days to test each change.
- The Ubuntu image is a cloud-init (https://cloud-images.ubuntu.com/noble/20250805/noble-server-cloudimg-amd64.img to be exact), it might be that something in their default configuration causes this which means my setup is just weird enough that too few people have encountered it.