Kernel 5.13.19-4 crashes when launching VM with HBA passthrough

Lefuneste

Active Member
Mar 7, 2017
13
18
43
51
Hi there

Just bumped into this issue. I have been doing some diagnostics.

With Kernel version Linux 5.13.19-4-pve when starting a TrueNas VM with PCIe Passthrough of a Dell HBA330, I get the following dmesg. Other VM continue to run but I cannot reboot the Host. It hangs indefinitely after having closed all KVM processes. Please note that this does NOT occur with Kernel 5.13.19-3-pve. The rollback corrects 100% of the symptoms. This is VERY scary. I thought my HBA died on me... Or the Exos X18 I am trying to install... The dmesg here is with only the HBA in the system, and no disks connected to it. If disks are connected I get even scarier messages after the ones presented here. It basically says that all the disks are failed with very alarming errors. Obviously once the kernel is rolled back, none of this occurs.

I have no clue of what kernel modification brings such a violent behavior. Maybe a BIOS stability issue ? I am running the F14 version of UEFI BIOS on a Gigabyte B550M AORUS PRO-P, with AMD Ryzen 9 5900X and 64 GB of ECC UDIMM Kingston Server RAM (2 modules of 32 GB).

I also think it could come from the way IOMMU is handled by the BIOS, having had to use boot cmdline "multifunction" keyword to split the IOMMU groups into granular enough level, otherwise most devices would end up in only one group (IOMMU group number 15).

After rolling back to Kernel 5.13.19-3-pve the VM is running rock solid. It is very strange to see that a minor kernel update brings such a massive malfunction. I have no memory of this happening for the last 8 years using Proxmox

According to this https://wiki.ubuntu.com/DebuggingKernelOops troubleshooting guide, the offending call seems like this :

RIP: 0010:blk_mq_cancel_work_sync+0x5/0x60

Which brings to many very recent bug reports on Linux Kernel and systemd issue when releasing disk, which is basically what is happening here as I am moving disks (unmounted of course) from a running system to a VM with exclusive access to them.
https://bugs.launchpad.net/ubuntu/impish/+source/linux/+bug/1960034
https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5995521.html


Code:
[  148.023082] BUG: kernel NULL pointer dereference, address: 0000000000000030
[  148.023087] #PF: supervisor read access in kernel mode
[  148.023089] #PF: error_code(0x0000) - not-present page
[  148.023091] PGD 0 P4D 0
[  148.023093] Oops: 0000 [#1] SMP NOPTI
[  148.023095] CPU: 17 PID: 59885 Comm: task UPID:pve:0 Tainted: P           O      5.13.19-4-pve #1
[  148.023098] Hardware name: Gigabyte Technology Co., Ltd. B550M AORUS PRO-P/B550M AORUS PRO-P, BIOS F14 01/04/2022
[  148.023100] RIP: 0010:blk_mq_cancel_work_sync+0x5/0x60
[  148.023105] Code: 4c 89 f7 e8 9d 13 00 00 e9 4b ff ff ff 48 8b 45 d0 49 89 85 e8 00 00 00 31 c0 eb a0 b8 ea ff ff ff eb c1 66 90 0f 1f 44 00 00 <48> 83 7f 30 00 74 45 55 48 89 e5 41 54 49 89 fc 48 8d bf f8 04 00
[  148.023108] RSP: 0018:ffffb28fe97639b8 EFLAGS: 00010246
[  148.023110] RAX: 0000000000000000 RBX: ffff9da9cef44300 RCX: 0000000000000282
[  148.023112] RDX: ffff9da9d41d5440 RSI: ffffffff853a7766 RDI: 0000000000000000
[  148.023113] RBP: ffffb28fe97639d0 R08: 0000000000000282 R09: ffffb28fe9763a00
[  148.023115] R10: 0000000000000000 R11: ffff9da9c85d0700 R12: ffff9da9d41d51a0
[  148.023117] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9da9d0a71198
[  148.023119] FS:  00007f2454006280(0000) GS:ffff9db8cea80000(0000) knlGS:0000000000000000
[  148.023121] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  148.023123] CR2: 0000000000000030 CR3: 000000038ef44000 CR4: 0000000000750ee0
[  148.023125] PKRU: 55555554
[  148.023126] Call Trace:
[  148.023128]  <TASK>
[  148.023129]  ? disk_release+0x24/0xa0
[  148.023133]  device_release+0x3b/0xa0
[  148.023136]  kobject_put+0x94/0x1b0
[  148.023139]  put_device+0x13/0x20
[  148.023141]  put_disk+0x1b/0x20
[  148.023143]  sg_device_destroy+0x54/0x90
[  148.023146]  sg_remove_device+0x128/0x170
[  148.023149]  device_del+0x138/0x3e0
[  148.023151]  ? kobject_put+0xae/0x1b0
[  148.023153]  device_unregister+0x1b/0x60
[  148.023155]  __scsi_remove_device+0x110/0x150
[  148.023158]  scsi_remove_target+0x1c9/0x260
[  148.023161]  ? sas_port_delete+0x150/0x150 [scsi_transport_sas]
[  148.023166]  sas_rphy_remove+0x7b/0x80 [scsi_transport_sas]
[  148.023170]  sas_port_delete+0x2d/0x150 [scsi_transport_sas]
[  148.023174]  ? sas_port_delete+0x150/0x150 [scsi_transport_sas]
[  148.023178]  do_sas_phy_delete+0x3c/0x40 [scsi_transport_sas]
[  148.023182]  device_for_each_child+0x5e/0xa0
[  148.023184]  sas_remove_host+0x28/0x50 [scsi_transport_sas]
[  148.023188]  scsih_remove+0x122/0x3e0 [mpt3sas]
[  148.023197]  pci_device_remove+0x3e/0xb0
[  148.023200]  __device_release_driver+0x181/0x240
[  148.023202]  device_driver_detach+0x41/0xa0
[  148.023205]  unbind_store+0x11e/0x130
[  148.023207]  drv_attr_store+0x24/0x30
[  148.023210]  sysfs_kf_write+0x3f/0x50
[  148.023214]  kernfs_fop_write_iter+0x13b/0x1d0
[  148.023216]  new_sync_write+0x114/0x1a0
[  148.023220]  vfs_write+0x1c5/0x260
[  148.023222]  ksys_write+0x67/0xe0
[  148.023224]  __x64_sys_write+0x1a/0x20
[  148.023227]  do_syscall_64+0x61/0xb0
[  148.023230]  ? syscall_exit_to_user_mode+0x27/0x50
[  148.023232]  ? __x64_sys_write+0x1a/0x20
[  148.023234]  ? do_syscall_64+0x6e/0xb0
[  148.023236]  ? do_syscall_64+0x6e/0xb0
[  148.023238]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  148.023241] RIP: 0033:0x7f2454219fb3
[  148.023243] Code: 75 05 48 83 c4 58 c3 e8 cb 41 ff ff 66 2e 0f 1f 84 00 00 00 00 00 90 64 8b 04 25 18 00 00 00 85 c0 75 14 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 55 c3 0f 1f 40 00 48 83 ec 28 48 89 54 24 18
[  148.023246] RSP: 002b:00007ffda5e78e28 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  148.023249] RAX: ffffffffffffffda RBX: 000055c9662865d0 RCX: 00007f2454219fb3
[  148.023250] RDX: 000000000000000c RSI: 000055c9662865d0 RDI: 000000000000000d
[  148.023252] RBP: 000000000000000c R08: 0000000000000000 R09: 000055c95e8173b0
[  148.023254] R10: 000055c96623ab10 R11: 0000000000000246 R12: 000055c9662845b0
[  148.023255] R13: 000055c95f0862a0 R14: 000000000000000d R15: 000055c9662845b0
[  148.023257]  </TASK>
[  148.023258] Modules linked in: ebtable_filter ebtables ip_set ip6table_raw iptable_raw ip6table_filter ip6_tables iptable_filter bpfilter nf_tables bonding tls softdog nfnetlink_log nfnetlink intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd kvm_amd kvm crct10dif_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd vhost_net vhost rapl vhost_iotlb tap snd_hda_intel ib_iser snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec gigabyte_wmi wmi_bmof pcspkr joydev rdma_cm input_leds efi_pstore snd_hda_core iw_cm snd_hwdep k10temp snd_pcm ccp snd_timer snd soundcore ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi mac_hid vfio_pci vfio_virqfd irqbypass vfio_iommu_type1 vfio drm sunrpc ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zzstd(O) zlua(O) zavl(PO) icp(PO) zcommon(PO) znvpair(PO) spl(O) btrfs blake2b_generic xor zstd_compress raid6_pq libcrc32c hid_generic usbkbd uas usbhid usb_storage hid crc32_pclmul mpt3sas xhci_pci xhci_pci_renesas ahci raid_class
[  148.023303]  r8169 nvme i2c_piix4 scsi_transport_sas realtek libahci xhci_hcd nvme_core wmi gpio_amdpt gpio_generic
[  148.023319] CR2: 0000000000000030
[  148.023321] ---[ end trace 0b196b2a4156e5d1 ]---
[  148.154211] RIP: 0010:blk_mq_cancel_work_sync+0x5/0x60
[  148.154215] Code: 4c 89 f7 e8 9d 13 00 00 e9 4b ff ff ff 48 8b 45 d0 49 89 85 e8 00 00 00 31 c0 eb a0 b8 ea ff ff ff eb c1 66 90 0f 1f 44 00 00 <48> 83 7f 30 00 74 45 55 48 89 e5 41 54 49 89 fc 48 8d bf f8 04 00
[  148.154217] RSP: 0018:ffffb28fe97639b8 EFLAGS: 00010246
[  148.154218] RAX: 0000000000000000 RBX: ffff9da9cef44300 RCX: 0000000000000282
[  148.154220] RDX: ffff9da9d41d5440 RSI: ffffffff853a7766 RDI: 0000000000000000
[  148.154221] RBP: ffffb28fe97639d0 R08: 0000000000000282 R09: ffffb28fe9763a00
[  148.154222] R10: 0000000000000000 R11: ffff9da9c85d0700 R12: ffff9da9d41d51a0
[  148.154223] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9da9d0a71198
[  148.154224] FS:  00007f2454006280(0000) GS:ffff9db8cea80000(0000) knlGS:0000000000000000
[  148.154225] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  148.154226] CR2: 0000000000000030 CR3: 000000038ef44000 CR4: 0000000000750ee0
[  148.154227] PKRU: 55555554

My /etc/kernel/cmdline

root=ZFS=rpool/ROOT/pve-1 boot=zfs pcie_acs_override=downstream,multifunction iommu=pt amd_iommu=on vfio_iommu_type1.allow_unsafe_interrupts=1 kvm_amd.npt=1 kvm_amd.avic=1 video=efifb:eek:ff quiet



My VM definition

agent: 1
balloon: 0
bios: ovmf
boot: order=scsi0;ide2;net0
cores: 4
cpu: host
hostpci0: 0000:0b:00,pcie=1
ide2: local:iso/TrueNAS-12.0-U8.iso,media=cdrom
machine: q35
memory: 16384
meta: creation-qemu=6.1.1,ctime=1644029739
name: TRUENAS
net0: virtio=6A:9C:F0:8C:D0:10,bridge=vmbr0
numa: 0
ostype: other
scsi0: rpool-vm:vm-300-disk-0,backup=0,size=60G,ssd=1
scsihw: virtio-scsi-pci
smbios1: uuid=eb5cb9fb-521a-4543-bd22-fb769e3b3b1c
sockets: 1
vga: virtio
vmgenid: 8d0bac79-1a96-49d8-8416-bfbfdd5b84f1
 
Last edited:
  • Like
Reactions: TorqueWrench
Which brings to many very recent bug reports on Linux Kernel and systemd issue when releasing disk, which is basically what is happening here as I am moving disks (unmounted of course) from a running system to a VM with exclusive access to them.
https://bugs.launchpad.net/ubuntu/impish/+source/linux/+bug/1960034
https://www.mail-archive.com/ubuntu-bugs@lists.ubuntu.com/msg5995521.html
Thanks for your pointers here. Fyi, we're currently in progress of building a newer kernel release with the issue from the linked report addressed, it should be available in the next hours on the test repository.
 
  • Like
Reactions: Lefuneste
FYI, a kernel with the referenced issue is available as of now on pve-no-subscription as pve-kernel-5.13.19-4-pve with version 5.13.19-9 (i.e., same ABI but newer package version, you'll still need to reboot).
 
  • Like
Reactions: Lefuneste
I confirm that the new Kernel version seems to correct the issue. Amazing work guys. Your dedication is very much appreciated !
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!