Hi all, I am having an issue I havent had before, unsure why or when exactly it started but its been a while and its now driving me crazy finally.
It is randomly rebooting itself and causing all sorts of issues on my setup.
My TrueNAS VM that was once stable throughout proxmox 7.x to sometime in 8.x over 3 different hardware changes is no longer behaving itself and it looks to be related to passthrough settings perhaps but I cant quite figure it out on my own.
So I am asking for some assistance please, as I suspect I am now overlooking something simple, I hope.
Thank you!
Hardware
Intel i5-14600k
Asus W680 IPMI board with IOMMU and VTd passthrough options enabled, in UEFI mode
HBA is a LSI 9500 (SAS3808 controller)
BIOS Settings

The SAS3808 is installed in the CPU PCIe lanes slot (bottom card in this pic) so it is not going through the W680 chipset I believe

TrueNAS VM is set up with UEFI as well and here is its config, freshly reinstalled VM, with reimported config
pveversion
Error log from system log on the Proxmox host is in next post as reached character limit. Thanks!
It is randomly rebooting itself and causing all sorts of issues on my setup.
My TrueNAS VM that was once stable throughout proxmox 7.x to sometime in 8.x over 3 different hardware changes is no longer behaving itself and it looks to be related to passthrough settings perhaps but I cant quite figure it out on my own.
So I am asking for some assistance please, as I suspect I am now overlooking something simple, I hope.
Thank you!
Hardware
Intel i5-14600k
Asus W680 IPMI board with IOMMU and VTd passthrough options enabled, in UEFI mode
HBA is a LSI 9500 (SAS3808 controller)
BIOS Settings

The SAS3808 is installed in the CPU PCIe lanes slot (bottom card in this pic) so it is not going through the W680 chipset I believe

Code:
root@blofeld:~# cat /etc/kernel/cmdline
root=ZFS=rpool/ROOT/pve-1 boot=zfs quiet intel_iommu=on iommu=pt
Code:
root@blofeld:~# dmesg | grep -e DMAR -e IOMMU -e AMD-Vi
[ 0.004723] ACPI: DMAR 0x0000000070458000 000088 (v01 INTEL EDK2 00000002 01000013)
[ 0.004750] ACPI: Reserving DMAR table memory at [mem 0x70458000-0x70458087]
[ 0.104599] DMAR: IOMMU enabled
[ 0.240446] DMAR: Host address width 39
[ 0.240447] DMAR: DRHD base: 0x000000fed90000 flags: 0x0
[ 0.240454] DMAR: dmar0: reg_base_addr fed90000 ver 4:0 cap 1c0000c40660462 ecap 29a00f0505e
[ 0.240455] DMAR: DRHD base: 0x000000fed91000 flags: 0x1
[ 0.240458] DMAR: dmar1: reg_base_addr fed91000 ver 5:0 cap d2008c40660462 ecap f050da
[ 0.240459] DMAR: RMRR base: 0x0000007c000000 end: 0x000000807fffff
[ 0.240461] DMAR-IR: IOAPIC id 2 under DRHD base 0xfed91000 IOMMU 1
[ 0.240462] DMAR-IR: HPET id 0 under DRHD base 0xfed91000
[ 0.240462] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
[ 0.242038] DMAR-IR: Enabled IRQ remapping in x2apic mode
[ 0.432849] pci 0000:00:02.0: DMAR: Skip IOMMU disabling for graphics
[ 0.514021] DMAR: No ATSR found
[ 0.514021] DMAR: No SATC found
[ 0.514022] DMAR: IOMMU feature fl1gp_support inconsistent
[ 0.514023] DMAR: IOMMU feature pgsel_inv inconsistent
[ 0.514024] DMAR: IOMMU feature nwfs inconsistent
[ 0.514024] DMAR: IOMMU feature dit inconsistent
[ 0.514025] DMAR: IOMMU feature sc_support inconsistent
[ 0.514026] DMAR: IOMMU feature dev_iotlb_support inconsistent
[ 0.514026] DMAR: dmar0: Using Queued invalidation
[ 0.514029] DMAR: dmar1: Using Queued invalidation
[ 0.514903] DMAR: Intel(R) Virtualization Technology for Directed I/O
Code:
root@blofeld:~# lspci -nnk | grep -A3 "02:00.0"
02:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI Fusion-MPT 12GSAS/PCIe Secure SAS38xx [1000:00e6]
Subsystem: Broadcom / LSI 9500-8e Tri-Mode HBA [1000:4080]
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas
Code:
root@blofeld:~# find /sys/kernel/iommu_groups/ -type l
/sys/kernel/iommu_groups/17/devices/0000:02:00.0
/sys/kernel/iommu_groups/7/devices/0000:00:16.0
/sys/kernel/iommu_groups/7/devices/0000:00:16.3
/sys/kernel/iommu_groups/15/devices/0000:00:1f.0
/sys/kernel/iommu_groups/15/devices/0000:00:1f.5
/sys/kernel/iommu_groups/15/devices/0000:00:1f.4
/sys/kernel/iommu_groups/5/devices/0000:00:14.2
/sys/kernel/iommu_groups/5/devices/0000:00:14.0
/sys/kernel/iommu_groups/13/devices/0000:00:1c.3
/sys/kernel/iommu_groups/3/devices/0000:00:01.1
/sys/kernel/iommu_groups/11/devices/0000:00:1b.4
/sys/kernel/iommu_groups/1/devices/0000:00:00.0
/sys/kernel/iommu_groups/18/devices/0000:03:00.0
/sys/kernel/iommu_groups/8/devices/0000:00:17.0
/sys/kernel/iommu_groups/16/devices/0000:01:00.0
/sys/kernel/iommu_groups/6/devices/0000:00:15.1
/sys/kernel/iommu_groups/6/devices/0000:00:15.2
/sys/kernel/iommu_groups/6/devices/0000:00:15.0
/sys/kernel/iommu_groups/14/devices/0000:00:1d.0
/sys/kernel/iommu_groups/4/devices/0000:00:0a.0
/sys/kernel/iommu_groups/12/devices/0000:00:1c.0
/sys/kernel/iommu_groups/2/devices/0000:00:01.0
/sys/kernel/iommu_groups/20/devices/0000:08:00.0
/sys/kernel/iommu_groups/20/devices/0000:07:00.0
/sys/kernel/iommu_groups/20/devices/0000:08:01.0
/sys/kernel/iommu_groups/10/devices/0000:00:1b.0
/sys/kernel/iommu_groups/0/devices/0000:00:02.0
/sys/kernel/iommu_groups/19/devices/0000:05:00.0
/sys/kernel/iommu_groups/9/devices/0000:00:1a.0
TrueNAS VM is set up with UEFI as well and here is its config, freshly reinstalled VM, with reimported config
Code:
root@blofeld:~# qm config 150
agent: 1
balloon: 0
bios: ovmf
boot: order=virtio0;ide2
cores: 4
cpu: host
efidisk0: local-zfs:vm-150-disk-0,efitype=4m,size=1M
hostpci0: 0000:02:00,pcie=1,rombar=0
ide2: none,media=cdrom
machine: q35
memory: 32768
meta: creation-qemu=9.0.2,ctime=1741013206
name: storage-general
net0: virtio=BC:24:11:10:21:3D,bridge=vmbr0,firewall=1,tag=67
numa: 0
onboot: 1
ostype: l26
scsihw: virtio-scsi-single
smbios1: uuid=9888ac74-2bec-43cd-aedd-c3d249150199
sockets: 1
tablet: 0
virtio0: local-zfs:vm-150-disk-1,discard=on,iothread=1,size=32G
vmgenid: c37c6124-7bdd-407e-8637-2b2bd3e3879c
pveversion
Code:
root@blofeld:~# pveversion -v
proxmox-ve: 8.3.0 (running kernel: 6.8.12-8-pve)
pve-manager: 8.3.4 (running version: 8.3.4/65224a0f9cd294a3)
proxmox-kernel-helper: 8.1.0
proxmox-kernel-6.8: 6.8.12-8
proxmox-kernel-6.8.12-8-pve-signed: 6.8.12-8
proxmox-kernel-6.8.8-4-pve-signed: 6.8.8-4
proxmox-kernel-6.8.8-3-pve-signed: 6.8.8-3
proxmox-kernel-6.5.13-6-pve-signed: 6.5.13-6
proxmox-kernel-6.5: 6.5.13-6
proxmox-kernel-6.5.11-4-pve-signed: 6.5.11-4
ceph-fuse: 17.2.7-pve3
corosync: 3.1.7-pve3
criu: 3.17.1-2+deb12u1
glusterfs-client: 10.3-5
ifupdown2: 3.2.0-1+pmx11
ksm-control-daemon: 1.5-1
libjs-extjs: 7.0.0-5
libknet1: 1.28-pve1
libproxmox-acme-perl: 1.5.1
libproxmox-backup-qemu0: 1.5.1
libproxmox-rs-perl: 0.3.4
libpve-access-control: 8.2.0
libpve-apiclient-perl: 3.3.2
libpve-cluster-api-perl: 8.0.10
libpve-cluster-perl: 8.0.10
libpve-common-perl: 8.2.9
libpve-guest-common-perl: 5.1.6
libpve-http-server-perl: 5.2.0
libpve-network-perl: 0.10.0
libpve-rs-perl: 0.9.1
libpve-storage-perl: 8.3.3
libspice-server1: 0.15.1-1
lvm2: 2.03.16-2
lxc-pve: 6.0.0-1
lxcfs: 6.0.0-pve2
novnc-pve: 1.5.0-1
proxmox-backup-client: 3.3.3-1
proxmox-backup-file-restore: 3.3.3-1
proxmox-firewall: 0.6.0
proxmox-kernel-helper: 8.1.0
proxmox-mail-forward: 0.3.1
proxmox-mini-journalreader: 1.4.0
proxmox-offline-mirror-helper: 0.6.7
proxmox-widget-toolkit: 4.3.4
pve-cluster: 8.0.10
pve-container: 5.2.4
pve-docs: 8.3.1
pve-edk2-firmware: 4.2023.08-4
pve-esxi-import-tools: 0.7.2
pve-firewall: 5.1.0
pve-firmware: 3.14-3
pve-ha-manager: 4.0.6
pve-i18n: 3.3.3
pve-qemu-kvm: 9.0.2-5
pve-xtermjs: 5.3.0-3
qemu-server: 8.3.8
smartmontools: 7.3-pve1
spiceterm: 3.3.0
swtpm: 0.8.0+pve1
vncterm: 1.8.0
zfsutils-linux: 2.2.7-pve1
Error log from system log on the Proxmox host is in next post as reached character limit. Thanks!
Last edited: