Has anyone seen DMA PTE / MegaRAID controller resets after upgrading to recent Proxmox 8 kernels?

tycjan · Jun 18, 2026

We recently upgraded several 2-node Proxmox clusters from kernel 6.8.12-20-pve to 6.8.12-29-pve and started seeing instability on one node in multiple clusters.

Environment:

Proxmox VE 8.x
Dell PowerEdge R350
Dell PERC H345 RAID controller (megaraid_sas)
2-node clusters with Corosync + QDevice + GlusterFS
Intel VT-d enabled

Observed symptoms:

Corosync node drops and rejoins
Unexpected node reboots in some locations
Filesystem recovery required after reboot on a few hosts
MegaRAID controller resets reported by the kernel

Kernel messages included:

DMAR: ERROR: DMA PTE for vPFN already set
followed by traces involving:

intel_iommu_map_pages iommu_dma_map_sg scsi_dma_map megasas_build_and_issue_cmd_fusion
and later:

megaraid_sas: resetting fusion adapter scsi0
Interestingly, the issue was observed across multiple servers with nearly identical hardware after the kernel upgrade.

As a mitigation we added:

intel_iommu=on iommu=pt
to the kernel command line.

Since applying this change:

No new DMA PTE errors have been observed
No new MegaRAID controller resets have been observed
Clusters have remained stable

Has anyone experienced similar behavior with:

Dell R350 (or similar 15G Dell servers)
PERC H345 / MegaRAID controllers
Proxmox 8.x kernels in the 6.8 series
Intel IOMMU / VT-d enabled

I'm particularly interested in whether this is a known regression/change in IOMMU behavior, a MegaRAID driver issue, or a firmware interaction with Dell RMRR/DMAR tables.

Any feedback or similar experiences would be appreciated.

fabian · Jun 18, 2026

please upgrade to the -30 kernel, the -29 one had an IOMMU related regression that is fixed there.

leesteken · Jun 18, 2026

intel_iommu=on is no longer necessary since it is on by default since kernel version 6.8: https://pve.proxmox.com/pve-docs-8/pve-admin-guide.html#_configuration_14
iommu=pt might indeed help non-passedthrough hardware that cannot handle IOMMU very well as it tells the IOMMU to use the identity mapping. There have been more threads on this forum about some RAID controllers having issues with the IOMMU on by default. Newer kernel versions might have driver fixes for such issues.

tycjan · Jun 18, 2026

fabian said:
please upgrade to the -30 kernel, the -29 one had an IOMMU related regression that is fixed there.

We will try to upgrade it once again and have a look.

tycjan · Jun 18, 2026

leesteken said:
intel_iommu=on is no longer necessary since it is on by default since kernel version 6.8: https://pve.proxmox.com/pve-docs-8/pve-admin-guide.html#_configuration_14
iommu=pt might indeed help non-passedthrough hardware that cannot handle IOMMU very well as it tells the IOMMU to use the identity mapping. There have been more threads on this forum about some RAID controllers having issues with the IOMMU on by default. Newer kernel versions might have driver fixes for such issues.

Thx, I will have a look.

iss-integration · Jun 30, 2026

I can confirm that we are seeing a similar issue on the latest Proxmox 9 in our lab. The fix so far was to disable IOMMU since we're not doing any pass-through in that particular cluster. iommu=pt still fails, so we had to set intel_iommu=off

Here is a sampling of the messages:

sd ... [sdb] tag#640 OCR is requested due to IO timeout!!
sd ... [sdb] SCSI host state: 5 SCSI host busy: 5 FW outstanding: 5
megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion ...
megaraid_sas 0000:02:00.0: [ 0]waiting for 5 commands to complete for scsi0
... [ 5] ... [10] ... [15] ... [20] ... [25] ... [30]waiting for 5 commands to complete

pveversion
pve-manager/9.2.3/d0fde103346cf89a (running kernel: 7.0.12-1-pve)

fabian · Jul 6, 2026

iss-integration said:
I can confirm that we are seeing a similar issue on the latest Proxmox 9 in our lab. The fix so far was to disable IOMMU since we're not doing any pass-through in that particular cluster. iommu=pt still fails, so we had to set intel_iommu=off

Here is a sampling of the messages:

sd ... [sdb] tag#640 OCR is requested due to IO timeout!!
sd ... [sdb] SCSI host state: 5 SCSI host busy: 5 FW outstanding: 5
megaraid_sas 0000:02:00.0: megasas_disable_intr_fusion ...
megaraid_sas 0000:02:00.0: [ 0]waiting for 5 commands to complete for scsi0
... [ 5] ... [10] ... [15] ... [20] ... [25] ... [30]waiting for 5 commands to complete

pveversion
pve-manager/9.2.3/d0fde103346cf89a (running kernel: 7.0.12-1-pve)

that is a totally different kernel, please open a new thread (and try the latest kernel in the 7.0.x series)!

iss-integration · Jul 6, 2026

Got it - https://bugzilla.proxmox.com/show_bug.cgi?id=7790

uzumo · Jul 6, 2026

No. I think what they're saying is that you shouldn't hijack this thread created by tycjan, but rather start a new one, right?

Has anyone seen DMA PTE / MegaRAID controller resets after upgrading to recent Proxmox 8 kernels?

tycjan

New Member

fabian

Proxmox Staff Member

leesteken

Distinguished Member

tycjan

New Member

tycjan

New Member

iss-integration

Active Member

fabian

Proxmox Staff Member

iss-integration

Active Member

uzumo

Well-Known Member

We value your privacy