mdraid headers overridden during pve 8 minor upgrade

mszmidt

New Member
May 11, 2026
1
0
1
Hi all,

During a recent minor upgrade of the Proxmox VE (PVE) host I experienced a serious issue with a RAID‑0 array that was managed by mdadm on an AlmaLinux 9.X guest. The disks had been presented to the VM via PCI‑e passthrough, so the guest owned the controllers directly. Disks were used directly without partitioning.

After the host rebooted following the upgrade (performed with apt), the array could no longer be assembled. Inspection of the block devices showed that the md super‑blocks had been overwritten by GPT headers, which destroyed the RAID metadata - including the original disk order.

I backed up the raw disks before attempting any further manipulation and I crafted script (with the help of an AI assistant) that:
  1. created a differential overlay using qemu‑img,
  2. iteratively forced the md header onto the overlay, tried to assemble the array, and attempted a mount,
  3. on any critical mount error rolled back the overlay, swapped the disk order, and repeated the process.

The script eventually succeeded in re‑creating a usable filesystem, yet i have no logs as I was working under pressure of time.

My questions

1. Is this a known edge case? Have other users observed mdraid metadata being overwritten by GPT after a PVE minor upgrade, especially when PCI‑e passthrough is involved?
2. What preventive measures could be taken? Are there best‑practice steps?

Any insight, references to existing bug reports, or suggestions for a more robust recovery workflow would be greatly appreciated.Thank you for your help.

At least i have one conclusion - I should have created raid on GPT partitioned disk instead on raw device.
 
Make sure that the passed through drive controller is not accessible by the Proxmox host (before the VM starts), using early binding to vfio-pci, and anything running on the Proxmox host should not be able to interfere with those drives in any way.

I cannot think of any update that might cause the effect you describe, so I have no idea what could be the cause. I would suspect something inside the VM, if you already made sure that the controller and drives are not visible on the host. Or maybe an action of the system BIOS or IPMI that can access the controller/drives before Proxmox is booted?

EDIT: Please consider making full backups of the VM if the data on a RAID0 is important to you.
 
Last edited: