[SOLVED] Simulating ZFS RAID-1disk failure on ProxMox boot disks

adamitj

New Member
Dec 2, 2024
3
0
1
Hello.

After I discovered ProxMox I'm moving from VMWare but I'm trying to understand some key concepts of ZFS since my server RAID failed me at least twice corrupting data on my 2 SATA SSD devices on RAID 1. I'm still using hardware RAID for now, but planning to study ZFS before moving from it to use compression and to have more resilience, so I'll not discuss the pros and cons of my current hardware capabilities.

I have installed a guest ProxMox under my ProxMox host server willing to troubleshoot how to swap one of the ZFS RAID-1 disks in a event of hardware failure, let's say, when one disk "burns" and is unrecognizable, being just a piece of garbage, and the server was turned off.

But that scared me to the bone.

If I just remove one of the disks detaching it from the guest server, ProxMox is not able to find Grub and boot. So I won't be able to do anything. I was expecting it to boot at least.

Therefore I'm coming here for advice on how to proceed if this happens on a production server. I searched for documentation but found nothing for this case, only for detaching a disk from the current pool before the crash happens with the server up.

Could someone point a documentation or help me to find a way to troubleshoot this kind of problem?

I installed ProxMox 8.3.3 using 2 basic QEMU harddisks of 32G with zfs RAID-1.
 
If I just remove one of the disks detaching it from the guest server, ProxMox is not able to find Grub and boot. So I won't be able to do anything. I was expecting it to boot at least.
Is it Proxmox of just the motherboard? The ESP for booting Proxmox is duplicated on both (or more) drives (check with proxmox-boot-tool status).
Maybe the motherboard boot selection was set to only the removed drive? Can you try manually selecting the other drive in the motherboard BIOS boot menu?
I installed ProxMox 8.3.3 using 2 basic QEMU harddisks of 32G with zfs RAID-1.
Did you install Proxmox VE inside a VM? Or are you testing a VM with a ZFS mirror? Make sure to check the virtual boot order in that case.
 
  • Like
Reactions: LnxBil
Is it Proxmox of just the motherboard? The ESP for booting Proxmox is duplicated on both (or more) drives (check with proxmox-boot-tool status).
Maybe the motherboard boot selection was set to only the removed drive? Can you try manually selecting the other drive in the motherboard BIOS boot menu?

Did you install Proxmox VE inside a VM? Or are you testing a VM with a ZFS mirror? Make sure to check the virtual boot order in that case.
I'm sorry if I didn't make myself clear, but it is a Guest ProxMox VE inside a bare-metal ProxMox VE host. And indeed, I'm noob with ProxMox VE therefore I didn't pay attention to the boot order. That was a nice catch, disabling network and IDE CD-ROM boot, it could boot without any problems apart from a small delay.

@LnxBil thank you for your notes.

So after booting I did these procedure to add another disk (a third one) to the pool thus recovering the RAID 1 completely. The `zpool replace -f` command did not work and I did not went deep on debug since detaching the missing disk from the pool and adding another one solved my problem. Feel free to add your concerns/observations:


List Disks
Bash:
root@pve:~# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   32G  0 disk
sdb      8:16   0   32G  0 disk
├─sdb1   8:17   0 1007K  0 part
├─sdb2   8:18   0  512M  0 part
└─sdb3   8:19   0 31.5G  0 part
sr0     11:0    1 1024M  0 rom

List ZFS status
Bash:
root@pve:~# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
        invalid.  Sufficient replicas exist for the pool to continue
        functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
config:

        NAME                                            STATE     READ WRITE CKSUM
        rpool                                           DEGRADED     0     0     0
          mirror-0                                      DEGRADED     0     0     0
            16726577199004995155                        UNAVAIL      0     0     0  was /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part3  ONLINE       0     0     0

errors: No known data errors

Remove faulty disk from array
Bash:
root@pve:~# zpool detach /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3-scsi0-part3
root@pve:~# zpool status
  pool: rpool
 state: ONLINE
config:

        NAME                                          STATE     READ WRITE CKSUM
        rpool                                         ONLINE       0     0     0
          scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part3  ONLINE       0     0     0

errors: No known data errors

Copy partition table from the working disk to the new disk
Bash:
root@pve:~# sgdisk /dev/sdb -R /dev/sda
The operation has completed successfully.

root@pve:~# sgdisk -G /dev/sda
The operation has completed successfully.

Show new partition table
Bash:
root@pve:~# lsblk
NAME   MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
sda      8:0    0   32G  0 disk
├─sda1   8:1    0 1007K  0 part
├─sda2   8:2    0  512M  0 part
└─sda3   8:3    0 31.5G  0 part
sdb      8:16   0   32G  0 disk
├─sdb1   8:17   0 1007K  0 part
├─sdb2   8:18   0  512M  0 part
└─sdb3   8:19   0 31.5G  0 part
sr0     11:0    1 1024M  0 rom

Attach new disk to the mirror
Bash:
root@pve:~# zpool attach rpool /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part3 /dev/disk/by-id/scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3

Check resilvering status
Bash:
root@pve:~# zpool status -v
  pool: rpool
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
        continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Mon Jan 27 13:06:16 2025
        1.82G / 1.82G scanned, 112M / 1.82G issued at 56.0M/s
        96.6M resilvered, 6.03% done, 00:00:31 to go
config:

        NAME                                            STATE     READ WRITE CKSUM
        rpool                                           ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part3  ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3  ONLINE       0     0     0  (resilvering)

errors: No known data errors

When done
Bash:
root@pve:~# zpool status -v
  pool: rpool
 state: ONLINE
  scan: resilvered 1.86G in 00:00:54 with 0 errors on Mon Jan 27 13:07:10 2025
config:

        NAME                                            STATE     READ WRITE CKSUM
        rpool                                           ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi1-part3  ONLINE       0     0     0
            scsi-0QEMU_QEMU_HARDDISK_drive-scsi0-part3  ONLINE       0     0     0

errors: No known data errors
 
Just missed a step to reinstall GRUB on the new disk. Without that, if you lost the previous working disk and needs to boot on this last attached disk, it will fail.

Format new EFI partition (the one with 512M)
Bash:
root@pve:/etc/kernel# proxmox-boot-tool format /dev/sda2
UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Formatting '/dev/sda2' as vfat..
mkfs.fat 4.2 (2021-01-31)
Done.

Install GRUB on new disk
Bash:
root@pve:/etc/kernel# proxmox-boot-tool init /dev/sda2 [grub]
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="6B37-DC4E" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Mounting '/dev/sda2' on '/var/tmp/espmounts/6B37-DC4E'.
Installing grub i386-pc target..
Installing for i386-pc platform.
Installation finished. No error reported.
Unmounting '/dev/sda2'.
Adding '/dev/sda2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
WARN: 12396417041760752798 read from /etc/kernel/proxmox-boot-uuids does not look like a VFAT-UUID - skipping
Copying and configuring kernels on /dev/disk/by-uuid/6B37-DC4E
        Copying kernel 6.8.4-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.8.4-2-pve
Found initrd image: /boot/initrd.img-6.8.4-2-pve
done
Copying and configuring kernels on /dev/disk/by-uuid/DCC8-2CED
        Copying kernel 6.8.4-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.8.4-2-pve
Found initrd image: /boot/initrd.img-6.8.4-2-pve
done