[SOLVED] Zfs boot disk replacing

FPOlivier

Active Member
Jun 4, 2020
17
2
43
50
Hello,
Sorry for my English, which is poor....
The, I've got a node with 6 disks :
Code:
zpool status
  pool: data
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:31:53 with 0 errors on Sun Apr 12 00:55:54 2026
config:

        NAME                                            STATE     READ WRITE CKSUM
        data                                            ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J0LA2K  ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6N0K0RCU0  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J90N6S  ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J8S6F8  ONLINE       0     0     0

errors: No known data errors

  pool: rpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
        attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
        using 'zpool clear' or replace the device with 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-9P
  scan: scrub repaired 0B in 00:01:22 with 0 errors on Sun Apr 12 00:25:24 2026
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              ONLINE       0     0     0
          mirror-0                                         ONLINE       0     0     0
            ata-WDC_WDS250G2B0A-00SM50_19508W472511-part3  ONLINE       0     0     1
            ata-WDC_WDS250G2B0A-00SM50_19508W478311-part3  ONLINE       0     0     0

errors: No known data errors
I need to change rpool disks (which are boot/system ones)

questions :
1- As these are boot and system disks, will zfs be able to rebuild everything (grub, anything else) or will I have to do some tasks manually ?
2- Why can we see .....-part3 at the end of the identification of these 2 disks? and not in the other pool?
3- If I : "offline+ pull out the disk + put the new one + online it + zpool replace" will it be ok?

Thanks a lot ang have a nice day :)
 
"part3" is third partition holding the big data chunk of disk. Other two partitions are boot and EFI because those disks are your boot devices. Data pools usually use whole disks and name has no partition suffix.

Mirror = same data stored on all disks, so you lose nothing and can rebuild pool from last working disk.

For boot devices there is more to do. Look for "Changing a failed bootable device" on https://pve.proxmox.com/pve-docs/pve-admin-guide.html#chapter_zfs
 
  • Like
Reactions: leesteken
  • Like
Reactions: news
Code:
proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with legacy bios
6901-556E is configured with: grub (versions: 6.14.11-5-pve, 6.17.13-1-pve, 6.17.2-2-pve)
6902-1625 is configured with: grub (versions: 6.14.11-5-pve, 6.17.13-1-pve, 6.17.2-2-pve)

Then; I'll have to do :
Code:
# sgdisk <healthy bootable device> -R <new device>
# sgdisk -G <new device>
# zpool replace -f <pool> <old zfs partition> <new zfs partition>

Next, will I have to use proxmox-boot-tool or grub-install?
A lot of thanks for your help :)
 
Well..., I hesitate to answer - because none of my machines has "booted with legacy bios", all of them report "booted with uefi".
 
thanks to your help, here is what I did and seems to work (I did reboot on each sole drive to validate the boot mirror => it's ok)
I give all if it can help other people:
Code:
me@myhost:~# sgdisk /dev/sdf -R /dev/sdb
The operation has completed successfully.
me@myhost:~# sgdisk -G /dev/sdb
The operation has completed successfully.
me@myhost:~# lsblk -S
NAME HCTL       TYPE VENDOR   MODEL                       REV SERIAL           TRAN
sda  0:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J0LA2K  sata
sdb  4:0:0:0    disk ATA      KINGSTON SKC600256G    S4800120 50026B76878539C0 sata
sdc  6:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6N0K0RCU0  sata
sdd  7:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J90N6S  sata
sde  8:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J8S6F8  sata
sdf  5:0:0:0    disk ATA      WDC WDS250G2B0A-00SM50 401020WD 19508W478311     sata
me@myhost:~# ls -als /dev/disk/by-id/ | grep "8539C0"
0 lrwxrwxrwx 1 root root   9 Apr 14 16:44 ata-KINGSTON_SKC600256G_50026B76878539C0 -> ../../sdb
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part1 -> ../../sdb1
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part2 -> ../../sdb2
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part3 -> ../../sdb3
me@myhost:~# proxmox-boot-tool format /dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2
UUID="" SIZE="1073741824" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT=""
Formatting '/dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2' as vfat..
mkfs.fat 4.2 (2021-01-31)
Done.
me@myhost:~# proxmox-boot-tool init /dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
UUID="8D2A-BE4D" SIZE="1073741824" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT=""
Mounting '/dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2' on '/var/tmp/espmounts/8D2A-BE4D'.
Installing grub i386-pc target..
Installing for i386-pc platform.
Installation finished. No error reported.
Unmounting '/dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2'.
Adding '/dev/disk/by-id/ata-KINGSTON_SKC600256G_50026B76878539C0-part2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
WARN: /dev/disk/by-uuid/6901-556E does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
Copying and configuring kernels on /dev/disk/by-uuid/6902-1625
        Copying kernel 6.14.11-5-pve
        Copying kernel 6.17.13-1-pve
        Copying kernel 6.17.2-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.17.13-1-pve
Found initrd image: /boot/initrd.img-6.17.13-1-pve
Found linux image: /boot/vmlinuz-6.17.2-2-pve
Found initrd image: /boot/initrd.img-6.17.2-2-pve
Found linux image: /boot/vmlinuz-6.14.11-5-pve
Found initrd image: /boot/initrd.img-6.14.11-5-pve
Adding boot menu entry for UEFI Firmware Settings ...
done
Copying and configuring kernels on /dev/disk/by-uuid/8D2A-BE4D
        Copying kernel 6.14.11-5-pve
        Copying kernel 6.17.13-1-pve
        Copying kernel 6.17.2-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.17.13-1-pve
Found initrd image: /boot/initrd.img-6.17.13-1-pve
Found linux image: /boot/vmlinuz-6.17.2-2-pve
Found initrd image: /boot/initrd.img-6.17.2-2-pve
Found linux image: /boot/vmlinuz-6.14.11-5-pve
Found initrd image: /boot/initrd.img-6.14.11-5-pve
Adding boot menu entry for UEFI Firmware Settings ...
done
me@myhost:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
System currently booted with legacy bios
WARN: /dev/disk/by-uuid/6901-556E does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
6902-1625 is configured with: grub (versions: 6.14.11-5-pve, 6.17.13-1-pve, 6.17.2-2-pve)
8D2A-BE4D is configured with: grub (versions: 6.14.11-5-pve, 6.17.13-1-pve, 6.17.2-2-pve)
me@myhost:~# proxmox-boot-tool refresh
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
Re-executing '/etc/kernel/postinst.d/zz-proxmox-boot' in new private mount namespace..
WARN: /dev/disk/by-uuid/6901-556E does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
Copying and configuring kernels on /dev/disk/by-uuid/6902-1625
        Copying kernel 6.14.11-5-pve
        Copying kernel 6.17.13-1-pve
        Copying kernel 6.17.2-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.17.13-1-pve
Found initrd image: /boot/initrd.img-6.17.13-1-pve
Found linux image: /boot/vmlinuz-6.17.2-2-pve
Found initrd image: /boot/initrd.img-6.17.2-2-pve
Found linux image: /boot/vmlinuz-6.14.11-5-pve
Found initrd image: /boot/initrd.img-6.14.11-5-pve
Adding boot menu entry for UEFI Firmware Settings ...
done
Copying and configuring kernels on /dev/disk/by-uuid/8D2A-BE4D
        Copying kernel 6.14.11-5-pve
        Copying kernel 6.17.13-1-pve
        Copying kernel 6.17.2-2-pve
Generating grub configuration file ...
Found linux image: /boot/vmlinuz-6.17.13-1-pve
Found initrd image: /boot/initrd.img-6.17.13-1-pve
Found linux image: /boot/vmlinuz-6.17.2-2-pve
Found initrd image: /boot/initrd.img-6.17.2-2-pve
Found linux image: /boot/vmlinuz-6.14.11-5-pve
Found initrd image: /boot/initrd.img-6.14.11-5-pve
Adding boot menu entry for UEFI Firmware Settings ...
done
me@myhost:~# proxmox-boot-tool clean
Checking whether ESP '6901-556E' exists.. Not found!
Checking whether ESP '6902-1625' exists.. Found!
Checking whether ESP '8D2A-BE4D' exists.. Found!
Sorting and removing duplicate ESPs..
me@myhost:~# zpool status -x
  pool: rpool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: resilvered 32.0G in 00:01:24 with 0 errors on Tue Apr 14 16:50:18 2026
config:

        NAME                                               STATE     READ WRITE CKSUM
        rpool                                              DEGRADED     0     0     0
          mirror-0                                         DEGRADED     0     0     0
            sdb                                            OFFLINE      0     0     0
            ata-WDC_WDS250G2B0A-00SM50_19508W478311-part3  ONLINE       0     0     0

errors: No known data errors
me@myhost:~# lsblk -S
NAME HCTL       TYPE VENDOR   MODEL                       REV SERIAL           TRAN
sda  0:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J0LA2K  sata
sdb  4:0:0:0    disk ATA      KINGSTON SKC600256G    S4800120 50026B76878539C0 sata
sdc  6:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6N0K0RCU0  sata
sdd  7:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J90N6S  sata
sde  8:0:0:0    disk ATA      WDC WD1005FBYZ-01YCBB2     RR07 WD-WMC6M0J8S6F8  sata
sdf  5:0:0:0    disk ATA      WDC WDS250G2B0A-00SM50 401020WD 19508W478311     sata
me@myhost:~# ls -als /dev/disk/by-id/ | grep "50026B76878539C0"
0 lrwxrwxrwx 1 root root   9 Apr 14 16:44 ata-KINGSTON_SKC600256G_50026B76878539C0 -> ../../sdb
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part1 -> ../../sdb1
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part2 -> ../../sdb2
0 lrwxrwxrwx 1 root root  10 Apr 14 17:02 ata-KINGSTON_SKC600256G_50026B76878539C0-part3 -> ../../sdb3
me@myhost:~# zpool replace -f rpool /dev/sdb3 ata-KINGSTON_SKC600256G_50026B76878539C0-part3
cannot replace /dev/sdb3 with ata-KINGSTON_SKC600256G_50026B76878539C0-part3: no such device in pool
me@myhost:~# zpool replace -f rpool /dev/sdb ata-KINGSTON_SKC600256G_50026B76878539C0-part3
me@myhost:~# zpool status
  pool: data
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub repaired 0B in 00:31:53 with 0 errors on Sun Apr 12 00:55:54 2026
config:

        NAME                                            STATE     READ WRITE CKSUM
        data                                            ONLINE       0     0     0
          mirror-0                                      ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J0LA2K  ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6N0K0RCU0  ONLINE       0     0     0
          mirror-1                                      ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J90N6S  ONLINE       0     0     0
            ata-WDC_WD1005FBYZ-01YCBB2_WD-WMC6M0J8S6F8  ONLINE       0     0     0

errors: No known data errors

  pool: rpool
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 32.0G in 00:01:58 with 0 errors on Tue Apr 14 17:22:45 2026
config:

        NAME                                                STATE     READ WRITE CKSUM
        rpool                                               ONLINE       0     0     0
          mirror-0                                          ONLINE       0     0     0
            ata-KINGSTON_SKC600256G_50026B76878539C0-part3  ONLINE       0     0     0
            ata-WDC_WDS250G2B0A-00SM50_19508W478311-part3   ONLINE       0     0     0

errors: No known data errors

I tell you all folks many many thanks !:)
 
Last edited:
  • Like
Reactions: UdoB