replace Disk on ZFS uefi?

killmasta93

Renowned Member
Aug 13, 2017
973
58
68
31
HI
I was wondering if someone could shed some light
Currently on my test environment trying to test out how to replace a disk when bios is using uefi
These are the steps i have so far, Were going to assume that ata-VBOX_HARDDISK_VBd49634a6-155a6124-part3 died

Code:
root@pve:~# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
    invalid.  Sufficient replicas exist for the pool to continue
    functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
config:

    NAME                                             STATE     READ WRITE CKSUM
    rpool                                            DEGRADED     0     0     0
      mirror-0                                       DEGRADED     0     0     0
        6681090437292701121                          UNAVAIL      0     0     0  was /dev/disk/by-id/ata-VBOX_HARDDISK_VBd49634a6-155a6124-part3
        ata-VBOX_HARDDISK_VB8eb78106-9ccb7531-part3  ONLINE       0     0     0
      mirror-1                                       ONLINE       0     0     0
        ata-VBOX_HARDDISK_VB67720d2e-4bea2bd1-part3  ONLINE       0     0     0
        ata-VBOX_HARDDISK_VB418c327c-62f3b4ce-part3  ONLINE       0     0     0

errors: No known data errors

we turn off the VM and put in the new disk and the healthy mirror disk is

ata-VBOX_HARDDISK_VB8eb78106-9ccb7531

and the new disk is ata-VBOX_HARDDISK_VBd740ad61-d73df170

so i need to replicate the partitions

Code:
root@pve:~# sgdisk /dev/disk/by-id/ata-VBOX_HARDDISK_VB8eb78106-9ccb7531 -R /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170

The operation has completed successfully.


root@pve:~# sgdisk -G /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170
The operation has completed successfully.

then replace the pool wait to resilver
Code:
zpool replace -f rpool 6681090437292701121 /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part3

now the issue is when we install grub i run this command i get this error

Code:
root@pve:~# pve-efiboot-tool format /dev/sdb2
UUID="BBCC-6520" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT=""
E: '/dev/sdb2' contains a filesystem ('vfat') - exiting (use --force to override)


i made sure the VM was using UEFI

Code:
root@pve:~# efibootmgr -v
BootCurrent: 000A
Timeout: 0 seconds
BootOrder: 000A,0009,0008,0007,0000,0001,0003,0004,0005,0006,0002
Boot0000* UiApp    FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(462caa21-7614-4503-836e-8ab6f4662331)
Boot0001* UEFI VBOX CD-ROM VB2-01700376     PciRoot(0x0)/Pci(0x1,0x1)/Ata(1,0,0)N.....YM....R,Y.
Boot0002* UEFI VBOX HARDDISK VBd740ad61-d73df170     PciRoot(0x0)/Pci(0xd,0x0)/Sata(0,65535,0)N.....YM....R,Y.
Boot0003* UEFI VBOX HARDDISK VB8eb78106-9ccb7531     PciRoot(0x0)/Pci(0xd,0x0)/Sata(1,65535,0)N.....YM....R,Y.
Boot0004* UEFI VBOX HARDDISK VB67720d2e-4bea2bd1     PciRoot(0x0)/Pci(0xd,0x0)/Sata(2,65535,0)N.....YM....R,Y.
Boot0005* UEFI VBOX HARDDISK VB418c327c-62f3b4ce     PciRoot(0x0)/Pci(0xd,0x0)/Sata(3,65535,0)N.....YM....R,Y.
Boot0006* EFI Internal Shell    FvVol(7cb8bdc9-f8eb-4f34-aaea-3ee4af6516a1)/FvFile(7c04a583-9e3e-4f1c-ad65-e05268d0b4d1)
Boot0007* Linux Boot Manager    HD(2,GPT,fb089919-b25a-4a96-a97a-20a53806839a,0x800,0x100000)/File(\EFI\systemd\systemd-bootx64.efi)
Boot0008* Linux Boot Manager    HD(2,GPT,8771c20b-5ac7-4ef6-badb-7fee992a8408,0x800,0x100000)/File(\EFI\systemd\systemd-bootx64.efi)
Boot0009* Linux Boot Manager    HD(2,GPT,6d73a03d-24dd-4453-a25a-6b2cf5b756e7,0x800,0x100000)/File(\EFI\systemd\systemd-bootx64.efi)
Boot000A* Linux Boot Manager    HD(2,GPT,42b0de79-354e-4193-b147-89db4438aba7,0x800,0x100000)/File(\EFI\systemd\systemd-bootx64.efi)

Thank you
 
UUID="BBCC-6520" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT="" E: '/dev/sdb2' contains a filesystem ('vfat') - exiting (use --force to override)
seems the 'new' disk used to be used for something before hand (or the image file of this new disk happens to be where an older disk was) - and it contains a vfat at the place where you want to create one.

are you sure that /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170 is really /dev/sdb?
(`ls -l /dev/disk/by-id` should show this)

if you're sure the disk is fresh and you don't need the data on it anymore you can use the --force parameter

I hope this helps!
 
Thanks for the reply, correct im testing out on a test environment using Virtualbox on UEFI

so your right it was not sdb, so instead im using the disk id so it wont happened

But im getting this error, my question is first install grub on the disk or first put it in the pool? in my case i already put it in the pool
The disk that is new is VBOX_HARDDISK VBd740ad61-d73df170 (sda)

normally grub would be installed on part2

Code:
root@pve:~# lsblk -o name,model,serial
NAME   MODEL         SERIAL
sda    VBOX_HARDDISK VBd740ad61-d73df170
|-sda1               
|-sda2               
`-sda3               
sdb    VBOX_HARDDISK VB8eb78106-9ccb7531
|-sdb1               
|-sdb2               
`-sdb3               
sdc    VBOX_HARDDISK VB67720d2e-4bea2bd1
|-sdc1               
|-sdc2               
`-sdc3               
sdd    VBOX_HARDDISK VB418c327c-62f3b4ce
|-sdd1               
|-sdd2               
`-sdd3

Code:
root@pve:~# pve-efiboot-tool format /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2
UUID="3114495972267699916" SIZE="536870912" FSTYPE="zfs_member" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
E: '/dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2' contains a filesystem ('zfs_member') - exiting (use --force to override)

Would I just force it?
 
so this is what tried, im guessing it worked but not sure why i needed to add force

Code:
root@pve:~# pve-efiboot-tool format /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2 --force
UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Formatting '/dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2' as vfat..
mkfs.fat 4.1 (2017-01-24)
Done.

Code:
root@pve:~# pve-efiboot-tool init /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2
Re-executing '/usr/sbin/pve-efiboot-tool' in new private mount namespace..
UUID="DCE1-9B45" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT=""
Mounting '/dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2' on '/var/tmp/espmounts/DCE1-9B45'.
Installing systemd-boot..
Created "/var/tmp/espmounts/DCE1-9B45/EFI/systemd".
Created "/var/tmp/espmounts/DCE1-9B45/EFI/BOOT".
Created "/var/tmp/espmounts/DCE1-9B45/loader".
Created "/var/tmp/espmounts/DCE1-9B45/loader/entries".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/DCE1-9B45/EFI/systemd/systemd-bootx64.efi".
Copied "/usr/lib/systemd/boot/efi/systemd-bootx64.efi" to "/var/tmp/espmounts/DCE1-9B45/EFI/BOOT/BOOTX64.EFI".
Created EFI boot entry "Linux Boot Manager".
Configuring systemd-boot..
Unmounting '/dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2'.
Adding '/dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2' to list of synced ESPs..
Refreshing kernels and initrds..
Running hook script 'proxmox-auto-removal'..
Running hook script 'zz-proxmox-boot'..
WARN: /dev/disk/by-uuid/BBC9-E27D does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
Copying and configuring kernels on /dev/disk/by-uuid/BBCC-6520
    Copying kernel and creating boot-entry for 5.4.106-1-pve
Copying and configuring kernels on /dev/disk/by-uuid/BBCE-7E54
    Copying kernel and creating boot-entry for 5.4.106-1-pve
Copying and configuring kernels on /dev/disk/by-uuid/BBD1-50C3
    Copying kernel and creating boot-entry for 5.4.106-1-pve
Copying and configuring kernels on /dev/disk/by-uuid/DCE1-9B45
    Copying kernel and creating boot-entry for 5.4.106-1-pve

then i verified GRUB and i removed the old disk nano /etc/kernel/proxmox-boot-uuids

Code:
root@pve:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
WARN: /dev/disk/by-uuid/BBC9-E27D does not exist - clean '/etc/kernel/proxmox-boot-uuids'! - skipping
BBCC-6520 is configured with: uefi
BBCE-7E54 is configured with: uefi
BBD1-50C3 is configured with: uefi
DCE1-9B45 is configured with: uefi


after that i got this

Code:
root@pve:~# proxmox-boot-tool status
Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace..
BBCC-6520 is configured with: uefi
BBCE-7E54 is configured with: uefi
BBD1-50C3 is configured with: uefi
DCE1-9B45 is configured with: uefi

im going to assume it worked?

1644529758165.png
 
then i verified GRUB and i removed the old disk nano /etc/kernel/proxmox-boot-uuids
you can also use `proxmox-boot-tool clean` for this :)
Would I just force it?
In this case using `--force` is needed because the meta-data for the partition is wrong/reported wrong from `lsblk`
But as usual: always verify that you don't need the data on the partition (mount it check with `ls`, check the partition size in `lsblk`)
BBCC-6520 is configured with: uefi
usually this should also print the kernel-versions copied there - could you mount one of the partitions and paste the output of find on the mountpoint:

Code:
mkdir /mnt/temp
mount /dev/disk/by-id/ata-VBOX_HARDDISK_VBd740ad61-d73df170-part2 /mnt/temp
find /mnt/temp
umount /mnt/temp

im going to assume it worked?
especially since this is a test-environment in virtualbox - you can always try - reboot and see if the system comes back up again
 
Thanks for the reply, Correct i rebooted, and it works with no problem

This is the outcome

Code:
root@pve:~# find /mnt/temp
/mnt/temp
/mnt/temp/EFI
/mnt/temp/EFI/proxmox
/mnt/temp/EFI/proxmox/5.4.106-1-pve
/mnt/temp/EFI/proxmox/5.4.106-1-pve/vmlinuz-5.4.106-1-pve
/mnt/temp/EFI/proxmox/5.4.106-1-pve/initrd.img-5.4.106-1-pve
/mnt/temp/EFI/systemd
/mnt/temp/EFI/systemd/systemd-bootx64.efi
/mnt/temp/EFI/BOOT
/mnt/temp/EFI/BOOT/BOOTX64.EFI
/mnt/temp/loader
/mnt/temp/loader/entries
/mnt/temp/loader/entries/proxmox-5.4.106-1-pve.conf
/mnt/temp/loader/loader.conf
 
thanks, this post help me test my disk replacement on a healthy rpool. I wish the proxmox wiki was more detailed and gave an example like this instead of just 3 lines of commands ... because I wasn't sure where to start if I was installed/booting off the rpool.
 
  • Like
Reactions: killmasta93

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!