Replacing bad disk on ZFS rpool uefi?

killmasta93

Renowned Member
Aug 13, 2017
973
58
68
31
Hi,
I was wondering if someone else has tried replacing a bad disk on a uefi rpool disk with the disk that had the grub?
Currently this are the steps i take normally but not sure if it gets changed on uefi?
this ex: new disk is called sda and the mirror is called sdh

Code:
sgdisk --replicate=/dev/sda /dev/sdh
sgdisk --randomize-guids /dev/sda
grub-install /dev/sda
zpool replace -f rpool 13843703100769424298 /dev/sda2


Thank you
 
Thanks for the reply,
this is what i did

Code:
sgdisk /dev/disk/by-id/scsi-35000c5003a81d367 -R /dev/disk/by-id/scsi-35000c5003a7dab73
sgdisk -G /dev/disk/by-id/scsi-35000c5003a7dab73
zpool replace -f rpool 8707294139048458014  /dev/disk/by-id/scsi-35000c5003a7dab73-part3

but when i run this code i get this error

Code:
root@prometheus:~# pve-efiboot-tool init /dev/sdb2
Re-executing '/usr/sbin/pve-efiboot-tool' in new private mount namespace..
UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT=""
E: '/dev/sdb2' has wrong filesystem (!= vfat).
root@prometheus:~# pve-efiboot-tool init /dev/sdb1
Re-executing '/usr/sbin/pve-efiboot-tool' in new private mount namespace..
UUID="" SIZE="1031168" FSTYPE="" PARTTYPE="21686148-6449-6e6f-744e-656564454649" PKNAME="sdb" MOUNTPOINT=""
E: '/dev/sdb1' is too small (<256M).
root@prometheus:~# pve-efiboot-tool init /dev/sdb3
Re-executing '/usr/sbin/pve-efiboot-tool' in new private mount namespace..
UUID="" SIZE="299462063616" FSTYPE="" PARTTYPE="6a898cc3-1dd2-11b2-99a6-080020736631" PKNAME="sdb" MOUNTPOINT=""
E: '/dev/sdb3' has wrong partition type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b).


whats odd is that i check the partitions and both are correct and sdb is the new disk

Code:
Disk /dev/sdb: 279.4 GiB, 300000000000 bytes, 585937500 sectors
Disk model: ST9300603SS    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 59DE43EB-FB58-4782-9B45-D7E8E7F4ED51

Device       Start       End   Sectors   Size Type
/dev/sdb1       34      2047      2014  1007K BIOS boot
/dev/sdb2     2048   1050623   1048576   512M EFI System
/dev/sdb3  1050624 585937466 584886843 278.9G Solaris /usr & Apple ZFS


isk /dev/sde: 279.4 GiB, 300000000000 bytes, 585937500 sectors
Disk model: ST9300603SS    
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 12866477-26E8-4554-A22F-EAD9583AA7C6

Device       Start       End   Sectors   Size Type
/dev/sde1       34      2047      2014  1007K BIOS boot
/dev/sde2     2048   1050623   1048576   512M EFI System
/dev/sde3  1050624 585937466 584886843 278.9G Solaris /usr & Apple ZFS
 
also i tried install NTFS-3G the outcome i got this, not sure if i should ignore this? or its normal?

Code:
root@prometheus:~# apt-get install ntfs-3g
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  libntfs-3g883
The following NEW packages will be installed:
  libntfs-3g883 ntfs-3g
0 upgraded, 2 newly installed, 0 to remove and 15 not upgraded.
Need to get 574 kB of archives.
After this operation, 1,888 kB of additional disk space will be used.
Do you want to continue? [Y/n]
Get:1 http://ftp.debian.org/debian buster/main amd64 libntfs-3g883 amd64 1:2017.3.23AR.3-3 [167 kB]
Get:2 http://ftp.debian.org/debian buster/main amd64 ntfs-3g amd64 1:2017.3.23AR.3-3 [407 kB]
Fetched 574 kB in 2s (326 kB/s) 
Selecting previously unselected package libntfs-3g883.
(Reading database ... 45253 files and directories currently installed.)
Preparing to unpack .../libntfs-3g883_1%3a2017.3.23AR.3-3_amd64.deb ...
Unpacking libntfs-3g883 (1:2017.3.23AR.3-3) ...
Selecting previously unselected package ntfs-3g.
Preparing to unpack .../ntfs-3g_1%3a2017.3.23AR.3-3_amd64.deb ...
Unpacking ntfs-3g (1:2017.3.23AR.3-3) ...
Setting up libntfs-3g883 (1:2017.3.23AR.3-3) ...
Setting up ntfs-3g (1:2017.3.23AR.3-3) ...
Processing triggers for libc-bin (2.28-10) ...
Processing triggers for man-db (2.8.5-2) ...
Processing triggers for initramfs-tools (0.133+deb10u1) ...
update-initramfs: Generating /boot/initrd.img-5.4.34-1-pve
Running hook script 'zz-pve-efiboot'..
Re-executing '/etc/kernel/postinst.d/zz-pve-efiboot' in new private mount namespace..
Copying and configuring kernels on /dev/disk/by-uuid/5BDF-28DB
    Copying kernel and creating boot-entry for 5.4.34-1-pve
WARN: /dev/disk/by-uuid/5BE1-0E5A does not exist - clean '/etc/kernel/pve-efiboot-uuids'! - skipping
 
E: '/dev/sdb2' has wrong filesystem (!= vfat).

I mean, did you format it? Having an EFI partition is one thing, having a valid filesystem on it another:
pve-efiboot-tool format /dev/sdb2

WARN: /dev/disk/by-uuid/5BE1-0E5A does not exist - clean '/etc/kernel/pve-efiboot-uuids'! - skipping

Yes, that's your old drive which is now gone. You can drop that mentioned UUID from that file, not sure if we have a command for that (no access to a PVE installation currently to look)
 
  • Like
Reactions: flames
Thanks for the reply, the disk i added was clean, after that i ran what the wiki said


sgdisk /dev/disk/by-id/scsi-35000c5003a81d367 -R /dev/disk/by-id/scsi-35000c5003a7dab73
sgdisk -G /dev/disk/by-id/scsi-35000c5003a7dab73
zpool replace -f rpool 8707294139048458014 /dev/disk/by-id/scsi-35000c5003a7dab73-part3
but putting the last 2 commands shows the error above saying it has the partition should i first run the pve-efiboot-tool before running sgdisk?

This is the wiki
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_zfs_administration

Changing a failed bootable device when using systemd-boot
# sgdisk <healthy bootable device> -R <new device>
# sgdisk -G <new device>
# zpool replace -f <pool> <old zfs partition> <new zfs partition>
# pve-efiboot-tool format <new disk's ESP>
# pve-efiboot-tool init <new disk's ESP>

Thank you
 
thanks for the reply, so i did a on a test server before doing on production just want to make sure its safe so because the disks already have the partition but not the grub i re ran this command

# pve-efiboot-tool format /dev/sd2 --force
# pve-efiboot-tool init /dev/sd2 --force

and i removed the other good disks on the test machine and it did boot up but want to make sure i wont screw it up on the main server

Thank you
 
I've tried the above but I still get the same error message:

# pve-efiboot-tool format /dev/sda2 --force UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT="" Formatting '/dev/sda2' as vfat.. mkfs.fat 4.2 (2021-01-31) Done. # pve-efiboot-tool init /dev/sda2 --force Re-executing '/usr/sbin/pve-efiboot-tool' in new private mount namespace.. UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sda" MOUNTPOINT="" E: '/dev/sda2' has wrong filesystem (!= vfat).

What else can I do?
 
Maybe you can advice me also? Anyone has an ideas how to get out of this situation?

Partition 2 is correctly formatted and the right EFI type. (re-applied it with fdisk just to be sure...)

run: proxmox-boot-tool format /dev/nvme0n1p2
All ok here.
But... init wont work.

Disk /dev/nvme0n1: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors Disk model: Corsair MP600 PRO XT Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: C3ABADBC-30C1-4C5A-8461-11CD126BA1CD Device Start End Sectors Size Type /dev/nvme0n1p1 34 2047 2014 1007K BIOS boot /dev/nvme0n1p2 2048 1050623 1048576 512M EFI System /dev/nvme0n1p3 1050624 1048576000 1047525377 499.5G Solaris /usr & Apple ZFS /dev/nvme0n1p4 1048578048 7814037134 6765459087 3.2T Solaris /usr & Apple ZFS

root@ubuntu:/# proxmox-boot-tool init /dev/nvme0n1p2 Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace.. UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="" PKNAME="nvme0n1" MOUNTPOINT="" E: '/dev/nvme0n1p2' has wrong partition type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b).

"E: '/dev/nvme0n1p2' has wrong partition type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b)."
I cant find anything about this... and "c12a7328-f81f-11d2-ba4b-00a0c93ec93b" seems to be the EFI id.
 
Maybe you can advice me also? Anyone has an ideas how to get out of this situation?

Partition 2 is correctly formatted and the right EFI type. (re-applied it with fdisk just to be sure...)

run: proxmox-boot-tool format /dev/nvme0n1p2
All ok here.
But... init wont work.

Disk /dev/nvme0n1: 3.64 TiB, 4000787030016 bytes, 7814037168 sectors Disk model: Corsair MP600 PRO XT Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disklabel type: gpt Disk identifier: C3ABADBC-30C1-4C5A-8461-11CD126BA1CD Device Start End Sectors Size Type /dev/nvme0n1p1 34 2047 2014 1007K BIOS boot /dev/nvme0n1p2 2048 1050623 1048576 512M EFI System /dev/nvme0n1p3 1050624 1048576000 1047525377 499.5G Solaris /usr & Apple ZFS /dev/nvme0n1p4 1048578048 7814037134 6765459087 3.2T Solaris /usr & Apple ZFS

root@ubuntu:/# proxmox-boot-tool init /dev/nvme0n1p2 Re-executing '/usr/sbin/proxmox-boot-tool' in new private mount namespace.. UUID="" SIZE="536870912" FSTYPE="" PARTTYPE="" PKNAME="nvme0n1" MOUNTPOINT="" E: '/dev/nvme0n1p2' has wrong partition type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b).

"E: '/dev/nvme0n1p2' has wrong partition type (!= c12a7328-f81f-11d2-ba4b-00a0c93ec93b)."
I cant find anything about this... and "c12a7328-f81f-11d2-ba4b-00a0c93ec93b" seems to be the EFI id.
Try rebooting to see if it fixes the problem, as my problem was almost the same but with the FSTYPE that appeared to be empty but it wasn't. From your command output you have a similar problem but with the PARTTYPE.
Try rebooting and let us know, my guess is that "some info" is not updated as it should and a reboot fixes that so the next time you try everything is ok.
 
Try rebooting to see if it fixes the problem, as my problem was almost the same but with the FSTYPE that appeared to be empty but it wasn't. From your command output you have a similar problem but with the PARTTYPE.
Try rebooting and let us know, my guess is that "some info" is not updated as it should and a reboot fixes that so the next time you try everything is ok.
nop. I tried to change it manually but nothing...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!