[SOLVED] how do i replace a hard drive in a healthy ZFS raid?

Read my reply. I did basic command in fdisk.
Be shure of driver's (HDD) sign (ie. sda, sdb...) than do the basic fdisk commands.
Good luck :)
I have read it already. This is not working for me because of different partition layout. As you can see, there are three partition and ZFS is on part3

I have to set partition as old HD and as other HDs for Raid-Configuration.
 
In my situation (after damage HDD and putted the new one) this solved my problem:

1667937825335.png
In your case, I don't know what did you have. Maybe you should try type in console:
Bash:
sudo zpool status

and read "action", what do you have to do.

1667938787706.png
 
Last edited:
Hi
I have similar situation. I have installed proxmox with zfs by setup. So now, I have to replace a disk (/dev/sdd). How can I add the new disk with all partition safely? Have do I part it manually before zpool replace?
The Old disk is no more in system. Do I need zpool replace?
Tanks.

View attachment 43028

Theoretical part
As a general rule, sfdisk is used to manage MBR partition tables and sgdisk is used to manage GPT partition tables. Admittedly though
UEFI should fall back to using the /efi/boot/bootx64.efi (more accurately /efi/boot/boot{machine type short-name}.efi)
when the partition IDs change.

keep in mind that if the failed disk is part of a pool you want to boot from... you need to update grub or refresh uefi boot entries
with proxmox boot tool. And copy partition layout from old disk to a new one.

Changing a failed device
# zpool replace -f <pool> <old device> <new device>

Changing a failed bootable device
Depending on how Proxmox VE was installed it is either using proxmox-boot-tool [1] or plain grub as bootloader.
You can check by running:
# proxmox-boot-tool status

The first steps of copying the partition table, reissuing GUIDs and replacing the ZFS partition are the same.
To make the system bootable from the new disk, different steps are needed which depend on the bootloader in use.
# sgdisk <healthy bootable device> -R <new device in the /dev/disk/by-id/ format>
Logically replace command will created the appropriate partitions automatically instead of using (maybe needed though)

But if you need to create the same partition table as the healthy drive do one of the following

create an empty GPT Partition Table on the new hdd with parted:
parted /dev/new-disk
(parted)# print
(parted)# mklabel GPT
(parted)# Yes
(parted)# q

OR

Copy the partition table from a mirror member to the new one
newDisk= '/dev/sda'
healthyDisk='/dev/sdb'
sgdisk -R "$newDisk" "$healthyDisk"
sgdisk -G "$newDisk



# sgdisk -G <new device in the /dev/disk/by-id/ format format>
# zpool replace -f <pool> <old disk partition-probably part3> <new disk partition-probably part3>
Note Use the zpool status -v command to monitor how far the resilvering process of the new disk has progressed.
Afterwards with proxmox-boot-tool:
# proxmox-boot-tool format <new disk's ESP>
# proxmox-boot-tool init <new disk's ESP>
Note ESP stands for EFI System Partition, which is setup as partition #2 on bootable disks setup by the Proxmox VE installer since version 5.4.
For details, see Setting up a new partition for use as synced ESP.

With grub:
# grub-install <new disk>

/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////
Practical part
Remove failed disk and insert a new one (MY SCENARIO. ZFS MIRROR CONFIGURATION BOOT IN UEFI MODE)
CT250MX500SSD1_2028E2B4....

General rule of replacement
# sgdisk <healthy bootable device> -R <new device>
# sgdisk -G <new device>
# zpool replace -f <pool> <old zfs partition> <new zfs partition>memb
# Use the zpool status -v command to monitor how far the resilvering process of the new disk has progressed.
# proxmox-boot-tool format <new disk's ESP> (ESP stands for EFI System Partition, which is setup as partition #2 on bootable disks
setup by the Proxmox VE installer since version 5.4)
# proxmox-boot-tool init <new disk's ESP>

sgdisk /dev/disk/by-id/ata-CT250MX500SSD1_2028E2B4.... -R /dev/disk/by-id/ata-CT250MX500SSD1_2028C456....

lsblk (check afterwards if both disks have the same amount of partitions. sda/sda1,sda2,sda3 and sdb/sdb1,sdb2,sdb3)

sgdisk -G /dev/disk/by-id/ata-CT250MX500SSD1_2028E2B4.... (-G = generalize)

zpool replace -f rpool 872817340134134.... /dev/disk/by-id/ata-CT250MX500SSD1_2028E2B4....-part3

pve-efiboot-tool format /dev/disk/by-id/ata-CT250MX500SSD1_2028E2B4.....-part2 --force (else you get warning messages about begin a member of zfs filesystem)

pve-efiboot-tool init /dev/disk/by-id/ata-CT250MX500SSD1_2028E2B4....-part2

## ls -l /dev/disk/by-id/* if you want to see the id of the disks

Check and post your results
 
Last edited:
I think you have the arguments to sgdisk in the wrong order, I think it is health disk then new disk

Copy the partition table from a mirror member to the new one
newDisk= '/dev/sda'
healthyDisk='/dev/sdb'
sgdisk -R "$newDisk" "$healthyDisk"
sgdisk -G "$newDisk

You have this earlier, which I think is correct:
To make the system bootable from the new disk, different steps are needed which depend on the bootloader in use.
# sgdisk <healthy bootable device> -R <new device in the /dev/disk/by-id/ format>
 
Thank you ieronymous, I could not find this information anywhere else. Here is an example following your model.

New ProxMox server loaded with 8.0 (now updated to 8.1). Used (3) 4 TB Hard Drives in raidz1. OS and ProxMox on the ZFS drives.

Here is the zpool status before error

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 02:47:24 with 0 errors on Sun Apr 14 03:11:26 2024
config:
NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
raidz1-0 ONLINE 0 0 0
ata-WDC_WD40EZAZ-00SF3B0_WD-WXC2D53NE639-part3 ONLINE 0 0 0
ata-WDC_WD40EZAZ-00SF3B0_WD-WXD2D53E41D6-part3 ONLINE 0 0 0
ata-WDC_WD40EZAZ-00SF3B0_WD-WXN2D53MR06D-part3 ONLINE 0 0 0
errors: No known data errors


Then the disk ending in R06D failed

Put in a new 4 TB hard drive, but how do I handle the zpool replace with the "-part3" partition, the new drive does not have any partitions on it?

Following the example from ieronymous

proxmox-boot-tool status - confirm really using uefi

fdisk -l -- Find the good drives in the pool
In my case, they are /dev/sda and /dev/sdb
My new drive has id ending in S8XE

sgdisk /dev/sda -R /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXF2D539S8XE

ls /dev/disk/by-id/
you will see that the new drive has been partitioned as needed.

sgdisk -G /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXF2D539S8XE

zpool replace -f rpool ata-WDC_WD40EZAZ-00SF3B0_WD-WXN2D53MR06D-part3 ata-WDC_WD40EFPX-68C6CN0_WD-WXF2D539S8XE-part3

zpool status
monitor the resilver operation, mind took several hours

After resilver done:
proxmox-boot-tool format /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXF2D539S8XE-part2
proxmox-boot-tool init /dev/disk/by-id/ata-WDC_WD40EFPX-68C6CN0_WD-WXF2D539S8XE-part2
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!