ZFS Disk replacement

qgrasso

Member
Jul 31, 2013
27
4
23
Queensland, Australia
Hi All,

We've had a disk fail in our ZFS rpool, were looking for the procedure to replace the disk.

So far we've found a couple of wiki's however, I thought i'd run it by you guys and see if its still correct.

We are running Proxmox VE 4.4.

1) Replace the physical failed/offline drive, /dev/sdc

Initialize Disk
2) From the WebUI, Servername -> Disks -> Initialize Disk with GPT (/dev/sdc)

Copy the partition table from /dev/sda to /dev/sdc
3) sgdisk --replicate=/dev/sdc /dev/sda

Ensure the GUIDs are randomized
4) sgdisk --randomize-guids /dev/sdc

Install the Grub on the new disk
5) grub-install /dev/sdc

Then replace the disk in the ZFS pool,
6) zpool replace rpool /dev/sdc2

Job Done?

Cheers,
Q
 
  • Like
Reactions: Fischer-HWH
I should have found this thread earlier:
I had the same situation (one disk in my rpool was dying) but I replaced the disk without step 2 (initialize disk with GPT from the WebUI).
I directly cloned the partition table with sgdisk, randomized the guids and did a "zpool replace rpool old new". The resilvering is still going.
Now I did a "grub-install /dev/sdb" and I get the following message:
grub-install: Error: unknown filesystem.

Do I need to manually format /dev/sdb1 in this case?

Kind regards

Andreas
 
I just noticed that /boot/ is just a folder on my root zfs volume, so there's no special partition for grub.
Maybe my grub-install is not aware of ZFS?
 
I replaced disks in PM 5.* without step 2.) and it always worked.
I also tested the reboot with only replaced drive in.

I did not get an grub-install unknown filesystem error.
I do not see a boot partition on my servers.
Maybe there is something wrong with partitions or filesystems on your new disk.
Check out partition table and zpool list.

Code:
root@p26:~# fdisk /dev/sda

Welcome to fdisk (util-linux 2.29.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/sda: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 05DB9387-8561-4E6C-84D9-0A2DCDEBDA22

Device          Start        End    Sectors  Size Type
/dev/sda1          34       2047       2014 1007K BIOS boot
/dev/sda2        2048 3907012749 3907010702  1.8T Solaris /usr & Apple ZFS
/dev/sda9  3907012750 3907029134      16385    8M Solaris reserved 1
 
Apart from the disk size, my layout looks exactly like yours:
Code:
# fdisk /dev/sdb

Welcome to fdisk (util-linux 2.25.2).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): p
Disk /dev/sdb: 5,5 TiB, 6001175126016 bytes, 11721045168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 664936B4-76C3-4A69-8D9C-55AB161BA5F2

Device           Start         End     Sectors  Size Type
/dev/sdb1           34        2047        2014 1007K BIOS boot
/dev/sdb2         2048 11721028749 11721026702  5,5T Solaris /usr & Apple ZFS
/dev/sdb9  11721028750 11721045134       16385    8M Solaris reserved 1

I'm using raidz1 and can't boot with one disk only.
My system boots fine, but I want to make sure that in case sda fails, it can still boot. Even after I replaced one drive in my pool.

So to sum this up: You also did a 'grub-install /dev/sdX' on a replaced drive and you got a success message, not the "unknown filesystem" error?
 
Remove sda and test booting. RAIDZ1 works fine without one disk.
If I would have seen unknown filesystem error, i think I would remember.
Just to be sure, I redid grub-install on sdb just now just for you. Here is the result:
Code:
root@p26:~# zpool status
  pool: rpool
 state: ONLINE
  scan: resilvered 871M in 0h0m with 0 errors on Mon May 21 15:08:39 2018
config:

   NAME        STATE     READ WRITE CKSUM
   rpool       ONLINE       0     0     0
     mirror-0  ONLINE       0     0     0
       sda2    ONLINE       0     0     0
       sdb2    ONLINE       0     0     0
   logs
     sdc1      ONLINE       0     0     0
     sdd1      ONLINE       0     0     0
   cache
     sdc2      ONLINE       0     0     0
     sdd2      ONLINE       0     0     0

errors: No known data errors
root@p26:~# fdisk -l /dev/sdb
Disk /dev/sdb: 1.8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 93E30F40-2653-4716-9FD2-0FE0D65EB15D

Device          Start        End    Sectors  Size Type
/dev/sdb1          34       2047       2014 1007K BIOS boot
/dev/sdb2        2048 3907012749 3907010702  1.8T Solaris /usr & Apple ZFS
/dev/sdb9  3907012750 3907029134      16385    8M Solaris reserved 1
root@p26:~# grub-install /dev/sdb
Installing for i386-pc platform.
Installation finished. No error reported.
root@p26:~#
 
Thank you very much!
I'm still not sure why the grub-install fails for me. But I can live with that since the system is running fine for now.
I'll dig into that if I have a scheduled downtime.
 
  • Like
Reactions: trendco
on a side note try not to use sdx rather then using disk by id trust me had a server reboot and ZFS pool was going nuts
 
Thank you very much!
I'm still not sure why the grub-install fails for me. But I can live with that since the system is running fine for now.
I'll dig into that if I have a scheduled downtime.

Did you ever get the grub-install /dev/sdb to work? or anyone else?

I just went through these steps on 2 separate machines. On the 1st machine each step worked perfectly. On the 2nd machine, at the "grub-install /dev/sdb" step I get "grub-install: error: cannot find EFI directory." and I don't understand why.

I think it has something to do with booting the 2nd machine in UEFI mode instead of Legacy but I don't understand it, maybe I don't even need to do the "grub-install /dev/sdb" step on the 2nd machine. Obviously I just want to make sure it boots regardless of any of the 4 ZFS disks failing in the future.
 
Did you ever get the grub-install /dev/sdb to work? or anyone else?

I just went through these steps on 2 separate machines. On the 1st machine each step worked perfectly. On the 2nd machine, at the "grub-install /dev/sdb" step I get "grub-install: error: cannot find EFI directory." and I don't understand why.

I think it has something to do with booting the 2nd machine in UEFI mode instead of Legacy but I don't understand it, maybe I don't even need to do the "grub-install /dev/sdb" step on the 2nd machine. Obviously I just want to make sure it boots regardless of any of the 4 ZFS disks failing in the future.


Most likely you boot EFI and your partition lay-out is in 3 parts (1,2,3). Number 2 is the EFI partition.

Try:
pve-efiboot-tool format <device disk-by-id-part2>
pve-efiboot-tool init <device disk-by-id-part2>


Better Try:
pve-efiboot-tool format /dev/sdX2 --force
pve-efiboot-tool init /dev/sdX2

Source:https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysboot
 
Last edited:
  • Like
Reactions: Fischer-HWH
I was just trying the same (removing a disk and adding it again to make sure its booting from every drive).
My system is using EFI.
I tried the pve-efiboot-tool, but it returns an error:
root@test:~# pve-efiboot-tool format /dev/sdb2
UUID="20A2-29D7" SIZE="536870912" FSTYPE="vfat" PARTTYPE="c12a7328-f81f-11d2-ba4b-00a0c93ec93b" PKNAME="sdb" MOUNTPOINT=""
E: '/dev/sdb2' contains a filesystem ('vfat') - exiting (use --force to override)

root@test:~# fdisk -l /dev/sdb
Disk /dev/sdb: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors
Disk model: WDC WD10EACS-22D
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 274BF179-F089-441F-AD75-365E8315D722

Device Start End Sectors Size Type
/dev/sdb1 34 2047 2014 1007K BIOS boot
/dev/sdb2 2048 1050623 1048576 512M EFI System
/dev/sdb3 1050624 1953525134 1952474511 931G Solaris /usr & Apple ZFS


Any further ideas?
Thanks!
 
O geez, I messed up in my previous comment. What I actually did on my system replacing the second mirror zfs disk was:


pve-efiboot-tool format /dev/sdb2 --force
pve-efiboot-tool init /dev/sdb2

Mind you. This worked fine but I haven't actually tested it like you are doing by removing one of the drives...Too much to do ;-)

Anyway, try the --force option. Whatever happens, you will only kill one disk ;-)
 
My full personal manual for reference is as follows:

######### Where sda is healthy bootable device and sdb is a new device ###########

Steps 1-2 are basically the same as 3. You can add steps 1-2 if previous efforts to complete have failed, just to give it a try.

Step 3 can be done in two ways. The man page of sgdisk itself is not clear. What I have found on the internet 3.1 or 3.2 render the same result, but the syntax is pretty messed/mixed up between the two, so tread carefully.

Step 4 randomizes the guids, use either notation

1 sgdisk --backup=table /dev/sda
2 sgdisk --load-backup=table /dev/sdb

(Choose which one to use, both do the same)
3.1 sgdisk -R /dev/sdb /dev/sda - sgdisk -R /dev/[destination] /dev/[source]
3.2 sgdisk /dev/sda -R /dev/sdb - sgdisk /dev/[source] -R /dev/[destination]

(Choose which one to use, both do the same)
4.1 sgdisk -G /dev/sdb
4.2 sgdisk --randomize-guids /dev/sdb

5 pve-efiboot-tool format /dev/sdb2 --force
6 pve-efiboot-tool init /dev/sdb2
7 zpool attach rpool (sda disk-by-id-part3) (sdb disk-by-id-part3) (Any entry for the drive is ok, just make sure it is part3 at the end)
 
Force did the trick. Thank you.
I hope I don't ever have to do that on a production machine....
 
Just a note in case of a confusion for future readers. :)

Looks like MH_MUC revived an old thread.
In original issue we have had proxmox <= 5 and there is no efi boot with ZFS, and instructions still hold true.
Latter issue looks like from PM 6 where we can have EFI boot and ZFS, hence the new command line tool and instructions.
 
  • Like
Reactions: MH_MUC
Hi Folks, I need help from ye

step 3 above my Team made a silly mistake,

Copy the partition table from /dev/sda to /dev/sdc
3) sgdisk --replicate=/dev/sdc /dev/sda

they did

sgdisk --replicate=new/drive old/drive


Code:
root@pve:~# sgdisk --replicate=/dev/sdd /dev/sde
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.


Server not rebooted yet , is their a way to revert back?

I tried to do a backup and restore my old drive as below:

Code:
root@pve:~# sgdisk --backup=/root/sdd_good /dev/sdd
The operation has completed successfully.


root@pve:~# sgdisk --load-backup=/root/sdd_good /dev/sde
The operation has completed successfully.
root@pve:~# sgdisk --load-backup=/root/sdd_good /dev/sdd
Warning: The kernel is still using the old partition table.
The new table will be used at the next reboot or after you
run partprobe(8) or kpartx(8)
The operation has completed successfully.

Now i am not sure how to confirm when i backup it backup the old drive active partition table, not new partition table?


Just to let you know,

I did run the new drive partprobe

Code:
root@pve:~# partprobe /dev/sde
root@pve:~# ls /dev/sde*
/dev/sde
root@pve:~# ls /dev/sdd*
/dev/sdd  /dev/sdd1  /dev/sdd2  /dev/sdd3

but as you can see above sde not showing 3 partition like sdd.
 
Last edited:
  • Like
Reactions: joy123
Yeah... I did it as well once. It can be a problem if you don't have copy of partition tables. Fortunately my old disk was kind of working, so I've copied my partition table from it and get my data bac, but if your HDD is completely dead - then you may be in trouble.

This links was helpful for me:
https://www.reddit.com/r/sysadmin/comments/3pr6ad/reverse_sgdisk_r_devsdx_devsdy/
https://www.redhat.com/sysadmin/recover-partition-files-testdisk
Thanks for the quick replay.

My old hdd /dev/SDD is perfect and running with old partition table as i havent rebooted yet.
 
Yeah... I did it as well once. It can be a problem if you don't have copy of partition tables. Fortunately my old disk was kind of working, so I've copied my partition table from it and get my data bac, but if your HDD is completely dead - then you may be in trouble.

This links was helpful for me:
https://www.reddit.com/r/sysadmin/comments/3pr6ad/reverse_sgdisk_r_devsdx_devsdy/
https://www.redhat.com/sysadmin/recover-partition-files-testdisk
If you don't mind, can you explain what way you went at the end please?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!