HELP !! Recovering ZFS disk header after running sgdisk -Z on one RAIDZ disk

kybi

Member
Dec 9, 2020
4
1
23
50
Hi everyone,


I have a ZFS RAIDZ (3 disks) setup that was created using whole disks (no partitions).
Unfortunately, I accidentally ran the following command on one of the member disks:

Code:
sgdisk -Z /dev/sdb


This wiped the partition table and GPT headers.
Now ZFS doesn’t recognize the disk as part of the pool anymore.


Here’s the situation:
  • Pool type: RAIDZ (3 disks)
  • Disks used: /dev/sda, /dev/sdb, /dev/sdc
  • Accidentally wiped: /dev/sdb
  • The other two disks are intact and still part of the pool
Code:
zpool import -f -R /mnt/test rpool

cannot import 'rpool': one or more devices is currently unavailable

My questions:
  1. Is it possible to restore the ZFS disk header (or partition table) from the backup GPT?
  2. If ZFS was using the entire raw disk (no partitions), can I just re-add or replace the disk without data loss?
  3. What’s the safest way to check or repair without destroying the existing pool metadata?

Any suggestions or example commands (like zpool import -f, sgdisk --replicate, etc.) would be really appreciated.


Thanks in advance!
 
Last edited:
but I can not import zpool after reboot.

Code:
zpool import -f rpool
cannot import 'rpool': one or more devices is currently unavailable


also problem with

Code:
sgdisk /dev/sda -R /dev/sdb

Caution: invalid main GPT header, but valid backup; regenerating main header
from backup!

Warning: Invalid CRC on main header data; loaded backup partition table.
Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
on the recovery & transformation menu to examine the two tables.

Warning! Main partition table CRC mismatch! Loaded backup partition table
instead of main partition table!

Warning! One or more CRCs don't match. You should repair the disk!
Main header: ERROR
Backup header: OK
Main partition table: ERROR
Backup partition table: OK

Invalid partition data!
 
Last edited:
Situations like this is why we tell people, RAID is not a backup.

Do you have any backups of anything on rpool?

https://github.com/kneutron/ansitest/tree/master/proxmox

Look into the bkpcrit script, point it to external disk / NAS, run it nightly in cron. After reinstalling, take pictures of the GUI documenting your network / storage / Datacenter Backup config.

I would recommend you rebuild your rpool as a standard mirror - raidz1 is not a common configuration for ZFS boot. And separate OS from data - put your LXC/VMs on a separate pool.

If you're not running a business and you don't NEED to install to zfs mirror to maintain uptime, reinstall proxmox to single-disk ext4+LVM and use the 2 other disks as a separate zpool mirror for VMs and such. Backups/restore of OS will be much easier.
 
I have a ZFS RAIDZ (3 disks) setup that was created using whole disks (no partitions).
I assumed it was a RAIDz1 but maybe it is not, since you appear to be unable to import it:
but I can not import zpool after reboot.

Code:
zpool import -f rpool
cannot import 'rpool': one or more devices is currently unavailable
What kind of ZFS configuration with three whole drives did you use? If it is not a RAIDz1 then you should disregard my previous suggestions.
Unfortunately, I accidentally ran the following command on one of the member disks:

Code:
sgdisk -Z /dev/sdb
Since you used whole drives, that command overwrote some important ZFS (meta)data. I don't know how to fix that. What kind of redundancy did you configure your ZFS for (as in what kind of ZFS drive setup did you use)? A 3-way mirror would be fine as it could survive two drives missing. A RAIDz1 could survive one drive missing. A stripe cannot survive any drive missing.