ZFS boot RAIDZ with 3 SSD

Erk

Renowned Member
Dec 11, 2009
165
6
83
If I make a Proxmox server running RaidZ with 3 same size SSD drives, is it possible to have grub boot on two of the drives for recovery like you could have in a Raid 1 mirror if you copied the grub boot across? I am not familiar with how RaidZ distributes the redundant data. I am just trying to come up with a method of keeping the drive count to only 3 if I can but allowing for one to fail. I wont be using a ZIL or L2ARC because it's a pure SSD setup.

So in summary, I want to know if I can have a 3 drive RAIDZ setup where I can boot from any of the 3 drives so it's fully redundant.
 
Last edited:
So in summary, I want to know if I can have a 3 drive RAIDZ setup where I can boot from any of the 3 drives so it's fully redundant.

Yes, this is the default when you use the Proxmox installer.
 
Yes, this is the default when you use the Proxmox installer.

That's good to know. If I have to replace an SSD because one fails, how does the grub boot loader get written to the new drive? I presume the Proxmox installer did it originally.
 
That's good to know. If I have to replace an SSD because one fails, how does the grub boot loader get written to the new drive? I presume the Proxmox installer did it originally.

You need to create a grub-boot partition at the start of the new disk. Then run 'dpkg-reconfigure grub-pc', additionally select the new grub partition.
 
You need to create a grub-boot partition at the start of the new disk. Then run 'dpkg-reconfigure grub-pc', additionally select the new grub partition.

So do you create the grub boot partition and run 'dpkg-reconfigure grub-pc' before you run zpool replace?
 
So do you create the grub boot partition and run 'dpkg-reconfigure grub-pc' before you run zpool replace?

No, after. Just do not touch existing partitions, and use free space at start of drive.
 
No, after. Just do not touch existing partitions, and use free space at start of drive.

I haven't had any luck. I trashed one of my test drives with dd and tried to which gave me the following:

Code:
 root@kvm4:/home# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
	Sufficient replicas exist for the pool to continue functioning in a
	degraded state.
action: Online the device using 'zpool online' or replace the device with
	'zpool replace'.
  scan: scrub repaired 0 in 0h6m with 0 errors on Fri Nov 13 14:52:03 2015
config:

	NAME                                          STATE     READ WRITE CKSUM
	rpool                                         DEGRADED     0     0     0
	  raidz1-0                                    DEGRADED     0     0     0
	    2860130782450730660                       OFFLINE      0     0     0  was /dev/sdb2
	    ata-SAMSUNG_HD250HJ_S0URJDRP900752-part2  ONLINE       0     0     0
	    ata-SAMSUNG_HD250HJ_S0URJDRP900756-part2  ONLINE       0     0     0

errors: No known data errors

Then I tried

Code:
root@kvm4:/home# zpool replace -f rpool 2860130782450730660 /dev/sdc
cannot replace 2860130782450730660 with /dev/sdc: /dev/sdc is busy

/dev/sdc is the new device of the drive as linux likes to rename things, it was /dev/sdb. I will use /dev/disk/by-id/ when I have finished testing the procedure.


zpool always says the drive is busy when I try to replace even though it made some partitions on it!

Code:
Disk /dev/sdc: 232.9 GiB, 250059350016 bytes, 488397168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 341E93E3-EC84-614A-ACB5-983952A35B7A

Device         Start       End   Sectors   Size Type
/dev/sdc1       2048 488380415 488378368 232.9G Solaris /usr & Apple ZFS
/dev/sdc9  488380416 488396799     16384     8M Solaris reserved 1


What's the trick?
 
Ok, I worked it out, because I moved the drive from SATA to a USB external device, it lost the plot with the device naming. I couldn't get it to replace after trying different tricks for hours. What I had to do was totally zero out the drive with dd if=/dev/zero of=/dev/sdc bs=1M so there was not a trace of the former ZFS data anywhere on it. I first tried zeroing out just a few GB of the drive but that was not good enough, it must keep multiple copies of the ZFS info on it. The full erase did the trick, after that I just did:

Code:
zpool replace -f rpool 2860130782450730660 /dev/disk/by-id/usb-SAMSUNG_SP2504C_SAMSUNG_SPS09QJ1CP201692-0\:0
Make sure to wait until resilver is done before rebooting.

root@kvm4:/home# zpool status
  pool: rpool
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Nov 13 20:09:35 2015
	397M scanned out of 28.9G at 26.5M/s, 0h18m to go
	127M resilvered, 1.34% done
config:

	NAME                                                    STATE     READ WRITE CKSUM
	rpool                                                   DEGRADED     0     0     0
	  raidz1-0                                              DEGRADED     0     0     0
	    replacing-0                                         OFFLINE      0     0     0
	      2860130782450730660                               OFFLINE      0     0     0  was /dev/sdb2
	      usb-SAMSUNG_SP2504C_SAMSUNG_SPS09QJ1CP201692-0:0  ONLINE       0     0     0  (resilvering)
	    ata-SAMSUNG_HD250HJ_S0URJDRP900752-part2            ONLINE       0     0     0
	    ata-SAMSUNG_HD250HJ_S0URJDRP900756-part2            ONLINE       0     0     0

errors: No known data errors
 
Ok, I worked it out, because I moved the drive from SATA to a USB external device, it lost the plot with the device naming. I couldn't get it to replace after trying different tricks for hours. What I had to do was totally zero out the drive with dd if=/dev/zero of=/dev/sdc bs=1M so there was not a trace of the former ZFS data anywhere on it. I first tried zeroing out just a few GB of the drive but that was not good enough, it must keep multiple copies of the ZFS info on it...
Hi,
perhaps because of the second copy of an gpt label? Then should an wiping of the gpt-label be enough.

Udo
 
Hi,
perhaps because of the second copy of an gpt label? Then should an wiping of the gpt-label be enough.

Udo

I think it's the ZFS labels. I tried zpool labelclear but it wouldn't let me because the drive contained pool info. I don't understand the point of the command in that case.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!