What do you think about ZFS spare disks?

tcabernoch · Jun 10, 2024

I have to get on an airplane in order to visit my servers. Or use remote hands. You know how that can go.
So, I'd rather not _need_ somebody visiting my servers. Or at least as little as possible.
And some of its old junk. So it dies.

Enter the spare disk conundrum.
It's easy to find ZFS 'experts' who will blithely argue exact opposite perspectives on any given aspect.
Many of them declare that a spare disk is only suitable for large arrays.
I've seen advice that you are better off getting more parity than having a spare.
I've even seen warnings about resilvering, as if that's a bad thing somehow.

This is what I'm talking about.

Code:

zpool add DATA spare /dev/gptid/whateveritisafteryougptit

So you can walk away and leave the server running for a year or two.
And it cleans up after itself.
Why is that bad?

UdoB · Jun 10, 2024

tcabernoch said:
Why is that bad?

You didn't tell us your vdev layout, so I can invent one myself - for demonstration only let's say:

you have a single mirror
you add a spare
one drive fails
that spare shall replace the dead drive
resilvering starts - and it has to read successfully the complete data on the remaining single drive
if resilver succeeds you are back to normal; if not: you've lost some data

On the other hand you could

start with a triple mirror from the beginning
one drive fails
you still have redundant intact data without the urgent need to resilver
and when you replace the third mirror the resilver process can read two discs to reconstruct the data

In the above example a "spare" is just stupid.

Another example:

you have a pool with two RaidZ1 vdevs with 4 drives each --> 8 drives active, single redundancy per vdev
if you can only attach one more drive, and only with this being a fact, you could add a shared spare
one drive fails
resilvering that vdev starts - and all three surviving drives of that vdev have to be read successfully to avoid data loss

In this case I would really fight hard for a 10th drive to create two RaidZ2 instead of this fragile approach. (Or opt for a different topology.)

Both are just examples of my personal understanding

leesteken · Jun 10, 2024

For robustness, use a three-way mirror with at least a HDD and a NVMe or SSD on another controller. Add as many (hot) spares as you want (for each type). Remember to change the name of the network device(s) because they might otherwise change when a NVMe (or other PCI(e) device) fails and disappears from the bus.
Or don't do any redundancy per Proxmox host and just create a cluster of many hosts (4 or more). Then other hardware pieces can also fail without interrupting your remote setup (or putting it in danger as soon as some one thing fails). I feel that your focus on ZFS spares is much to narrow for your redudancy requirements.

Kingneutron · Jun 10, 2024

A spare disk starts to make some sense if you have at least 6 disks in a raidz2. Just be aware that the hotspare does not automatically become a permanent replacement (at least, not without intervention.) The zed daemon rotates the hotspare in and out, and IIRC the pool will still show as DEGRADED if the hotspare is still in use.

Once you replace the bad drive, the hotspare goes back into "standby" mode for the next failure.

https://www.reddit.com/r/zfs/comments/rkym6s/hot_spare_to_become_a_permanent_replacement/

https://www.reddit.com/r/zfs/comments/p114ty/the_fine_manual_is_confusing_zpool_replace_and/

tcabernoch · Jun 11, 2024

These are really fantastic responses, thanks all.

Some sample layouts:
- 4 disk machine - Yeah, there's no room for a spare in this array.
- 6 disks - I agree, this is the very start of the range where a spare makes any sense at all. And maybe not a lot of sense.
- 12 disk array of 4 x 3 disk raidz1 vdevs - Buncha rando disks, each vdev is different capacity. Ya, I put a spare on that. (Of the largest capacity, of course.)(I inherited this junk stack. Scold me for it, but don't blame me. :] )

I still don't see a problem with resilvering. Ya, things suck while that's happening. Known issue.

Where I'm actively considering how to go forward is the oddball machine where I've got extra bays or an unused disk. (Perhaps a was bad disk replaced after the machine was first rebuilt. Sometimes that can take a while with remote hands.)
In these cases, I face a choice of expanding capacity or ... in my eyes ... giving myself the ability to ignore this particular machine for longer.
There's also the additional point that if one disk went out, another may be pending, and a spare might get used quickly.

Again, that was solid feedback. Much appreciated.
I guess my takeaway is I need to carefully consider vdev structure/raid level as I make any decisions about spares.

UdoB · Jun 11, 2024

tcabernoch said:
the ability to ignore this particular machine for longer.

Just make sure to setup some kind of monitoring so you get an Email when "something" happens.

Good luck

tcabernoch · Jun 11, 2024

Got Zabbix, man. Its talking to PVE and PBS. Built some dashboards.
I'm no wizard with it, but it shows me all our PVE clusters on one page.
It's a start.

Search

Search

What do you think about ZFS spare disks?

tcabernoch

Active Member

UdoB

Distinguished Member

leesteken

Distinguished Member

Kingneutron

Active Member

tcabernoch

Active Member

UdoB

Distinguished Member

tcabernoch

Active Member