zfs 2.1 roadmap

alexskysilk

Famous Member
Oct 16, 2015
1,444
247
133
Chatsworth, CA
www.skysilk.com
I've had a rather catastrophic fault (3 drives in a raidz2 vdev) that I've been trying to recover from for the last month or so. I finally have the scrub complete successfully without DISK errors but there are permanent errors in the pool which I'm now resigned to having to having to redeploy the whole thing as I've been unable to clear them.

I wouldnt normally ask except in this case, but what is the prospective ETA for inclusion of ZFS 2.1? if its imminent I'll wait to deploy as draid is a killer feature for me; if its months out I'll just have to live without it.

@t.lamprecht @martin please advise :)
 
Thanks Thomas.

What I'm going to do is remove proxmox from this box and install a clean debian. That way I SHOULD Be able to reintroduce proxmox once zfs is incorporated. Not ideal, but ok for this particular deployment.
 
I've had a rather catastrophic fault (3 drives in a raidz2 vdev)
This happens to be a >300TB pool.
300tb implies dozens of drives. Why raidz2 for such a huge pool? Why not raidz3 to start with?

Correct me, if I'm wrong, yet it is my understanding that draid is for those, who want even better redundancy, beyond raidz3... Otherwise, why somebody should choose 6 hdd draidz(lets say 4 data,1 parity+1 spare) over 6 hdd raidz2 (4 data+2 parity)? It is the same number of disks. No advantage in electricity consumption or HDD wear (the spare is not dedicated in draidz, thus it doesn't spin down). Theoretically, draidz is marginally faster, yet raidz2 is much more robust: there are 2 redundant disks at all times, instead of 1!
 
raidz2 vdevs. with a 36 disk arrangement, your options are 2x 18disk RAIDZ3 or 3x 12disk RAIDZ2. I'd rather have more vdevs or performance would be total shit. more vdevs would be great but the parity overhead would be murder on usable capacity.
 
raidz2 vdevs. with a 36 disk arrangement, your options are 2x 18disk RAIDZ3 or 3x 12disk RAIDZ2. I'd rather have more vdevs or performance would be total shit. more vdevs would be great but the parity overhead would be murder on usable capacity.
1) You have had a terrible month, but it was NOT due to poor pool performance, indeed.
2) Your debacle is not a coincidence. For modern installations, raidz2 is not enough. Here is a publication from old 2010 predicting that something like your case will be happening more and more often by 2019 https://www.zdnet.com/article/why-raid-6-stops-working-in-2019/. We are in 2021 now. The author is 100% correct.
3) I suggest you reading https://arstechnica.com/gadgets/202...-1s-new-distributed-raid-topology/?comments=1, especially the ending, about dRAID real usable capacity and fault tolerance.
 
3) I suggest you reading https://arstechnica.com/gadgets/202...-1s-new-distributed-raid-topology/?comments=1, especially the ending, about dRAID real usable capacity and fault tolerance.
... which has become available in zfs 2.1. and we've come full circle to the first post ;)

Incidentally, I run about 20 filers in this configuration. None of them has ever come close to a real fault. this particular unit shipped with a whole batch of faulty drives. and survived for years (with a lot of care and feeding.) while your articles are not wrong, its also about scope. the sky isnt falling.
 
Last edited:
... which has become available in zfs 2.1. and we've come full circle to the first post ;)

Incidentally, I run about 20 filers in this configuration. None of them has ever come close to a real fault. this particular unit shipped with a whole batch of faulty drives. and survived for years (with a lot of care and feeding.) while your articles are not wrong, its also about scope. the sky isnt falling.
 
I hinted to the following:
You was an unfortunate to get a 3 disk failure within the same vdev (3 out of 12 disks). If those were 3 disk from different vdev, you be fine. It was a very unlikely failure, but it happened.
In case, of the dRAID, a failure of ANY 3 disks out of 36 will result in pool loss.
 
Use mirrors ;)

I have never understood the drive to risky raidZ, usually these setups have huge amounts spent on them with large amounts of disks in the pool, if money isnt an object then 50% redundancy cost shouldnt matter.
 
I have never understood the drive to risky raidZ, usually these setups have huge amounts spent on them with large amounts of disks in the pool, if money isnt an object then 50% redundancy cost shouldnt matter.
That is a whole bunch of assumptions, based on your admitted lack of understanding. I'd suggest you understand the use case before you begin offering suggestions.

Just to humor you- all ZFS pools are vulnerable to complete failure due to vdev loss. A mirror is a 2 disk vdev. A striped pool can sustain the LOSS of two disks or even 3 with raidz3. If you're banking on your luck that you'd lose two disks in different vdevs thats your call. Its true that mirrors resilver at a much greater rate then striped pools, which is a case that can be made- but when you start taking physical space, power, and cooling requirements (and, yes- cost) this isnt always a workable approach for all storage requirements.
 
In case, of the dRAID, a failure of ANY 3 disks out of 36 will result in pool loss.
No so. that is the WHOLE POINT of draid. you're NOT necessarily changing your stripe arrangement, and its up to you on HOW MANY virtual spares you define. In my case, I WOULD probably make 2x 18disk stripesets (RaidZ2) with 2 draid spares each. A disk failure will RESILVER very rapidly to a draid spare, which means that even when a disk is out you still have full dual parity. even a second disk failure still has a spare to rebuild to. Assuming you did not replace either disk fault, you can sustain ANOTHER two disks failing and still be operational, but I'd be real nervous by then- and if you really haven't replaced your failures by then you deserve to lose your pool ;)

Also, the above is true for EACH VDEV. in theory, this arrangement will provide the same usable capacity similar (6p vs 8p) as 3x12 as I operate now, but have the ability to sustain up to 8 disk failures (not at the same time but you get the point) without data loss.
 
Last edited:
That is a whole bunch of assumptions, based on your admitted lack of understanding. I'd suggest you understand the use case before you begin offering suggestions.

Just to humor you- all ZFS pools are vulnerable to complete failure due to vdev loss. A mirror is a 2 disk vdev. A striped pool can sustain the LOSS of two disks or even 3 with raidz3. If you're banking on your luck that you'd lose two disks in different vdevs thats your call. Its true that mirrors resilver at a much greater rate then striped pools, which is a case that can be made- but when you start taking physical space, power, and cooling requirements (and, yes- cost) this isnt always a workable approach for all storage requirements.

I am well aware there is advantages of raidz2 raidz3 over mirror, in the fact "any" 2/3 disks can fail whilst if you have multiple mirror vdevs and 2 disks fail in the same vdev then say bye bye to your pool. Personally I think raidz3 is overall safer than mirrored vdev's, but I dont have that same feeling with raidz2 especially on very large device pools. My point was more about if you able to budget for a very large device pool then cost of redundancy shouldnt be an issue.

I have looked at the draid documentation and I do consider it a huge step forward as the window of risk is significantly reduced, I feel that draid makes parity based setups much more viable now.

I would like to leave it here on a agree to disagree note, I probably have made wrong assumptions that very large pools perhaps do have budgeting limitations and my comment shouldnt have been made not productive to this specific discussion (it wasnt referring to draid which I think is great), and hope you dont reply again saying I dont understand things.
 
Last edited:
We uploaded a bunch of kernels with newer ZFS versions yesterday:
  • Proxmox VE 6.4 (oldstable):
    • pve-kernel-5.4.143-1-pve (5.4.143-1) with ZFS 2.0.6
    • pve-kernel-5.11.22-5-pve (5.11.22-10~bpo10+1) with ZFS 2.0.6
  • Proxmox VE 7.0 (stable):
    • pve-kernel-5.11.22-5-pve (5.11.22-10) with ZFS 2.0.6
    • pve-kernel-5.13.14-1-pve (5.13.14-1) with ZFS 2.1.1
The 5.13 based kernel is still opt-in only, it will be the one we default to in Proxmox VE 7.1 (planned for 2021/Q4).
That means that 5.11 is slowly on its way out, and that's why we did not bother with updating the ZFS module there to the 2.1 series.

To test ZFS 2.1 and the 5.13 based kernel add the pvetest repo and do:
Bash:
apt update
apt full-upgrade
apt install pve-kernel-5.13
# -> reboot
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!