[TUTORIAL] FabU: Can I use ZFS RaidZ for my VMs?

UdoB

Distinguished Member
Nov 1, 2016
1,971
817
213
Germany
Assumption: you use at least four identical devices for that. Mirrors, RaidZ, RaidZ2 are possible - theoretically.

Technically correct answer: yes, it works. But the right answers is: no, do not do that! The recommendation is very clear: use “striped mirrors”. This results in something similar to a classic Raid10.

(1) RaidZ1 (and Z2 too) gives you the IOPS of a single device, completely independent of the actual number of physical devices. For the “four devices, mirrored” approach this will double --> giving twice as many Operations per Second. For a large-file fileserver this may be not so important, but for multiple VMs running on it concurrently as high IOPS as possible are crucial!

(2) It is a waste of space because of padding blocks: Dunuin has described that problem several times, an extreme example for RaidZ3 : https://forum.proxmox.com/threads/zfs-vs-single-disk-configuration-recomendation.138161/post-616199 “A 8 disk raidz3 pool would require that you increase the block size from 8K (75% capacity loss) to 64K (43% capacity loss) or even 256K (38% capacity loss)“


There seem to be some counter arguments against “only mirrors”:

(3) Resiliency: "I will use RaidZ2 with six drives to allow two to fail. Mirrors are less secure, right?"

Yes. In a single RaidZ2-vdev any two devices may fail without data loss. In a normal mirror only one device may fail.

BUT: there are triple mirrors! These are being so rarely discussed that I need to mention them here explicitly. Let us compare that RaidZ2 with six devices:

(3a) the RaidZ2 will give us the performance of a single drive and the usable capacity of four drives. Two drives may fail.

(3b) the two vdev with triple mirrors gives us the IOPS of two drives for writing data + six fold read performance! Any two of each vdev may fail! (So up to four drive may die - but only in a specific selection.)

(4) Capacity: the only downside of (3) is that the capacity shrinks down to two drives.


Recommendation: for VM storage use a mirrored vdev approach. For important data use RaidZ2 or RaidZ3.

In any case note that “Raid” of any flavor and/or having snapshots does not count as a backup. Never!


See also:
 
Beginners often confuse hardware RAID5/6 with BBU (which can cache sync writes) with ZFS RaidZ1/2 (with unfortunate block size alignment on consumer drives) just because both can deal with one/two missing drive(s). The performance behavior is indeed completely different (as well as the supported feature set) and RaidZ, as you already explained, is mostly unsuitable for VMs.
 
Last edited:
It couldn't hurt to add that a single vdev stripe of multiple disks, whether it's a misconfiguration or a misunderstanding of the striped mirror concept, is the worst choice of all. Even worse than using a single disk because it at least doubles the failure rate.
 
  • Like
Reactions: UdoB and Johannes S
It couldn't hurt to add that a single vdev stripe of multiple disks, whether it's a misconfiguration or a misunderstanding of the striped mirror concept, is the worst choice of all. Even worse than using a single disk because it at least doubles the failure rate.
Yes, absolutely correct. For the interested reader, let me show you two basic examples:

This is the bad approach, it has zero redundancy - and if one device fails the whole pool is gone:
Code:
# zpool create dummypool /rpool/dummy/disk-a.img /rpool/dummy/disk-b.img /rpool/dummy/disk-c.img /rpool/dummy/disk-d.img 

# zpool status dummypool
  pool: dummypool
 state: ONLINE
config:

        NAME                       STATE     READ WRITE CKSUM
        dummypool                  ONLINE       0     0     0
          /rpool/dummy/disk-a.img  ONLINE       0     0     0
          /rpool/dummy/disk-b.img  ONLINE       0     0     0
          /rpool/dummy/disk-c.img  ONLINE       0     0     0
          /rpool/dummy/disk-d.img  ONLINE       0     0     0

While what we are recommending is this to use mirrors:
Code:
# zpool create dummypool  mirror /rpool/dummy/disk-a.img /rpool/dummy/disk-b.img  mirror /rpool/dummy/disk-c.img /rpool/dummy/disk-d.img 

# zpool status dummypool
  pool: dummypool
 state: ONLINE
config:

        NAME                         STATE     READ WRITE CKSUM
        dummypool                    ONLINE       0     0     0
          mirror-0                   ONLINE       0     0     0
            /rpool/dummy/disk-a.img  ONLINE       0     0     0
            /rpool/dummy/disk-b.img  ONLINE       0     0     0
          mirror-1                   ONLINE       0     0     0
            /rpool/dummy/disk-c.img  ONLINE       0     0     0
            /rpool/dummy/disk-d.img  ONLINE       0     0     0

:)
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!