ZFS newbie question

janvv

Active Member
Jun 21, 2020
63
10
28
66
52.24154182722349, 5.117853866801705
From what I have read, I understand that ZFS is a lot faster than EXT4, so I want to give it a try. I do not need RAID redundancy, because it is only a homelab that does not run any critical applications. And in case of troubles, I have Ansible playbooks to set up all my containers and VM's from scratch.

Currently I am using LVM to combine several disks into volumes. Two 1Tb nvme ssd's, together a 2Tb volume for the base Proxmox local and local-lvm. Two other ssd's together form a 4Tb data volume, and finally a few old spinning hard disks together in a 5Tb logical volume for backups and so.

Are ZFS pools similar to logical volumes in LVM? So that I can create kind of the same setup with ZFS Raid0?

My system has a Ryzen 9 3900X (24cpu) processor and 64Gb ram.

Would it be worth to start from scratch and try ZFS?
 
From what I have read, I understand that ZFS is a lot faster than EXT4,
No. ZFS is a complex system. It continuously calculates checksums etc. Ext4 does not do this. Not doing something is always faster.

ZFS is about reliability. ZFS won't deliver damaged data to you (the application you are running). Ext4 on the other hand has no way to verify data coming from disk...

Are ZFS pools similar to logical volumes in LVM?
Yes. There are two ways to utilize ZFS: a) create a "dataset" and use it like a sub-directory and b) create a "ZVOL" and use it like a block device, e.g. an LV. Both have unique capabilities and parameter sets.

PVE does both in parallel, usually datasets are created for Containers and ZVOLs are created for Virtual Machines.

So that I can create kind of the same setup with ZFS Raid0?
Yes. But please: do NOT do that! There are several drawbacks by not having any redundancy: ZFS is famous for self-healing. Self-healing requires to get information of damages from somewhere. This works by storing redundant information...

No redundancy means that ZFS can nevertheless recognize damages data as "wrong" but it can not repair them.

And: for ext4 (and other filesystems) there are tools that try to recover data from damaged systems. For ZFS there is no such tool, afaik.

Would it be worth to start from scratch and try ZFS?
Yes.

Disclaimer: you may call me a ZFS fan boy...
 
  • Like
Reactions: janvv and leesteken
Okay.... I based my information about performance mainly on this video. Are those benchmarks not true?
But anyway, I am always trying to learn technologies that are new to me, so I think I am indeed going to backup everything and reinstall Proxmox with ZFS.

You recommend not to use RAID0, but still adding redundancy. That means halving my disk space. Or buying additional disks :p .
I have to check my motherboard if there are still nvme slots available. Or I have to limit the base system to the two 1Tb nvme disks in RAID1. And put another SSDs next to the ones I already have.

Hmm.... this is not a project for just a weekend. This needs some upfront planning and buying disks. Maybe I have to do it in smaller steps. First reinstalling Proxmox onto the nvme disks. And from there do the further migration.

At some day this year I'll maybe join you being a ZFS fan boy ;)
 
Last edited:
ZFS uses several "tricks" and is one of the top sophisticated filesystems. And of course performance is part of optimization. There is an integrated read cache in memory, things like a separate-log-device (SLOG) and special devices for meta data. Compression brings both more speed and more space. The integrated encryption feature adds another level of security (but this one is not used in PVE). Snapshots are ingenious when you are going to make dangerous experiments. A ZFS pool and Data Sets / ZVOLs have some dozen of parameters to play with.
Okay.... I based my information about performance mainly on this video. Are those benchmarks not true?

While I am absolutely happy seeing ZFS being twice as fast as the competitors I can not judge the presentation in the video as I do not know the specific parameters of the tests. I am fairly sure another guy may produce a similar comparison presenting another filesystem being the winner.

"The best" thing out of alternatives always depends on what aspects you look at, what hardware you have and how you configure the actual software suite.

Stupid example: if you only have old mechanical harddisks and write a single 100 GB file all filesystems will probably have the same "performance": the speed is limited by that disk and is not depending on the filesystem's cleverness.

For me the feature set and the durability/reliability of a ZFS pool with integrated redundancy is the most important aspect. I have RaidZ2 pools for "normal but long-term data" and I have mirrored pools for virtual machines. (Virtual machines really need SSD/NVMe for adequate IOPS.) Also for VMs the SSDs should be "Enterprise" grade as consumer grade SSDs may fail sooner than expected. (Although the quality did really, really improve during the last 10 years!)

Oh, did I mention that I am a fan? ;-)
Nevertheless: your mileage may vary. Have fun!
 
I just created a VM with a single disk and installed Proxmox on it, using ZFS RAID0. Works great. I just wanted to check if it works, and it is my very first attempt to use ZFS.
I've read that ZFS is still a good thing, even if you use it on a single disk. However,I might still decide to sacrifice disk space, and reinstall Proxmox using a RAID1 with my two NVMe disks. The compression feature of ZFS will give me some disk space back, I guess.
 
The compression feature of ZFS will give me some disk space back, I guess.
Probably not much when using default values, as with that only blocks that are more than 50% compressible could save any space. Compression would be more useful when using datasets, which only LXCs use but not VMs.

Also keep in mind that a ZFS pool shouldn't be filled more than 80%. So with 2x 1TB in raid1 or 1x 1TB in raid0, in both cases you get 800GB of actually usable storage.
 
Last edited:
Okay, thanks. I also had another idea, but I don't know if it makes any sense: why shouldn't I take a smaller hdd as boot drive for the OS. I still have some old 2.5" laptop disks lying around. Then I can use the fast ssd's entirely for the VM disk images and containers. Or am I talking nonsense now?
 
Okay, thanks. I also had another idea, but I don't know if it makes any sense: why shouldn't I take a smaller hdd as boot drive for the OS. I still have some old 2.5" laptop disks lying around. Then I can use the fast ssd's entirely for the VM disk images and containers. Or am I talking nonsense now?
Sure, if additional power consumption and a lost 2,5" slot or SATA port aren't a problem, this should be fine. PVE itself doesn't need SSD performance.
 
SATA ports aren't a problem. With your information I think have made my plan. I'll use two small 2.5" disk in a ZFS RAID1 config, for the Proxmox base system. And I am going to buy a PCI NVMe expansion card (with 4 M2 slots) and a bunch of 1TB disks. When I add 4x 1Tb, then I have 6 of them in total to combine in RAIDZ1. This will result in about 4.8Tb space for VM images, and containers. The 2Tb normal SSD can be a local mirror of my Dropbox space. I can make a dedicated LXC with samba to serve my Dropbox on the network. Then I do no longer need to have dropbox on the laptops to still have access to those files.
Sounds like the pieces of the puzzle are falling together step by step. :)
 
When I add 4x 1Tb, then I have 6 of them in total to combine in RAIDZ1. This will result in about 4.8Tb space for VM images, and containers.
6x 1TB raidz1 with default ashift=12 and 8K volblocksize would result in a usable capacity of 2.4TB for VMs or 4TB for LXCs. Don't forget the padding overhead when using raidz1/2/3 with zvols and the 20% of capacity that should be kept free. To get the padding loss down you would need to increase the volblocksize to at least 32K.
 
Last edited:
hmm... more study needed. I do not understand the difference between VMs and LXCs. A file system is a file system, regardless of the files that are on it, is my simple thinking. But that's my lack of knowlegde in this area...
 
hmm... more study needed. I do not understand the difference between VMs and LXCs. A file system is a file system, regardless of the files that are on it, is my simple thinking. But that's my lack of knowlegde in this area...
VMs don't use filesystems. VMs use zvols = block devices. LXCs use datasets = filesystems. With zvols on raidz1/2/3 you get padding overhead when the volblocksize is too low. Datasets are using a dynamic recordsize. So it matters what you store on it.

There is an explanation of the padding overhead: https://web.archive.org/web/2021030...or-how-i-learned-stop-worrying-and-love-raidz
 
Last edited:
Hey @UdoB , did you say you're a ZFS fan? :p
I have been reading a lot over the past few days, and my gut feeling is, that it would be a missed chance if I won't use ZFS when I reconfigure my system.
There's still a lot I do not understand. I am a developer, not a hardware guru.
I am playing around with a virtual Proxmox machine. Just to try and see what happens when I setup a new machine with ZFS.
But then... what are the best choices when it comes to the options? What is ashift? I tried to find info, but I got swamped with posts and articles.
I am still doubting about redundancy, because I am hesitating to lose disk space. But a lot of people say: ZFS is still a good choice, even on a single disk.
 
What is ashift? I tried to find info, but I got swamped with posts and articles.
Depends on your hardware. You usually want ashift=12 when using disks with a 4k physical sector size. When only using disks with a 512B physical sector size then ashift=9 might also be an option.

I am still doubting about redundancy, because I am hesitating to lose disk space.
You will lose disk space. A ZFS pool should for example not be filled more than 80% for best performance.
ZFS is all about data integrity and enterprise features. When using ZFS you are willing to sacrifice alot of RAM, CPU performance, SSD wear, disk capacity and disk performance to get additional data integrity. But with just a single disk you will only get bit rot detection but no bit rot protection. So you usually want at least a raid1.
 
Last edited:
Yeah, @Dunuin was faster :) and he has a better understanding of block sizes (and additional loss for e.g. RaidZ) than I do.

ZFS is about reliability. If you want to be able to read the same data in the (possibly far) future as you did write today you need some verification mechanism and redundancy. Some current Filesystems do that today, the majority of the "classic" filesystems lack this feature completely. This aspect alone make me tend to use ZFS.

Regarding the PVE context: I have a productive Cluster, a Test-Cluster and a Homelab. All are ZFS-only. (My Test-Cluster has also Ceph, but until now just for testing...)

PVE pools (better: vdevs) for storing VMs should be mirrors, so yes, you lose 50% capacity (plus the already mentioned 20% for performance/stability). But in respect to the RAM prices and the rest of a server (be it professional or a small device in a homelab) another SSD is perhaps - let's say - 10% of the price of the whole system. For this 10% higher price I can increase the reliability massively. No single disk failure - be it a sector or a complete device - will stop my systems to work.

Just my 2€c - ymmv!
 
This is interesting though.
I added 6x1Tb disks to the VM, then installed ZFS RAIDZ1.
Now the proxmox interface shows this:
1674311698129.png
But df and zpool list show different sizes...
1674311979963.png

Anyway.... I'll no longer bother you, with all my newbie questions. I just keep tinkering around with a VM, until I have enough confidence to re-install from scratch.
 

Attachments

  • 1674311769971.png
    1674311769971.png
    18.7 KB · Views: 0
When working with zvols on raidz1/2/3 pool you also have to take padding overhead into account. When not increasing the volblocksize you will lose the same 60% of raw capacity you would lose with a 6 disk raid10. With an ashift of 12 and 6 disks in raidz1 the volblocksize should be increased to at least 32K to only lose 36% of the raw capacity.

There is a good explanation: https://web.archive.org/web/2020020...or-how-i-learned-stop-worrying-and-love-raidz
 
Last edited:
  • Like
Reactions: UdoB
Okay, here is my plan...

My current setup is:
2x1Tb NVMe: one LV, for the Proxmox OS and the LXCs en VMs (on the local-lvm storage)
1x4Tb 2.5" SSD: data directories with movies, databases and data, mounted to the LXCs with Plex, SQL and SMB server.
2x2Tb spinning HDDs: data that does not need to be fast like ISO's and all kind of archived data.

I want to buy 2 additional NVMe 1Tb disks on a PCI expansion card, and another 4Tb 2.5" SSD.
The planned setup is then:
4x1Tb NVMe: ZFS RAID10, for the Proxmox OS and the LXCs/VMs
2x4Tb 2.5" SSDs: ZFS RAID1, for the data
2x2Tb HDDs: ZFS RAID1, for the ISOs and archive

In the end, the total disk space is less, but buying stability and protection against data loss.

Does this make sense?
I am still doubting about swapping the data and the system. The OS and LXCs/VMs on the SATA SSDs, and the data to the faster NVMe disks.
 
Does this make sense?
Depends on your workload. You didn't told us what SSD models were used. Your SSDs will probably wear a multiple times faster when using ZFS because of the additional overhead and sync writes. This is one of the of points why it is highly recommended to only use Enterprise/Datacenter grade SSDs when using ZFS, as these are rated for a multiple of write durability. For some people consumer SSDs which their crappy durability will last for years, for some people they will fail after some months.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!