After fighting with ZFS memory hunger, poor performance, and random reboots, I just have replaced it with mdraid (raid1), ext4, and simple qcow2 images for the VMs, stored in the ext4 file system. This setup should be the least efficient because of the multiple layers of abstraction (md and file system have to be "traversed" to reach the actual VM data), but still we have noticed a 5x improvement in VM responsiveness, database query times inside the VMs, and in backup (vzdump) speed in production. Also, no more reboots, and all of the RAM is now available for the VMs, instead of only half of it.
I have just run some tests in a lab environment.
Hardware: 64 GB ram, dual Xeon
Disk controller: simple SATA3 integrated in the server mainboard
Disk configuration: 2x WD GOLD 2 TB
First test: Standard installation of PVE 5.2 (from CD, no online updates), using RAIDZ-1. I have uploaded a simple VM backup (3 GB data on the virtual disk) and performed a vzdump "in place", that means I have dumped the VM from and to the same phisical disks.
Time to dump 3 GB of data (using the default "backup" function of PVE): 12 minutes.
Second test: on the same HW I installed Debian 9 with md raid 1, and ext4 file system on top of the md device. I have then uploaded the same VM backup, and tested the same, identical, procedure.
Time to dump 3 GB of data (using the default "backup" function of PVE): 2 minutes, 6 seconds.
This is a SIXFOLD IMPROVEMENT.
CONCLUSIONS: Please, PVE developers, PLEASE, PLEASE, PLEASE, consider offering mdraid as a default installation path for the PVE ISO images. It is quite clear to me that ZFS has a LOT of disadvantages: it's slow, memory hungry, and crash prone (because of OOM situations).
CAVEATS:
I know that I should use a proper RAID controller and LVM, and ditch ZFS. But why spend a lot of money on a raid controller when, in some low-end (and mid-range) setups, md raid works really well?
I know that the first test uses PVE 5.2 and the second uses 5.3 (from the free PVE repos). Still I don't think that the performance improvement is caused by using 5.3 instead of 5.2
I know I can tune ZFS to make it stop using up half of the available RAM, but speed will only get worse anyway.
I know I can set up PVE on Debian as I did, and not bother PVE developers to ask for something I can do by myself, still I can't believe it's so hard to support LVM on mdraid as a setup option. What's wrong with mdraid? I use it everywhere, I have used it for 15 years (maybe 20) and I had NEVER HAD ANY ISSUE AT ALL.
I have just run some tests in a lab environment.
Hardware: 64 GB ram, dual Xeon
Disk controller: simple SATA3 integrated in the server mainboard
Disk configuration: 2x WD GOLD 2 TB
First test: Standard installation of PVE 5.2 (from CD, no online updates), using RAIDZ-1. I have uploaded a simple VM backup (3 GB data on the virtual disk) and performed a vzdump "in place", that means I have dumped the VM from and to the same phisical disks.
Time to dump 3 GB of data (using the default "backup" function of PVE): 12 minutes.
Second test: on the same HW I installed Debian 9 with md raid 1, and ext4 file system on top of the md device. I have then uploaded the same VM backup, and tested the same, identical, procedure.
Time to dump 3 GB of data (using the default "backup" function of PVE): 2 minutes, 6 seconds.
This is a SIXFOLD IMPROVEMENT.
CONCLUSIONS: Please, PVE developers, PLEASE, PLEASE, PLEASE, consider offering mdraid as a default installation path for the PVE ISO images. It is quite clear to me that ZFS has a LOT of disadvantages: it's slow, memory hungry, and crash prone (because of OOM situations).
CAVEATS:
I know that I should use a proper RAID controller and LVM, and ditch ZFS. But why spend a lot of money on a raid controller when, in some low-end (and mid-range) setups, md raid works really well?
I know that the first test uses PVE 5.2 and the second uses 5.3 (from the free PVE repos). Still I don't think that the performance improvement is caused by using 5.3 instead of 5.2
I know I can tune ZFS to make it stop using up half of the available RAM, but speed will only get worse anyway.
I know I can set up PVE on Debian as I did, and not bother PVE developers to ask for something I can do by myself, still I can't believe it's so hard to support LVM on mdraid as a setup option. What's wrong with mdraid? I use it everywhere, I have used it for 15 years (maybe 20) and I had NEVER HAD ANY ISSUE AT ALL.