RAIDZ1 Pool ZFS use a lot of CPU on Live Migration

marciglesias17

Active Member
Feb 12, 2020
44
0
26
25
Dark fiber
Hello,

I have a cluster with 5 machines with many cores with 6x1TB SSD in RAIDZ1 ZFS and when I migrate machines in live from one node to another node the load goes up a lot, it does not allow me to migrate machines smoothly, can you think of something that can be?

ZFS version:
Code:
zfs-0.8.5-pve1
zfs-kmod-0.8.5-pve1

Code:
  pool: rpool
 state: ONLINE
  scan: scrub repaired 0B in 0 days 02:52:05 with 0 errors on Sun Aug 14 03:16:08 2022
config:

        NAME                                                   STATE     READ WRITE CKSUM
        rpool                                                  ONLINE       0     0     0
          raidz1-0                                             ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_1TB_S3Z9NB0NA15616V-part3  ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_1TB_S3Z9NB0NA15611T-part3  ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_1TB_S3Z9NB0NA15614J-part3  ONLINE       0     0     0
            ata-Samsung_SSD_860_EVO_1TB_S3Z9NB0NA15612F-part3  ONLINE       0     0     0

errors: No known data errors

Thanks,
 
Normal with your setup. You use consumer grade SSDs for everything, which are terrible with respect to performance. Please google or search the forums why this is the case ... go get a performant system, please just use enterprise grade SSDs.
 
Normal with your setup. You use consumer grade SSDs for everything, which are terrible with respect to performance. Please google or search the forums why this is the case ... go get a performant system, please just use enterprise grade SSDs.
I have been using this type of disks in several setups with RAID Hardware and XFS partition and it works perfectly, I don't think this is the problem.
 
I have been using this type of disks in several setups with RAID Hardware and XFS partition and it works perfectly, I don't think this is the problem.

ZFS in combination with consumer SSDs is the problem.

Look at the benchmarks here:
https://forum.proxmox.com/threads/proxmox-ve-zfs-benchmark-with-nvme.80744

Edit: RaidZ is also not great performance-wise (IOPS) in comparison to e.g. a raid10.

PS.: Your ZFS-version (and therefore probably your PVE-version) is rather old.
 
Last edited:
I have upgraded to the latest version of Proxmox and the problem still occurs, any ideas? I have these same disks in another cluster with RAID Hardware and I don't have these same proxmox problems.
 
I have upgraded to the latest version of Proxmox and the problem still occurs, any ideas?
You mean besides that @Neobin and I already pointed out the cause: the consumer SSDs? Switch them out for enterprise grade SSDs and you ZFS pool will be faster.

I have these same disks in another cluster with RAID Hardware and I don't have these same proxmox problems.
Yes, your hardware raid controller "caches away" the problems you have with ZFS.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!