z_wr_iss high CPU usage and high CPU load

harmonyp · Aug 6, 2021

I have created a NVMe RAIDZ zfs pool and have noticed it uses a lot more CPU power than a similar setup using RAID-10.

Cloning a 40GB template causes the servers load to skyrocket and you can really feel the lag trying to run anything else during this process.

At peaks it's using almost half of my servers CPU power (AMD EPYC 7502P) for short bursts. This feels really excessive.

Is there anything I can do to reduce this load? Also I presume RAID0 would give the best performance?

Dunuin · Aug 7, 2021

harmonyp said:
I have created a NVMe RAIDZ zfs pool and have noticed it uses a lot more CPU power than a similar setup using RAID-10.

Thats normal. Raid10 is just writing stuff to multiple disks without any big computations. For raidz it is way more complex. You need to compute parity data and so on. So a raidz will always be more CPU heavy.

harmonyp said:
Cloning a 40GB template causes the servers load to skyrocket and you can really feel the lag trying to run anything else during this process.

At peaks it's using almost half of my servers CPU power (AMD EPYC 7502P) for short bursts. This feels really excessive.

Did you optimize your volblocksize? If you are just using the default 8K volblocksize you are wasting alot of capacity and you are loosing performance due to padding overhead.

harmonyp said:
Is there anything I can do to reduce this load? Also I presume RAID0 would give the best performance?

Yes but also no redundancy and ZFS won't be able to heal itself if data gets corrupted.

harmonyp · Aug 7, 2021

Dunuin said:
Did you optimize your volblocksize? If you are just using the default 8K volblocksize you are wasting alot of capacity and you are loosing performance due to padding overhead.

No I have not. What do you suggest for a combination of Windows/Linux VMs SAMSUNG MZQLB1T9HAJR-00007 on RAIDZ (3 drives). Also the same but for RAID10 (4 drives).

Dunuin · Aug 7, 2021

harmonyp said:
No I have not. What do you suggest for a combination of Windows/Linux VMs SAMSUNG MZQLB1T9HAJR-00007 on RAIDZ (3 drives). Also the same but for RAID10 (4 drives).

If you use a pool that was created with ashift=12 the best should be 8K volblocksize for raid10 of 4 drives or 16K volblocksize for raidz1 of 3 disks.

harmonyp · Aug 7, 2021

Dunuin said:
If you use a pool that was created with ashift=12 the best should be 8K volblocksize for raid10 of 4 drives or 16K volblocksize for raidz1 of 3 disks.

Ok. Got a problem my cluster both ZFS have the same name

If I change them manually will the cluster storage setting change it back?

Dunuin · Aug 7, 2021

Also keep in mind that you can't simple change the volblocksize for existing zvols. The volblocksize can only be set at creation so you will need to destroy and recreate all virtual disks so that a change in the "Block Size" for the Pool takes effect. The best way to do this is to shutdown all VMs, back them up, overwrite them by restoring them from the backup. If you then start the VMs they should be using the new volblocksize and the pool should be 33% more empty.
You don't need to replace LXCs because they use the 128K recordsize instead of the volblocksize.

harmonyp · Aug 8, 2021

Dunuin said:
Also keep in mind that you can't simple change the volblocksize for existing zvols. The volblocksize can only be set at creation so you will need to destroy and recreate all virtual disks so that a change in the "Block Size" for the Pool takes effect. The best way to do this is to shutdown all VMs, back them up, overwrite them by restoring them from the backup. If you then start the VMs they should be using the new volblocksize and the pool should be 33% more empty.
You don't need to replace LXCs because they use the 128K recordsize instead of the volblocksize.

Ok thanks a lot for the info. Can I ask why do you suggest the 8k for RAID-10 and 16k for RAIDZ1. How are you calculating it?

If I change this now new virtual machines will use the new values correct? I can backup/restore the existing ones at a later date.

Also in my last message there are multiple nodes with this setting in our cluster (8k) there is no way to individually set the Block Size setting in the GUI for each cluster. If I manually adjust this will the GUI setting change it at a later date?

Dunuin · Aug 8, 2021

harmonyp said:
Ok thanks a lot for the info. Can I ask why do you suggest the 8k for RAID-10 and 16k for RAIDZ1. How are you calculating it?

For striped pools you want blocksize of your pool (so 4K if ashift of 12 is used) multiplied by the number of data bearing disks. So if you got two mirrors of each 2 SSDs striped together using ashift of 12 that is 2 * 4K = 8K.
For raidz it is more complex. Here is a table that show parity+padding loss for different volblocksizes for raidz1/2/3 with 3 to 24 disks. And here is the is a blog post of the leading ZFS engineer explaining how raidz works and how that values of that spreadsheet are calculated.
So a raidz1 of 3 disks with ashift of 12 and a volblocksize of 8K will loose 50% of the raw capacity (33% to parity + 17% to padding overhead). With a volblocksize of 16K you only loose 33% of the raw capacity because there is no padding overhead.

harmonyp said:
If I change this now new virtual machines will use the new values correct?

Yes.

harmonyp said:
Also in my last message there are multiple nodes with this setting in our cluster (8k) there is no way to individually set the Block Size setting in the GUI for each cluster. If I manually adjust this will the GUI setting change it at a later date?

I don't know. Maybe the staff can answer this.

harmonyp · Aug 8, 2021

Dunuin said:
For striped pools you want blocksize of your pool (so 4K if ashift of 12 is used) multiplied by the number of data bearing disks. So if you got two mirrors of each 2 SSDs striped together using ashift of 12 that is 2 * 4K = 8K.
For raidz it is more complex. Here is a table that show parity+padding loss for different volblocksizes for raidz1/2/3 with 3 to 24 disks. And here is the is a blog post of the leading ZFS engineer explaining how raidz works and how that values of that spreadsheet are calculated.
So a raidz1 of 3 disks with ashift of 12 and a volblocksize of 8K will loose 50% of the raw capacity (33% to parity + 17% to padding overhead). With a volblocksize of 16K you only loose 33% of the raw capacity because there is no padding overhead.

Yes.

I don't know. Maybe the staff can answer this.

Ok thanks for your help.

This will also adjust if I just migrate the machine too? rather than backup/restore.

Dunuin · Aug 8, 2021

harmonyp said:
Ok thanks for your help.

This will also adjust if I just migrate the machine too? rather than backup/restore.

Not sure but I guess that would work too. As long as that deletes the old zvol and creates a new one that should work.

Search

Search

z_wr_iss high CPU usage and high CPU load

harmonyp

Member

Dunuin

Distinguished Member

harmonyp

Member

Dunuin

Distinguished Member

harmonyp

Member

Dunuin

Distinguished Member

harmonyp

Member

Dunuin

Distinguished Member

harmonyp

Member

Dunuin

Distinguished Member