ZFS raidz3 high IO w/o scrub running

stratoss

Member
Jan 6, 2020
9
0
21
36
I'm running Proxmox 6.2-11 with 256GB ECC RAM and the following raidz3 pool:

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 15:25:14 with 0 errors on Sun Jan 10 15:49:29 2021
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
raidz3-0 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JDUHAR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JDWWAR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JB5AYR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JBYYTR-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MHXH-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9M248-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MQ9E-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MN5A-part4 ONLINE 0 0 0

errors: No known data errors


NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 43.6T 9.71T 33.9T - - 10% 22% 1.00x ONLINE -


On the proxmox host from few weeks I feel a big increase in the IO wait (2-3%), seeing txg_sync maxing at 95-99.99% IO via iotop. On one of the guest instances /Debian/ I can see jdb2/vda1-8 having again 99.99% most of the time.

Plenty of RAM available on both guest & host. What can the issue be?
 
I'm running Proxmox 6.2-11 with 256GB ECC RAM and the following raidz3 pool:

pool: rpool
state: ONLINE
scan: scrub repaired 0B in 0 days 15:25:14 with 0 errors on Sun Jan 10 15:49:29 2021
config:

NAME STATE READ WRITE CKSUM
rpool ONLINE 0 0 0
raidz3-0 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JDUHAR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JDWWAR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JB5AYR-part4 ONLINE 0 0 0
ata-HGST_HUS726T6TALE6L4_V8JBYYTR-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MHXH-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9M248-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MQ9E-part4 ONLINE 0 0 0
ata-ST6000NM0115-1YZ110_ZAD9MN5A-part4 ONLINE 0 0 0

errors: No known data errors


NAME SIZE ALLOC FREE CKPOINT EXPANDSZ FRAG CAP DEDUP HEALTH ALTROOT
rpool 43.6T 9.71T 33.9T - - 10% 22% 1.00x ONLINE -


On the proxmox host from few weeks I feel a big increase in the IO wait (2-3%), seeing txg_sync maxing at 95-99.99% IO via iotop. On one of the guest instances /Debian/ I can see jdb2/vda1-8 having again 99.99% most of the time.

Plenty of RAM available on both guest & host. What can the issue be?
Hi,
what services are running in the guest? Is this Debian running in a LXC or VM? jdb2 is the journaling block device and is most likely not the root cause, something else might cause high IO.

On a side note: Make sure your system stays up to date, PVE 6.3 was already released.
 
Last edited:
Hi,
what services are running in the guest? Is this Debian running in a LXC or VM? jdb2 is the journaling block device and is most likely not the root cause, something else might cause high IO.

On a side note: Make sure your system stays up to date, PVE 6.3 was already released.
The guest wasn't the root issue, because I've stopped all guests and rebooted the host w/o any luck. Still txg_sync using 99.99% of the IO most of the time. After upgrading to proxmox 6.3 (zfs 0.8.4 -> 0.8.6) the issue is gone for now.

Screenshot_2021-01-11 mainframe - Proxmox Virtual Environment.png

Thanks!
 
Last edited: