I am copying my Camera Dump backups from a 2Tb SSD to a RAIDz array of 3 x 2Tb HDDs in RAIDz, in an external USB3 enclosure. It's 50/50 JPGs and MPGs, so 50% 1-5Mb files and 50% >100Mb files.
I have tested and the drives are running at full USB3 speeds.
I tried using rsync so I could stop and restart the task, but the transfer speed was really slow and it resulted in 98% IOWait and a load of 18. Non-responsive VMs and GUI timeouts. This is running direct on the PVE node as root. No VMs/CTs involved.
So I tried tar'ing it and just writing a single tar file to the RAIDz. This sped up the transfer quite a bit, but still results in 88% IOWait and a load of 14.
This is behaving exactly like a Linux disk with DMA disabled and PIO in operation where the CUP is literally waiting for block by block events.
"top, htop, atop, dstat" all show that it "should" be normal. IOWait is not "busy" time. SoftInt and HardInt show <1%. It's just means the core have an IO interrupt wait flag set. It is free to do other tasks. I don't understand why is the system bottlenecking with a single IO Task? Also, why is a single write thread producing 90%+ IOWait state on 6 of 12 cores, literally saturating the CPU scheduler?
Actual physical transfer rate is slow, maybe 60% what it should be to a basic HDD, around 55-60MB/s. Write is double that, which I assume is the 'actual' physical writes for the RAIDz.
I must be missing an important setting or something here. ZFS cannot be that flakey, even on HDDs.
Either it's in synchronous write without DMA or something else is very wrong.
I have tried "re-nicing" the tar process to -19 but no effect. I could try and fix up file permissions and run the tar job as a non-root ruser, would that stop it saturating the cores in IOWait?
Also, why is the scheduler getting hung in IOWait? It should just context in any other process with "runnable" state, but it doesn't seem to be doing so, or rather the ZFS layer is generating so many IOEvents and context switches the queue for the scheduler is artificially far, far longer than it should be.
Hardware:
Ryzen 5600G, 96Gb DDR4 3600. Source disk: Samsung EVO870 2Tb. Targets: Seagate 2Tb HDDs (Compute)
I have tested and the drives are running at full USB3 speeds.
I tried using rsync so I could stop and restart the task, but the transfer speed was really slow and it resulted in 98% IOWait and a load of 18. Non-responsive VMs and GUI timeouts. This is running direct on the PVE node as root. No VMs/CTs involved.
So I tried tar'ing it and just writing a single tar file to the RAIDz. This sped up the transfer quite a bit, but still results in 88% IOWait and a load of 14.
This is behaving exactly like a Linux disk with DMA disabled and PIO in operation where the CUP is literally waiting for block by block events.
"top, htop, atop, dstat" all show that it "should" be normal. IOWait is not "busy" time. SoftInt and HardInt show <1%. It's just means the core have an IO interrupt wait flag set. It is free to do other tasks. I don't understand why is the system bottlenecking with a single IO Task? Also, why is a single write thread producing 90%+ IOWait state on 6 of 12 cores, literally saturating the CPU scheduler?
Actual physical transfer rate is slow, maybe 60% what it should be to a basic HDD, around 55-60MB/s. Write is double that, which I assume is the 'actual' physical writes for the RAIDz.
I must be missing an important setting or something here. ZFS cannot be that flakey, even on HDDs.
Either it's in synchronous write without DMA or something else is very wrong.
I have tried "re-nicing" the tar process to -19 but no effect. I could try and fix up file permissions and run the tar job as a non-root ruser, would that stop it saturating the cores in IOWait?
Also, why is the scheduler getting hung in IOWait? It should just context in any other process with "runnable" state, but it doesn't seem to be doing so, or rather the ZFS layer is generating so many IOEvents and context switches the queue for the scheduler is artificially far, far longer than it should be.
Hardware:
Ryzen 5600G, 96Gb DDR4 3600. Source disk: Samsung EVO870 2Tb. Targets: Seagate 2Tb HDDs (Compute)
Last edited: