Copy 1Tb of mixed files from ext4 to RAIDZ VERY high load.

venquessa

New Member
Aug 25, 2023
17
2
3
I am copying my Camera Dump backups from a 2Tb SSD to a RAIDz array of 3 x 2Tb HDDs in RAIDz, in an external USB3 enclosure. It's 50/50 JPGs and MPGs, so 50% 1-5Mb files and 50% >100Mb files.

I have tested and the drives are running at full USB3 speeds.

I tried using rsync so I could stop and restart the task, but the transfer speed was really slow and it resulted in 98% IOWait and a load of 18. Non-responsive VMs and GUI timeouts. This is running direct on the PVE node as root. No VMs/CTs involved.

So I tried tar'ing it and just writing a single tar file to the RAIDz. This sped up the transfer quite a bit, but still results in 88% IOWait and a load of 14.

This is behaving exactly like a Linux disk with DMA disabled and PIO in operation where the CUP is literally waiting for block by block events.

"top, htop, atop, dstat" all show that it "should" be normal. IOWait is not "busy" time. SoftInt and HardInt show <1%. It's just means the core have an IO interrupt wait flag set. It is free to do other tasks. I don't understand why is the system bottlenecking with a single IO Task? Also, why is a single write thread producing 90%+ IOWait state on 6 of 12 cores, literally saturating the CPU scheduler?

Actual physical transfer rate is slow, maybe 60% what it should be to a basic HDD, around 55-60MB/s. Write is double that, which I assume is the 'actual' physical writes for the RAIDz.

I must be missing an important setting or something here. ZFS cannot be that flakey, even on HDDs.

Either it's in synchronous write without DMA or something else is very wrong.

I have tried "re-nicing" the tar process to -19 but no effect. I could try and fix up file permissions and run the tar job as a non-root ruser, would that stop it saturating the cores in IOWait?

Also, why is the scheduler getting hung in IOWait? It should just context in any other process with "runnable" state, but it doesn't seem to be doing so, or rather the ZFS layer is generating so many IOEvents and context switches the queue for the scheduler is artificially far, far longer than it should be.

Hardware:
Ryzen 5600G, 96Gb DDR4 3600. Source disk: Samsung EVO870 2Tb. Targets: Seagate 2Tb HDDs (Compute)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!