Hi,
I'm have a backup-job scheduled which will daily backup my system.
Sometimes while running the backup, the system will spawn one arc_reclaim and lots of arc_prune processes. All of these processes will eat up the CPU time and cause the VMs which are still running to run into (timeout) errors of different kinds. Even SSH to the host does have "lags"!
Source and destination are separated ZFS pools (on separated disks) in the same server, so no problems with networking. Both have mirrored SSDs as log devices, the Source has SSDs as cache devices.
The problem will did never resolve itself until I rebooted (only had patience for a few hours)
Here is a screenshot of htop when this happens:
Edit: I just saw that today a scheduled zfs scrub is running on one of the pools. I think that could cause the problem. Is there a trick to enable scrub and backup a the same time (bwlimit on the backup maybe?)
Edit: I killed the backup job, and the zfs arc_reclaim and arc_prune are still on 100% CPU
Edit: I stopped scrubbing with zpool scrub -s but the processes still on max CPU :/
I'm have a backup-job scheduled which will daily backup my system.
Sometimes while running the backup, the system will spawn one arc_reclaim and lots of arc_prune processes. All of these processes will eat up the CPU time and cause the VMs which are still running to run into (timeout) errors of different kinds. Even SSH to the host does have "lags"!
Source and destination are separated ZFS pools (on separated disks) in the same server, so no problems with networking. Both have mirrored SSDs as log devices, the Source has SSDs as cache devices.
The problem will did never resolve itself until I rebooted (only had patience for a few hours)
Code:
root@host:~# pveversion
pve-manager/5.4-7/fc10404a (running kernel: 4.15.18-16-pve)
Here is a screenshot of htop when this happens:
Edit: I just saw that today a scheduled zfs scrub is running on one of the pools. I think that could cause the problem. Is there a trick to enable scrub and backup a the same time (bwlimit on the backup maybe?)
Edit: I killed the backup job, and the zfs arc_reclaim and arc_prune are still on 100% CPU
Edit: I stopped scrubbing with zpool scrub -s but the processes still on max CPU :/
Last edited: