This happend on Archlinux too and the fix was to increase the timer to 1000.
Got any links, sources on that, because I couldn't find anything and Archlinux Kernel is configured to
tick with 300Hz.
And yes we're using 250Hz in combination with "CONFIG_NO_HZ_IDLE" (disable ticks on idle CPU cores), which is a good trade-off of timer accuracy and wake-ups per seconds (which are costly, CPU and power wise (only the former is
really important for PVE, the latter is more for mobile devices).
We're using dynamic ticks now, so you get already anything between 100 and 1500, depending on needs, see:
https://elinux.org/Kernel_Timer_Systems#Dynamic_ticks
mainly it comes from that zfs does not give back the scheduler to the disk io back to kernel and so it freezes
Sounds like a bug in ZFS, which you could report to ZFS on Linux, such a thing would be a general issue and not solved by increasing timer ticks, maybe reduced in chance but that's never a solution for such bugs.
120s comes 100% from zfs ..
It
can come from ZFS (never state that it couldn't be from ZFS) and does in your and some other cases, but it also can come from bugs in certain NICs or their driver/firmware (which you will also find reports here) or from anything else, well, blocking for longer than 120s, like doing IO on a dead NFS mount. Also the ZFS one can come from different issues (that's what I tried to say in my last reply), some of them already solved...
I have this issue on all my ZFS proxmox installations - just make a dd to a vm image and after several 10s fo GB it stops.
That's the issue, I cannot reproduce this:
Code:
# zfs create -V $[128 * 1<<30] toms-big-pool/foo # creates a vdev with ~130GB size
# dd if=/dev/urandom of=/dev/toms-big-pool/foo bs=1M count=$[1<<16] # write random data (from urandom, so unblocked), ~ 64 GB
65536+0 records in
65536+0 records out
68719476736 bytes (69 GB, 64 GiB) copied, 543.355 s, 252 MB/s
or do you do something else? Clear steps for us to ensure there's no difference would be best for us to trying to reproduce this..
What hardware do you have (the more details the better, disks, ram, cpu, vendor, HW raid or not, ...)?