One zfs pool slows down all other zfs pools

Jannoke

Renowned Member
Jul 13, 2016
67
11
73
It might be more like zfs question, but maybe someone has had experience.

So i have few zfs pools on one machine.
One of pools is running on simple consumer (dramless) nvme's (3 disk raidz1, one disk missing).. There is only one virtual machine on that specific pool . While filling disk on that pools guest machine using:
Bash:
dd if=/dev/urandom of=filltest.bin bs=1M count=95000 status=progress
It starts at 220MB/s ..and in around 5 minutes and 20GB it has dropped to around 100-120MB/s. At the same time all other guest machines are starting to report high IO even thou they are on different pools. And verifiebly the io is very sluggish on all these virtual machines to the point some services start to report being down. If i cancel the DD, it will recover.

There is plenty of ram (512GB) and cpu (dual xeon 2699v4). Also machine is not that loaded either on cpu or disk side before activating dd.

So how can it be that one virtual machine with unrelated pool to other machines can take down entire server zfs storage? What am I missing here. Is it like trashing my ARC?
Bash:
root@bla:~# pveversion
pve-manager/9.0.9/117b893e0e6a4fee (running kernel: 6.14.11-2-pve)
root@bla:/etc/modprobe.d# cat zfs.conf 
options zfs zfs_arc_min=10737418240
options zfs zfs_arc_max=64424509440

it happends with sync=disabled or sync=standard
 
> One of pools is running on simple consumer (dramless) nvme's (3 disk raidz1, one disk missing)

Seriously, you're running a consumer-level 3-disk raidz DEGRADED with 1 disk MISSING, and posting about it here?? Fix your pool first.

If you want better speed, rebuild it as a mirror pool. With Enterprise-level SSD or at least high-TBW rated like Lexar NM790.
 
  • Like
Reactions: waltar