You have a hung task for more than 120 seconds writing to a zvol, this indicates that ZFS was having trouble with I/O to the drive.
> rpool: raidz1 3 x 4TB NVME drives
> storage: raidz1 8 x 12TB SATA hard disks
> The trick is that some partitions of the NVME drives are ZIL/LOGS and L2ARC caches for storage pool
You built it wrong.
Full stop. Re-architect.
RAIDZx is not good for VMs, and re-using partitions on the rpool drives for ZIL/LOG/L2ARC is probably causing I/O contention and general confusion.
Assumption: you use at least four identical devices for that. Mirrors, RaidZ, RaidZ2 are possible - theoretically.
Technically correct answer: yes, it works. But the right answers is: no, do not do that! The recommendation is very clear: use “striped mirrors”. This results in something similar to a classic Raid10.
(1) RaidZ1 (and Z2 too) gives you the IOPS of a single device, completely independent of the actual number of physical devices. For the “four devices, mirrored” approach this will double --> giving twice as many Operations per Second. For a large-file...
--What I would recommend:
o Mirror for rpool, and use different make/model SSD so they don't both wear out around the same time (Think EVO and Pro, you want one to wear out faster.) Backup the ZFS rpool to the 3rd nvme drive if you want, or repurpose it.
o Mirrors for LXC/VM vdisk backing storage, so interactive response is better
o RAIDZ2 for bulk storage / media, where interactive response is not an issue. You might have a bad time with "raidz1 8x12TB SATA hard disks" when things start failing, especially if they're not NAS-rated disks. Desktop-class hard drives can cause Weird Behavior with ZFS when they start failing; the firmware is different from NAS.
The odds of a 2nd disk (especially if it's over ~2-4TB) falling over during replacement / resilver are
not in your favor.
o Separate devices for ZIL / SLOG (if you even need these, generally you
don't unless NFS / lots of sync writes), and L2ARC.
You can try moving the L2ARC to e.g. 64GB PNY USB3 thumbdrives. Inexpensive, disposable, pool doesn't fall into a black hole if they fail, easily replaced if you have spares (buy a 4-5 pack.) L2ARC survives a reboot, where ARC does not.
https://search.brave.com/search?q=z...summary=1&conversation=8cc2378dc0a0e3207dd594
o If you have a lot of small files and your scrubs are taking more than ~24 hours, add-in a mirrored Special SSD device. Again, different make/model to minimize double-failure odds.
https://forum.level1techs.com/t/zfs-metadata-special-device-z/159954
o Consider adding a hotspare to the pool if you have extra drive bay(s) - with 12TB disks you want at least 1-2 spares lying around if you can afford it. Waiting for a replacement drive to show up in the mail is nail-biting time, and hoping the pool doesn't alter the deal and fail any further.
--When you get back up and running, check the Wearout indicator in Nodes / (nodename) / Disks. If any are above ~50-80%, proactively replace. With SSD/nvme, you want a high TBW rating if you're not going with Enterprise-level.