What problem with MariaDB did you see? The first issue we saw with MariaDB is that queries suddenly started never returning, when that was in a cluster the entire cluster then would not return queries because I presume it was waiting forever for a node to commit. My memory is sketchy on this but a restart of mariadb service fixed resulted in them working ok. So at that point, that was not a broken FS, and MariaDB could have been more helpful as to why queries were never returning but nothing in the logs at all. This would start after a snapshot had concluded and it wasn't random it was every single time without fail. This was Ubuntu 22.04 container with MariaDB 10.6.7 we upgraded to 10.8 where that problem stopped. Seemed others had the same problem, so not just us.
But then we encountered the next problem - which was intermittent. That is that when you do a snapshot, on completion the entire filesystem disappears - you can log in to a VM or container but you can't call any binaries or list any directories, it just freezes. The only way you know what is going on is if you have a process inside that reports the problem or your syslog goes to an external syslog. This is why I think that people have concentrated on qm-agent because it notices that it can't access the file system and then tells the outside that the fs-thaw has timedout, but we know better. Same issue occurs on a VM not running guest agent, which means its nothing to do with fs-freeze/thaw IMO.
Dig a bit deeper and you find that it seems to be an issue with qemu since a kernel change over 12 months ago.
This puzzles me as to why when fundamentally it doesn't work why isn't there more noise in the groups? So maybe its a specific combination of our hardware? Just in case, here it is:
Code:
DELL poweredge r450, Intel(R) Xeon(R) Silver 4310 CPU, 32Gb of HMA82GR7DJR8N-XN, MegaRAID Tri-Mode SAS3516 , 4x MTFDDAK480TDT, ZFS
Kernel : Linux 5.15.53-1-pve #1 SMP PVE 5.15.53-1