Crash with data-loss in a 2node cluster

MH_MUC

Active Member
May 24, 2019
66
6
28
37
Hi everyone,

I have a problem and I hope that anyone knows how to prevent that in the future.

I am running a two-node cluster with zfs-replication and HA both with ZFS both running
Kernel Linux 5.15.131-2-pve #1 SMP PVE 5.15.131-3
pve-manager/7.4-17

Both servers have local-zfs (rpool) and a larger data-pool.
Last night server1 wasn't reachable anymore so the server2 sent me the fencing messages. However it didn't bring the VMs up and server2 wasn't reachable through SSH.

I sshd into server1 which was reachable again, but showd no diskspace on /
I was a little bit in a panic mode so I didn't check the actual ZFS status. I just deleted some template files to free some disk-space and rebooted the machine.
It was than up and running again showing serveral hundred GB of free disk space in both-pools.

After a hard reboot of server2 it was reachable again and brought up all VMs (also the ones previously hosted on server1 with no failback option). This would have been the expected behaviour during fencing.

However a major dataloss occurred. Everything between the eveninng of the 9th and last night is gone missing and also not included in the proxmox-backups on the pbs.

Last log message in the VMs syslogs:
Code:
Jan  9 21:00:00 hostname qemu-ga: info: guest-ping called
Jan  9 21:00:01 hostname qemu-ga: info: guest-fsfreeze called

I think somehow the freeze of the guest fs for the 22 Local Time backup never got released on the 9th which ended up in a filled zfs pool, which ended up in a servercrash because no free disk space was left and which automatically got released with dataloss after the reboot.
On the other hand I can see a usage increase of the zfs beginning 2 hours before the crash yesterday.

Is there a way to confirm that and how can I make sure this doesn't happen again?

Thank you for any help!
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!