Right now, it is impossible to use ProxMox VE + ProxMox Backup Server to get a coherent snapshot of a multi-VM system. I'm thinking of a SharePoint farm where the database and contents are on different servers, but the problem applies to many situations.
The problem is that each VM is backed up individually. The system tells the qemu-agent to ask the operating system to freeze writes to the disk, then the system takes a snapshot, then it unfreezes the disks. When that VM is done, it moves on to the next VM, and so on. Because there is no coordination, the database backup might include data about contents on a different server that do not exist in the other server's backup.
The solution appears to be fairly simple - define a backup group. When a backup is started for any VM in the group, send the freeze request to every server in the group, and while they are all frozen, snapshot them all. The system can then process the snapshots sequentially as usual, and the group of backups will be consistent with each other because they all reflect the state of the group of servers during a single freeze.
The problem is that each VM is backed up individually. The system tells the qemu-agent to ask the operating system to freeze writes to the disk, then the system takes a snapshot, then it unfreezes the disks. When that VM is done, it moves on to the next VM, and so on. Because there is no coordination, the database backup might include data about contents on a different server that do not exist in the other server's backup.
The solution appears to be fairly simple - define a backup group. When a backup is started for any VM in the group, send the freeze request to every server in the group, and while they are all frozen, snapshot them all. The system can then process the snapshots sequentially as usual, and the group of backups will be consistent with each other because they all reflect the state of the group of servers during a single freeze.