[SOLVED] understanding pct snapshot - rollback

diviaki

New Member
Sep 11, 2020
5
0
1
50
Hi!

First thanks for the Proxmox suite, it just works, and makes my poor-mans-sysadmin role a lot easier.

There's one thing I'd like to better understand about snapshotting containers.
Do I need to pct stop - snapshot - start my containers to have reliable snapshots?

I'm not into python but tried to discover the call path through /usr/sbin/pct -> /usr/share/perl5/PVE/CLI/pct.pm -> PVE/API2/LXC/Snapshot.pm -> PVE/LXC/Config.pm -> PVE/API2/LXC/Snapshot.pm -> PVE/Storage.pm -> PVE/Storage/ZFSPoolPlugin.pm -> zfs snapshot but found no traces of start/stop, suspend/resume, as if the state of the processes running in the container are not saved at all.
But obviously that would not be reliable, would it?

Cheers, Ákos
 
But obviously that would not be reliable, would it?
this depends on the underlying storage, for zfs for example we use zfs subvolume which will be consisten

but in general we freeze the container, sync the filesystems and then make the snapshot

(see also AbstractConfig.pm where most of the snapshot logic is for both vm/ct, and LXC/Config.pm where the actual implementations are)
 
Thanks for pointing to AbstractConfig.pm - yes, I now see the container freezed, mount points synced, finally the container unfreezes.
I still could not find where is the frozen process state saved to disk, not the code nor such files.
 
I still could not find where is the frozen process state saved to disk, not the code nor such files.
it is not for containers, but if the applications write consistent data to the disk, this is ok (e.g. databases try to always have a consistent state on disk)
 
Thanks Dominik.
Do you know what happens in edge cases where an application could not write all it's state to disk?
 
well that data is not included in the snapshot.

if your applications do not keep their data on disk consistent, i would suggest using regular vms instead of container. with those you can include the memory into the snapshot
(for containers this is rather unfeasible, since that would involve saving many parts (memory/open files/sockets/network connections/etc) which is not easy
 
OK I see. So when snapshotting a running container only the filesystem is snapshotted. As process state is not stored for snapshots, restoring snapshots that were taken on running containers involves the risks of loosing any unwritten internal caches of a program.
Going back my opening question, this means that I do need pct stop - pct snapshot - pct start to have reliable snapshots.
Alternatively, as you suggested, VMs with memory dumps are an option.
(I also tried using pct suspend, which stores the container state, but that - actually the underlying lxc-checkpoint - fails.
 
well that data is not included in the snapshot.

if your applications do not keep their data on disk consistent, i would suggest using regular vms instead of container. with those you can include the memory into the snapshot
(for containers this is rather unfeasible, since that would involve saving many parts (memory/open files/sockets/network connections/etc) which is not easy
My question is what will happen with LXc containers with MySQL or MSSQL servers working inside , if all containers and other VMs are working in Proxmox cluster HA or backup cluster. What is happened if one of node will die, and corosync starts them from other node. Will databases (or other data buffered in RAM of LXC or VM ) be consistent ? How works replicatating between nodes for LXC containers and VM machines ? What a diffirences are ? I've tested MSSQL Server in LXC Ubuntu container, and after simulating of crash one node, all started fine from the others. I understand that inside full VM qmagent can dump memory after receive snapshot signal from the host, or we can make snapshot with --vmstate option, but how this works with LXC container? Is it safety to use containers like that on Proxmox HA or backup cluster ?
 
if a node is fenced/crashes and HA recovers the service on another node, only what has been persisted to disk will be available on the other node (provided you use shared storage of course, otherwise not even that is guaranteed).
 
if a node is fenced/crashes and HA recovers the service on another node, only what has been persisted to disk will be available on the other node (provided you use shared storage of course, otherwise not even that is guaranteed).
Yes I understand this. But is any differences between LXC and VM snapshooted by corosync ? Are the VM more safety, than LXC for situation like this? Does corosync make period snapshot of VM with memory state or only disk state before sends a snapshot to other node?
 
huh? corosync doesn't send anything related to guest state to another node.. you can (manually) create a snapshot, for VMs that can include the RAM/guest state, and this snapshot can then be rolled back to. this has nothing to do with HA recovery in any way.
 
Yes, I know that corosync does't make snapshots. However in module replication we can set period of synchronize data between nodes in cluster. For example every 5 minutes.
I asked about this process. When LXC and VMS are snaphooted and sent to another node - is it only state of disks sent (ZFS), or maybe before this process buffers inside VMs are flushed by qemu-guest-agent ? And are there any differences between VMs and LXCs during this process of replicating between nodes?
 
the replication is on the storage layer only, and uses regular snapshots (no RAM snapshots/guest state dump for VMs).
 
the replication is on the storage layer only, and uses regular snapshots (no RAM snapshots/guest state dump for VMs).
Ok. So conclusion is, that we have no guarantee, that after failed one of nodes, VMs or LXCs will be consistent on another node after last replication. Better way is making snapshots of VM with --vmstate 1 option, and replicate this snapshot to the others node. (And if VM will be non consistent rollback snapshot on other node). Am I right?
 
if you use replication, the replicated state will always
- only be consistent on the storage level, there is no RAM dump involved
- always be (at least slightly) out-of-sync, unless the guest is not running at all
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!