Backup script hook missing events?

fendrich

New Member
Dec 13, 2023
5
0
1
Dear all,

im running pve 7.4 in a 6 node ceph cluster and active ha manager.
We are currently running nightly backups.

My goal is to stop specific services on a windows guest vm during the fs-freeze command which is called during snapshot backup to prevent database corruption.
As we have the vm in a HA cluster, using the backup mode stop is not an working option.

Now to do so,i read about the --script argument and implemented a script for my vm, which should use qm exec to stop and start the service on the guest.

The issue that im having right now is, that there is no hook on the fs-freeze command and also no hook after fs-thaw is called.
The only events suitable would be backup-start to stop the service and backup-end to resume.


As the backup is pretty large, it takes long time to copy to our network storage and i do not want to wait until this is completed to resume the service on the guest.

the only thing i can think of right now is to delay the resume of the service for a few seconds and call both commands on the backup-start hook.
But that seems a bit off to me.

here are some detailed information:

pveversion --v:

Code:
proxmox-ve: 7.4-1 (running kernel: 5.13.19-3-pve)
pve-manager: 7.4-16 (running version: 7.4-16/0f39f621)
pve-kernel-5.15: 7.4-4
pve-kernel-5.13: 7.1-9
pve-kernel-5.15.108-1-pve: 5.15.108-2
pve-kernel-5.13.19-6-pve: 5.13.19-15
pve-kernel-5.13.19-3-pve: 5.13.19-7
ceph: 16.2.13-pve1
ceph-fuse: 16.2.13-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown: residual config
ifupdown2: 3.1.0-1+pmx4
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.3-1
proxmox-backup-file-restore: 2.4.3-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-5
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1


is there any other way to achive this? Did i get the hooks wrong and need to use pre-stop and post-restart?

The only information i found about the process timing was this:
https://forum.proxmox.com/threads/e...fsfreeze-thaw-failed-got-timeout.68082/page-2

but maybe the log messages are not in the correct order?
 
Last edited:
AFAICS there are no points for the hookscript to run at that times. for a snapshot based backup the pre-stop and post-restart hooks are not run when you'd expect (since the snapshot backup on qemu does not restart or even make a 'real' snapshot)
maybe we could add hooks for that, but seeing how the code is layed out, i think it's not really trivial to implement.
you might want to open a feature request on our bugtracker, so that other devs can see the request too (no promises if or when we'd implement that though): https://bugzilla.proxmox.com
 
  • Like
Reactions: Dunuin
note that the guest agent itself has hook points that are executed before freezing and after thawing..
 
My goal is to stop specific services on a windows guest vm during the fs-freeze command which is called during snapshot backup to prevent database corruption.
There is usually a way to to this from inside the VM. For example, the qemu guest agent provides a "--fsfreeze-hook" option.
 
There is usually a way to to this from inside the VM. For example, the qemu guest agent provides a "--fsfreeze-hook" option.

having the hook on the guest would be the best solution, unfortunately i cannot find information about this.
Also my qemu-ga.exe does not seem to have this option.

Code:
PS C:\Program Files\Qemu-ga> .\qemu_ga.exe --help
Usage: C:\Program Files\Qemu-ga\qemu_ga.exe [-m <method> -p <path>] [<options>]
QEMU Guest Agent 106.0.1
Copyright (c) 2003-2022 Fabrice Bellard and the QEMU Project developers

  -m, --method      transport method: one of unix-listen, virtio-serial,
                    isa-serial, or vsock-listen (virtio-serial is the default)
  -p, --path        device/socket path (the default for virtio-serial is:
                    \\.\Global\org.qemu.guest_agent.0,
                    the default for isa-serial is:
                    COM1).
                    Socket addresses for vsock-listen are written as
                    <cid>:<port>.
  -l, --logfile     set logfile path, logs to stderr by default
  -f, --pidfile     specify pidfile (default is C:\ProgramData\qemu-ga\qemu-ga.pid)
  -t, --statedir    specify dir to store state information (absolute paths
                    only, default is C:\ProgramData\qemu-ga)
  -v, --verbose     log extra debugging information
  -V, --version     print version information and exit
  -d, --daemonize   become a daemon
  -s, --service     service commands: install, uninstall, vss-install, vss-uninstall
  -b, --block-rpcs  comma-separated list of RPCs to disable (no spaces,
                    use "help" to list available RPCs)
  -D, --dump-conf   dump a qemu-ga config file based on current config
                    options / command-line parameters to stdout
  -r, --retry-path  attempt re-opening path if it's unavailable or closed
                    due to an error which may be recoverable in the future
                    (virtio-serial driver re-install, serial device hot
                    plug/unplug, etc.)
  -h, --help        display this help and exit

See <https://qemu.org/contribute/report-a-bug> for how to report bugs.
More information on the QEMU project at <https://qemu.org>.


always thought that option would be linux only?
 
for windows you need to use the VSS integration
 
for windows you need to use the VSS integration

i'm already doing this. the services are installed QEMU Guest Agent VSS Provider aswell as SQL Server VSS Writer.
we're still facing the issue that one of our databases went corrupt during the snapshot backup procedure.

so i'd really prefer stopping and restarting the MSSQL Server Service.
 
that's not an option unfortunately, unless windows provides hook points in VSS itself that I am not aware of.. if you want 100% consistency like that, you can always use shutdown mode (but the downtime will be bigger, and you lose dirty bitmap support for PBS so the backup itself will likely also take longer)
 
that's not an option unfortunately, unless windows provides hook points in VSS itself that I am not aware of.. if you want 100% consistency like that, you can always use shutdown mode (but the downtime will be bigger, and you lose dirty bitmap support for PBS so the backup itself will likely also take longer)

As mentioned in my initial post, the server is in a HA group, so the stop mode is not possible, is it?

so back to the start where i use the host to managed the service....

maybe using backup-start and checking the out put of qm guest cmd <vm> fsfreeze-status constantly.
but i dont really like those workaround solutions ....
 
sorry, overlooked that.

of course there are ways to implement workarounds like that (you could also leave the monitoring to the VM by querying VSS state inside, that way a single guest-exec call at backup-start would be enough - stop DB service, fork background task that waits for -> freeze -> thaw cycle or timeout and then starts DB again)
 
Ok.
If anyone is interested in my solution:

1. i do not like to have the scripts on each host for maintaining reasons, so i put them in /etc/pve/backup-hookscripts.
As you cannot set chmod +x on this location, i have a script called /etc/vzdump.sh, which does this:

Code:
#!/bin/bash
rm -r /etc/backup-hookscripts/
cp -r /etc/pve/backup-hookscripts /etc/
chmod +x /etc/backup-hookscripts/*.sh
/etc/backup-hookscripts/global.sh "$@"

yes, this is all executed on each step in the backup process, but for me it's the easiest way to keep the scripts updated and have the possibility to maintain them in one location without having a thrid party tool syncing.
maybe i'll add a md5 check to test if the copy is necessary, or i'll check for the phase and only copy on job-start - but for now it'll do.

2. in the global.sh i try to call a vm specific script, here is a snippet:

Code:
[...]
vmscript="/etc/pve/backup-hookscripts/$vmid.sh"
if [ -e "$vmscript" ]; then
        log_and_rotate "calling $vmscript with $phase $backupMode $vmid"
        bash "$vmscript" $phase $backupMode $vmid
fi

3. on the vm script in this case i do something like this:

Code:
[...]

function check_fsfreeze_status_background() {
    log_and_rotate "background thread started"
    previous_status=$(get_fsfreeze_status)
    start_time=$(date +%s)
    start_service=0
    while true; do
        status=$(get_fsfreeze_status)
        if [ "$previous_status" == "frozen" ] && [ "$status" == "thawed" ]; then
            log_and_rotate "VM was frozen, but is now thawed again, we can restart the service!"
            start_service=1
        fi
        previous_status="$status"

        current_time=$(date +%s)
        elapsed_time=$((current_time - start_time))
        log_and_rotate "$elapsed_time"
        if [ "$elapsed_time" -ge "$timeout_seconds" ]; then
            log_and_rotate "fs not frozen in time, starting the service!"
            start_service=1
        fi

        if [ "$start_service" -gt 0 ]; then
           start_service
           break
        fi

        sleep 0.1
    done
}

[...]

if [ "$phase" == "backup-start" ];then
        log_and_rotate "backup start called"
        stop_service
        (check_fsfreeze_status_background) </dev/null &>/dev/null &
        disown $!
        log_and_rotate "continue"
fi


i'd prefer to have the vm handle this, but working with vss writers seems to be quite difficult.

if anyone sees something to be improved, feedback is welcome.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!