PBS - Weird behaviour & hanging VM afterwards

iprigger

Renowned Member
Sep 5, 2009
190
43
93
earth!
Hi All,

I found a weird behaviour that makes the VM kind of "hang"... (until you open the console it simply stands still).

Seems to have something to do with the Backup / guest agent...

Log:
INFO: Starting Backup of VM 106 (qemu)
INFO: Backup started at 2020-08-05 20:01:14
INFO: status = running
INFO: VM Name: mariadb-c01
INFO: include disk 'virtio0' 'finavia_a:106/vm-106-disk-0.raw' 20G
INFO: include disk 'virtio1' 'finavia_a:106/vm-106-disk-1.raw' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: skip unused drive 'esnx-a:106/vm-106-disk-1.qcow2' (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/106/2020-08-05T18:01:14Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: VM 106 qmp command 'guest-fsfreeze-thaw' failed - got timeout
ERROR: VM 106 qmp command 'backup' failed - got timeout
ERROR: Backup of VM 106 failed - VM 106 qmp command 'backup' failed - got timeout
INFO: Failed at 2020-08-05 20:02:25

Any idea where this could come from?

I had it on various systems (mostly linux) with the qemu-ga installed.

Tobias
 
Hi All,

I found a weird behaviour that makes the VM kind of "hang"... (until you open the console it simply stands still).

Seems to have something to do with the Backup / guest agent...

Log:
INFO: Starting Backup of VM 106 (qemu)
INFO: Backup started at 2020-08-05 20:01:14
INFO: status = running
INFO: VM Name: mariadb-c01
INFO: include disk 'virtio0' 'finavia_a:106/vm-106-disk-0.raw' 20G
INFO: include disk 'virtio1' 'finavia_a:106/vm-106-disk-1.raw' 100G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: skip unused drive 'esnx-a:106/vm-106-disk-1.qcow2' (not included into backup)
INFO: creating Proxmox Backup Server archive 'vm/106/2020-08-05T18:01:14Z'
INFO: issuing guest-agent 'fs-freeze' command
INFO: issuing guest-agent 'fs-thaw' command
ERROR: VM 106 qmp command 'guest-fsfreeze-thaw' failed - got timeout
ERROR: VM 106 qmp command 'backup' failed - got timeout
ERROR: Backup of VM 106 failed - VM 106 qmp command 'backup' failed - got timeout
INFO: Failed at 2020-08-05 20:02:25

Any idea where this could come from?

I had it on various systems (mostly linux) with the qemu-ga installed.

Tobias

One more Hint: CPU Usage is going to 100% on one core for the VM in this situation... so it's rather easy to find out!

Tobias
 
how is the load on the system at that point in time?
 
Hi,

how is the load on the system at that point in time?

The Host itself is having a load between 2 and 6 (backup time). the VM is a DB Server (mariadb galera, 3 nodes) and more or less idle...

The peak comes at the time the backup kicks in:
Screenshot 2020-08-06 at 10.14.31 .png

And the load graph from the VM Server (Host):
Screenshot 2020-08-06 at 10.15.34 .png
Tobias
 
it would be interesting to try and trace the qemu-guest-agent process inside the VM - it looks like it's busy-waiting on something. e.g., attaching with 'strace -ff -p $(pidof qemu-ga)', attempting the backup, and then posting the resulting traces here