Any logfile that can give a hint?
You have to activate debugging on the qemu-guest-agent to get more out of it. Please see /etc/default/qemu-guest-agent.
Any logfile that can give a hint?
You have to activate debugging on the qemu-guest-agent to get more out of it. Please see /etc/default/qemu-guest-agent.
]# rpm -ql qemu-guest-agent
/etc/qemu-ga
/etc/qemu-ga/fsfreeze-hook
/etc/qemu-ga/fsfreeze-hook.d
/etc/sysconfig/qemu-ga
/usr/bin/qemu-ga
/usr/lib/systemd/system/qemu-guest-agent.service
/usr/lib/udev/rules.d/99-qemu-guest-agent.rules
/usr/share/doc/qemu-guest-agent-2.12.0
/usr/share/doc/qemu-guest-agent-2.12.0/COPYING
/usr/share/doc/qemu-guest-agent-2.12.0/README
/usr/share/man/man8/qemu-ga.8.gz
/usr/share/qemu-kvm/qemu-ga
/usr/share/qemu-kvm/qemu-ga/fsfreeze-hook.d
/usr/share/qemu-kvm/qemu-ga/fsfreeze-hook.d/mysql-flush.sh.sample
/var/log/qemu-ga
[general]
verbose = 1
Now this is the weirdest ... the only change I made was adding
2019-11-19 01:15:02 INFO: Starting Backup of VM 105 (qemu)
2019-11-19 01:15:02 INFO: status = running
2019-11-19 01:15:03 INFO: update VM 105: -lock backup
2019-11-19 01:15:03 INFO: VM Name: 3605-EU
2019-11-19 01:15:03 INFO: include disk 'scsi0' 'local-lvm:vm-105-disk-0' 200G
2019-11-19 01:15:03 INFO: backup mode: snapshot
2019-11-19 01:15:03 INFO: ionice priority: 7
2019-11-19 01:15:03 INFO: creating archive '/var/lib/vz/dump/vzdump-qemu-105-2019_11_19-01_15_02.vma.lzo'
2019-11-19 02:15:03 ERROR: VM 105 qmp command 'guest-fsfreeze-freeze' failed - got timeout
2019-11-19 02:15:13 ERROR: VM 105 qmp command 'guest-fsfreeze-thaw' failed - got timeout
2019-11-19 02:15:13 INFO: started backup task 'd08c500e-f900-43ca-91d1-a99f90af0e69'
root@server36:~# qm agent 105 fsfreeze-status
thawed
root@server36:~# qm agent 105 fsfreeze-freeze
# ... 10 minutes pass, nothing happens, 105 is dead to the world, so I decide to break this and thaw it.
^C
root@server36:~# qm agent 105 fsfreeze-thaw
QEMU guest agent is not running
root@server36:~# qm agent 105 fsfreeze-status
QEMU guest agent is not running
# ... so now the agent is gone AWOL, ok, so let's try to reset it:
root@server36:~# qm stop 105
trying to acquire lock...
can't lock file '/var/lock/qemu-server/lock-105.conf' - got timeout
# ... let's skip the lockfile
root@server36:~# qm stop 105 --skiplock true
trying to acquire lock...
can't lock file '/var/lock/qemu-server/lock-105.conf' - got timeout
# dpkg -l | grep -i qemu
ii pve-qemu-kvm 3.0.1-4 amd64 Full virtualization on x86 hardware
ii qemu-server 5.0-54 amd64 Qemu Server Tools
# rpm -qa | grep -i qemu
qemu-guest-agent-2.12.0-3.el7.x86_64
INFO: starting new backup job: vzdump 105 --compress lzo --mode snapshot
INFO: Starting Backup of VM 105 (qemu)
INFO: Backup started at 2019-12-08 09:19:26
INFO: status = running
INFO: update VM 105: -lock backup
INFO: VM Name: 3605-EU
INFO: include disk 'scsi0' 'local-lvm:vm-105-disk-0' 200G
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/var/lib/vz/dump/vzdump-qemu-105-2019_12_08-09_19_26.vma.lzo'
INFO: started backup task 'a57a2dab-4ba8-4705-b219-10407a15e711'
INFO: status: 0% (1337982976/214748364800), sparse 0% (884207616), duration 3, read/write 445/151 MB/s
INFO: status: 1% (2233073664/214748364800), sparse 0% (1005961216), duration 8, read/write 179/154 MB/s
INFO: status: 2% (4345233408/214748364800), sparse 0% (1129299968), duration 20, read/write 176/165 MB/s
INFO: status: 3% (6456868864/214748364800), sparse 0% (1167572992), duration 33, read/write 162/159 MB/s
INFO: status: 4% (8734900224/214748364800), sparse 0% (1202794496), duration 46, read/write 175/172 MB/s
INFO: status: 5% (10904141824/214748364800), sparse 0% (1270341632), duration 59, read/write 166/161 MB/s
INFO: status: 6% (13033930752/214748364800), sparse 0% (1279619072), duration 73, read/write 152/151 MB/s
INFO: status: 7% (15062728704/214748364800), sparse 0% (1284661248), duration 85, read/write 169/168 MB/s
INFO: status: 8% (17287348224/214748364800), sparse 0% (1286201344), duration 98, read/write 171/171 MB/s
INFO: status: 9% (19526189056/214748364800), sparse 0% (1483706368), duration 110, read/write 186/170 MB/s
INFO: status: 10% (21613838336/214748364800), sparse 0% (1534369792), duration 122, read/write 173/169 MB/s
INFO: status: 11% (23688118272/214748364800), sparse 0% (1556529152), duration 134, read/write 172/171 MB/s
INFO: status: 12% (25915817984/214748364800), sparse 0% (1593987072), duration 147, read/write 171/168 MB/s
INFO: status: 13% (27976990720/214748364800), sparse 0% (1614819328), duration 160, read/write 158/156 MB/s
INFO: status: 14% (30184439808/214748364800), sparse 0% (1624965120), duration 172, read/write 183/183 MB/s
INFO: status: 15% (32255442944/214748364800), sparse 0% (1643663360), duration 183, read/write 188/186 MB/s
INFO: status: 16% (34422652928/214748364800), sparse 0% (1720737792), duration 196, read/write 166/160 MB/s
INFO: status: 17% (36585472000/214748364800), sparse 0% (1748467712), duration 209, read/write 166/164 MB/s
INFO: status: 18% (38690488320/214748364800), sparse 0% (1933090816), duration 221, read/write 175/160 MB/s
INFO: status: 19% (41015377920/214748364800), sparse 1% (3174694912), duration 228, read/write 332/154 MB/s
INFO: status: 20% (43097849856/214748364800), sparse 1% (3286736896), duration 241, read/write 160/151 MB/s
INFO: status: 21% (45141721088/214748364800), sparse 1% (3290038272), duration 252, read/write 185/185 MB/s
INFO: status: 22% (47431548928/214748364800), sparse 1% (3535233024), duration 265, read/write 176/157 MB/s
INFO: status: 23% (49410801664/214748364800), sparse 1% (3651522560), duration 275, read/write 197/186 MB/s
INFO: status: 24% (51646758912/214748364800), sparse 1% (3714195456), duration 288, read/write 171/167 MB/s
INFO: status: 25% (53721563136/214748364800), sparse 1% (3740983296), duration 300, read/write 172/170 MB/s
INFO: status: 26% (55959945216/214748364800), sparse 1% (4093202432), duration 312, read/write 186/157 MB/s
INFO: status: 27% (58036518912/214748364800), sparse 2% (4879020032), duration 320, read/write 259/161 MB/s
INFO: status: 28% (60243443712/214748364800), sparse 2% (4894195712), duration 333, read/write 169/168 MB/s
INFO: status: 29% (62284627968/214748364800), sparse 2% (4907098112), duration 345, read/write 170/169 MB/s
INFO: status: 30% (64607092736/214748364800), sparse 2% (5151129600), duration 357, read/write 193/173 MB/s
INFO: status: 31% (66671083520/214748364800), sparse 2% (5215911936), duration 369, read/write 171/166 MB/s
INFO: status: 32% (68740186112/214748364800), sparse 2% (5365456896), duration 381, read/write 172/159 MB/s
INFO: status: 33% (70890291200/214748364800), sparse 2% (5465829376), duration 392, read/write 195/186 MB/s
INFO: status: 34% (73123037184/214748364800), sparse 2% (5691367424), duration 403, read/write 202/182 MB/s
INFO: status: 35% (75258527744/214748364800), sparse 2% (5729705984), duration 415, read/write 177/174 MB/s
INFO: status: 36% (77426130944/214748364800), sparse 2% (5730824192), duration 427, read/write 180/180 MB/s
INFO: status: 37% (79606185984/214748364800), sparse 2% (6110089216), duration 438, read/write 198/163 MB/s
INFO: status: 38% (81700913152/214748364800), sparse 2% (6119223296), duration 450, read/write 174/173 MB/s
Dec 8 09:19:26 server36 vzdump[15269]: <root@pam> starting task UPID:server36:00003BA7:0041A611:5DECC01E:vzdump:105:root@pam:
Dec 8 09:19:27 server36 qm[15274]: <root@pam> update VM 105: -lock backup
Dec 8 09:25:26 server36 pvedaemon[1683]: <root@pam> successful auth for user 'root@pam'
Dec 8 09:28:16 server36 kernel: [ 0.000000] microcode: microcode updated early to revision 0x21, date = 2019-02-13
Dec 8 09:28:16 server36 kernel: [ 0.000000] Linux version 4.15.18-19-pve (root@dev5) (gcc version 6.3.0 20170516 (Deb
I'm wondering if this is the same problem causing my VM snapshots. This is in fact a VM with bindmounts.
# qm guest cmd 105 fsfreeze-status
thawed
# qm guest cmd 105 fsfreeze-freeze
^C << hangs, having to break out of the command
# qm guest cmd 105 fsfreeze-status
QEMU guest agent is not running
# qm reset 105 --skiplock
# qm guest cmd 105 fsfreeze-status
thawed
...
2020-05-31 03:15:58 INFO: creating archive '/var/lib/vz/dump/vzdump-qemu-105-2020_05_31-03_15_57.vma.lzo'
2020-05-31 04:15:58 ERROR: VM 105 qmp command 'guest-fsfreeze-freeze' failed - got timeout
2020-05-31 04:16:08 ERROR: VM 105 qmp command 'guest-fsfreeze-thaw' failed - got timeout
2020-05-31 04:16:08 INFO: started backup task '973576ea-d503-4800-b826-874b7b4c874b'
...
Isn't there anything I can do to try to debug this?
Is there somewhere I can open a bug report on this? It is a little frustrating that after so many months of this being a confirmed bug, there still seem to be no action taken to fix it.
root@proxmox ~ > qm guest cmd 104 get-osinfo
{
"id" : "centos",
"kernel-release" : "3.10.0-1127.10.1.el7.x86_64",
"kernel-version" : "#1 SMP Wed Jun 3 14:28:03 UTC 2020",
"machine" : "x86_64",
"name" : "CentOS Linux",
"pretty-name" : "CentOS Linux 7 (Core)",
"version" : "7 (Core)",
"version-id" : "7"
}
root@proxmox ~ > qm guest cmd 104 fsfreeze-status
thawed
root@proxmox ~ > qm guest cmd 104 fsfreeze-freeze
2
root@proxmox ~ > qm guest cmd 104 fsfreeze-status
frozen
root@proxmox ~ > qm guest cmd 104 fsfreeze-thaw
2
root@proxmox ~ > qm guest cmd 104 fsfreeze-status
thawed
root@proxmox ~ > qm guest cmd 104 info | jq ".version"
"2.12.0"
It was my impression from your post on Nov 22 that you were seeing the same problem sometimes, obviously I misunderstood you.
The only "debugging" info I find is that now it shows in the log that the fsfreeze-freeze "fails". There are no further details. I knew that already, I need to find out WHY the freeze fails.