ERROR: VM 110 qmp command 'guest-fsfreeze-freeze' failed - got timeout

Discussion in 'Proxmox VE: Installation and configuration' started by Lenu, Jan 14, 2019.

  1. Lenu

    Lenu New Member
    Proxmox Subscriber

    Joined:
    Jan 14, 2019
    Messages:
    2
    Likes Received:
    0
    Hi,

    we often have VM / Kernel hung task / panik things with our one VM.
    Does anyone has an Idea?

    Backup Log:
    Code:
    110: 2019-01-07 00:01:57 INFO: Starting Backup of VM 110 (qemu)
    110: 2019-01-07 00:01:57 INFO: status = running
    110: 2019-01-07 00:01:57 INFO: update VM 110: -lock backup
    110: 2019-01-07 00:01:57 INFO: VM Name: hostname
    110: 2019-01-07 00:01:57 INFO: include disk 'scsi0' 'nvme:110/vm-110-disk-0.qcow2' 60G
    110: 2019-01-07 00:01:57 INFO: backup mode: snapshot
    110: 2019-01-07 00:01:57 INFO: ionice priority: 7
    110: 2019-01-07 00:01:57 INFO: creating archive '/mnt/pve/Backup_Server/dump/vzdump-qemu-110-2019_01_07-00_01_57.vma.lzo'
    110: 2019-01-07 01:01:57 ERROR: VM 110 qmp command 'guest-fsfreeze-freeze' failed - got timeout
    110: 2019-01-07 01:01:58 INFO: started backup task '1bada98d-4884-494d-9a9b-6595ec87f908'
    110: 2019-01-07 01:02:01 INFO: status: 2% (1702887424/64424509440), sparse 1% (993382400), duration 3, read/write 567/236 MB/s
    110: 2019-01-07 01:02:04 INFO: status: 3% (2235039744/64424509440), sparse 1% (995135488), duration 6, read/write 177/176 MB/s
    110: 2019-01-07 01:02:09 INFO: status: 4% (2816606208/64424509440), sparse 1% (1003827200), duration 11, read/write 116/114 MB/s
    110: 2019-01-07 01:02:12 INFO: status: 5% (3637182464/64424509440), sparse 1% (1006587904), duration 14, read/write 273/272 MB/s
    110: 2019-01-07 01:02:15 INFO: status: 6% (4425973760/64424509440), sparse 1% (1009418240), duration 17, read/write 262/261 MB/s
    .....
    Email with subject "[abrt] : Kernel panic - not syncing: hung_task: blocked tasks":
    Code:
    reason:         Kernel panic - not syncing: hung_task: blocked tasks
    component:      kernel
    hostname:       xxx.hostname.com
    count:          1
    analyzer:       vmcore
    architecture:   x86_64
    event_log:   
    kernel:         3.10.0-962.3.2.lve1.5.24.7.el7.x86_64
    kernel_tainted_long: O - Out-of-tree module has been loaded.
    kernel_tainted_short: GO
    last_occurrence: 1546819513
    os_release:     CloudLinux release 7.6 (Vladimir Lyakhov)
    runlevel:       N 3
    time:           Mon 07 Jan 2019 01:05:13 CET
    type:           vmcore
    uid:            0
    username:       root
    uuid:           5b37bc38aeb82309605dff75d34f96fd7306f37e
    
    backtrace:
    :Kernel panic - not syncing: hung_task: blocked tasks
    :CPU: 1 PID: 21 Comm: khungtaskd ve: 0 Kdump: loaded Tainted: G           O   ------------   3.10.0-962.3.2.lve1.5.24.7.el7.x86_64 #1 61.16
    :Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.11.1-0-g0551a4be2c-prebuilt.qemu-project.org 04/01/2014
    :Call Trace:
    : [<ffffffff83f2872d>] dump_stack+0x19/0x1b
    : [<ffffffff83f22f6d>] panic+0xe8/0x21f
    : [<ffffffff8395399e>] watchdog+0x26e/0x2c0
    : [<ffffffff83953730>] ? reset_hung_task_detector+0x20/0x20
    : [<ffffffff838bf701>] kthread+0xd1/0xe0
    : [<ffffffff838bf630>] ? create_kthread+0x60/0x60
    : [<ffffffff83f3a677>] ret_from_fork_nospec_begin+0x21/0x21
    : [<ffffffff838bf630>] ? create_kthread+0x60/0x60
    machineid:
    :systemd=02956c2999504428ac7a94e90f0b6386
    :sosreport_uploader-dmidecode=3ce185c3ba2e23cc941959203608a05db9b77422cdaa5ee3ebc792281f466b78
    not-reportable:
    :A kernel problem occurred, but your kernel has been tainted (flags:GO). Explanation:
    :O - Out-of-tree module has been loaded.
    :Kernel maintainers are unable to diagnose tainted reports.
    This was a newly created VM with Proxmox and we installed the qemu guest agent without any modification.

    Cloudlinux Support says, that "the issue is that FS is being frozen by the qemu agent".
     
  2. mailinglists

    mailinglists Active Member

    Joined:
    Mar 14, 2012
    Messages:
    346
    Likes Received:
    32
    My guess on what happens here is that, host issues FS freeze to guest, guest does it, but does not reply back in time (
    110: 2019-01-07 01:01:57 ERROR: VM 110 qmp command 'guest-fsfreeze-freeze' failed - got timeout), host thinks it was not successful, leaving the VM in locked state - never unfreezing the filesystem. Then you have to hard reboot or unlock the guest filesystem manually.

    In my experience this usually happens when the host is starved of storage IO.
    You can probably stop the freezes, if you disable backup for this guest.
     
  3. Lenu

    Lenu New Member
    Proxmox Subscriber

    Joined:
    Jan 14, 2019
    Messages:
    2
    Likes Received:
    0
    thx! This System is new, only three VMs running on NVMe attached Storage. the other VMs (also with Plesk and CloudLinux) have no such Problems. This (also newly created VM) has Problems ... We didn't made any Changes nor is something different compared to other VMs. Just normal CL + Plesk Setup ..

    What do you mean with "disabling backup for this guest" ? You mean Backup disable for this VM? I need Backup.
    Turning something off is not a Solution in my honest opinion.
     
  4. mailinglists

    mailinglists Active Member

    Joined:
    Mar 14, 2012
    Messages:
    346
    Likes Received:
    32
    To solve your problems, you must find the reason why it does not unfreeze the FS.
    I already told you what my guess is, now it is up to you.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice