guest hangs due to qemu-guest-agent

menelaostrik

Member
Sep 7, 2020
24
0
21
39
Hi,
i face a very wierd problem and although i found a few threads with "similar" problem, i wasn't able to diagnose and fix.
this is stopping the replication from finishing and also makes the guest unusable.

the guest system hangs almost immediately when issuing fsfreeze-status
Code:
qm guest cmd 500 fsfreeze-freeze

the load averages on the guest system jumps, here is an excerpt of the top command while the guest was still responding:

Code:
top - 15:29:43 up 11 min,  1 user,  load average: 16.95, 8.45, 3.48
Tasks: 220 total,   1 running, 219 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni, 90.0 id, 10.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem : 10.3/9827.3   [|||||||||||                                                                                         ]
MiB Swap:  0.0/20480.0  [                                                                                                    ]

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                                                                            
   3008 root      20   0   52596   4660   3640 R   0.3   0.0   0:01.59 top                                                                                                                
      1 root      20   0   95768  11596   8316 S   0.0   0.1   0:02.97 /usr/lib/systemd/systemd --switched-root --system --deserialize 17                                                 
      2 root      20   0       0      0      0 S   0.0   0.0   0:00.00 [kthreadd]                                                                                                         
      3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 [rcu_gp]                                                                                                           
      4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 [rcu_par_gp]                                                                                                       
      6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 [kworker/0:0H-kblockd]

you can see that there's 10% usage on waiting state but with iotop i can't see anything
After that the only thing i can do is reset the guest.
if i disable qemu-guest-agent from the VM options the issue dissapears but qemu-guest-agent is nessesary for our setup

the guest is running cloudlinux with kernel 4.18.0-147.8.1.el8.lve.1.x86_64

any clues on how can i diagnose the issue?
 
Last edited: