Alpine VM freezes with 100% CPU

Mrt12

Well-Known Member
May 19, 2019
165
20
58
46
CH
Good day,

I have one VM that runs on Alpine

Code:
# cat /etc/alpine-release
3.20.3

and so far, it was working fine for several years already. It has 2 CPU cores and 1GB RAM allocated. Now, since a couple of weeks, I have the VM intermittently being frozen. When this happens, the CPU usage shown on the Proxmox GUI goes to 100%, and I can no longer log into the machine, either by Console or by SSH. Also the machine does not react to STOP, but only to RESET.
Of course, because the only way to revive the machine is via RESET, the dmesg of the VM is cleared after reboot, and I don't see anything useful in the syslog. Further, in the syslog of PVE, I cannot find anything related to this, too.
What I can find in the VM's syslog is flooded messages such as

Code:
 user.info : info: guest-ping called

which I think is related to the guest agent?
I have disk usage only 2%, so the lockup is not due to full disk. RAM is also not fully used, so he is also not swapping.

About one or 2 months ago, I saw in the syslog lots of messages of the sorts

Code:
Aug 8 07:21:37 kern warn kernel [50706.507735] eth1: bad gso: type: 1, size: 1452
Aug 8 07:21:38 kern warn kernel [50706.829626] eth1: bad gso: type: 4, size: 1440
Aug 8 07:21:38 kern warn kernel [50706.830058] eth1: bad gso: type: 4, size: 1440
Aug 8 07:21:38 kern warn kernel [50707.532034] eth1: bad gso: type: 4, size: 1440
Aug 8 07:21:39 kern warn kernel [50707.912949] eth1: bad gso: type: 1, size: 1452
Aug 8 07:21:43 kern warn kernel [50712.035334] net_ratelimit: 15 callbacks suppressed

and ethernet performance of this VM was incredibly bad. I found one solution which said that in the virtio ethernet driver is a bug and one shall use Intel E1000E. I switched the VM's network adapter to this Intel thingy, and indeed, the network performance is good again, but I have the feeling that the random lockups occur since then. However, it was so far not possible for me to reproduce the lockup. I have now htop open since ~10 days, in the hope that I could see something when the VM locks up. But nothing. Today it locked up again, and I couldn't see a thing in htop (and of course, htop was also frozen).

I saw already this thread

https://forum.proxmox.com/threads/vms-freeze-with-100-cpu.127459/page-12

but I have not seen a solution so far. I wonder, how can I debug this sort of problem? I am sure in the VM's dmesg I could see something, but unfortunately, it is cleared when the VM resets.
 
I have the same problem as you .... cpu suddenly goes to 100% randomly
I am using version 8.0.3 ..... :eek::eek:
 

Attachments

  • AOS 100.jpg
    AOS 100.jpg
    59.8 KB · Views: 5
Case solved (problem solved) after updating to proxmox version 8.2.2
Previously we always had random problems suddenly cpu usage 100% causing hank and stagnant

Thank you Proxmox
 
Case solved (problem solved) after updating to proxmox version 8.2.2
Previously we always had random problems suddenly cpu usage 100% causing hank and stagnant

Thank you Proxmox
Out of interest: Do you know why this happened?