Linux VMs with old kernel freezing randomly

Sep 22, 2021
29
7
8
49
Hi everyone,

I have an issue with VMs randomly freezing. IT happens multiple times a day at seemingly random times. Not all my VMs with old Linux kernel freeze at once but they are all affected at one time or another. I did not have issues with more recent Linux distributions.

I think that this problem occurred without any changes on the VM and Proxmox side.

At first it started on a old Proxmox 6.4
But i continued after migrating the VMs to a brand new Proxmox 7.3

The VMs show a screen like the one attached and have a utilization of the cpu more than 100% (sometimes this does not apply).

Because of the old Linux kernel the VMs have this hardware configuration:
Machine - i440fx
SCSI controller - LSI 53C895A
Hard disk - scsi0
Network device - e1000
Processors - original kvm64, tried host + numa with same results

The only common thing for the "dying" VMs is, that they are running:
Debian 4 (Linux 2.6)
Debian 5 (Linux 2.6)
Debian 6 (Linux 2.6)
Debian 7 (Linux 3.2)
Debian 8 (Linux 3.16) - I think that some Debian 8 based VMs did not have a problem yet - I have to check the kernel versions and installed package versions (installed programs and libraries)
Ubuntu 12.04 (Linux 3.2)

Newer Linux distributions are not affected.
Is this problem related to a security leak?


I went through different similar posts on this forum, but i didn't find a thread where the issues are related to Linux distribution versions.
Does anybody have a similar experience?
Thank you in advance for any recommendations.


PS: Maybe i will have to migrate all the dozens of VMs to a new Debian/Ubuntu
(Mission Impossible :cool: due required program and library versions of running apps).
 

Attachments

  • Screenshot 2023-01-02 12.33.45.png
    Screenshot 2023-01-02 12.33.45.png
    551.8 KB · Views: 13
  • Screenshot 2023-01-02 12.58.09.png
    Screenshot 2023-01-02 12.58.09.png
    367.6 KB · Views: 12
Last edited:
I think that this problem occurred without any changes on the VM and Proxmox side.
That would be very unlikely. Something always changes.
SCSI controller - LSI 53C895A
the first screenshot strongly implies a problem with SCSI subsystem. I would try changing the scsi controller. Make a clone of existing VM and experiment with it.
Is this problem related to a security leak?
I guess that depends on how one defines what a "security leak" is. IMHO its highly unlikely under most definitions...
PS: Maybe i will have to migrate all the dozens of VMs to a new Debian/Ubuntu
I'd start by tracking down when it happened first and if you still have the logs - examine if any packages got auto-updated. Both in VMs and on the Hypervisor. The fact that the issue continues from PVE6 to PVE7 indicates to me that there was a package/firmware change. You can try to find PVE6 ISO that was installed originally and either migrate VMs to that vanilla installation or compare the package versions to your current install. Make sure to disable auto-update.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
The QEMU machine version (virtual hardware and virtual PCI(e) layout) does change sometimes. For Windows VMs, the machine version is saved (in recent Proxmox versions) when creating the VM because Windows handles such changes poorly. Try setting the Machine Version (enabled Advanced) to 6.2 (or earlier) instead of Latest, which is the default for Linux VMs.
 
Thank you,
the most likely change is, that we started backing up virtuals using the Proxmox Backup Server. This could cause some stress on the IO.
And it is probably the scsi subsystem... I found some old kernel bug "CPU freezes in KVM guests during high IO load on host" and this could be something similar.

I will change the hardware definition to Virtio SCSI (single) and cache to writeback (or other values).

Thank you once again.
 
this problem is replicable
(tried 2 socket xeon E5645, 2 socket E5-2637 v2, 2 socket CPU E5-2620 v3, 2 socket CPU E5-2650 v4, 2 socket Xeon Silver 4216):

install proxmox (tried 6.4 and 7.2 +7.3)
create a vm with debian 8 or ubuntu 12.04 or lower on ceph or zfs
(because virtio scsi is not supported on the host, the only option is LSI 53C895A)
set a backup with proxmox backup server
(everything with default settings)

-> sooner or later you have the same issues described here

solution: nvme, cache: writeback etc. (but not 100% only less occurrent)
the only option: upgrade to debian 9 or ubuntu 14.04 and higher
(and use virtio scsi and a virtio network device)
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!