VM freezes irregularly

I was pointed to this thread from a similar one I had created reporting the same issue. Lots of good info for me as a newb; much appreciated.

I was curious if anyone experiencing this problem has tried something as simple as scheduling a reboot of their VMs (or even the Proxmox host itself) daily in the middle of the night as a stop-gap measure? I'm sure that may not be feasible for some scenarios, but given my limited use-case in a basic home setup (OPNSense and a couple of Ubuntu server VMs), it seems to me that might be preferable to having the VMs hang randomly in the middle of the day after a few days of uptime (I've never seen mine crash after less than 24 hours of uptime, unlike some others here).

The VMs can freeze up anywhere from hours to days. There's no rhyme or reason or predictability in terms of when they freeze. Rebooting at night won't change the frequency of freezes. After digging around, I'm pretty sure this issue is related to the kernel and more specifically the KVM and/or qemu modules. Until there's a kernel fix, this issue will continue.

So far, after moving my VMs to VMware ESXi they have been rock solid with no freezes whatsoever. I'll continue to report on whether this is the case. I'd prefer to run Proxmox but it's just not stable on this Intel N5105 CPU at this stage with the current kernels.
 
  • Like
Reactions: BarTouZ
Yes, I have my IoT machine that froze 2x in a 3h interval...
On the other hand, where it would be +- simple to detect a freeze, I think it's via its ping because that's how I detect if my VM has frozen or not at first.

We need a script that detects the ping of the VMs and resets it accordingly or not... But, let's face it, since I have this N5105, I've never been calm. We still have that sword above our heads...

However, with the settings changed, the VMs hold up, but I have little hope because gyrex says it freezes after a few days anyway... But I went from a few hours in my case to already almost 2 days, I'm all excited :D

1660713337884.png
1660713377162.png

gyrex, keep us informed with ESXi, because if it's THE solution, I'll go there too.

I also prefer Proxmox but to choose, I prefer stability and tranquility to a personal choice...
 
Last edited:
  • Like
Reactions: gyrex
Good, after 2 days and 4 hours, I have one VM out of the 3 which is frozen...

It's hopeless... @gyrex, do you have good results under ESXi ?
 
Last edited:
Good, after 2 days and 4 hours, I have one VM out of the 3 which is frozen...

It's hopeless... @gyrex, do you have good results under ESXi ?

Zero issues so far. If you've got some time, could you try @fabian's suggested troubleshooting above? It's a bit hard for me since I've moved my VMs to ESXi.
 
since this seems to be hardware related and we don't have any affected hardware - could any of you try the steps from my last comment in the bug?

https://bugzilla.proxmox.com/show_bug.cgi?id=4188#c14

Do the kernels need to be customised and built from source for Proxmox or can we use standard kernel builds? It looks like @BarTouZ is willing to try some kernel variations - do you have some instructions for him please mate?
 
  • Like
Reactions: ccooldog
the main blocker for trying out those pre-built kernels would be usage of ZFS (which is not available in mainline kernels). obviously there are lots of other changes (and also bug fixes) that make up the difference between our kernel and the mainline one, but since we want to find out if any of those are at fault here we want to remove them when testing.

so, long story short - make sure you don't require ZFS, install the image (and, if you need some DKMS module, headers) packages for the version you want to test, reboot into it, verify with uname -a that the right version is booted and report back with results :)
 
the main blocker for trying out those pre-built kernels would be usage of ZFS (which is not available in mainline kernels). obviously there are lots of other changes (and also bug fixes) that make up the difference between our kernel and the mainline one, but since we want to find out if any of those are at fault here we want to remove them when testing.

so, long story short - make sure you don't require ZFS, install the image (and, if you need some DKMS module, headers) packages for the version you want to test, reboot into it, verify with uname -a that the right version is booted and report back with results :)
So if @BarTouZ isn't running zfs, he could perform the following?:

wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.39/amd64/linux-image-unsigned-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb && wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.39/amd64/linux-headers-5.15.39-051539_5.15.39-051539.202205120747_all.deb sudo dpkg -i linux-image-unsigned-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb && sudo dpkg -i linux-headers-5.15.39-051539_5.15.39-051539.202205120747_all.deb sudo reboot #After reboot uname -a

Then report back?
 
Last edited:
Hello fabian,
Would it be possible for you to take control of the pc and show me remotely how and what I have to do to get there?
Moreover, I speak only French, between translating and understanding what is said, it is not always easy...
Thanks
 
So if @BarTouZ isn't running zfs, he could perform the following?:

wget https://kernel.ubuntu.com/~kernel-ppa/mainline/v5.15.39/amd64/linux-headers-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb dpkg -i linux-headers-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb sudo reboot #After reboot uname -a

Then report back?
Ok, i will try it

Sorry, but i'm noob :D

1660745264942.png
 
Last edited:
Hello fabian,
Would it be possible for you to take control of the pc and show me remotely how and what I have to do to get there?
Moreover, I speak only French, between translating and understanding what is said, it is not always easy...
Thanks
It's pretty easy to change kernels and if something doesn't go right, it's very easy to boot the old kernel from the grub menu when your pc reboots. I just want to him to confirm the instructions above and if you follow those, you'll be running the mainline kernel instead of the Proxmox custom kernel.
 
@fabian We seem to be having some issues installing the mainline kernel:

Code:
dpkg-deb: error: archive 'linux-modules-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb' uses unknown compression for member 'control.tar.zst', giving up
dpkg: error processing archive linux-modules-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb (--install):
 dpkg-deb --control subprocess returned error exit status 2
Errors were encountered while processing:
 linux-modules-5.15.39-051539-generic_5.15.39-051539.202205120747_amd64.deb

I've tried installing the zstd package but dpkg still won't work. Any ideas mate? It appears the dpkg package on the Proxmox repositories doesn't support zst?

Edit: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892664

What's the alternative?
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!