VM freezes irregularly

with these combo, debian vm lives about 2 days now is still good.
  • host kernel Edge 5.19.8-1
  • host os install latest cpu microcode
  • install backports kernel 5.18 to guest debian vm
 
  • Like
Reactions: rRobbie
Thanks Fabian, much appreciated.

Quick question: best practice coming from edge kernel? Remove it and then install the official 5.19?
Thanks
The packages just adds to the boot menu so you can switch between them, test it, then remove the package once all is well.

I was more adventurous so I just went ahead to remove the pve-kernel-5.19-edge packages, installed the new one from the main repo and rebooted. It downgraded mine from 5.19-8 to 5.19-7 but it's running fine.
 
  • Like
Reactions: rRobbie
I have no experience whatsoever with the pve-edge-kernels (they are third party and neither developed nor supported by us), but if they are working like regular kernel packages on Debian based distros, they should peacefully co-exist with the PVE ones, and the one with the highest version will be booted by default, and the usual ways to override that default selection should apply.
 
this time, freezed within 24 hours, while the guest freezed, the traffic is about 900KB/s in and out. cpu is lower than 10%.

migrate dockers to lxc container again.
 
this time, freezed within 24 hours, while the guest freezed, the traffic is about 900KB/s in and out. cpu is lower than 10%.

migrate dockers to lxc container again.
What kernel are you running on the proxmox host?
 
A brief report of my NUC11ATKC4 running:

View attachment 41030

Now 3 days running the 7 VMs under stress-ng mild load, one issue with an Ubuntu VM which eventually filled up its memory and crashed (memory leak of the stress test? I am not sure), however the symptoms were totally different from the old freezes I had before installing the edge kernel.

Now I am testing one VM at time, starting with Fedora, under heavier load (for my little NUC).

stress-ng --cpu 0 -l 40 --io 2 --vm 2 --vm-bytes 40% -t 0 --metrics-brief

I will report back in a few days.

Brief update on my NUC11ATKC4, 3 days running the edge kernel with no issue. I now switched to the pve-kernel-5.19.

Capture.PNG

I am testing under mild load (it's only a NUC) Fedora and Ubuntu VMs.

stress-ng --cpu 0 -l 50 --io 2 --vm 2 --vm-bytes 20% -t 0

Update in a few days.
 
Upgraded to 5.19-edge, updated microcode. The VM worked for a couple of days, but then got frozen as before. It doesn't seem like the issue has been fixed in 5.19. The CPU is N5095.
 
Last edited:
Yep, mine froze after a day or so as well, but I reverted to the official 5.19-7. Might try the edge 5.19 again.

Still just the Ubuntu box though, The FreeNAS box is running fine.
 
Yep, mine froze after a day or so as well, but I reverted to the official 5.19-7. Might try the edge 5.19 again.

Still just the Ubuntu box though, The FreeNAS box is running fine.
How much traffic do you have going thru the Ubuntu VM vs the FreeNAS VM?

CPU burnin didnt phase it. Memtest didn't phase it.

I wonder if we have NIC driver problem.
 
Last edited:
Not that much, but should roughly equal, at least from a bridge perspective. The Ubuntu box has Deluge running on it, but data from it goes to NFS shares on the NAS. Ubuntu box definitely has more traffic going through the NIC.

The only reason why I think it still could be CPU related was that the symptom is the CPU is stuck at 50% flat and not moving.
 
  • Like
Reactions: CPUGoingZoomy
I run two VMs, and the one that produces greater network traffic (5 Mbit/s weekly avg peaking at 200 Mbit/s) tends to freeze more often. However, this VM also creates greater CPU load (~30-40%), so I can't tell for sure if networking is the contributing factor.
 
Last edited:
  • Like
Reactions: rRobbie
And what about memory ballooning? Is it advisable to keep it enabled in combination with pci passthrough?
 
I try the 5.19 pve official.

Proxmox freezer only a few seconds after it finish the boot.

Back to 5.15 Official. With this one only the vm freeze not the entire server
 
Yep, mine froze after a day or so as well, but I reverted to the official 5.19-7. Might try the edge 5.19 again.

Still just the Ubuntu box though, The FreeNAS box is running fine.
Update on this one... I decided to stay on 5.19-7 a bit longer. Last time when it froze, the CPU didn't spike to 50% flat and after the last reset it has been running for 3 days straight with traffic and no freeze. So I'll observe it a bit more before moving to Edge.
 
Been lurking on this thread for some time, running into the same issues as other folks on both freeBSD and Linux guests. I'm on a N5105 with 4 x 2.5GbE I225-V from HUNSN. Thanks to all who proceeded me to get us this far!

I upgraded to:
Linux xxxx 5.19.8-edge #1 SMP PREEMPT_DYNAMIC PVE Edge 5.19.8-1 (2022-09-08) x86_64 GNU/Linux

And was stable for about 3 days (a record!) running a FreeBSD VM (pfSense), then got the dreaded freebsd kernel panic:
Sep 15 14:21:40 kernel Fatal trap 12: page fault while in kernel mode
Sep 15 14:21:40 kernel cpuid = 0; apic id = 00
Sep 15 14:21:40 kernel fault virtual address = 0x20
Sep 15 14:21:40 kernel fault code = supervisor write data, page not present
Sep 15 14:21:40 kernel instruction pointer = 0x20:0xffffffff80bac0e3

But I was not running the latest microcode at the time. I just upgraded to:
[ 0.000000] microcode: microcode updated early to revision 0x24000023, date = 2022-02-19

So for anybody joining this thread late, the edge kernel alone doesn't appear to fix it.

I'll see if the microcode + edge kernel gets me the stability everybody else has been experiencing, been 12 hours so far.

FWIW, I had been trying to track this issue down before I found this thread. The box seemed quite stable before I joined it to a cluster. However perhaps that was just unrelated... But it made it a week just running pfSense in a VM. However since then, I rarely could make it >24 hours w/o a crash. However after finding this thread I'm going to stop chasing that red herring...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!