VMs shutting down unexpectedly

Lamia

Member
Jun 25, 2020
24
0
6
34
Hello,

I have VMs shutting down unexpectedly. How can I see why this is happening.

Thanks
 
Check your logs (journalctl -e, dmesg), and please include your pveversion -v output as well as configs from your VMs (qm config <vmid>). This is very little information to go on...
 
Here are attached documents with the output of these commands.

Thanks a lot
 

Attachments

  • journalctl -e.txt
    22.2 KB · Views: 71
  • dmesg.txt
    68.3 KB · Views: 32
  • pveversion -v.txt
    1.3 KB · Views: 16
  • qm config of both vms.txt
    883 bytes · Views: 26
Obviously I meant the logs from a time where such a crash occured. Check also /var/log/syslog plus its logrotations. Filter through them for a timestamp when your issue occurs, don't just post the whole log please.

In your posted output only a single VM stops, at Sep 15 12:21:04. This appears to be a guest-triggered shutdown, so your VM decided to shut itself off, no PVE intervention.
 
Indeed, the shutdown on both VMs is not due to PVE but is guest-triggered. It's because their licenses are expired.

Thanks a lot,

Lamia
 
Indeed, the shutdown on both VMs is not due to PVE but is guest-triggered. It's because their licenses are expired.

Thanks a lot,

Lamia
Hi Lamia,

i have the same problem with my Server, everthing was ok until few weeks ago, my Windows is activated and i have only 2 VMs. Firewall and Win-Server 2022.

proxmox is up to date and im using 7.2-4.

i just turned on the VM and it is working now, after 1-2 day it goes automaticly down.

will be happy if any one can help.

Regards.
KBL
 
  • Like
Reactions: Greg.Davis
Seeing the same. I have hundreds of VMs across 6 clusters (78 nodes) and every day we have probably 15 or so VMs turn off with no indication as to why. Event viewer shows an unexpected shutdown, syslogs show no activity either.

Everything is on 7.x-x. I've tried updating a few of them to latest. I've also tried switch machine type from q35 to i440fx and it doesn't seem to be making a difference.
 
Last edited:
Seeing the same. I have hundreds of VMs across 6 clusters (78 nodes) and every day we have probably 15 or so VMs turn off with no indication as to why. Event viewer shows an unexpected shutdown, syslogs show no activity either.

Everything is on 7.x-x. I've tried updating a few of them to latest. I've also tried switch machine type from q35 to i440fx and it doesn't seem to be making a difference.
Hi Greg

i was in Vacation, so here again, one of the VM is off. :(

i checked the Vm Settings too, the Windows Settings too, i found no problem. this is the only Proxmox that has problem, i have sevral Proxmox which works without any problem.

Hope Proxmox Team or anyone else can find a solution for this prob.
 
if the VM crashes (bug, faulty memory, ..) or is killed by the host kernel (e.g., because of running out of memory) that should be visible in the host logs - please take a look at the timespan surrounding the unexpected shutdown. if the VM shuts down from inside the VM, it is expected that the VM is stopped and the cause needs to be searched for inside the guest.
 
if the VM crashes (bug, faulty memory, ..) or is killed by the host kernel (e.g., because of running out of memory) that should be visible in the host logs - please take a look at the timespan surrounding the unexpected shutdown. if the VM shuts down from inside the VM, it is expected that the VM is stopped and the cause needs to be searched for inside the guest.
Hi Fabian,

i just saw the journalctl -e i saw the following error which is in Red.

May 30 14:27:46 proxmox QEMU[1049503]: KVM: entry failed, hardware error 0x80000021
May 30 14:27:46 proxmox QEMU[1049503]: If you're running a guest on an Intel machine without unrestricted mode
May 30 14:27:46 proxmox QEMU[1049503]: support, the failure can be most likely due to the guest entering an invalid
May 30 14:27:46 proxmox QEMU[1049503]: state for Intel VT. For example, the guest maybe running in big real mode
May 30 14:27:46 proxmox QEMU[1049503]: which is not supported on less recent Intel processors.
May 30 14:27:46 proxmox kernel: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 30 14:27:46 proxmox QEMU[1049503]: EAX=002d0b18 EBX=d139f180 ECX=00000001 EDX=00000000
May 30 14:27:46 proxmox QEMU[1049503]: ESI=8d49c040 EDI=8f885080 EBP=fff17a40 ESP=fff17840
May 30 14:27:46 proxmox QEMU[1049503]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May 30 14:27:46 proxmox QEMU[1049503]: ES =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: CS =ba00 7ffba000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: SS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: DS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: FS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: GS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: LDT=0000 00000000 000fffff 00000000
May 30 14:27:46 proxmox QEMU[1049503]: TR =0040 d13ae000 00000067 00008b00
May 30 14:27:46 proxmox QEMU[1049503]: GDT= d13affb0 00000057
May 30 14:27:46 proxmox QEMU[1049503]: IDT= 00000000 00000000
May 30 14:27:46 proxmox QEMU[1049503]: CR0=00050032 CR2=00000030 CR3=7dd21000 CR4=00000000
May 30 14:27:46 proxmox QEMU[1049503]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 30 14:27:46 proxmox QEMU[1049503]: DR6=00000000ffff0ff0 DR7=0000000000000400
May 30 14:27:46 proxmox QEMU[1049503]: EFER=0000000000000000
May 30 14:27:46 proxmox QEMU[1049503]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret>

can you maye tell me what the problem can be ?

Regards
 
Hi Fabian,

i just saw the journalctl -e i saw the following error which is in Red.

May 30 14:27:46 proxmox QEMU[1049503]: KVM: entry failed, hardware error 0x80000021
May 30 14:27:46 proxmox QEMU[1049503]: If you're running a guest on an Intel machine without unrestricted mode
May 30 14:27:46 proxmox QEMU[1049503]: support, the failure can be most likely due to the guest entering an invalid
May 30 14:27:46 proxmox QEMU[1049503]: state for Intel VT. For example, the guest maybe running in big real mode
May 30 14:27:46 proxmox QEMU[1049503]: which is not supported on less recent Intel processors.
May 30 14:27:46 proxmox kernel: set kvm_intel.dump_invalid_vmcs=1 to dump internal KVM state.
May 30 14:27:46 proxmox QEMU[1049503]: EAX=002d0b18 EBX=d139f180 ECX=00000001 EDX=00000000
May 30 14:27:46 proxmox QEMU[1049503]: ESI=8d49c040 EDI=8f885080 EBP=fff17a40 ESP=fff17840
May 30 14:27:46 proxmox QEMU[1049503]: EIP=00008000 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=1 HLT=0
May 30 14:27:46 proxmox QEMU[1049503]: ES =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: CS =ba00 7ffba000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: SS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: DS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: FS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: GS =0000 00000000 ffffffff 00809300
May 30 14:27:46 proxmox QEMU[1049503]: LDT=0000 00000000 000fffff 00000000
May 30 14:27:46 proxmox QEMU[1049503]: TR =0040 d13ae000 00000067 00008b00
May 30 14:27:46 proxmox QEMU[1049503]: GDT= d13affb0 00000057
May 30 14:27:46 proxmox QEMU[1049503]: IDT= 00000000 00000000
May 30 14:27:46 proxmox QEMU[1049503]: CR0=00050032 CR2=00000030 CR3=7dd21000 CR4=00000000
May 30 14:27:46 proxmox QEMU[1049503]: DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000 DR3=0000000000000000
May 30 14:27:46 proxmox QEMU[1049503]: DR6=00000000ffff0ff0 DR7=0000000000000400
May 30 14:27:46 proxmox QEMU[1049503]: EFER=0000000000000000
May 30 14:27:46 proxmox QEMU[1049503]: Code=kvm: ../hw/core/cpu-sysemu.c:77: cpu_asidx_from_attrs: Assertion `ret < cpu->num_ases && ret>

can you maye tell me what the problem can be ?

Regards
im sending you an attachment, the latest info from Journalctl -e

the VM is shutdown right now.
 

Attachments

  • error.txt
    13.6 KB · Views: 19
see https://forum.proxmox.com/threads/vm-shutdown-kvm-entry-failed-hardware-error-0x80000021.109410/ - seems to be an issue affecting some CPU/guest combinations with the most recent kernel updates.
Hi Fabian,

Thank you for your reply.

i had a WinServer 2022 10 Core on my VM installed. i had to bring some changes in my hardware (Processors, i had 2 Sockets and 5 cores, i changed it to 1 Socket and 10 Cores) it is strange cuz it was working fine and just after an Update not anymore.

since i changed the settings, the VM is on and working fine, i have to wait till tomorrow.

Regards
Knight
 
Hi Fabian,

Thank you for your reply.

i had a WinServer 2022 10 Core on my VM installed. i had to bring some changes in my hardware (Processors, i had 2 Sockets and 5 cores, i changed it to 1 Socket and 10 Cores) it is strange cuz it was working fine and just after an Update not anymore.

since i changed the settings, the VM is on and working fine, i have to wait till tomorrow.

Regards
Knight
just for a quick update, the VM was just for 24 hours on, then same problem, it seems that it is Kernel Prob i have to do a downgrade :/
 
just for a quick update, the VM was just for 24 hours on, then same problem, it seems that it is Kernel Prob i have to do a downgrade :/
Hello,

did you downgrade?
on which version downgraded?
i have a same problem, vm shutdown unexpectedly.
 
Hello,

did you downgrade?
on which version downgraded?
i have a same problem, vm shutdown unexpectedly.
Did you check the syslog for OOM messages? cat /var/log/syslog | grep OOM
Most people complaing about stopping VMs overcommit the RAM until OOM needs to kill the KVM process to free up some RAM so the host won't crash.
 
what i observed, this problem is happening in that host where KSM sharing has some value (xxGB), it is not happening on those host where KSM sharing is 0, so in this kernel there is some issue with KSM sharing with ZFS storage. hope Proxmox team will look at this problem and solve it. this become irritating sometime at late early morning to power on the vm's.
 
But with default configs that also means the one node with active KSM is the only node that is low on RAM as KSM will only be enabled when RAM utilization exceeds 80%. So wouldn't be that surprising if only the node shutsdown VMs that is lowest on RAM.
 
hi Dunuin, i am finding it very strange for latest update. i have three node cluster, all three are identical server, dell r630, 128gb ram, each has two 3 TB ssd configured as zfs mirror, the particular node has only 4 windows server 2022 vm (12+12+12+8=44 GB ram), this cluster is running since 2020, i upgraded server 2022 on last march 22, before that server 2019 was running, identical configuration was there with server 2019, never faced any problem, this problem started for past 30 days when i upgraded proxmox with latest upgrade., with 44gb consumption out of 128 gb , how the ram shortfall is happening, i can not figure it out. KSM sharing is about 15gb. rest of the two node has only linux vm, not faced any problem. how to solve this problem... can u give me some idea, every day in early morning i have to wake up to start these vm, not all 4 vm shutting down, randomly 2-3 vm shutting down. please help
 
Last edited:
ZFS by default will use up to 50% of your RAM as ARC for caching. So up to 64GB RAM will ZFS use and that cache is used at userspace level and can't be freed as fast as the normal linux caches. So with 2GB for PVE + 64GB for ZFS + 44GB for VMs (+X GB virtualization overhead) your RAM can go up to 107+X GB. If you use stuff like writeback for your VMs it can even be higher. You can check your ARC by running arc_summary. If your nodes run out of RAM and the ARC can't be freed fast enough so OOM killer will trigger and kill VMs you could manually limit your ARC size like describes here: https://pve.proxmox.com/wiki/ZFS_on_Linux#sysadmin_zfs_limit_memory_usage
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!