VM shutdown, KVM: entry failed, hardware error 0x80000021

Feb 14, 2022
2
0
1
72
Hi,

experience with the package residing about two weeks on testing and one on no-subscription went good so far, so from that POV it would be now good to go for the enterprise.

But, upstream send out a slightly revised fix with some development feedback addressed, which we'd like to check out first. As the closer we're with the patch series that actually goes in upstream, the less friction potential is there on the longer run.

Testing that version with our reproducer will take a few hours, if that goes well we'll move that package a bit faster to the open repos, and as the change is relatively small compared to the last kernel build, it should get into the enterprise repo no later than start of next week, naturally only if nothing comes up. Depending on upstream feedback and also further feedback from the community here, we may be able to short cut that, but no promises here and if we're not very sure about impact potential we'll lean towards the slower and safer option. Anyhow, we'll keep you posted if that new kernel is available in public repos.

Thanks for your answers... and for the caution you take ;-)
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
5,334
1,643
164
South Tyrol/Italy
shop.proxmox.com
The updated version of the fix is now available on pvetest as package pve-kernel-5.15.39-4-pve in version 5.15.39-4, our reproducer was still fixed and in addition some slightly dubious kernel logs, like:

Code:
QEMU[2775]: kvm: Could not update PFLASH: Stale file handle
kernel: kvm: vcpu 1: requested 191999 ns lapic timer period limited to 200000 ns

disappeared with the new version, so even a slight improvement.
 
  • Like
Reactions: itNGO

samuele

Member
Oct 27, 2019
7
8
8
21
Italy
looks ok on a first glance?
you do need to reboot the PVE-node for the new parameter-setting to take effect.
Hi,
I installed the new kernel a week ago and everything works fine; I also renabled the tdp_mmu.
Several VMs with Win2022 and Debian and there aren't any error about KVM.
At the moment on my hosts I have theese pve-kernel version installed:
pve-kernel-5.13.19-6-pve/stable,now 5.13.19-15 amd64 [installed] pve-kernel-5.15.30-2-pve/stable,now 5.15.30-3 amd64 [installed] pve-kernel-5.15.39-1-pve/stable,now 5.15.39-1 amd64 [installed,automatic] pve-kernel-5.15.39-3-pve/stable,now 5.15.39-3 amd64 [installed,automatic]

It is safe to remove the versions 5.13.19-6 and 5.15.30-2?
 

Stoiko Ivanov

Proxmox Staff Member
Staff member
May 2, 2018
7,221
1,137
164
I installed the new kernel a week ago and everything works fine; I also renabled the tdp_mmu.
Several VMs with Win2022 and Debian and there aren't any error about KVM.
If you still have the time @t.lamprecht uploaded a new kernel 5.15.39-4 yesterday - which contains the same fix, but in a version that is more likely to get merged upstream - so testing would be much appreciated!

It is safe to remove the versions 5.13.19-6 and 5.15.30-2?
In general I'd say as long as there is a kernel on the machine that does boot and work - you can of course remove old ones (you can always reinstall them later if you really want to) - but make sure that you have a running kernel left :)
 
  • Like
Reactions: t.lamprecht

samuele

Member
Oct 27, 2019
7
8
8
21
Italy
If you still have the time @t.lamprecht uploaded a new kernel 5.15.39-4 yesterday - which contains the same fix, but in a version that is more likely to get merged upstream - so testing would be much appreciated!


In general I'd say as long as there is a kernel on the machine that does boot and work - you can of course remove old ones (you can always reinstall them later if you really want to) - but make sure that you have a running kernel left :)
Hi,
Some minutes ago I installed the new kernel version 5.15.39-4 and restarted the node.
I also installed the new version pve-kernel-helper.
After one week I will reply this post with the feedback about it.

Thanks!
 

RedVortex

New Member
Jan 6, 2021
4
1
3
Canada
Upgraded to 5.15.39-4 on 3 servers that were having the issue and the Windows VM have been stable for the last 2 weeks. It does seem there is a slight performance impact with this new kernel though but at least it is stable.

Thanks for the fix !
 

piggie-mickie

New Member
Jan 22, 2022
3
0
1
54
Short update: I have upgraded to 5.15.39-4 a week ago and also removed the setting kvm.tdp_mmu=N
So far all my VM are running stable and do not crash during backup. Especially my Exchange server was an issue.

But now I have another issue after this change: almost each day I have at least one VM which won't be backed up by PBS, I get the error similar like: backup write data failed: command error: protocol canceled.

Then I usually was able to backup this machine individually in offline mode. But often next day I had another VM with similar issue.

Now I have reverted the mmu setting kvm.tdp_mmu=N but keep the kernel version 5.15.39-4 and all backups are running normal so far. The PBS server is also a VM running on same ProxMox host and the backup target is a Synology NAS.
 

timproxmox

Member
Dec 18, 2019
13
1
8
47
If you still have the time @t.lamprecht uploaded a new kernel 5.15.39-4 yesterday - which contains the same fix, but in a version that is more likely to get merged upstream - so testing would be much appreciated!


In general I'd say as long as there is a kernel on the machine that does boot and work - you can of course remove old ones (you can always reinstall them later if you really want to) - but make sure that you have a running kernel left :)
Just wondering if the fixes has been added to the upstream yet or the estimated time on the merge?
 

timproxmox

Member
Dec 18, 2019
13
1
8
47
What do you mean with upstream? Our kernel hast the fix since a while and is rolled out on all supported repositories.
Apologies, I am not familiar with your processes. I thought when @Stoiko Ivanov mentioned "is more likely to get merged with upstream" that it was still in a test phase. We have applied the fix and thank you for this community which helped bring closure to this issue.
 

t.lamprecht

Proxmox Staff Member
Staff member
Jul 28, 2015
5,334
1,643
164
South Tyrol/Italy
shop.proxmox.com
  • Like
Reactions: Altrove and trisweb

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get your own in 60 seconds.

Buy now!