Spurious Kernel Hangup of PVE on Nested KVM Host

proctrap · Oct 11, 2020

Hello,

I've got a problem with spurious full PVE freezes where I can't even tell what is happening.

Setup: AMD EPYC 7702 KVM Host, in which my PVE is running (so it's PVE nested inside a KVM host with VMX flag).

After some time (18 days / 30 days..) the full system is unresponsive and one Core is at 100% Usage, the rest is completely idle. No Network response etc

I attached some screenshots of the management console the Host has where you can see the CPU usage and 0 on everything else. One is from when it happened, the others are from right now, where you can see the normal usage vs the beginning while PVE was hanging.

I've got no logs or anything when this happens. When I reset the system everything goes back to normal.

I don't even have a clue where I can start debugging this. I've got one live OS-Snapshost, so maybe I can debug this or provide additional details if you have any ideas how.

Stefan_R · Oct 12, 2020

Even if you're using nested virt, a lot of stuff needs to happen from the host kernel. So I'd start looking there for a fault.

What system are you running as the bare-metal hypervisor? Which kernel (version)? How is the L1 guest started? Which kernel in there?

Also, maybe the physical console displays some error messages? At least some "CPU stuck" info is usually given under such circumstances.

proctrap · Oct 12, 2020

First of all thanks for a reply.

After reading other threads in here I've got a kernel warning live monitor running now on the TTY of my Proxmox-Host, so I can hopefully see something if a next crash should happen and the TTY is frozen on last state. I also shut down 3 migration related VMs in PVE that came from an 2015-ish vsphere export, which I hope is the cause of all this. ( I can't say for sure based on the history of events, this is all a pretty new setup). The bare metal hypervisor is a KVM system, if you need more info I'll have to ask the hosting company.

This is nested on purpose in a "root VM" such that the people running this in long term won't have to care about the hardware itself. (Managed RAID, hardware etc) Thus I can't access the physical machine, only the web-accessible management console and TTY of that Host-VM.

Root-VM from netcup.de with non-official VMX option to allow nested virtualization, they allow this experimentally and it'd fit the "Verein" here perfectly, if I can get this to run stable...

proctrap · Oct 12, 2020

Ok, I'm trying to get some details about the host now. Regarding "cpu stuck": If you mean output on the PVE TTY: No, it's frozen with the default login screen. Maybe this is different now that I'm running the kernel warning ticker.

My PVE System:
CPU(s) 6 x AMD EPYC 7702P 64-Core Processor (1 Socket)
Kernel Version: Linux 5.4.65-1-pve #1 SMP PVE 5.4.65-1 (Mon, 21 Sep 2020 15:40:22 +0200)
PVE Manager Version: pve-manager/6.2-12/b287dd27

proctrap · Oct 13, 2020

Ok, I've catched SSH output about CPU soft lockups, while being able to access the machine:

Code:

Message from syslogd@machine at Oct 13 01:46:26 ...
 kernel:[215357.251009] watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [pveproxy worker:14988]

And so on..

I could theoretically still continue control the system.

proctrap · Oct 13, 2020

Kernel Logs of failure (and first Appearance of that in the attached log file):

Code:

Oct 13 01:55:26 host kernel: watchdog: BUG: soft lockup - CPU#2 stuck for 22s! [pveproxy worker:14988
Oct 13 01:55:26 host kernel: Modules linked in: veth ebtable_filter ebtables ip_set ip6table_filter i
Oct 13 01:55:26 host kernel:  i2c_piix4 uhci_hcd ehci_hcd pata_acpi floppy
Oct 13 01:55:26 host kernel: CPU: 2 PID: 14988 Comm: pveproxy worker Tainted: P           O L    5.4.
Oct 13 01:55:26 host kernel: Hardware name: netcup KVM Server, BIOS RS 4000 G9 08/20/2020
Oct 13 01:55:26 host kernel: RIP: 0010:__nf_conntrack_find_get+0x100/0x2b0 [nf_conntrack]
Oct 13 01:55:26 host kernel: Code: 48 8b 1b 48 8b 45 c8 48 8b 75 c0 48 89 da f6 c3 01 74 ae 48 d1 ea
Oct 13 01:55:26 host kernel: RSP: 0018:ffffa251005f3a30 EFLAGS: 00000287 ORIG_RAX: ffffffffffffff13
Oct 13 01:55:26 host kernel: RAX: 000030f3e0223c68 RBX: 0000000000015029 RCX: ffff915d0f680000
Oct 13 01:55:26 host kernel: RDX: 000000000000a814 RSI: ffffffffaff5f380 RDI: ffffffffb05ddfc0
Oct 13 01:55:26 host kernel: RBP: ffffa251005f3a70 R08: 00000000f196291c R09: ffffffffb05ddfc0
Oct 13 01:55:26 host kernel: R10: ffff915d15ac0d48 R11: ffffa251005f3cf7 R12: ffffa251005f3a98
Oct 13 01:55:26 host kernel: R13: ffffffffc0820144 R14: fffffffffffffff0 R15: ffffffffb05ddfc0
Oct 13 01:55:26 host kernel: FS:  00007fb03283b1c0(0000) GS:ffff915d1fa80000(0000) knlGS:000000000000
Oct 13 01:55:26 host kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Oct 13 01:55:26 host kernel: CR2: 0000555c786bce88 CR3: 0000000816822000 CR4: 0000000000340ee0
Oct 13 01:55:26 host kernel: Call Trace:
Oct 13 01:55:26 host kernel:  ? ttwu_do_wakeup+0x1e/0x150
Oct 13 01:55:26 host kernel:  nf_conntrack_in+0x13d/0x5f0 [nf_conntrack]
Oct 13 01:55:26 host kernel:  ipv4_conntrack_local+0x48/0x70 [nf_conntrack]
Oct 13 01:55:26 host kernel:  nf_hook_slow+0x49/0xd0
Oct 13 01:55:26 host kernel:  __ip_local_out+0xd4/0x140
Oct 13 01:55:26 host kernel:  ? ip_forward_options.cold.8+0x1d/0x1d
Oct 13 01:55:26 host kernel:  ip_local_out+0x1c/0x50
Oct 13 01:55:26 host kernel:  __ip_queue_xmit+0x170/0x430
Oct 13 01:55:26 host kernel:  ip_queue_xmit+0x10/0x20
Oct 13 01:55:26 host kernel:  __tcp_transmit_skb+0x54f/0xb10
Oct 13 01:55:26 host kernel:  tcp_connect+0xb03/0xdc0
Oct 13 01:55:26 host kernel:  ? kvm_clock_get_cycles+0x11/0x20
Oct 13 01:55:26 host kernel:  ? ktime_get_with_offset+0x4c/0xc0
Oct 13 01:55:26 host kernel:  tcp_v4_connect+0x465/0x4e0
Oct 13 01:55:26 host kernel:  __inet_stream_connect+0xd6/0x3a0
Oct 13 01:55:26 host kernel:  ? _cond_resched+0x19/0x30
Oct 13 01:55:26 host kernel:  ? aa_sk_perm+0x43/0x180
Oct 13 01:55:26 host kernel:  inet_stream_connect+0x3b/0x60
Oct 13 01:55:26 host kernel:  __sys_connect+0xed/0x120
Oct 13 01:55:26 host kernel:  ? do_fcntl+0x337/0x560
Oct 13 01:55:26 host kernel:  __x64_sys_connect+0x1a/0x20
Oct 13 01:55:26 host kernel:  do_syscall_64+0x57/0x190
Oct 13 01:55:26 host kernel:  entry_SYSCALL_64_after_hwframe+0x44/0xa9

Stefan_R · Oct 13, 2020

Yeah, the logs and symptoms pretty much indicate that the host has stopped running certain vCPUs for your L1 guest. Nested virt for AMD Zen chips was only recently brought to somewhat of a stable state in 5.8.X/5.9 kernels (host kernel, L0, what's running in the guests shouldn't matter - famous last words of course). There's quite a few kernel bug tracker issues about that, so I'd assume your hoster is using an outdated kernel or some other misconfiguration. (The bare-metal machine is also where any useful logs/warnings would then appear)

Oh and VMX is Intel, on AMD it's called SVM. Just FYI

proctrap · Oct 13, 2020

Hm ok, then I'll have to try getting a newer kernel Version (which obviously isn't that easy to request for one node..). Thank you for taking your time guiding me to the real culprit, now I've got some ideas how/when this could be resolved

goseph · Apr 26, 2021

proctrap said:
Hm ok, then I'll have to try getting a newer kernel Version (which obviously isn't that easy to request for one node..). Thank you for taking your time guiding me to the real culprit, now I've got some ideas how/when this could be resolved

Hello, any news on this? Any info highly appriciated.

fyi: i have a rebooting proxmox after about 8-14 days. VMX enabled on netcup. I have CPU(s) 4 x Intel(R) Xeon(R) Gold 6230 CPU.

Stefan_R · Apr 27, 2021

goseph said:
Hello, any news on this? Any info highly appriciated.

fyi: i have a rebooting proxmox after about 8-14 days. VMX enabled on netcup. I have CPU(s) 4 x Intel(R) Xeon(R) Gold 6230 CPU.

This thread was about specific issues with AMD chips, you mention you have an Intel... please open a new issue if you experience issues in the current version. As a shot in the dark, we released an opt-in 5.11 kernel recently (apt install pve-kernel-5.11), maybe that'll help...

goseph · Apr 27, 2021

Stefan_R said:
This thread was about specific issues with AMD chips, you mention you have an Intel... please open a new issue if you experience issues in the current version. As a shot in the dark, we released an opt-in 5.11 kernel recently (apt install pve-kernel-5.11), maybe that'll help...

Hi Stefan,
thanks for your reply.

I was asking here because i have issues with EPYC as well. Intel only mentioned to let know: happened to intel as well on nested virt.
I would still love a feedback from @proctrap !

@Stefan_R Can you please tell me more about kernel 5.11 why this could help and where could this help? AMD or Intel? Kernel 5.11 on the main host or inside my nested proxmox host?

Again: thanks a lot and congrats on 1000 messages

Stefan_R · Apr 27, 2021

goseph said:
I was asking here because i have issues with EPYC as well. Intel only mentioned to let know: happened to intel as well on nested virt.

Again, then please open a new thread. If your issue is not specific to AMD it probably does not fit here. Just because you have hangs in nested VMs, does not mean it's the same problem. Also, without any detailed information about what does or doesn't work (i.e. logs, error messages, system details, etc...) it is impossible to say what is wrong.

Kernel 5.11 would go on the host, it's always the host's kernel doing hardware-assisted virtualization, even for nested. I mention it only because reading through this thread lead me to believe that we diagnosed the issue as kernel-related back then, so upgrading that might make sense...

TimG · Apr 28, 2021

Was having same issue with my AMD Epyc with a VMWARE ESXi nested install in Proxmox.. Host CPU, SVM flag, and mods to VM confg per Proxmox KB instructions.. ESXi would boot, and while the nested VM's would boot, they would randomly lock/reset on me with a soft cpu locks constantly hitting the nested VM..

Unfortunately, I had to give up on it.. :/ was just too unstable. Running latest kernels/updates from Proxmox.

Search

Search

Spurious Kernel Hangup of PVE on Nested KVM Host

proctrap

Member

Attachments

Stefan_R

Proxmox Retired Staff

proctrap

Member

proctrap

Member

proctrap

Member

proctrap

Member

Attachments

Stefan_R

Proxmox Retired Staff

proctrap

Member

goseph

Renowned Member

Stefan_R

Proxmox Retired Staff

goseph

Renowned Member

Stefan_R

Proxmox Retired Staff

TimG

Active Member

We value your privacy