pve cpanel vm freeze

bzb-rs

Member
Jun 8, 2022
43
4
13
Canada
Hello Gang

Our cluster is doing well with all the support that was passed previously. Currently i am facing an issue thats been going on for about 3 days now.

I have a cpanel server that is running and throwing error occasionally and going into frozen mode with following error:
Screenshot_72.png
The only way to bring back the vm is to reset it.
I reached out to cpanel and they said its not their software. I used to run this cpanel on ubuntu and then switched it to almalinux 8 today but facing the same issue of freezing. The same vm been running on the same infrastructure for months without any issues and no other vm shows any similar issues.

I run pve 7.2.5

Thanks for sharing ideas.
 
This is most likely a storage issue. Based on the screenshot both top and sshd were unable to write to some device for more than 2 minutes. In our experience, the most likely culprit is hung network attached block storage.

There is most likely something in the kernel log of the hypervisor that corresponds to the VM event.


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
The strange thing is my qemu-guest-agent stopped working and i ended up removing it from the vm and since then i am yet to face the freezing issue. However i got the following error on console but it did not freeze the vm.

Screenshot_73.png
@bbgeek17 The vm or host node does not have any network drive/share, all storage is local zfs and none of the other vms have any issues. There is nothing in the host logs, everything's clean.
I also suspect the numa node is causing this issue from this above error? I ended up with removing numa option for the server and downgrading to single socket for this particular vm. (we have epyc rome cpu's)

The vm is on Alma8 latest.
I managed to locate the following on host node:

[1223891.799598] NMI backtrace for cpu 43
[1223891.799599] CPU: 43 PID: 2621553 Comm: z_wr_int_0 Tainted: P O 5.15.35-3-pve #1
[1223891.799601] Hardware name: Supermicro Super Server/H12SSL-i, BIOS 2.0 02/22/2021
[1223891.799601] RIP: 0010:kmem_cache_free+0x7d/0x290
[1223891.799605] Code: e8 0c 48 c1 e0 06 48 01 c8 48 8b 50 08 48 8d 72 ff 83 e2 01 48 0f 45 c6 48 8b 70 08 48 8d 56 ff 83 e6 01 48 0f 44 d0 48 8b 12 <80> e6 02 0f 84 79 01 00 00 4c 8b 70 18 4d 85 f6 0f 84 e9 00 00 00
[1223891.799606] RSP: 0018:ffffb21aaea53c68 EFLAGS: 00000246
[1223891.799607] RAX: fffff2c083fa9100 RBX: ffff9749bea44960 RCX: fffff2c040000000
[1223891.799608] RDX: 0057ffffc0000200 RSI: 0000000000000000 RDI: ffff974898b49540
[1223891.799609] RBP: ffffb21aaea53cb8 R08: 0000000000000002 R09: 00000066a6e7c000
[1223891.799610] R10: 0000000100000000 R11: 000000000000000b R12: ffff97493ea44960
[1223891.799611] R13: ffff97493ea44960 R14: ffff9799dc34a108 R15: 0000000000000000
[1223891.799612] FS: 0000000000000000(0000) GS:ffff97684fdc0000(0000) knlGS:0000000000000000
[1223891.799613] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1223891.799613] CR2: 00007ff7e81d4a0c CR3: 0000004054618000 CR4: 0000000000350ee0
[1223891.799614] Call Trace:
[1223891.799615]
[1223891.799615] spl_kmem_cache_free+0x13d/0x1f0 [spl]
[1223891.799622] zio_remove_child+0x127/0x140 [zfs]
[1223891.799699] zio_done+0x4cb/0x1280 [zfs]
[1223891.799765] ? zio_wait_for_children+0xaf/0x140 [zfs]
[1223891.799829] zio_execute+0x95/0x160 [zfs]
[1223891.799890] taskq_thread+0x29b/0x4c0 [spl]
[1223891.799897] ? wake_up_q+0x90/0x90
[1223891.799899] ? zio_gang_tree_free+0x70/0x70 [zfs]
[1223891.799961] ? taskq_thread_spawn+0x60/0x60 [spl]
[1223891.799967] kthread+0x12a/0x150
[1223891.799969] ? set_kthread_struct+0x50/0x50
[1223891.799971] ret_from_fork+0x22/0x30
[1223891.799974]
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!