[SOLVED] Kernel panic zfs / glusterfs

joujou333

Member
Jan 16, 2022
4
0
6
26
Hi Proxmox community!

I have a problem with my setup, which I know it is not ideal.

PVE0: SSD zfs pool with glusterfs
PVE1: SSD zfs pool with glusterfs, HDD zfs pool shared via NFS
PVE2: Arbiter glusterfs + "compute node"
PVE3: "compute node"

Since I updated every node from 6 to 7 I got random panics on PVE1.
Glusterfs is not connected anymore, but via console I can write and read from the local zfs mount (ssd and hdd pool).
Since then I tried to pin point what is going on but cant find it exactly.

On PVE1 I run a VM with Proxmox Backup Server with a disk on the NFS mount (PVE1) which always worked. Since the upgrade I can't run any backups.
Also when syncing up with the other glusterfs node it will eventually panic and crash the whole cluster. It also does something with the pvestatd. When restarting that progress the vm's will run fine then. I tried every zfs pool seperate to make sure it was not my HBA for HDD's nor the SSD pool. But on all the pools it gave me those panics.

Don't know which kernel it was, but the latest free kernel, not 5.15, gave me also some NFS kernel panics. But after upgrading to 5.15 kernel that panic went away. Also tried to reinstall the PVE1 node but that didn't gave me the solution.

I don't know what more information you need but I can give all details.

Here is the kernel panic from what I can get.

Sorry for my explanation.
 

Attachments

  • Panic.txt
    26.8 KB · Views: 2
  • pveversion.txt
    1.3 KB · Views: 1
Last edited:
PVE1 Is not in using at this moment, ran the backup vm on that node and started a vm backup. It kernel panics immediately. VM turned itself off and now the load is stable at 1.00. Which is weird because it is not in use and should be like 0.04 as what it did on 6.4.

After around 15 minutes the node become unresponsive and gave me the following kernel panic.
 

Attachments

  • panic2.png
    panic2.png
    214.5 KB · Views: 12
Last edited:
Update: Replaced set ram sticks with new one and is running fine for 2 days now. I can even make backups. So seems the memory was the issue.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!