Segmentation fault on fairly new installation

ricky.i

Member
Oct 29, 2020
7
0
6
34
Hello everyone,

I'm building a small server using proxmox 8. After first install everything seemed fine (apart from issues with ACPI not being recognized as well as warnings on USBs with no ports), but from time to time a kernel panic was thrown which needed me to force restart the machine. After a few kernel updates the situation is now this:

- Kernel 6.2.16-18-pve --> systems seems not to throw kernel panics anymore but I can see a lot of segmentation faults on lvm (which though seems not to stop the VMs or the OS)
- Kernel 6.2.16-19-pve --> won't even start due to segmentation fault and loads BusyBox since it seems not to be able to access root partition

The system is an Asus ExpertCenter PN53 (sold with Geekom brand) with the following specs:

- Ryzen 7 7735HS
- 32GB RAM DDR5
- 2x 2TB NVMe SSDs in zfs raid for VMs storage
- 2TB SATA SSD for the operating system, ISOs, etc..
- 2x VMs, one with Ubuntu server 22.04 LTS and one with Windows 10 Pro

Could please help me understand what's happening?
I'm trying to do my best, but I'm no expert.

Thank you
Kind regards
 
Hello everyone,

I've isolated the issue to the ZFS RAID, moving both VMs to the other storage keeps the system flawlessly stable (it worked for a couple of months with no issue whatsoever). I've tried to destroy the array and rebuild it, but as soon as I perform some work on the ZFS volume the issue appears again. Could it be a faulty drive? Shouldn't appear something on the SMART data? ZFS should at least keep me working even in case of a faulty drive. From SMART data disks seems to be fine.

I've also checked the RAM with Memtest and no issues were reported.

Any suggestion?

Thank you,
Kind Regards
 
What are the exact error message in the Proxmox logs (journalctl)? What are the make and model of the drives? What is the final result of a zpool scrub of the pool?
Attached journalctl after I tried a backup of a VM running on zfs disks.

Drives are NM620 from Lexar.

Attached the status of the Pool.
 

Attachments

  • pool.png
    pool.png
    29.7 KB · Views: 4
  • journalctl.txt
    59.1 KB · Views: 2
i bet on cheap nvme drives.
zfs require enterprise disks.
Maybe, I'm running another setup with similar drives with no issues for about 2 years, maybe I'm lucky on that setup. But before replacing the drives I'd like to understand more.
 
Attached journalctl after I tried a backup of a VM running on zfs disks.

Drives are NM620 from Lexar.

Attached the status of the Pool.
I don't see drive errors, just reports from zed that probably indicate that operations take too long. The kernel crash indicates the problem happened in arc_evictable_space_decrement, which is in the ZFS software.

Kernel version 6.2 has not been updated (no security fixes also!) for some time. Please update your Proxmox to a supported kernel like kernel version 6.5 (and preferably PVE 8.1 with the latest ZFS) and maybe then you also won't have this issue?
 
I don't see drive errors, just reports from zed that probably indicate that operations take too long. The kernel crash indicates the problem happened in arc_evictable_space_decrement, which is in the ZFS software.

Kernel version 6.2 has not been updated (no security fixes also!) for some time. Please update your Proxmox to a supported kernel like kernel version 6.5 (and preferably PVE 8.1 with the latest ZFS) and maybe then you also won't have this issue?
Sorry, I didn't update the topic. I'm now on 6.5.11-8 running proxmox 8.1.4.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!