EXT4-fs error

LazeX7

New Member
Jul 13, 2024
23
7
3
Hello,

Every few days or hours my proxmox device crashes. Everytime I reboot the system it works for a while. I never had it connected to a monitor so I could never see any errors, until today. Today it crashed again and i got the follow errors:

1723963791772.jpeg

The OS is installed on a M.2 from Lexar NM710 1TB M.2 SSD I bought brand new a few months back...
After this error, i'm unable to input "reboot" only "exit" it doesn't do anything after that!

I need to reboot it manually. After a reboot it works fine for a few days or hours again....
 
You won't be happy with cheap crabby ssd's while seen your os is still damaged inside the ext4.
Go like in monopoly back to start, buy a good one, reinstall proxmox and so on ...
 
You won't be happy with cheap crabby ssd's while seen your os is still damaged inside the ext4.
Go like in monopoly back to start, buy a good one, reinstall proxmox and so on ...
Proxmox has been reinstalled less than 3 weeks ago because of this issue.
 
FYI I went with an nvme Lexar NM790 1TB as it had good reviews and a high TBW rating. Been running 24/7 since Feb with 0% wearout and no issues.

Also make sure everything is running on UPS power.

I would RMA the drive and go with something better, also turn off atime everywhere (including in-guests) and turn off cluster services in Proxmox if this is single-node. You can also go with log2ram and zram for less writing to the SSD
 
FYI I went with an nvme Lexar NM790 1TB as it had good reviews and a high TBW rating. Been running 24/7 since Feb with 0% wearout and no issues.

Also make sure everything is running on UPS power.

I would RMA the drive and go with something better, also turn off atime everywhere (including in-guests) and turn off cluster services in Proxmox if this is single-node. You can also go with log2ram and zram for less writing to the SSD
Problem has been solved.

By going to: nano /etc/default/grub
And add the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off"
 
Problem has been solved.

By going to: nano /etc/default/grub
And add the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off"
I am new to using Proxmox and also ran into this ext4-fs-error after I installed it. The grub change also worked to resolve my issues, it was rebooting 1-2 times a day before the change and has now gone two weeks without an issue. I am using a Samsung 990 Pro 4TB nvme. Thanks!
 
Last edited:
Hello guys, this is my setup for the same problem:
The server is crashing from time to time, and the only thing I did, was patching Proxmox, and all LXC containers and VM's.

1740182768855.png

root@sky:/# lspci -nnk | grep -A3 RAID
02:00.0 RAID bus controller [0104]: Hewlett-Packard Company Smart Array Gen8 Controllers [103c:323b] (rev 01)
DeviceName: Storage Controller
Subsystem: Hewlett-Packard Company P420i [103c:3354]
Kernel driver in use: hpsa

root@sky:~# modinfo -p hpsa
hpsa_simple_mode:Use 'simple mode' rather than 'performant mode' (int)

root@sky:/# cat /etc/default/grub | grep -i pcie
GRUB_CMDLINE_LINUX_DEFAULT="quiet pcie_aspm=off"


Now I see :

root@sky:/# dmesg | grep -i pcie
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-8-pve root=/dev/mapper/pve-root ro quiet pcie_aspm=off
[ 0.382884] Kernel command line: BOOT_IMAGE=/boot/vmlinuz-6.8.12-8-pve root=/dev/mapper/pve-root ro quiet pcie_aspm=off
[ 0.382992] PCIe ASPM is disabled
[ 1.099750] ACPI FADT declares the system doesn't support PCIe ASPM, so disable it


root@sky:/# lspci -vvv | grep "ASPM .*abled"
LnkCtl: ASPM Disabled; RCB 64 bytes, Disabled- CommClk-


Hope the problem will disappear.
I will be back in touch, and let you know about it.

If you have any advice, please let me know!
 
You need a new pve os disk or even two if it's a raid, see your smart valies too. Kernel had auto-remounted to readonly as writes aren't successfully.
 
You need a new pve os disk or even two if it's a raid, see your smart valies too. Kernel had auto-remounted to readonly as writes aren't successfully.
Cannot see the smart values directly from the gui of proxmox, but I run the command

smartctl -a -d cciss,0 /dev/sda

and everything looks fine, nothing to be worried.
 
"... dm-... : Remounting filesystem read-only" is worried enough as your pve isn't able to work so anymore.
 
Problem has been solved.

By going to: nano /etc/default/grub
And add the line: GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off"
Just logged in to say thank you so much, this solved my issue. I still see the error but doing pct list && qm list now shows my ct and vm's running.
 
  • Like
Reactions: waltar