Problem with NVME disks with EXT4-fs error

nmartins

New Member
Apr 10, 2025
2
1
3
In the first place let me disclosure that I've searched the forum for this error but I didn't saw a "solution" to this problem that applies in my case.
Second, this is consumer hardware because I just want to try Proxmox and use it in my homelab setting.

Going to the issue...

I bought some new hardware and installed the latest Proxmox (8.4-1). Around 20m with the system running I get a
"EXT4-fs error (device dm-1): ext4_journal_check_start:84:"
"EXT4-fs (dm-1): Remounting filesystems read-only"

The disk in question is a brand new WD Black SN770 NVME 1TB.

I did the following tests to see if I could bypass the error:
  1. Tried the disk in the other M2 slot, same problem;
  2. Tried adding the following to the /etc/default/grub file: GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" - same problem
  3. Tried with an used Samsung SSD 840 Pro - with this one I got the system to run all night without problems and even running a VM.
  4. Tried with an used WD Black SN750 NVMW 500GB - same error as the other NVME;
  5. I went through my BIOS (Gigabyte b550 aorus elite v2) and everything looks ok - but to be honest I don't know what to look for in particular.
The error itself varies a bit from what I posted above, but I think it is always related. The other 2 errors I saw were:
    1. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: IO failure"
    2. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: Journal has aborted"
    3. "EXT4-fs error (device dm-1) ext4_do_writepages: jbd2_start: 10181 pages, ino 6032890; err -5"
    4. "EXT4-fs (dm-1): Remounting filesystems read-only"
    5. "EXT4-fs (dm-1): Remounting filesystems read-only"
    1. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: Journal has aborted"
    2. "EXT4-fs error (device dm-1): __ext4_find_entry:1683: inode #2500000: comm sh: reading directory Iblock 0"
    3. "EXT4-fs (dm-1): Remounting filesystems read-only"
    4. "EXT4-fs (dm-1): Remounting filesystems read-only"

I saw in other posts here that the general opinion is that "the disk is dying" or that "you should use enterprise hardware".
It's hard to believe that both NVME disks are about to die... seems a lot of coincidence to me. Regarding enterprise hardware... I get that it must be more stable and all of that. But I'm talking here that the system just running without any VMs or containers crashes... the only thing I did was uploading an ISO image...

Anyway... do you see any thing that I could try or any indication of some problem? In my ignorance looking at this I can only think that either Proxmox has some issue with NVME disks or there is some setting in my motherboard for NVME disks that I don't understand.
 
  • Like
Reactions: werty89
Hi @nmartins , welcome to forum.

It is possible there is some sort of power issue and/or PCI issue on the motherboard. There is a chance that the disk firmware and m/b BIOS are not compatible.

You could try running stock Debian or stock Ubuntu for a week to see of there is any improvement.

Good luck


Blockbridge : Ultra low latency all-NVME shared storage for Proxmox - https://www.blockbridge.com/proxmox
 
I would run "sudo fsck.ext4 /dev/nvme0n1" (or whatever the device name is) to check for and possibly repair errors. I would also run a SMART test against it. I would also try re-installing Proxmox with ZFS instead. I run a couple of consumer grade boxes here with no name NVME drives in them (Teamgroup stuff mostly) and they run fine on consumer drives. Although to be fair, I do disable some services which could potentially cause more write amplification (disable corosync, pve-ha-crm, and pve-ha-lrm). I also don't store much actual data on my Proxmox hosts. All my data resides on a TrueNAS machine, and inside of my VMs I tend to run NFS. My actual VM sizes are pretty small, usually 16 or 32 gb
 
In the first place let me disclosure that I've searched the forum for this error but I didn't saw a "solution" to this problem that applies in my case.
Second, this is consumer hardware because I just want to try Proxmox and use it in my homelab setting.

Going to the issue...

I bought some new hardware and installed the latest Proxmox (8.4-1). Around 20m with the system running I get a
"EXT4-fs error (device dm-1): ext4_journal_check_start:84:"
"EXT4-fs (dm-1): Remounting filesystems read-only"

The disk in question is a brand new WD Black SN770 NVME 1TB.

I did the following tests to see if I could bypass the error:
  1. Tried the disk in the other M2 slot, same problem;
  2. Tried adding the following to the /etc/default/grub file: GRUB_CMDLINE_LINUX_DEFAULT="quiet nvme_core.default_ps_max_latency_us=0 pcie_aspm=off" - same problem
  3. Tried with an used Samsung SSD 840 Pro - with this one I got the system to run all night without problems and even running a VM.
  4. Tried with an used WD Black SN750 NVMW 500GB - same error as the other NVME;
  5. I went through my BIOS (Gigabyte b550 aorus elite v2) and everything looks ok - but to be honest I don't know what to look for in particular.
The error itself varies a bit from what I posted above, but I think it is always related. The other 2 errors I saw were:
    1. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: IO failure"
    2. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: Journal has aborted"
    3. "EXT4-fs error (device dm-1) ext4_do_writepages: jbd2_start: 10181 pages, ino 6032890; err -5"
    4. "EXT4-fs (dm-1): Remounting filesystems read-only"
    5. "EXT4-fs (dm-1): Remounting filesystems read-only"
    1. "EXT4-fs error (device dm-1) in ext4_reserve_inode_write:5792: Journal has aborted"
    2. "EXT4-fs error (device dm-1): __ext4_find_entry:1683: inode #2500000: comm sh: reading directory Iblock 0"
    3. "EXT4-fs (dm-1): Remounting filesystems read-only"
    4. "EXT4-fs (dm-1): Remounting filesystems read-only"

I saw in other posts here that the general opinion is that "the disk is dying" or that "you should use enterprise hardware".
It's hard to believe that both NVME disks are about to die... seems a lot of coincidence to me. Regarding enterprise hardware... I get that it must be more stable and all of that. But I'm talking here that the system just running without any VMs or containers crashes... the only thing I did was uploading an ISO image...

Anyway... do you see any thing that I could try or any indication of some problem? In my ignorance looking at this I can only think that either Proxmox has some issue with NVME disks or there is some setting in my motherboard for NVME disks that I don't understand.

Hi. Did you manage to fix the error?
I have the same problem. The disk is OK.

Active State Power Management is disabled.

nvme_core.default_ps_max_latency_us=0

And the system still goes into error.
 

Attachments

  • 123.jpg
    123.jpg
    43.2 KB · Views: 12
  • Like
Reactions: werty89
Unfortunately I didn't.

I even asked for a replacement unit and it was the exact same issue. Also with a regular Ubuntu Server it happened the same.
I got it to work but it was with an old 2.5 SSD. I had to halt that server build for a while and I'm not sure when I will return to it, but I guess for me the solution will be to use something that it isn't NVME.