EXT4 I/O Error suddenly

stoeger

New Member
Oct 14, 2024
3
0
1
Hi there,

After moving my proxmox server from my parents to my flat I have this new error that stops the server from booting. The boot drive is a nvme ssd. When booting in recovery mode, i can log into the system and do stuff. I tried a rw-test as well as nvme-cli and smartcl with everything working and not displaying errors. Looking at the journalctl -xb, i can see no errors pop up which correlate to storage disks. When trying to disable the gpu passtrough, I could not update the initramfs due to some rw-errors but I thought, that this is because the system is in recovery mode. I can not boot in default mode, because then the error shows up.

Does anybody know, how I or if this is fixable, since i forgot to create a backup from my containers/vms :,(
 

Attachments

  • proxmox_boot.jpg
    proxmox_boot.jpg
    949.9 KB · Views: 24
Welcome to the Proxmox forum, stoeger!

Have you already found a solution to your problem? If not, have you also run a long test with smartctl to verify that there is nothing wrong with the drive? You could also boot a Live ISO (e.g. Debian/Ubuntu) and run e2fsck -fccky <device> there, to preserve the previous badblocks table and check if there were new bad blocks detected and run fsck on it, if no major errors were detected.

I would create a backup of the drive in any case, especially if there is any indication that the drive is failing. The filesystem should be still mountable on a Live ISO and from there you should be able to create a backup of the drive.
 
No, sadly not. I booted in recovery mode and remounted the fs in rw. After that I ran a short self test using the nvme-cli tool and the drive passed the test. When trying to create a backup using vzdump to a usb on the proxmox host (in recovery mode with fs remounted as rw), I get Error: Backup of VM <ID> failed - no such logical volume /pve/data followed by the same errors EXT4-fs warning and Buffer I/O error on device dm-1 as previously. I had to manually start the pvedaemon and pve-cluster.

After trying to create a backup I can no longer send commands in the terminal. For example shutdown returns bash: /usr/bin/bash: Input/output error.

I will try to install ubuntu on a usb and boot from it to check the boot drive with e2fsck -fccky <device>.

I doubt that it will have to do anything with this error, since the host is currently not connected to the network, but i had to change the IP-Address in /etc/network/interfaces as well as modify the /etc/hosts file to get it connected to the LAN.
 
Last edited:
So I tested the drive using the method described here:

Got "input output error" when execute any commands

When running time -p dd if=/dev/sda of=/dev/null bs=4M where sda is the boot drive (in my case nvme0n1), the command finishes without a error indicating that the problem is the filesystem and not the drive itself (could be due to not shutting down the system properly before moving).

I made a backup of the nvme0n1 drive using dd if=/dev/nvme0n1 of=/mnt/usb/proxmox-backup.img bs=64k conv=noerror,sync. The usb drive is mounted to /mnt/usb just for clarification.

I tried adding fsck.mode=force to grub using:

How to automatically force fsck disks after crash in `systemd`?

But this does not fix anything and if i boot in default, the I/O error comes back.

I will try the route with booting from a usb and running e2fsck from there. If this also fails i think the only solution would be to reinstall proxmox with the downside of loosing the configs (I could get the from the dd backup, but I have never worked with this before)