Server stuck at "/sbin/fsck.xfs: XFS file system" after a power failure

Minionan

Member
Feb 2, 2021
5
0
6
40
I had a power failure and lost my server.
Now on boot, my proxmox machine is stuck at: /sbin/fsck.xfs: XFS file system
Tried to manually scan the disk in recovery mode using
root@pve:~#xfs_repair -n /dev/mapper/pve-root
Unfortunately, can't unmount the active device, and thus can't perform the repair.
Found the following fix:

"The fix is to manually start services required for LVM, then run fsck (checking that root is mounted read-only), and reboot.
I've written a script to do this currently located on grid02 and attached (inline here since the forum says it is an invalid file)
=========================================
#!/bin/sh
# Karsten M. Self
# Thu Jan 22 23:18:04 GMT 2009
#
# If we fail fsck on boot we need to enable LVM before we can fsck.
#
/etc/init.d/glibc.sh start
/etc/init.d/mountkernfs.sh start
/etc/init.d/udev start
/etc/init.d/mountdevsubfs.sh start
/etc/init.d/libdevmapper1.02 start
/etc/init.d/lvm start
# Hail Mary fsck
mount -o remount,ro / && e2fsck -y /dev/mapper/pve-root
echo "Check for the smell of burning rubber. You should probably reboot." "

Understand that I need to write the above script, but I need a bit of additional info:
1. Where to write the script? What type of file? Maybe update any of the existing scripts?
2. How to point to the script so it will be loaded before fsck begins?
3. Can I write the script with vim or nano in recovery mode or do I have to use rescue disk?
4. I there any easier way to fix it?
 
So I booted the debian live and found some interesting stuff.
LVSCAN gives expected:
Code:
user@debian:~$ sudo lvscan
  ACTIVE            '/dev/pve/data' [140.45 GiB] inherit
  ACTIVE            '/dev/pve/swap' [8.00 GiB] inherit
  ACTIVE            '/dev/pve/root' [55.75 GiB] inherit
But xfs_repair doesn't find any issues on the disk:
Code:
user@debian:~$ sudo xfs_repair -n /dev/pve/root
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 3
        - agno = 2
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.
Just to be sure I checked if only my root is xfs and that is correct:
Code:
user@debian:~$ sudo fsck -N /dev/pve/root
fsck from util-linux 2.38.1
[/usr/sbin/fsck.xfs (1) -- /dev/mapper/pve-root] fsck.xfs /dev/mapper/pve-root
user@debian:~$ sudo fsck -N /dev/pve/data
fsck from util-linux 2.38.1
[/usr/sbin/fsck.ext2 (1) -- /dev/mapper/pve-data] fsck.ext2 /dev/mapper/pve-data
So according to xfs_repair the root partition is ok but it still hangs while booting on:
Code:
...
Found volume group "pve" using metadata type lvm2
3 logical volume(s) in volume group "pve" now active
/sbin/fsck.xfs: XFS file system.
_
When installing the system I wanted to install proxmox on raid1 of two 240GB ssd's I had lying around.
I don't remember if it worked but the system was running so I didn't care at that point.
lsblk shows those two drives as sdn & sdo:
Code:
user@debian:~$ lsblk
NAME     MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTSsdn        8:208  0 223.6G  0 disk
...
sdn        8:208  0 223.6G  0 disk
├─sdn1     8:209  0  1007K  0 part
├─sdn2     8:210  0   512M  0 part
└─sdn3     8:211  0 223.1G  0 part
  ├─pve-swap
  │      254:0    0     8G  0 lvm
  ├─pve-root
  │      254:1    0  55.8G  0 lvm
  ├─pve-data_tmeta
  │      254:2    0   1.4G  0 lvm
  │ └─pve-data
  │      254:4    0 140.5G  0 lvm
  └─pve-data_tdata
         254:3    0 140.5G  0 lvm
    └─pve-data
         254:4    0 140.5G  0 lvm
sdo        8:224  0 223.6G  0 disk
...
Tried to find where proxmox stores info about raid arrays but can't find mdadm.conf or anything similar.

Where proxmox stores raid info?

I can boot proxmox from USB install drive: Advanced Options -> Rescue Boot
All seems to be fine, the server comes online, VM's and data are in place.
But I can't start my most important VM's as they require PCI pass-through
Error: cannot prepare PCI passtrough, IOMMU not present

How to fix the original system drive?
Am I missing something?
 
Last edited:
So I eventually got it fixed.
With the help of a professional, we changed all /bin/sbin folder and played with files in etc/pve. Nothing worked, the system was hanging on:
Code:
/sbin/fsck.xfs: XFS file system
Defeated, I installed the new Proxmox on a new drive with all previous drives connected. Installed the new system and created a new grub. When trying to boot into my old system in rescue mode I started it in normal mode by mistake and the system miraculously booted. Can't really wrap my head around how it happened, but after installing the new Proxmox, the old installation sprung to life. The boot screen still hangs on:
Code:
/sbin/fsck.xfs: XFS file system
But this time, the server actually booted and the system was available via the browser where I could start all my VMs.
Couldn't communicate with the system via the shell directly but the shell via the browser worked fine.
I updated the system to v8.0.4 and everything works fine.
Learned my lesson and cloned the Proxmox disk this time.
I didn't really manage to find what was wrong. Just found a quick and dirty fix.
Hope this info will help anyone with the same problem.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!