Host Broken after PowerLoss

rechena · Jan 14, 2021

Hi, I know its been discussed alot and I've tried everything I could find, and I'm at a loss at this stage

This morning we had a powerloss and the ups couldn't keep up. tldr, my proxmox went down hard. And when I try to bring it back, I have the No Boot message.
I've tried to recover the grub but I keep getting issues, it just won't boot.

This is the process I've followed: https://pve.proxmox.com/wiki/Recover_From_Grub_Failure

But when I try to mount the boot partition:

Code:

# mount /dev/sdd1 /media/RESCUE/boot/
mount: /media/RESCUE/boot: wrong fs type, bad option, bad superblock on /dev/sdd1, missing codepage or helper program, or other error.

This is my fdisk..

Code:

# fdisk -l /dev/sdd
Disk /dev/sdd: 113 GiB, 121332826112 bytes, 236978176 sectors
Disk model: APPLE SSD TS128A
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 822A7CF2-48A5-490D-9093-39071F944796

Device       Start       End   Sectors   Size Type
/dev/sdd1       34      2047      2014  1007K BIOS boot
/dev/sdd2     2048   1050623   1048576   512M EFI System
/dev/sdd3  1050624 236978142 235927519 112.5G Linux LVM

I tried fsck but no joy also..

Code:

# fsck /dev/sdd1
fsck from util-linux 2.33.1
e2fsck 1.44.5 (15-Dec-2018)
ext2fs_open2: Bad magic number in super-block
fsck.ext2: Superblock invalid, trying backup blocks...
fsck.ext2: Bad magic number in super-block while trying to open /dev/sdd1

The superblock could not be read or does not describe a valid ext2/ext3/ext4
filesystem.  If the device is valid and it really contains an ext2/ext3/ext4
filesystem (and not swap or ufs or something else), then the superblock
is corrupt, and you might try running e2fsck with an alternate superblock:
    e2fsck -b 8193 <device>
 or
    e2fsck -b 32768 <device>

on /dev/sdd2 I seem to have a dirty bit which I can't remove also

Code:

root@sauron:/# fsck /dev/sdd2
fsck from util-linux 2.33.1
fsck.fat 4.1 (2017-01-24)
0x41: Dirty bit is set. Fs was not properly unmounted and some data may be corrupt.
1) Remove dirty bit
2) No action
?

Any other ideas what I can do? Thanks for the help.

Should I assume at this stage I need to reinstall the system?

rechena · Jan 14, 2021

Eventually I had to get the server up and running so I just did a fresh install and reimported the zpools...

Question though, any idea why this would corrupt so easily? is it because its a ssd?

leesteken · Jan 14, 2021

Glad to hear you recovered the system. These are just my thoughts on this subject and maybe a partial answer to your question.

There are more posts that have has boot and/or root partition issues after an unexpected power loss when using non-battery-backed/consumer SSDs. I have personally experienced a ZFS-pool to go bad because of a power interruption on a cheap SSD mirror. Even ZFS on a mirror! It did however detect that the metadata was broken.

Please note that the bios boot partition is not a filesystem, therefore mounting and fsck will not work. It contains binary code for running GRUB.
The ESP partition is a FAT32 filesystem that contains the actual boot files (for UEFI at least), such as the kernel and initramfs, which still has the dirty bit set because of the hard power-off. However, unless you were actually writing to it (e.g., running apt-get dist-upgrade), I would not expect any errors in that particular filesystem.

Is your system booting from UEFI or BIOS (or UEFI with CSM enabled)? Please note that the wiki-page is about Proxmox 4 and out of date (i.e., no ESP at that time).
I'm not sure what the "No Boot message" is exactly. Maybe it was a problem with the Proxmox root filesystem on LVM? Were you using thinly-provisioned LVM? There were probably disk writes going on to that filesystem when the power went out.

rechena · Jan 14, 2021

avw said:
Glad to hear you recovered the system. These are just my thoughts on this subject and maybe a partial answer to your question.

There are more posts that have has boot and/or root partition issues after an unexpected power loss when using non-battery-backed/consumer SSDs. I have personally experienced a ZFS-pool to go bad because of a power interruption on a cheap SSD mirror. Even ZFS on a mirror! It did however detect that the metadata was broken.

Please note that the bios boot partition is not a filesystem, therefore mounting and fsck will not work. It contains binary code for running GRUB.
The ESP partition is a FAT32 filesystem that contains the actual boot files (for UEFI at least), such as the kernel and initramfs, which still has the dirty bit set because of the hard power-off. However, unless you were actually writing to it (e.g., running apt-get dist-upgrade), I would not expect any errors in that particular filesystem.

Is your system booting from UEFI or BIOS (or UEFI with CSM enabled)? Please note that the wiki-page is about Proxmox 4 and out of date (i.e., no ESP at that time).
I'm not sure what the "No Boot message" is exactly. Maybe it was a problem with the Proxmox root filesystem on LVM? Were you using thinly-provisioned LVM? There were probably disk writes going on to that filesystem when the power went out.

Thanks so much for the extended reply, will try to reply as good

How do I check if I"m booting from UEFI or BIOS? Tried to look for indication and couldn't..
Fair point on the fsck, didn't though of that.

Humm I'm only using the boot disk for Proxmox with the default settings, nothing else. I do use thin-provision for the VMs and LXCs but thats on the zpool, not sure if this is the info you we're looking for?

Thanks

leesteken · Jan 14, 2021

rechena said:
How do I check if I"m booting from UEFI or BIOS? Tried to look for indication and couldn't..

Are there files in /sys/firmware/efi/efivars (UEFI with CSM disabled) or does that directory not exist (Legacy/BIOS/UEFI with CSM enabled)?

rechena said:
Humm I'm only using the boot disk for Proxmox with the default settings, nothing else. I do use thin-provision for the VMs and LXCs but thats on the zpool, not sure if this is the info you we're looking for?

I think it is wise to separate the Proxmox host installation from the VM/CT storage (as you have done), because it allows easier reinstallation if it is needed.

From your initial post, I cannot determine what was broken and why it was not booting. But I guess it does not matter anymore (and we cannot check anyway).

rechena · Jan 14, 2021

avw said:
Are there files in /sys/firmware/efi/efivars (UEFI with CSM disabled) or does that directory not exist (Legacy/BIOS/UEFI with CSM enabled)?

Yep

Code:

oot@sauron:~# ls -la /sys/firmware/efi/efivars |wc -l
136

avw said:
I think it is wise to separate the Proxmox host installation from the VM/CT storage (as you have done), because it allows easier reinstallation if it is needed.

Yeah, that was a good call alright, and I also have mirror on the VMs just not on the boot one.. but I suppose it wouldnt make a difference in this case.

leesteken · Jan 14, 2021

Using a mirror (Proxmox handles keeping multiple ESP up-to-date) and ZFS allows one to lose a drive and also detect data-corruption if one of the drives is slowly failing.
However, when you use the exact same drives that do write action the same at the same time (without proper battery backup), you can still have them corrupt data together on a power loss. I've seen that the same sectors were corrupt (nice to have checksums) and thus not recoverable, luckily it was just a log file.

In your case, I would make sure that the UPS works and that the system shuts down quickly enough (maybe even kill the VMs, which can be restored from backup, if necessary), as this is the battery backup for your drives. Or, if you don't have a working UPS, to buy an enterprise SSD with battery backup to mirror or replace your Proxmox drive.
PS: Note that you only need a 8GB disk for Proxmox itself and it don't need to be fast, so get the cheapest drive with (real) power loss protection (PLP).

lixaotec · May 21, 2021

Dear @avw , i do have UPS, unfortunally didnt setup it up to shut down properly, and had a power loss event that turned my root read-only.

I reboot and got it working again without doing anything (I assume fsck ran at boot time and did fixed it auto).

However I had not any power loss recently and suddenly it turn out read-only again, hows that possible?

Its a consumer ssd, kingston 120gb av400, just checked its health and its ok, even wearout is very low at 4% only.

So i would like to ask you what could have lead to that, and how can I resolve it without rebooting my server.

Appreciate your help.

Thanks

dmesg:

Code:

[496425.146346] perf: interrupt took too long (2503 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
[523651.936639] sd 4:0:0:0: [sdd] tag#15 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[523651.936642] sd 4:0:0:0: [sdd] tag#15 CDB: Write(10) 2a 00 02 94 95 78 00 00 98 00
[523651.936644] blk_update_request: I/O error, dev sdd, sector 43292024 op 0x1:(WRITE) flags 0x800 phys_seg 19 prio class 0
[523651.936687] sd 4:0:0:0: [sdd] tag#3 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[523651.936689] sd 4:0:0:0: [sdd] tag#3 CDB: Write(10) 2a 00 01 20 4c f0 00 00 18 00
[523651.936690] blk_update_request: I/O error, dev sdd, sector 18894064 op 0x1:(WRITE) flags 0x0 phys_seg 3 prio class 0
[523651.936695] EXT4-fs warning (device dm-15): ext4_end_bio:315: I/O error 10 writing to inode 526922 (offset 385024 size 12288 starting block 133022)
[523651.936698] Buffer I/O error on device dm-15, logical block 133022
[523651.936708] Buffer I/O error on device dm-15, logical block 133023
[523651.936710] Buffer I/O error on device dm-15, logical block 133024
[523651.936720] JBD2: Detected IO errors while flushing file data on dm-15-8
[523651.936721] sd 4:0:0:0: [sdd] tag#2 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[523651.936722] sd 4:0:0:0: [sdd] tag#2 CDB: Read(10) 28 00 00 10 08 00 00 01 00 00
[523651.936724] blk_update_request: I/O error, dev sdd, sector 1050624 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[523651.936730] Aborting journal on device dm-15-8.
[523651.936739] sd 4:0:0:0: [sdd] tag#1 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[523651.936740] sd 4:0:0:0: [sdd] tag#1 CDB: Read(10) 28 00 00 00 08 00 00 01 00 00
[523651.936741] blk_update_request: I/O error, dev sdd, sector 2048 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[523651.936755] sd 4:0:0:0: [sdd] tag#0 FAILED Result: hostbyte=DID_OK driverbyte=DRIVER_TIMEOUT
[523651.936756] sd 4:0:0:0: [sdd] tag#0 CDB: Read(10) 28 00 00 00 00 00 00 01 00 00
[523651.936757] blk_update_request: I/O error, dev sdd, sector 0 op 0x0:(READ) flags 0x0 phys_seg 32 prio class 0
[523651.945590] EXT4-fs error (device dm-15): ext4_journal_check_start:61: Detected aborted journal
[523651.945598] EXT4-fs (dm-15): Remounting filesystem read-only
[523651.946970] EXT4-fs error (device dm-15): ext4_journal_check_start:61: Detected aborted journal
[523651.948327] EXT4-fs error (device dm-15): ext4_journal_check_start:61: Detected aborted journal
[523651.948333] EXT4-fs (dm-15): ext4_writepages: jbd2_start: 2047 pages, ino 527874; err -30
[523651.949730] EXT4-fs error (device dm-15): ext4_journal_check_start:61: Detected aborted journal
[535696.142621] hrtimer: interrupt took 9768 ns

leesteken · May 21, 2021

If those messages come from a VM and you used thin provisioning then it would be because your drive is actually full but the VM still thinks that the virtual disk has space.
Otherwise, it looks like your drive has errors while reading and writing. Could be cables, could be a heat problem, could be a hardware fault. SMART does not always notice problems before they occur. Did you do a long self test? Or maybe the inside of the drive is fine, but errors occurs in transport like connectors or only when it is very busy. Try unplugging it and reconnecting it? Try it on another system? Ask Kingston for advice on how to test for hardware issues or bad flash memory?
I am no expert on LVM/ext4 and I know nothing about your SSD, but I would consider it broken and check the warranty. I don't see how you could fix this without rebooting.
Maybe someone else understands those errors better than me and might be able to help you?

lixaotec · May 21, 2021

Thanks @avw

I´ll check all those info later when I am able to reset.

lixaotec · May 21, 2021

Taking a part @avw, what would be a better setup, considering this situation of Power Loss will happen again, and I can´t afford a ssd with power loss protection?

Using another filesystem for root would help me in any way? ZFS perhaps?

Thanks in advance

lixaotec · May 21, 2021

Is there any wat to send a 'qm stop/shutdown' command to VM gracefully shutdown, without writing to /var/log/...?

As it is readonly I cannot do anything from command line, however for the VMs I can access I am able to as it for a shutdown inside, but I dont know how to stop the others that I dont have access..

lixaotec · May 21, 2021

unable to create output file '/var/log/pve/tasks/4/UPID:<just codes>:qmshutdown:504:root@pam:' - Read-only file system

Search

Search

Host Broken after PowerLoss

rechena

Member

rechena

Member

leesteken

Distinguished Member

rechena

Member

leesteken

Distinguished Member

rechena

Member

leesteken

Distinguished Member

lixaotec

Member

leesteken

Distinguished Member

lixaotec

Member

lixaotec

Member

lixaotec

Member

lixaotec

Member