[SOLVED] Guest-VM crash, fsck failed, bootloop

msbt

New Member
Jun 30, 2018
4
0
1
44
Hi there!

Earlier today one of my Ubuntu VM's suddently stopped working. I didn't change anything on the host (other than renewing the cert a few days ago) or the guest and both were running for a few weeks without any issues. Got a mail today from my uptime monitor that some websites aren't working anymore and I tried to jump on the console to see what's what. I couldn't even login (on the guest) and got spammed by fatal errors about read-only filesystem I think (sadly I don't have a screenshot of those). I rebooted the guest and was stuck in a boot-loop with the following message (which appeared after entering the password to encrypt the lvm):

proxmox.png

It is trying to do a fsck but after ~5% the message with the panic= boot argument appears and the machine automatically reboots.

VM settings:

proxmox_settings.PNG
Proxmox version is 5.2-8 (now that I'm writing that, I might have done an aptitude safe-upgrade the other day, but that was a few days ago already). The disk size is 1TB and had about ~60GB in use (although backups were as big as 200GB+). I've "solved" the issue by adding a new machine and recovering a backup, but I would very much like to find out how that happened to avoid future issues like that, any pointers on how to access that machine again and maybe pull some logs?

Best regards
 
does the host syslog show anything in that time period? maybe a broken disk, or full partition?
 
Ah yes, of course, the most obvious thing I didn't check :s

Seems to me that one of the disk in the Raid5 (I imagined that this is exactly why one would create a Raid) is acting up:

Sep 18 16:17:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 16 Currently unreadable (pending) sectors
Sep 18 16:17:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 16 Offline uncorrectable sectors
...
Sep 19 09:47:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 8 Currently unreadable (pending) sectors
Sep 19 09:47:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 8 Offline uncorrectable sectors
...
Sep 19 17:17:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 40 Currently unreadable (pending) sectors (changed +32)
Sep 19 17:17:08 Proxmox-VE smartd[764]: Device: /dev/bus/0 [megaraid_disk_05] [SAT], 40 Offline uncorrectable sectors (changed +32)
...

I'll replace it and see if something like that happens again, sorry for not checking the syslog first.

Best regards
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!