abrupt power outage.... proxmox will not boot

C

Chris Rivera

Guest
a few hours ago there was a power outage and while all other nodes came back online no problem one of our nodes it not working

its saying:

journal block not found
could not locate journal superblock
could not locate journal inode..
cannot mount pve 3 pve mounts

ttarget file system doesn't request /sbin/init

Then loads Busy Box (Debian:11.17.1-8)

/bin/sh cant access ty : job control turned off

i found a related post in the proxmox1.0 side that mentioned to run:

update-initramfs -k all -u

up i get update-initramfs not found


any help would be greatly appreciated!
 
I would think you lost a drive or your bootloader was corrupted; I would boot to live media and do a damage assesment/backup and plan on a reinsall.
 
Ok.. I can use a live cd from ubunutu 12.04 to mount the system and make checks but this is a little over my understanding.

what would be the proper way to mount the filesystem so that i can run a fsck to check the disk

sudo mount /dev/sda /mnt?
sudo mount /dev/sda1 /mnt/boot?

really appreaciate the help i can get...

hoping to exhaust all efforts to fix the system before backing up vm data and conf files and reinstalling a new node let me know
 
I have been told that a Ubuntu Server CD should be able not only to start the system but is also cable of mounting the system automatic in a chroot environment.
 
Hi,
I like grml for that (use an 64bit version because of chroot, which mir wrote).

But anyway,
sda1 is normaly /boot and should be mountable (perhaps need fsck).
on sda2 is the lvm pve - look with
Code:
pvscan
vgscan
lvscan
pvs
vgs
lvs
Udo
 
Mir,
yes you are correct i can use ubuntu to mount. without live cd i get errors when mounting the filesystem... parameter invalid.

#####

Udo,

i have mounted sda1 successfully and this is the boot partition... i can see grub config and files.

root@ubuntu:~# pvscan /dev/sda2
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]





root@ubuntu:~# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM


#####

Currently i have tried all i have found an have not been successful

I followed HOWTO: recover lost partition after unexpected shutdown: http://ubuntuforums.org/archive/index.php/t-1682038.html


sudo fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

sudo debugfs -w /dev/sda1
debugfs 1.41.11 (14-Mar-2010)
debugfs: clri <8>
debugfs: quit

sudo fsck -yv /dev/sda1

finished with no problems

then rebooted with ALT+PrnScreen+r+e+i+s+u+b

reboots... then i still see the same issue..
 
Mir,
yes you are correct i can use ubuntu to mount. without live cd i get errors when mounting the filesystem... parameter invalid.

#####

Udo,

i have mounted sda1 successfully and this is the boot partition... i can see grub config and files.

root@ubuntu:~# pvscan /dev/sda2
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]





root@ubuntu:~# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM


#####

Currently i have tried all i have found an have not been successful

I followed HOWTO: recover lost partition after unexpected shutdown: http://ubuntuforums.org/archive/index.php/t-1682038.html


sudo fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

sudo debugfs -w /dev/sda1
debugfs 1.41.11 (14-Mar-2010)
debugfs: clri <8>
debugfs: quit

sudo fsck -yv /dev/sda1

finished with no problems

then rebooted with ALT+PrnScreen+r+e+i+s+u+b

reboots... then i still see the same issue..

Hi,
your partition isn't lost.
Also show pvscan the vg pve - what happens with vgscan?
And if vgscan work, can you activate the vg with "vgchange -a y pve"?
Perhaps your lvs are then active and you can do an "fsck /dev/pve/root"?

Udo
 
root@ubuntu:~# lvscan
inactive '/dev/pve/swap' [15.00 GiB] inherit
inactive '/dev/pve/root' [96.00 GiB] inherit
inactive '/dev/pve/data' [338.27 GiB] inherit

how can i activate them to fsck

#####

root@ubuntu:~# pvscan /dev/sda
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]


root@ubuntu:~# vgscan /dev/sda
Too many parameters on command line
Run `vgscan --help' for more information.


root@ubuntu:~# lvscan /dev/sda
No additional command line arguments allowed
Run `lvscan --help' for more information.


root@ubuntu:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 pve lvm2 a- 465.26g 15.99g


root@ubuntu:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 465.26g 15.99g


root@ubuntu:~# pvscan
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]


root@ubuntu:~# vgscan
Reading all physical volumes. This may take a while...
Found volume group "pve" using metadata type lvm2


root@ubuntu:~# lvscan
inactive '/dev/pve/swap' [15.00 GiB] inherit
inactive '/dev/pve/root' [96.00 GiB] inherit
inactive '/dev/pve/data' [338.27 GiB] inherit


root@ubuntu:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 pve lvm2 a- 465.26g 15.99g


root@ubuntu:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 465.26g 15.99g


root@ubuntu:~# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
data pve -wi--- 338.27g
root pve -wi--- 96.00g
swap pve -wi--- 15.00g

#####
 
i was able to activate the logical volumes by using

vgchange -a y pve


now i can access the data..... at this point to i create a new system and copy the data over? Or is this system recoverable?
 
i was able to activate the logical volumes by using

vgchange -a y pve
yes - like I wrote before...
now i can access the data..... at this point to i create a new system and copy the data over? Or is this system recoverable?
If you can mount (and do before an fsck) pve-root and pve-data the system is recoverable, of couse!
The question is, why it's after thant don't boot (but perhaps booting is ok?)

Udo
 
How & where should i mount these logical volumes to the system to run fsck?

its different than mounting partitions
 
I was able to access the data portion only

i was able to activate the volume.... find the volume name then mount it.

mount /dev/pve/data /mnt/data

mounting swap failed--- dont believe i need this
mounting root failed --- bad superblock error


root@ubuntu:# mount /dev/pve/root /mnt/root
mount: wrong fs type, bad option, bad superblock on /dev/mapper/pve-root,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

#####

root@ubuntu:/mnt/data# dmesg | tail
[ 5956.100673] kjournald starting. Commit interval 5 seconds
[ 5956.100828] EXT3-fs (dm-2): warning: mounting unchecked fs, running e2fsck is recommended
[ 5956.101045] EXT3-fs (dm-2): using internal journal
[ 5956.101050] EXT3-fs (dm-2): mounted filesystem with ordered data mode
[ 6059.381175] journal_bmap: journal block not found at offset 0 on dm-1
[ 6059.381179] journal_init_inode: Cannot locate journal superblock
[ 6059.381183] EXT3-fs (dm-1): error: could not load journal inode
[ 6151.019979] journal_bmap: journal block not found at offset 0 on dm-1
[ 6151.019985] journal_init_inode: Cannot locate journal superblock
[ 6151.019988] EXT3-fs (dm-1): error: could not load journal inode

#####

At this part i can access the client data but do not have access to the conf files which i believe are on the root.

How do i fix the super block issue?
 
Try this: fsck /dev/pve/root

But before you do it it would be a god idea to make a backup: dd if=/dev/pve/root of=/root/root.img
 
Try this: fsck /dev/pve/root

But before you do it it would be a god idea to make a backup: dd if=/dev/pve/root of=/root/root.img


Awesome advice.


Do you have a remove KVM ? If so - i can get someone from our DC /NOC staff to help
All is not lost yet - sounds like we may need to just do a check against an alternate superblock

That might help find the journal entries -
 
completed but when rebooted still will not load
Hi,
if you need help,give more input PLEASE!!

What do you mean with completed?
The fsck? any output? Can you mount after that the root-partition?

Or the copy with dd - in this case your copy is lost, bacause the original commad copy the root-partition to the home-dir of root, which is only in memory with an live-cd...

BTW. Access to the data-partition is mostly enough to get the VMs back - but you must build a new config for each VM (I assume it's an single host?).
If this host is an part of an cluster, you are happy - all configs are stored on all hosts.

Udo
 
I am going to assume this is on /dev/sda1 (if not mark the appropriate location and run this command )
Code:
[FONT=Courier New]dumpe2fs /dev/sda1 | grep superblock[/FONT]

Once you find an alternate superblock from that list - run a fsck against it.

Code:
fsck -b 32768 /dev/sda1

My suggestion however would be to use RescueCD and start from there.
we have this setup in our PXE booter.

If you have another node online - and want a copy of that PXE image - (kvm) let me know - has a few other distro's in it)
Anyhow -

Make sure that /dev/sdxxxxx is not mounted of course
I have a feeling that this may have been mounted when you originally did the fsck.
(call it experience some years ago ... cough cough)

In the meantime - we are praying you get back up and running quickly.
 
Update: I was able to finally mount the pve lvm extract pve-data using tar, untar the file on node 3. Copy config files so proxmox knows vms are on node 3 and booted vms.

Now here comes my new network issue.

all vms with venet have no incoming or out going connections. Can only ping main host node.
all vps with veth work just fine...

how to i get proxmox to remake all venet network routing?
 
VM ifconfig:

[root@VM760 /]# ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:16 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1468 (1.4 KiB) TX bytes:1468 (1.4 KiB)


venet0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:127.0.0.1 P-t-P:127.0.0.1 Bcast:0.0.0.0 Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:420 (420.0 b) TX bytes:865 (865.0 b)


venet0:0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:199.195.213.*** P-t-P:199.195.213.*** Bcast:199.195.213.*** Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1


venet0:1 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:199.195.214.*** P-t-P:199.195.214.*** Bcast:199.195.214.*** Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1



#############


vm route:

[root@VM760 /]# route

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
169.254.0.0 * 255.255.0.0 U 0 0 0 venet0
default * 0.0.0.0 U 0 0 0 venet0




#############


hostnode ifconfig:

root@proxmox3a:~# ifconfig | grep 760


#############
 
Last edited by a moderator:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!