abrupt power outage.... proxmox will not boot

Chris Rivera · Feb 7, 2013

a few hours ago there was a power outage and while all other nodes came back online no problem one of our nodes it not working

its saying:

journal block not found
could not locate journal superblock
could not locate journal inode..
cannot mount pve 3 pve mounts

ttarget file system doesn't request /sbin/init

Then loads Busy Box (Debian:11.17.1-8)

/bin/sh cant access ty : job control turned off

i found a related post in the proxmox1.0 side that mentioned to run:

update-initramfs -k all -u

up i get update-initramfs not found

any help would be greatly appreciated!

Chris Rivera · Feb 7, 2013

trey85stang · Feb 7, 2013

I would think you lost a drive or your bootloader was corrupted; I would boot to live media and do a damage assesment/backup and plan on a reinsall.

Chris Rivera · Feb 7, 2013

Ok.. I can use a live cd from ubunutu 12.04 to mount the system and make checks but this is a little over my understanding.

what would be the proper way to mount the filesystem so that i can run a fsck to check the disk

sudo mount /dev/sda /mnt?
sudo mount /dev/sda1 /mnt/boot?

really appreaciate the help i can get...

hoping to exhaust all efforts to fix the system before backing up vm data and conf files and reinstalling a new node let me know

mir · Feb 7, 2013

I have been told that a Ubuntu Server CD should be able not only to start the system but is also cable of mounting the system automatic in a chroot environment.

udo · Feb 7, 2013

Hi,
I like grml for that (use an 64bit version because of chroot, which mir wrote).

But anyway,
sda1 is normaly /boot and should be mountable (perhaps need fsck).
on sda2 is the lvm pve - look with

Code:

pvscan
vgscan
lvscan
pvs
vgs
lvs

Udo

Chris Rivera · Feb 7, 2013

Mir,
yes you are correct i can use ubuntu to mount. without live cd i get errors when mounting the filesystem... parameter invalid.

#####

Udo,

i have mounted sda1 successfully and this is the boot partition... i can see grub config and files.

root@ubuntu:~# pvscan /dev/sda2
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]

root@ubuntu:~# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

#####

Currently i have tried all i have found an have not been successful

I followed HOWTO: recover lost partition after unexpected shutdown: http://ubuntuforums.org/archive/index.php/t-1682038.html

sudo fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

sudo debugfs -w /dev/sda1
debugfs 1.41.11 (14-Mar-2010)
debugfs: clri <8>
debugfs: quit

sudo fsck -yv /dev/sda1

finished with no problems

then rebooted with ALT+PrnScreen+r+e+i+s+u+b

reboots... then i still see the same issue..

udo · Feb 7, 2013

Chris Rivera said:
Mir,
yes you are correct i can use ubuntu to mount. without live cd i get errors when mounting the filesystem... parameter invalid.

#####

Udo,

i have mounted sda1 successfully and this is the boot partition... i can see grub config and files.

root@ubuntu:~# pvscan /dev/sda2
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]

root@ubuntu:~# fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

#####

Currently i have tried all i have found an have not been successful

I followed HOWTO: recover lost partition after unexpected shutdown: http://ubuntuforums.org/archive/index.php/t-1682038.html

sudo fdisk -l /dev/sda

Disk /dev/sda: 500.1 GB, 500107862016 bytes
255 heads, 63 sectors/track, 60801 cylinders, total 976773168 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x000af768

Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1048575 523264 83 Linux
/dev/sda2 1048576 976773119 487862272 8e Linux LVM

sudo debugfs -w /dev/sda1
debugfs 1.41.11 (14-Mar-2010)
debugfs: clri <8>
debugfs: quit

sudo fsck -yv /dev/sda1

finished with no problems

then rebooted with ALT+PrnScreen+r+e+i+s+u+b

reboots... then i still see the same issue..

Hi,
your partition isn't lost.
Also show pvscan the vg pve - what happens with vgscan?
And if vgscan work, can you activate the vg with "vgchange -a y pve"?
Perhaps your lvs are then active and you can do an "fsck /dev/pve/root"?

Udo

Chris Rivera · Feb 7, 2013

root@ubuntu:~# lvscan
inactive '/dev/pve/swap' [15.00 GiB] inherit
inactive '/dev/pve/root' [96.00 GiB] inherit
inactive '/dev/pve/data' [338.27 GiB] inherit

how can i activate them to fsck

#####

root@ubuntu:~# pvscan /dev/sda
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]

root@ubuntu:~# vgscan /dev/sda
Too many parameters on command line
Run `vgscan --help' for more information.

root@ubuntu:~# lvscan /dev/sda
No additional command line arguments allowed
Run `lvscan --help' for more information.

root@ubuntu:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 pve lvm2 a- 465.26g 15.99g

root@ubuntu:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 465.26g 15.99g

root@ubuntu:~# pvscan
PV /dev/sda2 VG pve lvm2 [465.26 GiB / 15.99 GiB free]
Total: 1 [465.26 GiB] / in use: 1 [465.26 GiB] / in no VG: 0 [0 ]

root@ubuntu:~# vgscan
Reading all physical volumes. This may take a while...
Found volume group "pve" using metadata type lvm2

root@ubuntu:~# lvscan
inactive '/dev/pve/swap' [15.00 GiB] inherit
inactive '/dev/pve/root' [96.00 GiB] inherit
inactive '/dev/pve/data' [338.27 GiB] inherit

root@ubuntu:~# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 pve lvm2 a- 465.26g 15.99g

root@ubuntu:~# vgs
VG #PV #LV #SN Attr VSize VFree
pve 1 3 0 wz--n- 465.26g 15.99g

root@ubuntu:~# lvs
LV VG Attr LSize Origin Snap% Move Log Copy% Convert
data pve -wi--- 338.27g
root pve -wi--- 96.00g
swap pve -wi--- 15.00g

#####

Chris Rivera · Feb 7, 2013

i was able to activate the logical volumes by using

vgchange -a y pve

now i can access the data..... at this point to i create a new system and copy the data over? Or is this system recoverable?

udo · Feb 7, 2013

Chris Rivera said:
i was able to activate the logical volumes by using

vgchange -a y pve

yes - like I wrote before...

now i can access the data..... at this point to i create a new system and copy the data over? Or is this system recoverable?

If you can mount (and do before an fsck) pve-root and pve-data the system is recoverable, of couse!
The question is, why it's after thant don't boot (but perhaps booting is ok?)

Udo

Chris Rivera · Feb 7, 2013

How & where should i mount these logical volumes to the system to run fsck?

its different than mounting partitions

Chris Rivera · Feb 7, 2013

I was able to access the data portion only

i was able to activate the volume.... find the volume name then mount it.

mount /dev/pve/data /mnt/data

mounting swap failed--- dont believe i need this
mounting root failed --- bad superblock error

root@ubuntu:# mount /dev/pve/root /mnt/root
mount: wrong fs type, bad option, bad superblock on /dev/mapper/pve-root,
missing codepage or helper program, or other error
In some cases useful info is found in syslog - try
dmesg | tail or so

#####

root@ubuntu:/mnt/data# dmesg | tail
[ 5956.100673] kjournald starting. Commit interval 5 seconds
[ 5956.100828] EXT3-fs (dm-2): warning: mounting unchecked fs, running e2fsck is recommended
[ 5956.101045] EXT3-fs (dm-2): using internal journal
[ 5956.101050] EXT3-fs (dm-2): mounted filesystem with ordered data mode
[ 6059.381175] journal_bmap: journal block not found at offset 0 on dm-1
[ 6059.381179] journal_init_inode: Cannot locate journal superblock
[ 6059.381183] EXT3-fs (dm-1): error: could not load journal inode
[ 6151.019979] journal_bmap: journal block not found at offset 0 on dm-1
[ 6151.019985] journal_init_inode: Cannot locate journal superblock
[ 6151.019988] EXT3-fs (dm-1): error: could not load journal inode

#####

At this part i can access the client data but do not have access to the conf files which i believe are on the root.

How do i fix the super block issue?

mir · Feb 7, 2013

Try this: fsck /dev/pve/root

But before you do it it would be a god idea to make a backup: dd if=/dev/pve/root of=/root/root.img

Chris Rivera · Feb 8, 2013

completed but when rebooted still will not load

typo3usa · Feb 8, 2013

mir said:
Try this: fsck /dev/pve/root

But before you do it it would be a god idea to make a backup: dd if=/dev/pve/root of=/root/root.img

Awesome advice.

Do you have a remove KVM ? If so - i can get someone from our DC /NOC staff to help
All is not lost yet - sounds like we may need to just do a check against an alternate superblock

That might help find the journal entries -

udo · Feb 8, 2013

Chris Rivera said:
completed but when rebooted still will not load

Hi,
if you need help,give more input PLEASE!!

What do you mean with completed?
The fsck? any output? Can you mount after that the root-partition?

Or the copy with dd - in this case your copy is lost, bacause the original commad copy the root-partition to the home-dir of root, which is only in memory with an live-cd...

BTW. Access to the data-partition is mostly enough to get the VMs back - but you must build a new config for each VM (I assume it's an single host?).
If this host is an part of an cluster, you are happy - all configs are stored on all hosts.

Udo

typo3usa · Feb 8, 2013

I am going to assume this is on /dev/sda1 (if not mark the appropriate location and run this command )

Code:

[FONT=Courier New]dumpe2fs /dev/sda1 | grep superblock[/FONT]

Once you find an alternate superblock from that list - run a fsck against it.

Code:

fsck -b 32768 /dev/sda1

My suggestion however would be to use RescueCD and start from there.
we have this setup in our PXE booter.

If you have another node online - and want a copy of that PXE image - (kvm) let me know - has a few other distro's in it)
Anyhow -

Make sure that /dev/sdxxxxx is not mounted of course
I have a feeling that this may have been mounted when you originally did the fsck.
(call it experience some years ago ... cough cough)

In the meantime - we are praying you get back up and running quickly.

Chris Rivera · Feb 11, 2013

Update: I was able to finally mount the pve lvm extract pve-data using tar, untar the file on node 3. Copy config files so proxmox knows vms are on node 3 and booted vms.

Now here comes my new network issue.

all vms with venet have no incoming or out going connections. Can only ping main host node.
all vps with veth work just fine...

how to i get proxmox to remake all venet network routing?

Chris Rivera · Feb 11, 2013

VM ifconfig:

[root@VM760 /]# ifconfig
lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:16 errors:0 dropped:0 overruns:0 frame:0
TX packets:16 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:1468 (1.4 KiB) TX bytes:1468 (1.4 KiB)

venet0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:127.0.0.1 P-t-P:127.0.0.1 Bcast:0.0.0.0 Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1
RX packets:5 errors:0 dropped:0 overruns:0 frame:0
TX packets:12 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:420 (420.0 b) TX bytes:865 (865.0 b)

venet0:0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:199.195.213.*** P-t-P:199.195.213.*** Bcast:199.195.213.*** Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1

venet0:1 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:199.195.214.*** P-t-P:199.195.214.*** Bcast:199.195.214.*** Mask:255.255.255.255
UP BROADCAST POINTOPOINT RUNNING NOARP MTU:1500 Metric:1

#############

vm route:

[root@VM760 /]# route

Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
169.254.0.0 * 255.255.0.0 U 0 0 0 venet0
default * 0.0.0.0 U 0 0 0 venet0

#############

hostnode ifconfig:

root@proxmox3a:~# ifconfig | grep 760

#############

abrupt power outage.... proxmox will not boot

Chris Rivera

Guest

Chris Rivera

Guest

New Member

Chris Rivera

Guest

Famous Member

Distinguished Member

Chris Rivera

Guest

Distinguished Member

Chris Rivera

Guest

Chris Rivera

Guest

Distinguished Member

Chris Rivera

Guest

Chris Rivera

Guest

Famous Member

Chris Rivera

Guest

Active Member

Distinguished Member

Active Member

Chris Rivera

Guest

Chris Rivera

Guest

We value your privacy