VG restore on drbd cluster

bramb · Feb 15, 2016

I need help with the following thing: We had a two node cluster with drbd and Proxmox 3.4. It worked fine until our second node failed. Because we were already transffering to a new cluster we took the disks out of the second node and let if offline. But after a while the node became active again and then the trouble started.

On the first node I did pmxcf -l on the first node so the second node doesn't dissapear. But the problem now is that nothing wants to boot on the first node.

I get this errors:
No volume groups found
No volume groups found
No volume groups found
Volume group "pve" not found
TASK ERROR: can't activate LV '/dev/pve/vm-122-disk-1': Skipping volume group pve

Then I did a 'vgs' command and got this:
root@nd1:~# vgs

No volume groups found

When I check the status of drbd it says"
0:r0 WFConnection Primary/Unknown UpToDate/Inconsistent C

My drbd configuration looks like this:

Code:

resource r0 {


on RCLL001 {

device /dev/drbd0;

disk /dev/sda7;

address 10.0.0.21:7788;

meta-disk internal;

}


on RCLL002 {

device /dev/drbd0;

disk /dev/sda7;

address 10.0.0.22:7788;

meta-disk internal;

}


}

What is the best thing I can do, I need the data of 1 vm that is currently stopped. After I got that recovered the whole server can go offline. Thanks!

udo · Feb 15, 2016

bramb said:
I need help with the following thing: We had a two node cluster with drbd and Proxmox 3.4. It worked fine until our second node failed. Because we were already transffering to a new cluster we took the disks out of the second node and let if offline. But after a while the node became active again and then the trouble started.

Hi,
how can an node with pulled disk became active again??

On the first node I did pmxcf -l on the first node so the second node doesn't dissapear. But the problem now is that nothing wants to boot on the first node.

I get this errors:
No volume groups found
No volume groups found
No volume groups found
Volume group "pve" not found
TASK ERROR: can't activate LV '/dev/pve/vm-122-disk-1': Skipping volume group pve

Then I did a 'vgs' command and got this:
root@nd1:~# vgs

No volume groups found

What is the output of

Code:

pvs
pvscan
vgscan
grep filter /etc/lvm/lvm.conf

Udo

bramb · Feb 15, 2016

Why? A stupid dedicated server provider who didn't do their job.

Code:

root@RCLL001:~# pvs
root@RCLL001:~# pvscan
  No matching physical volumes found
root@RCLL001:~# vgscan
  Reading all physical volumes.  This may take a while...
  No volume groups found
root@RCLL001:~# grep filter /etc/lvm/lvm.conf
    # A filter that tells LVM2 to only use a restricted set of devices.
    # The filter consists of an array of regular expressions.  These
    # Don't have more than one filter line active at once: only one gets used.
    #filter = [ "a/.*/" ]
    filter = [ "r|/dev/sda7|", "r|/dev/disk/|", "a/.*/"]
    # filter = [ "r|/dev/cdrom|" ]
    # filter = [ "a/loop/", "r/.*/" ]
    # filter =[ "a|loop|", "r|/dev/hdc|", "a|/dev/ide|", "r|.*|" ]
    # filter = [ "a|^/dev/hda8$|", "r/.*/" ]
    # Since "filter" is often overriden from command line, it is not suitable
    # for system-wide device filtering (udev rules, lvmetad). To hide devices
    # global_filter. The syntax is the same as for normal "filter"
    # above. Devices that fail the global_filter are not even opened by LVM.
    # global_filter = []
    # The results of the filtering are cached on disk to avoid
    # mlock_filter = [ "locale/locale-archive", "gconv/gconv-modules.cache" ]

udo · Feb 15, 2016

bramb said:

Why? A stupid dedicated server provider who didn't do their job.

Code:

root@RCLL001:~# pvs
root@RCLL001:~# pvscan
  No matching physical volumes found
root@RCLL001:~# vgscan
  Reading all physical volumes.  This may take a while...
  No volume groups found
root@RCLL001:~# grep filter /etc/lvm/lvm.conf
    # A filter that tells LVM2 to only use a restricted set of devices.
    # The filter consists of an array of regular expressions.  These
    # Don't have more than one filter line active at once: only one gets used.
    #filter = [ "a/.*/" ]
    filter = [ "r|/dev/sda7|", "r|/dev/disk/|", "a/.*/"]
    # filter = [ "r|/dev/cdrom|" ]
    # filter = [ "a/loop/", "r/.*/" ]
    # filter =[ "a|loop|", "r|/dev/hdc|", "a|/dev/ide|", "r|.*|" ]
    # filter = [ "a|^/dev/hda8$|", "r/.*/" ]
    # Since "filter" is often overriden from command line, it is not suitable
    # for system-wide device filtering (udev rules, lvmetad). To hide devices
    # global_filter. The syntax is the same as for normal "filter"
    # above. Devices that fail the global_filter are not even opened by LVM.
    # global_filter = []
    # The results of the filtering are cached on disk to avoid
    # mlock_filter = [ "locale/locale-archive", "gconv/gconv-modules.cache" ]

Hi,
remove
"r|/dev/sda7|",
from the filter and do an pvscan again (not sure if you need to reboot first).

I asume you are sure that you don't need drbd on this node, only to reactivate one LV?!

But I'm wondering, that you don't see other VGs like pve?!

Udo

udo · Feb 15, 2016

udo said:
Hi,
remove
"r|/dev/sda7|",
from the filter and do an pvscan again (not sure if you need to reboot first).

I asume you are sure that you don't need drbd on this node, only to reactivate one LV?!

But I'm wondering, that you don't see other VGs like pve?!

Udo

Hi,
reread your first post again - your volumegroup pve is on drbd??? You are know what you are doing?

Udo

bramb · Feb 16, 2016

I'm going to reboot the server in about a week because there are running some customers from those VM's.

The strange thing is that the disks of the running VM's are still shown in /dev/pve. After a shutdown of one of the running VMs they also don't want to boot up anymore.

udo · Feb 16, 2016

bramb said:
I'm going to reboot the server in about a week because there are running some customers from those VM's.

The strange thing is that the disks of the running VM's are still shown in /dev/pve. After a shutdown of one of the running VMs they also don't want to boot up anymore.

Hi,
in this case I would not reboot until I know the issue!!

Again - pve is on drbd?? How looks your installation?
If you have an standard-installation (in this case pve isn't on drbd) your system disk is dead - this would explain, why you see no VGs/PVs...

Any hints in the logfiles?

Udo

Search

Search

VG restore on drbd cluster

bramb

New Member

udo

Distinguished Member

bramb

New Member

udo

Distinguished Member

udo

Distinguished Member

bramb

New Member

udo

Distinguished Member

We value your privacy