can't activate LV '/dev/xxxx/vm-xxx-dsk 'Failed to find logical volume "

AndyBar

New Member
Apr 6, 2018
10
0
1
46
Given Configuration:
4 node cluster Proxmox:
PVE0 : pve-manager/5.1-46/ae8241d4
PVE1 : pve-manager/5.1-46/ae8241d4
PVE2 : pve-manager/5.1-46/ae8241d4
PVE3 : pve-manager/5.0-30/5ab26bc

Networking:
Each HOST has 4 NICs
2 for Backend to SAN Switch (LACP 802.3ad) - one IP per HOST (no need for multi-path)
2 for Frontend to Network Switch (LACP 802.3ad) - one IP per HOST
OVS is used as a virtual switch:
cat /etc/network/interfaces (All HOSTs except IPs change from HOST to HOST):
auto lo
iface lo inet loopback

auto eno1
iface eno1 inet manual

auto eno2
iface eno2 inet manual

auto eno3
iface eno3 inet manual
mtu 8996

auto eno4
iface eno4 inet manual
mtu 8996

allow-vmbr1 bond1
iface bond1 inet manual
ovs_bonds eno1 eno2
ovs_type OVSBond
ovs_bridge vmbr1
ovs_options bond_mode=balance-tcp lacp=active
other_config lacp-time=fast trunks=11,12,13,14,15,16,30,31

allow-vmbr1 mgmt_vlan30
iface mgmt_vlan30 inet static
address 10.255.30.202
netmask 255.255.255.0
gateway 10.255.30.1
ovs_type OVSIntPort
ovs_bridge vmbr1
ovs_options tag=30

auto vmbr1
iface vmbr1 inet manual
ovs_type OVSBridge
ovs_ports bond1 mgmt_vlan30


allow-vmbr2 bond2
iface bond2 inet manual
ovs_bonds eno3 eno4
ovs_type OVSBond
ovs_bridge vmbr2
ovs_options bond_mode=balance-tcp lacp=active
other_config lacp-time=fast
mtu 8996

allow-vmbr2 iscsi_vlan5
iface iscsi_vlan5 inet static
address 10.0.5.12
netmask 255.255.255.0
ovs_type OVSIntPort
ovs_bridge vmbr2
mtu 8996

auto vmbr2
iface vmbr2 inet manual
ovs_type OVSBridge
ovs_ports bond2 iscsi_vlan5
mtu 8996

STORAGE:
SHARED by ALL OF THE HOSTS iSCSI SAN Storage (Nimble cs210 )

Problem:
After some time, LV Volumes disappear and VMs are obviously not bootable.
lvs shows no sign of LV.

Our findings (sometimes):

1. The VM that does not boot cannot find its lv disk
2. lv disk appear to be found on another data-store (Console output see below and GUI attached file do not match)

root@pve2:/dev/QA# ls -la
total 0
drwxr-xr-x 2 root root 100 Apr 6 10:10 .
drwxr-xr-x 24 root root 4940 Apr 6 10:10 ..
...
lrwxrwxrwx 1 root root 8 Apr 6 10:10 vm-132-disk-1 -> ../dm-20
...

Any thoughts ?? Please help
 
Hi,

What version do you use?
Code:
pveversion -v

Why do you use a switch on the storage network an not just a bond?
 
HEllo Wolfgang,

Thank you for your reply.

Here is the output of the command:

root@pve2:/etc/lvm/archive# pveversion -v
proxmox-ve: 5.1-41 (running kernel: 4.13.13-6-pve)
pve-manager: 5.1-46 (running version: 5.1-46/ae8241d4)
pve-kernel-4.13.13-6-pve: 4.13.13-41
pve-kernel-4.10.17-2-pve: 4.10.17-20
corosync: 2.4.2-pve3
criu: 2.11.1-1~bpo90
glusterfs-client: 3.8.8-1
ksm-control-daemon: 1.2-2
libjs-extjs: 6.0.1-2
libpve-access-control: 5.0-8
libpve-common-perl: 5.0-28
libpve-guest-common-perl: 2.0-14
libpve-http-server-perl: 2.0-8
libpve-storage-perl: 5.0-17
libqb0: 1.0.1-1
lvm2: 2.02.168-pve6
lxc-pve: 2.1.1-3
lxcfs: 2.0.8-2
novnc-pve: 0.6-4
openvswitch-switch: 2.7.0-2
proxmox-widget-toolkit: 1.0-11
pve-cluster: 5.0-20
pve-container: 2.0-19
pve-docs: 5.1-16
pve-firewall: 3.0-5
pve-firmware: 2.0-3
pve-ha-manager: 2.0-5
pve-i18n: 1.0-4
pve-libspice-server1: 0.12.8-3
pve-qemu-kvm: 2.9.1-9
pve-xtermjs: 1.0-2
qemu-server: 5.0-22
smartmontools: 6.5+svn4324-1
spiceterm: 3.0-5
vncterm: 1.5-3
zfsutils-linux: 0.7.6-pve1~bpo9
==================================================================

We decided to use the OVS for virtual switch solution for both frontend and backend. Is there a problem with that ?
 
We decided to use the OVS for virtual switch solution for both frontend and backend. Is there a problem with that ?
No but it brings no benefits,
only the possibility of configuration issues
and more latency because the network stack has one component more.
I would recommend you to use only the bond.

In the newer OVS version you don't have to subtract the VLAN bits so you should use 9000 instead.

Code:
iface eno3 inet manual

iface eno4 inet manual

auto bond0
iface bond0 inet manual
        slaves eno3 eno4
        bond_miimon 100
        bond_mode 802.3ad
        bond_xmit_hash_policy layer2
        mtu 9000

auto bond0.5
iface bond0.5 inet static
        address 10.0.5.12
        netmask 255.255.255.0
 
Thank you Wolfgang! I will try to make a change on one of the HOSTs and test your config. However the initial problem is still not resolved.... The fact that I am using the OVS for the backend should not make LVs disappear. Unless my OVS config is wrong... Or Maybe PRoxmox does not like the LACP setup as opposed to MultiPathing? This only happens on the shared storage. On local storage VM disks do not disappear.
 
Last edited: