cluster iscsi and lvm

copymaster

Member
Nov 25, 2009
183
0
16
Hi.
I have a test scenario with actal 2 servers, both with local drives (raid5, 2 TB)

on Server A i mounted a ISCSI LUN from a network fileserver. Then i added a LVM Volume group using this LUN.

Then i installed a windows 7 on that storage.

After that, i created a cluster, Server A as Master, Server B as node and waited until sync.

Now i want to make a hotmigration test. I selected the running win7 VM and migrated (online) to the node. That seemed to work but then the vm stopped.

So i had a look at the master and the webinterface told me, that the vm was on the node. Seems like the vm got migrated to the node though.

So i tried to start the vm on the node, but it told me "you don't have write access"

I am not able to start the vm on the node. unfortunately i am even not able to migrate the vm back to the master.

The error is:

/usr/bin/ssh -t -t -n -o BatchMode=yes 192.168.0.72 /usr/sbin/qmigrate --online 192.168.0.71 101
Volume group "pve1" not found
command '/sbin/vgchange -aly pve1' failed with exit code 5
Dec 09 16:29:16 starting migration of VM 101 to host '192.168.0.71'
Dec 09 16:29:16 copying disk images
Volume group "pve1" not found
Dec 09 16:29:17 Failed to sync data - command '/sbin/vgchange -aly pve1' failed with exit code 5
Dec 09 16:29:17 migration aborted
Connection to 192.168.0.72 closed.

can i use vm on iscsi in a cluster config?
 
Hi.
I have a test scenario with actal 2 servers, both with local drives (raid5, 2 TB)

on Server A i mounted a ISCSI LUN from a network fileserver. Then i added a LVM Volume group using this LUN.

Then i installed a windows 7 on that storage.

After that, i created a cluster, Server A as Master, Server B as node and waited until sync.

did you wait till the cluster was in sync?

Now i want to make a hotmigration test. I selected the running win7 VM and migrated (online) to the node. That seemed to work but then the vm stopped.

the console disappears and the win7 should now run on the node.

So i had a look at the master and the webinterface told me, that the vm was on the node. Seems like the vm got migrated to the node though.

So i tried to start the vm on the node, but it told me "you don't have write access"

you can only manage from the master.

I am not able to start the vm on the node. unfortunately i am even not able to migrate the vm back to the master.

The error is:

/usr/bin/ssh -t -t -n -o BatchMode=yes 192.168.0.72 /usr/sbin/qmigrate --online 192.168.0.71 101
Volume group "pve1" not found
command '/sbin/vgchange -aly pve1' failed with exit code 5
Dec 09 16:29:16 starting migration of VM 101 to host '192.168.0.71'
Dec 09 16:29:16 copying disk images
Volume group "pve1" not found
Dec 09 16:29:17 Failed to sync data - command '/sbin/vgchange -aly pve1' failed with exit code 5
Dec 09 16:29:17 migration aborted
Connection to 192.168.0.72 closed.

can i use vm on iscsi in a cluster config?

yes, you did everything right. check if you got the identical storage definition on both nodes (this is synchronized automatically, see /etc/pve/storage.cfg) and as always, use the latest version on both nodes. any other error logs?
 
Thanx for the reply.

yes i waited until both servers where synced.
the vm console didn't disappear, the vm got stopped during migration.

the /etc/pve/storage.conf are identical on master and node.

there's only one difference between them:

on the webinterface of the master the storage screen shows (not original, only relevant lines)

MASTER:
Type Enabled Active Shared Used
iscsitest yes n/a n/a n/a (i think this line is correct, but its not relevant)
lvmtest yes yes yes 32,00

NODE
Type Enabled Active Shared Used
iscsitest yes n/a n/a n/a (I think this line is correct, but not relevant)
lvmtest yes NO yes 32,00

some other ideas??
 
Last edited:
Thanx for the reply.

yes i waited until both servers where synced.
the vm console didn't disappear, the vm got stopped during migration.

the /etc/pve/storage.conf are identical on master and node.

there's only one difference between them:

on the webinterface of the master the storage screen shows (not original, only relevant lines)

MASTER:
Type Enabled Active Shared Used
iscsitest yes n/a n/a n/a (i think this line is correct, but its not relevant)
lvmtest yes yes yes 32,00

NODE
Type Enabled Active Shared Used
iscsitest yes n/a n/a n/a (I think this line is correct, but not relevant)
lvmtest yes NO yes 32,00

some other ideas??

the storage needs to be active on the node too, check why lvmtest is not active on the node.
 
Hi.

Can you please tell me how i can set the lvm volume to active on the node?

as i said, the /etc/pve/storage.conf is identical on master and node, i don't know how to find out, why the lvm group is not active on the node....
 
Hi.

Can you please tell me how i can set the lvm volume to active on the node?

as i said, the /etc/pve/storage.conf is identical on master and node, i don't know how to find out, why the lvm group is not active on the node....

check syslog and dmesg.
 
hi.

Master shows in dmesg:
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA


This is the iscsi lun





the node shows in dmesg:
audit(1260453808.665:2): dev=eth0 prom=256 old_prom=0 auid=4294967295
igb: eth0: igb_watchdog_task: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
vmbr0: port 1(eth0) entering learning state
vmbr0: topology change detected, propagating
vmbr0: port 1(eth0) entering forwarding state
Loading iSCSI transport class v2.0-724.
iscsi: registered transport (tcp)
iscsi: registered transport (iser)
NET: Registered protocol family 10
ip_tables: (C) 2000-2006 Netfilter Core Team
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
loaded kvm module (kvm-kmod-2.6.30.1)
eth0: no IPv6 routers present
vmbr0: no IPv6 routers present
3w-9xxx: scsi0: AEN: INFO (0x04:0x000C): Initialize started:unit=0.
scsi5 : iSCSI Initiator over TCP/IP
Loading iSCSI transport class v2.0-724.
iscsi: registered transport (tcp)
iscsi: registered transport (iser)


Thats all...
????
 
hi.

Master shows in dmesg:
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA
sd 11:0:0:0: [sdb] 104872320 512-byte hardware sectors (53695 MB)
sd 11:0:0:0: [sdb] Write Protect is off
sd 11:0:0:0: [sdb] Mode Sense: bd 00 00 08
sd 11:0:0:0: [sdb] Write cache: disabled, read cache: enabled, doesn't support DPO or FUA


This is the iscsi lun





the node shows in dmesg:
audit(1260453808.665:2): dev=eth0 prom=256 old_prom=0 auid=4294967295
igb: eth0: igb_watchdog_task: NIC Link is Up 100 Mbps Full Duplex, Flow Control: None
vmbr0: port 1(eth0) entering learning state
vmbr0: topology change detected, propagating
vmbr0: port 1(eth0) entering forwarding state
Loading iSCSI transport class v2.0-724.
iscsi: registered transport (tcp)
iscsi: registered transport (iser)
NET: Registered protocol family 10
ip_tables: (C) 2000-2006 Netfilter Core Team
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
loaded kvm module (kvm-kmod-2.6.30.1)
eth0: no IPv6 routers present
vmbr0: no IPv6 routers present
3w-9xxx: scsi0: AEN: INFO (0x04:0x000C): Initialize started:unit=0.
scsi5 : iSCSI Initiator over TCP/IP
Loading iSCSI transport class v2.0-724.
iscsi: registered transport (tcp)
iscsi: registered transport (iser)


Thats all...
????

this does only shows that you do not run the latest version (but I doubt that this is the cause of your problem).

I am out of ideas, without access to this machine I cannot help further.
 
Well,
i used a iscsi-share from a Netapp. This share i created using the wizard (Filerview)

i created the iscsi lun and lvm -group on the first server BEFORE i created the cluster.

Could that be the reason?

Or is it possible that a netapp iscsi lun can only be mounted once?

By the way: in a cluster config, does every node mount that iscsi lun?

i couldn't find any mounts on even the master or the node saying that there's a iscsi lun mounted.

but from the master a fdisk /dev/sdc brings that lun to visibility
on the node a fdisk only says: no device

I am in a poor situation, because i like to use proxmox in production next year with iscsi luns and cluster config. But when the test scenario fails, the whole project fails.

Do you have other suggestions?
 
i created the iscsi lun and lvm -group on the first server BEFORE i created the cluster.

Could that be the reason?

no

Or is it possible that a netapp iscsi lun can only be mounted once?

Yes. Please check the access setting on netapp.

By the way: in a cluster config, does every node mount that iscsi lun?

i couldn't find any mounts on even the master or the node saying that there's a iscsi lun mounted.

'mounted' is the wrong word. The LV is activated as soon as it is uses by a VM. You can also use the 'pvesm' command to list storage content, for example:

# pvesm list -a

see 'man pvesm' for details. Maybe that command give further info about the problem?
 
ok i tested that command and got:

on master:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "de_DE.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
iscsilun1:0.0.0.scsi-360a9800068706575566f546134704f4a 0 raw 52436160

on Node:

perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = "de_DE.UTF-8"
are supported and installed on your system.
perl: warning: Falling back to the standard locale ("C").
Volume group "pve1" not found
command '/sbin/vgchange -aly pve1' failed with exit code 5
 
HI
I will post the above shown command-outputs later.
i verified the error.

i installed the two servers from proxmox cd and created a cluster.
now i connected to the NODE , went to storage and tested if i can add the iscsi lun. That showed an error,too.

Means, it seems not to lie on the netapp iscsi lun, it seems to be a problem with the cluster.

Now i will do the following:

Install all new (this time 3 servers) with proxmox

next i will create a cluster . After sync i will try to add the scsi lun to the master (use lun directly: off) and wait until sync.

After that i will try to add a LVM-Volume-Group using this LUN on the master and wait again until sync.

Then i will try to log in to the master and create a KVM VM on one of the nodes and see if the error persists.

If that fails, i will send the outputs of the above shown commands and hope you can solve this error.

As i said before, it should run in production and before we start spending money on this software we like to see some success in this test environment.

We will try to use it, simulate a failing server and hdd's so we can be certain that this software is usable for our needs.
I would greatly appreciate your help for this testphase.
 
We will try to use it, simulate a failing server and hdd's so we can be certain that this software is usable for our needs.
I would greatly appreciate your help for this testphase.

Please can you post your /etc/pve/storage.cfg file
 
ok. here's /etc/pve/storage.conf (it's the same on master and the nodes):

Code:
dir: local
    path /var/lib/vz
    content images,iso,vztmpl,rootdir

iscsi: testlun1
    portal 192.168.0.93
    target iqn.1992-08.com.netapp:01.896ebee516
    content none

lvm: lvm1
    vgname vgrp1
    base testlun1:0.0.0.scsi-360a9800068706575566f546258357153
    shared
    content images
######################################################################
output from the command pvdisplay on MASTER:

Code:
Donald:~# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sdb
  VG Name               vgrp1
  PV Size               50,00 GB / not usable 4,00 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              12799
  Free PE               12799
  Allocated PE          0
  PV UUID               cYuvk0-lOCS-V9Jd-1PTv-1Fa9-8XSJ-SynCpN
   
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               pve
  PV Size               278,87 GB / not usable 3,34 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              71391
  Free PE               1022
  Allocated PE          70369
  PV UUID               gIhcCb-1cAL-DRke-e5A1-0ZN8-FhuO-DDVYaM
And node 1:

Code:
Tick:~# pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               pve
  PV Size               1,82 TB / not usable 2,51 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              476703
  Free PE               1023
  Allocated PE          475680
  PV UUID               vnKGPu-9VeK-oZV5-z5lh-tFKS-4AQT-34pN6m
And node 2:

Code:
  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               pve
  PV Size               1,82 TB / not usable 2,51 MB
  Allocatable           yes 
  PE Size (KByte)       4096
  Total PE              476703
  Free PE               1023
  Allocated PE          475680
  PV UUID               64yTaZ-ydBf-EeDu-WaUc-5rmx-xlae-sz8kBX
####################################################################
Output from command vgdisplay on MASTER:


Code:
Donald:~# vgdisplay
  --- Volume group ---
  VG Name               vgrp1
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  1
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                0
  Open LV               0
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               50,00 GB
  PE Size               4,00 MB
  Total PE              12799
  Alloc PE / Size       0 / 0   
  Free  PE / Size       12799 / 50,00 GB
  VG UUID               pZMf8d-Mkgv-BeLG-IRd6-4fac-qKNs-WBXOWh

--- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               278,87 GB
  PE Size               4,00 MB
  Total PE              71391
  Alloc PE / Size       70369 / 274,88 GB
  Free  PE / Size       1022 / 3,99 GB
  VG UUID               OtkqAN-rpB3-orJI-uKyq-KQbX-76dn-or999K
on node 1:


Code:
Tick:~# vgdisplay
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1,82 TB
  PE Size               4,00 MB
  Total PE              476703
  Alloc PE / Size       475680 / 1,81 TB
  Free  PE / Size       1023 / 4,00 GB
  VG UUID               oU0kul-U0jK-l9SA-3m7K-yiEn-VYzE-NUb817
and node2:

Code:
Trick:~# vgdisplay
  --- Volume group ---
  VG Name               pve
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  4
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               1,82 TB
  PE Size               4,00 MB
  Total PE              476703
  Alloc PE / Size       475680 / 1,81 TB
  Free  PE / Size       1023 / 4,00 GB
  VG UUID               FJsilw-G3XK-G0gb-SWUo-P7vY-uIDl-ykdWqX
######################################################################
Output from command lvdisplay on Master:


Code:
Donald:~# lvdisplay
  --- Logical volume ---
  LV Name                /dev/pve/swap
  VG Name                pve
  LV UUID                HnCfCA-8Ttb-47sx-bSN0-0pKD-tnyr-dTnpRL
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                11,00 GB
  Current LE             2816
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:0
   
  --- Logical volume ---
  LV Name                /dev/pve/root
  VG Name                pve
  LV UUID                F2ptpN-6qH1-caKR-AzXG-vIcw-dEuI-yYuGyG
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                69,75 GB
  Current LE             17856
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:1
   
  --- Logical volume ---
  LV Name                /dev/pve/data
  VG Name                pve
  LV UUID                DEtpuY-sLri-3IUl-gfSK-h6GK-81Sj-1EdRbx
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                194,13 GB
  Current LE             49697
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:2
on node 1:

Code:
Tick:~# lvdisplay
  --- Logical volume ---
  LV Name                /dev/pve/swap
  VG Name                pve
  LV UUID                GLDZRf-wwl5-H2s7-hVfS-zizn-Px5W-LEFlIF
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                11,00 GB
  Current LE             2816
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:0
   
  --- Logical volume ---
  LV Name                /dev/pve/root
  VG Name                pve
  LV UUID                1WQIXu-xtRu-jGU0-43l1-dTNb-gwXS-SKHWWA
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                96,00 GB
  Current LE             24576
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:1
   
  --- Logical volume ---
  LV Name                /dev/pve/data
  VG Name                pve
  LV UUID                RQDiUK-6wR5-PzE4-Xnqr-5I9e-I9Mq-z3wSxO
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                1,71 TB
  Current LE             448288
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:2
and node 2:

Code:
Trick:~# lvdisplay
  --- Logical volume ---
  LV Name                /dev/pve/swap
  VG Name                pve
  LV UUID                G2aWnm-cea6-ftcf-Km7V-DCIP-1la6-oKA37N
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                11,00 GB
  Current LE             2816
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:0
   
  --- Logical volume ---
  LV Name                /dev/pve/root
  VG Name                pve
  LV UUID                fPJADd-yZSe-cgIz-uUuN-2SVk-mA3Y-KdYoSQ
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                96,00 GB
  Current LE             24576
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:1
   
  --- Logical volume ---
  LV Name                /dev/pve/data
  VG Name                pve
  LV UUID                0Uk8Se-JTaO-vyBj-itWb-HnFj-3uG4-KdEpsX
  LV Write Access        read/write
  LV Status              available
  # open                 1
  LV Size                1,71 TB
  Current LE             448288
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           254:2
But when i log on to Donald (master) and try to create a vm with storage LVM(ISCSI) on Tick or Trick (nodes), i get:

Code:
/usr/bin/ssh -t -t -n -o BatchMode=yes 192.168.0.71 /usr/sbin/qm create 101 --cdrom cdrom --name test1 --vlan0 'rtl8139=46:C0:48:74:FE:16' --bootdisk ide0 --ostype wvista --ide0 'lvm1:32,format=raw' --memory 1024 --onboot no --sockets 1
  Volume group "vgrp1" not found 
create failed - command '/sbin/vgchange -aly vgrp1' failed with exit code 5 
Connection to 192.168.0.71 closed. 
unable to apply VM settings -
 
Last edited:
Seems that the iscsi target iqn.1992-08.com.netapp:01.896ebee516 (testlun1:0.0.0.scsi-360a9800068706575566f546258357153) is not available on the node. Please use iscsiadm to check why.
 
OK. Thank you a lot for your quick replies.
I found the solution myself. somewhere under /etc/iscsi/ there's a config file (default) which was on automatic instead of manual (think its the connection to the lun)

Then restarted and it seems to work now. Maybe a prob with the sync?

I will begin migrating our 22 Servers to PROXMOX next week......

SOLVED!
 
Last edited: