Found duplicate PV ?

m.ardito · Jan 14, 2015

this problem is similar to that of some other user here, but maybe they come from different issues.

I have 2 identical nodes, ibm x3650m2, with 20GB ram, 2x72 GB local disks (raid 1)
Disk /dev/sda: 72.0 GB, 71999422464 bytes

and both are connected to the same physical NAS:
vm disks from both nodes are on the same lvm/iscsi target
backups are made by both nodes on the same nfs share on the same NAS.

now, one node shows the problem, the other not :S

I noticed this problem from the web gui backup logs from that node, where I see that all (apparently successful) backups, since a while, start with something like

Code:

"INFO: Starting Backup of VM 102 (qemu)
INFO: status = running
INFO: update VM 102: -lock backup
  [B]Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb
  Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb[/B]
INFO: backup mode: snapshot
INFO: ionice priority: 7
INFO: creating archive '/mnt/pve/pve_ts879/dump/vzdump-qemu-102-2015_01_14-01_00_02.vma.lzo'
INFO: started backup task 'f373a23c-82d3-4cd6-a5af-382858f0ac91'
INFO: status: 0% (36044800/12884901888), sparse 0% (3534848), duration 3, 12/10 MB/s"

while the backup log sent by mail states (same exact job) but no "found duplicate" warning, so I never noticed from those...

Code:

"102: Jan 14 01:00:02 INFO: Starting Backup of VM 102 (qemu)
102: Jan 14 01:00:02 INFO: status = running
102: Jan 14 01:00:03 INFO: update VM 102: -lock backup
102: Jan 14 01:00:03 INFO: backup mode: snapshot
102: Jan 14 01:00:03 INFO: ionice priority: 7
102: Jan 14 01:00:03 INFO: creating archive '/mnt/pve/pve_ts879/dump/vzdump-qemu-102-2015_01_14-01_00_02.vma.lzo'
102: Jan 14 01:00:04 INFO: started backup task 'f373a23c-82d3-4cd6-a5af-382858f0ac91'
102: Jan 14 01:00:07 INFO: status: 0% (36044800/12884901888), sparse 0% (3534848), duration 3, 12/10 MB/s"

all backups logs started on the other node are just fine, no "found duplicate" whatsoever.

digging the first node logs from the gui, all related logs show this, but since the email apparenly removed them, I never noticed.
(I could find other traces in other system logs perhaps? where?)

what can lead to this warning (?) what happened, how to solve?

more info: good node shows

Code:

#ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sdb

#fdisk -l | grep "/dev/sd"
Disk /dev/mapper/pve-root doesn't contain a valid partition table
Disk /dev/mapper/pve-swap doesn't contain a valid partition table
Disk /dev/mapper/pve-data doesn't contain a valid partition table
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sda: 72.0 GB, 71999422464 bytes
/dev/sda1   *        2048     1048575      523264   83  Linux
/dev/sda2         1048576   140623871    69787648   8e  Linux LVM
Disk /dev/sdb: 1073.7 GB, 1073741824000 bytes

# pvscan
  PV /dev/sdb    VG pve_vm_disks_ts879   lvm2 [1000.00 GiB / 22.81 GiB free]
  PV /dev/sda2   VG pve                  lvm2 [66.55 GiB / 8.37 GiB free]
  Total: 2 [1.04 TiB] / in use: 2 [1.04 TiB] / in no VG: 0 [0   ]

bad node shows

Code:

#ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sdb  /dev/sdc

fdisk -l | grep "/dev/sd"
Disk /dev/mapper/pve-root doesn't contain a valid partition table
Disk /dev/mapper/pve-swap doesn't contain a valid partition table
Disk /dev/mapper/pve-data doesn't contain a valid partition table
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sda: 72.0 GB, 71999422464 bytes
/dev/sda1   *        2048     1048575      523264   83  Linux
/dev/sda2         1048576   140623871    69787648   8e  Linux LVM
Disk /dev/sdb: 1073.7 GB, 1073741824000 bytes
Disk /dev/sdc: 1073.7 GB, 1073741824000 bytes

# pvscan
  Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb
  PV /dev/sdc    VG pve_vm_disks_ts879   lvm2 [1000.00 GiB / 22.81 GiB free]
  PV /dev/sda2   VG pve                  lvm2 [66.55 GiB / 8.37 GiB free]
  Total: 2 [1.04 TiB] / in use: 2 [1.04 TiB] / in no VG: 0 [0   ]

those servers are not recently modified, and running pve since 1.5 and on NAS also nothing intentionally changed... and I see nothing strange...

currently both pve nodes have 3.1.24, are connected to the same gigabit switch, and have the same identical pveversions:

Code:

proxmox-ve-2.6.32: 3.1-114 (running kernel: 2.6.32-26-pve)
pve-manager: 3.1-24 (running version: 3.1-24/060bd5a6)
pve-kernel-2.6.32-19-pve: 2.6.32-96
pve-kernel-2.6.32-26-pve: 2.6.32-114
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.98-pve4
clvm: 2.02.98-pve4
corosync-pve: 1.4.5-1
openais-pve: 1.1.4-3
libqb0: 0.11.1-2
redhat-cluster-pve: 3.2.0-2
resource-agents-pve: 3.9.2-4
fence-agents-pve: 4.0.0-2
pve-cluster: 3.0-8
qemu-server: 3.1-8
pve-firmware: 1.0-23
libpve-common-perl: 3.0-9
libpve-access-control: 3.0-8
libpve-storage-perl: 3.0-18
pve-libspice-server1: 0.12.4-2
vncterm: 1.1-6
vzctl: 4.0-1pve4
vzprocps: 2.0.11-2
vzquota: 3.1-2
pve-qemu-kvm: 1.4-17
ksm-control-daemon: 1.1-1
glusterfs-client: 3.4.1-1

and similar pveperfs

Code:

pveperf /mnt/pve/pve_ts879/
CPU BOGOMIPS:      72530.88
REGEX/SECOND:      921498
HD SIZE:           11081.12 GB (ts879:/PVE)
FSYNCS/SECOND:     1567.25
DNS EXT:           260.35 ms
DNS INT:           1.33 ms (apiform.to.it)
root@pve2:~# pveperf /mnt/pve/pve_ts879/
CPU BOGOMIPS:      72530.88
REGEX/SECOND:      952254
HD SIZE:           11081.12 GB (ts879:/PVE)
FSYNCS/SECOND:     1601.93
DNS EXT:           199.14 ms
DNS INT:           1.08 ms (apiform.to.it)

Code:

 pveperf /mnt/pve/pve_ts879/
CPU BOGOMIPS:      72531.60
REGEX/SECOND:      771375
HD SIZE:           11081.12 GB (ts879:/PVE)
FSYNCS/SECOND:     1343.58
DNS EXT:           49.35 ms
DNS INT:           1.03 ms (apiform.to.it)
root@pve1:~# pveperf /mnt/pve/pve_ts879/
CPU BOGOMIPS:      72531.60
REGEX/SECOND:      930701
HD SIZE:           11081.12 GB (ts879:/PVE)
FSYNCS/SECOND:     1638.08
DNS EXT:           163.32 ms
DNS INT:           0.95 ms (apiform.to.it)

Thanks,
Marco

udo · Jan 15, 2015

m.ardito said:

...

bad node shows

Code:

#ls /dev/sd*
/dev/sda  /dev/sda1  /dev/sda2  /dev/sdb  /dev/sdc

fdisk -l | grep "/dev/sd"
Disk /dev/mapper/pve-root doesn't contain a valid partition table
Disk /dev/mapper/pve-swap doesn't contain a valid partition table
Disk /dev/mapper/pve-data doesn't contain a valid partition table
Disk /dev/sdb doesn't contain a valid partition table
Disk /dev/sdc doesn't contain a valid partition table
Disk /dev/sda: 72.0 GB, 71999422464 bytes
/dev/sda1   *        2048     1048575      523264   83  Linux
/dev/sda2         1048576   140623871    69787648   8e  Linux LVM
Disk /dev/sdb: 1073.7 GB, 1073741824000 bytes
Disk /dev/sdc: 1073.7 GB, 1073741824000 bytes

# pvscan
  Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb
  PV /dev/sdc    VG pve_vm_disks_ts879   lvm2 [1000.00 GiB / 22.81 GiB free]
  PV /dev/sda2   VG pve                  lvm2 [66.55 GiB / 8.37 GiB free]
  Total: 2 [1.04 TiB] / in use: 2 [1.04 TiB] / in no VG: 0 [0   ]

...

Hi Marco,
this issue is easy: You have make an copy on an VM - so two VMs use PVs with the same uuid.
Normaly no problem, but you use the whole vm-disk without partitioning, so that the host see both PVs (as LV) too.

To resolve use partitions, or different uuids.

Udo

m.ardito · Jan 15, 2015

udo said:
Hi Marco,
this issue is easy: You have make an copy on an VM - so two VMs use PVs with the same uuid.
Normaly no problem, but you use the whole vm-disk without partitioning, so that the host see both PVs (as LV) too.
To resolve use partitions, or different uuids.
Udo

Thanks for answering, Udo.
I beg your pardon, but I am not sure to understand your words, though.

i never really "copied" MVs, but perhaps cloned them, and have a few VM templates (for cloning)
and I never set uuid explicitly... I can't really see what is linking /dev/sdb or /dev/sdc on the host to partitioning into virtual disks...?
but I'm not so experienced on this. This never happened to me... i

on this node I have several VMs... I can't where is the problem (also because I basically I didn't understant it..), and therefore I don't know how to solve it.
can you please give me some more hint or point me to some web resource to get more info?

Thanks
Marco

m.ardito · Jan 15, 2015

If this helps anyone to help me, I digged some more and found this iscsiadm output:

192.168.3.249 is the NAS ip

on the "good" node I see

Code:

#  iscsiadm -m session
tcp: [1] 192.168.3.249:3260,1 iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc

#  iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 2.0-873
Target: iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc
        Current Portal: 192.168.3.249:3260,1
        Persistent Portal: 192.168.3.249:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1993-08.org.debian:01:4922aff0e8f7
                Iface IPaddress: 192.168.3.29
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 262144
                FirstBurstLength: 65536
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 5  State: running
                scsi5 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running

while on the "bad" node I see this (apparently two "interfaces" for the same "portal")

Code:

# iscsiadm -m session
tcp: [1] 192.168.3.249:3260,1 iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc
tcp: [2] 192.168.3.249:3260,1 iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc

# iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 2.0-873
Target: iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc
        Current Portal: 192.168.3.249:3260,1
        Persistent Portal: 192.168.3.249:3260,1
                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1993-08.org.debian:01:3019b875d84
                Iface IPaddress: 192.168.3.28
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 1
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 262144
                FirstBurstLength: 65536
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 5  State: running
                scsi5 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdb          State: running

                **********
                Interface:
                **********
                Iface Name: default
                Iface Transport: tcp
                Iface Initiatorname: iqn.1993-08.org.debian:01:3019b875d84
                Iface IPaddress: 192.168.3.28
                Iface HWaddress: <empty>
                Iface Netdev: <empty>
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: 30
                LUN Reset Timeout: 30
                Abort Timeout: 15
                *****
                CHAP:
                *****
                username: <empty>
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 262144
                MaxXmitDataSegmentLength: 262144
                FirstBurstLength: 65536
                MaxBurstLength: 262144
                ImmediateData: Yes
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 6  State: running
                scsi6 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc          State: running

so I could probably fine a way to terminate one of the two, but what is the safest path for the VMs disks?
or should I stop (or migrate on the other node) them all and then try something? what?

Marco

udo · Jan 15, 2015

m.ardito said:
Thanks for answering, Udo.
I beg your pardon, but I am not sure to understand your words, though.

i never really "copied" MVs, but perhaps cloned them, and have a few VM templates (for cloning)
and I never set uuid explicitly... I can't really see what is linking /dev/sdb or /dev/sdc on the host to partitioning into virtual disks...?

Hi Marco,
copy or clone is the same - you have two vm-disks with the same content.
The uuids (there are different, from partitions: look with "blkid" (or the lvm-related only with pvdisplay) and from lvm with vgdisplay (or lvdisplay)) are the same in two VMs.
Normaly that's no problem, because one VM don't see the disks from the other VM.

In your case, you use the whole disk for lvm inside the VM, so that the host see the pv inside the logical volume (or iscsi-disk) and due the copy/clone the host see the the PV/VG twice...

Udo

m.ardito · Jan 15, 2015

Hmmm

Code:

# blkid -U "[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]"
/dev/sdb

?

have you seen

Code:

iscsiadm -m session -P3

output I got, above?

also see

Code:

# blkid -p -u raid /dev/sdc
/dev/sdc: UUID="[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hH[/B]S" VERSION="LVM2 001" TYPE="LVM2_member" USAGE="raid"

# blkid -p -u raid /dev/sdb
/dev/sdb: UUID="[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]" VERSION="LVM2 001" TYPE="LVM2_member" USAGE="raid"

and more:

Code:

~# pvdisplay
  Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb
  --- Physical volume ---
  PV Name               /dev/sdc
  VG Name               pve_vm_disks_ts879
  PV Size               1000.00 GiB / not usable 4.00 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              255999
  Free PE               5840
  Allocated PE          250159
  PV UUID              [B] dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]

  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               pve
  PV Size               66.55 GiB / not usable 4.00 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              17037
  Free PE               2143
  Allocated PE          14894
  PV UUID               eNJxmU-NbdN-vI2r-MfCL-lWMr-irNe-4Y3zj3

Marco

udo · Jan 15, 2015

m.ardito said:

Hmmm

Code:

# blkid -U "[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]"
/dev/sdb

?

have you seen

Code:

iscsiadm -m session -P3

output I got, above?

also see

Code:

# blkid -p -u raid /dev/sdc
/dev/sdc: UUID="[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hH[/B]S" VERSION="LVM2 001" TYPE="LVM2_member" USAGE="raid"

# blkid -p -u raid /dev/sdb
/dev/sdb: UUID="[B]dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]" VERSION="LVM2 001" TYPE="LVM2_member" USAGE="raid"

and more:

Code:

~# pvdisplay
  Found duplicate PV dB0Su2lTwsYfbcJPhby21PekoyeN3hHS: using /dev/sdc not /dev/sdb
  --- Physical volume ---
  PV Name               /dev/sdc
  VG Name               pve_vm_disks_ts879
  PV Size               1000.00 GiB / not usable 4.00 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              255999
  Free PE               5840
  Allocated PE          250159
  PV UUID              [B] dB0Su2-lTws-Yfbc-JPhb-y21P-ekoy-eN3hHS[/B]

  --- Physical volume ---
  PV Name               /dev/sda2
  VG Name               pve
  PV Size               66.55 GiB / not usable 4.00 MiB
  Allocatable           yes
  PE Size               4.00 MiB
  Total PE              17037
  Free PE               2143
  Allocated PE          14894
  PV UUID               eNJxmU-NbdN-vI2r-MfCL-lWMr-irNe-4Y3zj3

Marco

Hi Marco,
see this post after sending the last one. Looks for me, that you connect the same iscsi-disk twice?!
See which one is used and logoff the unsused?

But I'm not an iScsi-Expert...

Udo

m.ardito · Jan 15, 2015

!!

I found two VM with the same LVM UUIDS... (I probably used backup of the first to create the other?... I can't remember)

first VM

Code:

~$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/vda5
  VG Name               ubuntu-webserver
  PV Size               31.76 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              8130
  Free PE               0
  Allocated PE          8130
  PV UUID               UHqIVU-BKta-3R5x-9HvP-CSNG-r6gH-w87UIi

$ sudo vgdisplay
  --- Volume group ---
  VG Name               ubuntu-webserver
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               31.76 GiB
  PE Size               4.00 MiB
  Total PE              8130
  Alloc PE / Size       8130 / 31.76 GiB
  Free  PE / Size       0 / 0
  VG UUID               pfg6xD-nhx7-ZkJR-D003-mNEz-vg0o-a2w9Vf

second VM

Code:

$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda5
  VG Name               ubuntu-webserver
  PV Size               31.76 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              8130
  Free PE               0
  Allocated PE          8130
  PV UUID               UHqIVU-BKta-3R5x-9HvP-CSNG-r6gH-w87UIi

uwebproxy@ubuntu-webproxy:~$ sudo vgdisplay
  --- Volume group ---
  VG Name               ubuntu-webserver
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               31.76 GiB
  PE Size               4.00 MiB
  Total PE              8130
  Alloc PE / Size       8130 / 31.76 GiB
  Free  PE / Size       0 / 0
  VG UUID               pfg6xD-nhx7-ZkJR-D003-mNEz-vg0o-a2w9Vf

now I see that both VM have the same PV/VG UUID and the same name (ubuntu-webproxy (second MV) was created by probably restoring the ubuntu-webserver (first VM) backup)

now... based on all this, and iscsiadm output... in your opinion, what should I do? change one of those VM UUID? and which/how?
or?

Marco

udo · Jan 15, 2015

m.ardito said:

!!

I found two VM with the same LVM UUIDS... (I probably used backup of the first to create the other?... I can't remember)

first VM

Code:

~$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/vda5
  VG Name               ubuntu-webserver
  PV Size               31.76 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              8130
  Free PE               0
  Allocated PE          8130
  PV UUID               UHqIVU-BKta-3R5x-9HvP-CSNG-r6gH-w87UIi

$ sudo vgdisplay
  --- Volume group ---
  VG Name               ubuntu-webserver
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               31.76 GiB
  PE Size               4.00 MiB
  Total PE              8130
  Alloc PE / Size       8130 / 31.76 GiB
  Free  PE / Size       0 / 0
  VG UUID               pfg6xD-nhx7-ZkJR-D003-mNEz-vg0o-a2w9Vf

second VM

Code:

$ sudo pvdisplay
  --- Physical volume ---
  PV Name               /dev/sda5
  VG Name               ubuntu-webserver
  PV Size               31.76 GiB / not usable 2.00 MiB
  Allocatable           yes (but full)
  PE Size               4.00 MiB
  Total PE              8130
  Free PE               0
  Allocated PE          8130
  PV UUID               UHqIVU-BKta-3R5x-9HvP-CSNG-r6gH-w87UIi

uwebproxy@ubuntu-webproxy:~$ sudo vgdisplay
  --- Volume group ---
  VG Name               ubuntu-webserver
  System ID
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  3
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                2
  Open LV               2
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               31.76 GiB
  PE Size               4.00 MiB
  Total PE              8130
  Alloc PE / Size       8130 / 31.76 GiB
  Free  PE / Size       0 / 0
  VG UUID               pfg6xD-nhx7-ZkJR-D003-mNEz-vg0o-a2w9Vf

now I see that both VM have the same PV/VG UUID and the same name (ubuntu-webproxy (second MV) was created by probably restoring the ubuntu-webserver (first VM) backup)

now... based on all this, and iscsiadm output... in your opinion, what should I do? change one of those VM UUID? and which/how?
or?

Marco

Hi Marco,
this two identical PVs don't harm, because they are hidden for the host due the partition (vda5).
If you have they directly on a disk, like vdb, they will give the same trouble as the iscsi-disk in your posting.

About your iscsi-disk: have you tried to remove one with

Code:

iscsiadm -m node -T iqn.2004-04.com.qnap:ts-879u-rp:iscsi.pve.d4e6fc -p 192.168.3.249 -u

But don't know if the open one will be accessible?! Perhaps it's better to migrate all VMs, which are using this storage, to the other node first?

Udo

m.ardito · Jan 16, 2015

I don't know this night the backup job task log reported the same warning, but today all by itself something changed.

apprently one of the 2 active sessions went into a strange "disconnected" stage and all VM had troubles, and had to be stopped/started.

then I could migrate them all to other node, and then restarted the problematic node.

now the node cannot connect to LVM over ISCSI storage anymore...

The other node sees the same storage without problems, and VM are running fine, as it seems...

? What can I do ?
how can I force this node to reset its connection to that storage?

Marco

nihilanthlnxa · Jan 16, 2015

Greetings.

That was happened to me once. That is a iSCSI No Multipahting problem.

I show you a part on a manual of Two Nodes Proxmox 3.2 [now 3.3] Cluster that I wrote for self-consultation, something to capture my memories. This was wrote in Spanish language and In don't know if Google Translator translates it very well. Sorry in advance.

-----------------------------------------

Problema del No Multipathing de iSCSI

Hay que tener muy en cuenta un elemento en la configuración de los portales e iniciadores, y es la posibilidad de tener varias conexiones de red entre la SAN/NAS y los hipervisores. Si en la primera se configura el Target iSCSI para que escuche por todas las interfaces de red dedicadas a la transferencia de datos (en el ejemplo con NAS4Free por las interfaces em1 [10.0.0.3] y em2 [10.0.1.3]; en el ejemplo con Debian 7.0 por todas las interfaces de red configuradas) y permita a todas las subredes de iniciadores conectarse a las mismas (10.0.0.0/29 y 10.0.1.0/29), que es donde están también los hipervisores Proxmox (PRX-C0-1 con las direcciones IP 10.0.0.1 y 10.0.1.1; y PRX-C0-2 con las direcciones IP 10.0.0.2 y 10.0.1.2), durante la migración de las máquinas virtuales entre los nodos se pueden obtener efectos indeseados si no se configura en estos últimos el multipathing de iSCSI. Entre estos efectos está en que LVM detecta que hay duplicación de volúmenes físicos en dos o más discos duros (discos que están asociados a los portales iSCSI antes detectados durante la carga del servicio iSCSI Initiator). Imagen Proxmox - Storage - LVM over iSCSI - PV Error.png.

Como se ve, el sistema detectó que tanto en /dev/sdb como en /dev/sdc existe el mismo identificador de volumen físico LVM, entonces emplea el tercer disco (/dev/sdc) y no el segundo disco (/dev/sdb). Esto provoca que existan lecturas/escrituras incorrectas de los discos duros de las máquinas virtuales (que no son más que volúmenes lógicos ubicados en el LVM sobre iSCSI) y estas terminen por explotar con su correspondiente pérdida de datos como se muestra en la imagen siguiente: Imagen Proxmox - Storage - LVM over iSCSI - VM Explotando.png

Por lo tanto, si se configura correctamente el Target iSCSI (inicialmente con un solo camino, cosa no recomendable) se eliminan estos efectos indeseados y la migración de las máquinas virtuales entre los nodos ocurre sin problemas: Imagen Proxmox - Storage - LVM over iSCSI - OK.png

En el caso de usar un Target iSCSI configurado en un servidor con Debian, basta con modificar las opciones de inicio del servicio en el archivo /etc/default/iscsitarget, o sea:

ISCSITARGET_OPTIONS=”--address=<Dirección IP que se pondrá a la escucha>”

DeadMeat · May 19, 2015

Greetings,
I've had same issue on 1 of my nodes. It was resolved by editing iscsi config on the node.

Check /etc/iscsi/nodes/iqn.your.iscsi:target. in my case it was:
# ls -la /etc/iscsi/nodes/iqn.2005-10.org.freenas.ctl\

ve/ total 26
drw------- 3 root root 3 May 19 12:18 .
drw------- 3 root root 3 May 19 12:18 ..
drw------- 2 root root 3 May 19 12:18 172.16.12.10,3260,2

drw------- 2 root root 3 May 19 12:18 172.16.12.10,3260,3

After I've removed 172.16.12.10,3260,2 and reboot, the issue vas resolved.

Pourya Mehdinejad · Jan 7, 2018

DeadMeat said:
Greetings,
I've had same issue on 1 of my nodes. It was resolved by editing iscsi config on the node.

Check /etc/iscsi/nodes/iqn.your.iscsi:target. in my case it was:
# ls -la /etc/iscsi/nodes/iqn.2005-10.org.freenas.ctl\ve/ total 26
drw------- 3 root root 3 May 19 12:18 .
drw------- 3 root root 3 May 19 12:18 ..
drw------- 2 root root 3 May 19 12:18 172.16.12.10,3260,2

drw------- 2 root root 3 May 19 12:18 172.16.12.10,3260,3

After I've removed 172.16.12.10,3260,2 and reboot, the issue vas resolved.

Thank you so much, you saved my day !
I recently changed an interface on our SAN Storage and our cluster were keeping detect multiple duplicate PV from the storage, the only working solution was yours

Search

Search

Found duplicate PV ?

m.ardito

Active Member

udo

Distinguished Member

m.ardito

Active Member

m.ardito

Active Member

udo

Distinguished Member

m.ardito

Active Member

udo

Distinguished Member

m.ardito

Active Member

udo

Distinguished Member

m.ardito

Active Member

nihilanthlnxa

New Member

Attachments

DeadMeat

Renowned Member

Pourya Mehdinejad

Member