Proxmox VE 2.0 cluster and online migration

G

Géraud Tillit

Guest
Hello,

I've setup a non HA cluster of two servers, with two DRBD device in primary/primary mode

From an OS point of view, the DRBD devices are fine :

Code:
root@proxmox2-02:/var/lib/vz/dump# cat /proc/drbd
version: 8.3.10 (api:88/proto:86-96)
GIT-hash: 5c0b0469666682443d4785d90a2c603378f9017b build by phil@fat-tyre, 2011-01-28 12:17:35

 1: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:520496 nr:172484 dw:172788 dr:549816 al:20 bm:180 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
 2: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
    ns:11123304 nr:522864 dw:11125976 dr:1097638 al:3053 bm:165 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

When I try an online migration from the web interface, the task finish with an "OK" status :

Code:
Apr 14 20:52:54 starting migration of VM 102 to node 'proxmox2-02' (88.190.39.127)
Apr 14 20:52:54 copying disk images
Apr 14 20:52:55 starting VM 102 on remote node 'proxmox2-02'
Apr 14 20:52:56 starting migration tunnel
Apr 14 20:52:56 starting online/live migration on port 60000
Apr 14 20:53:02 migration status: completed
Apr 14 20:53:02 migration speed: 170.67 MB/s
Apr 14 20:53:05 migration finished successfuly (duration 00:00:11)
TASK OK

But then suddenly the VM has a very strange behaviour like :

Code:
bash: /usr/bin/vi : cannot execute binary file

I issue a stop on the VM via the web interface and when I try to restart it, grub doesn't find the boot disk... I have to migrate back to the first node, restart the VM (and fsck the disk).

Here is the pveversion output :

Code:
root@proxmox2-01:/var/lib/vz# pveversion -v
pve-manager: 2.0-59 (pve-manager/2.0/18400f07)
running kernel: 2.6.32-11-pve
proxmox-ve-2.6.32: 2.0-66
pve-kernel-2.6.32-10-pve: 2.6.32-63
pve-kernel-2.6.32-11-pve: 2.6.32-66
lvm2: 2.02.88-2pve2
clvm: 2.02.88-2pve2
corosync-pve: 1.4.3-1
openais-pve: 1.1.4-2
libqb: 0.10.1-2
redhat-cluster-pve: 3.1.8-3
resource-agents-pve: 3.9.2-3
fence-agents-pve: 3.1.7-2
pve-cluster: 1.0-26
qemu-server: 2.0-38
pve-firmware: 1.0-15
libpve-common-perl: 1.0-26
libpve-access-control: 1.0-18
libpve-storage-perl: 2.0-17
vncterm: 1.0-2
vzctl: 3.0.30-2pve2
vzprocps: 2.0.11-2
vzquota: 3.0.12-3
pve-qemu-kvm: 1.0-9
ksm-control-daemon: 1.1-1

It was working without any problem on proxmox 1.9
 
Yes I'm using LVM on top of DRBD.

Here is the layout :


  • node proxmox2-01
    • VMs 101 and 102
    • DRBD device proxmox2-01
  • node proxmox2-02
    • VMs 103 and 104
    • DRBD device proxmox2-02

I've tried to migration VM 102 from proxmox2-01 to proxmox2-02, here is the lvscan/pvscan output on both nodes before starting the migration :

proxmox2-01 :
Code:
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
root@proxmox2-01:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]

proxmox2-02 :
Code:
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
root@proxmox2-02:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]

After migrating, the same behaviour than yesterday on VM 102 :

Code:
gtillit@foobar:~$ vi
bash: /usr/bin/vi : fichier binaire impossible à lancer

(french locale, meaning "cannot execute binary file")

Here is after migrating the output of lvscan and pvscan on both nodes :

proxmox2-01 :
Code:
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
root@proxmox2-01:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]

proxmox2-02 :
Code:
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
root@proxmox2-02:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]

Here is the content of both filters in lvm.conf :

Code:
root@proxmox2-01:~# egrep "^\s*filter" /etc/lvm/lvm.conf
    filter = [ "r|/dev/sda6|", "r|/dev/sda7|", "r|/dev/disk/|", "r|/dev/block/|", "a/.*/" ]
Code:
root@proxmox2-02:~# egrep "^\s*filter" /etc/lvm/lvm.conf
    filter = [ "r|/dev/sda6|", "r|/dev/sda7|", "r|/dev/disk/|", "r|/dev/block/|", "a/.*/" ]
 
Are sda6 and sda7 the storage devices for DRBD?
Assuming yes, your filter command seems right and LVM/DRBD seem to be working correctly.

Do both nodes have the same CPU?

The one thing I find odd is you mention that when you start it on the 2nd node no boot disk is found, yet it is found on the first node.
That tells me that the data is possibly different on the two nodes rather than the same on each node.
That is the only thing I can think of so far.

There is a quick way to check this, do an md5sum of the beginning of the logical volume on both nodes.
If the sum is the same then the boot record is identical on both nodes, if it is different then you have found your problem.

The logical volumes are inactive by default so you might need to activate them before you check the md5sum:
Code:
lvchange -ay /dev/drbd-proxmox2-01/vm-102-disk-1

To get the md5sum of the first 1Mb of the volume:
Code:
dd if=/dev/drbd-proxmox2-01/vm-102-disk-1 bs=1M count=1|md5sum -

Are the md5sums the same on both nodes or different?
 
Are sda6 and sda7 the storage devices for DRBD?

Absolutly :

Code:
root@proxmox2-01:~# egrep '^\s*disk' /etc/drbd.d/proxmox2-01.res
        disk /dev/sda6;
root@proxmox2-01:~# egrep '^\s*disk' /etc/drbd.d/proxmox2-02.res
        disk /dev/sda7;

Do both nodes have the same CPU?

The two servers are the same model :

http://www.online.net/serveur-dedie/offre-dedibox-professionnel-hp.xhtml

The one thing I find odd is you mention that when you start it on the 2nd node no boot disk is found, yet it is found on the first node.

It is found when I migrate back to the first node but I have to do a fsck on the when rebooting

[...]

Are the md5sums the same on both nodes or different?

Code:
root@proxmox2-01:~# dd if=/dev/drbd-proxmox2-01/vm-102-disk-1 bs=1M count=1|md5sum -
1+0 enregistrements lus
1+0 enregistrements écrits
1048576 octets (1,0 MB) copiésc39c4f7a8fb5e7432220b585f2acddec  -
, 0,226895 s, 4,6 MB/s
Code:
root@proxmox2-02:~# dd if=/dev/drbd-proxmox2-01/vm-102-disk-1 bs=1M count=1|md5sum -
1+0 enregistrements lus
1+0 enregistrements écrits
1048576 octets (1,0 MB) copiés, 0,035502 s, 29,5 MB/s
c39c4f7a8fb5e7432220b585f2acddec  -

Exactly the same checksum... I really don't know what's wrong here
 
Ok I've made some tests with VM 104 (I'm using 102 to connect to other VM from this location, so a bit touchy to use it ;-) ). To remember, VM 104 is on proxmox2-02

Cold migration

VM 104 was shutdown properly before starting the operations

pvscan/vgscan/lvscan on both nodes before migration :

promxox2-02 (source node)

Code:
root@proxmox2-02:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]
root@proxmox2-02:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbd-proxmox2-02" using metadata type lvm2
  Found volume group "drbd-proxmox2-01" using metadata type lvm2
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

proxmox2-01 (target node)

I've activated lvs corresponding to VM 103 and 104 as they were inactive on this host :

Code:
root@proxmox2-01:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]
root@proxmox2-01:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbd-proxmox2-02" using metadata type lvm2
  Found volume group "drbd-proxmox2-01" using metadata type lvm2
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
root@proxmox2-01:~# lvchange -ay /dev/drbd-proxmox2-02/vm-103-disk-1
root@proxmox2-01:~# lvchange -ay /dev/drbd-proxmox2-02/vm-104-disk-1
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

The cold migration was successful and I was able to restart VM 104 on proxmox2-01

Here is an output of pvscan/vgscan/lvscan on both nodes after the migration and syslog output before and after the migration :

proxmox2-01 :

Code:
root@proxmox2-01:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbd-proxmox2-02" using metadata type lvm2
  Found volume group "drbd-proxmox2-01" using metadata type lvm2
root@proxmox2-01:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Code:
Apr 16 10:11:57 proxmox2-01 udevd-work[193886]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel
 to provide the proper name
Apr 16 10:12:08 proxmox2-01 udevd-work[193886]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel
 to provide the proper name
Apr 16 10:12:43 proxmox2-01 rrdcached[1294]: flushing old values
Apr 16 10:12:43 proxmox2-01 rrdcached[1294]: rotating journals
Apr 16 10:12:43 proxmox2-01 rrdcached[1294]: started new journal /var/lib/rrdcached/journal//rrd.journal.1334563963.877644
Apr 16 10:12:43 proxmox2-01 rrdcached[1294]: removing old journal /var/lib/rrdcached/journal//rrd.journal.1334556763.877659
Apr 16 10:12:44 proxmox2-01 pmxcfs[1331]: [dcdb] notice: data verification successful
Apr 16 10:12:48 proxmox2-01 pmxcfs[1331]: [status] notice: received log
Apr 16 10:12:50 proxmox2-01 pmxcfs[1331]: [status] notice: received log
Apr 16 10:14:03 proxmox2-01 pvedaemon[246195]: <root@pam> starting task UPID:proxmox2-01:0003C572:00D0DFC5:4F8BD4CB:qmstart:104:root@pam:
Apr 16 10:14:03 proxmox2-01 pvedaemon[247154]: start VM 104: UPID:proxmox2-01:0003C572:00D0DFC5:4F8BD4CB:qmstart:104:root@pam:
Apr 16 10:14:04 proxmox2-01 kernel: device tap104i0 entered promiscuous mode
Apr 16 10:14:04 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering forwarding state
Apr 16 10:14:04 proxmox2-01 pvedaemon[246195]: <root@pam> end task UPID:proxmox2-01:0003C572:00D0DFC5:4F8BD4CB:qmstart:104:root@pam: OK
Apr 16 10:14:14 proxmox2-01 kernel: tap104i0: no IPv6 routers present
Apr 16 10:14:27 proxmox2-01 pvedaemon[246005]: <root@pam> successful auth for user 'root@pam'
Apr 16 10:15:12 proxmox2-01 pvedaemon[246005]: <root@pam> successful auth for user 'root@pam'
Apr 16 10:16:47 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering disabled state
Apr 16 10:16:47 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering disabled state
Apr 16 10:17:01 proxmox2-01 /USR/SBIN/CRON[248045]: (root) CMD (   cd / && run-parts --report /etc/cron.hourly)

proxmox2-02

Code:
root@proxmox2-02:~# pvscan
  PV /dev/sda5               VG pve                lvm2 [351,80 GiB / 0    free]
  PV /dev/drbd_proxmox2-02   VG drbd-proxmox2-02   lvm2 [749,96 GiB / 719,95 GiB free]
  PV /dev/drbd_proxmox2-01   VG drbd-proxmox2-01   lvm2 [749,98 GiB / 617,97 GiB free]
  Total: 3 [1,81 TiB] / in use: 3 [1,81 TiB] / in no VG: 0 [0   ]
root@proxmox2-02:~# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "pve" using metadata type lvm2
  Found volume group "drbd-proxmox2-02" using metadata type lvm2
  Found volume group "drbd-proxmox2-01" using metadata type lvm2
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Code:
Apr 16 10:12:44 proxmox2-02 pmxcfs[1374]: [dcdb] notice: data verification successful
Apr 16 10:12:48 proxmox2-02 pvedaemon[273582]: <root@pam> starting task UPID:proxmox2-02:00043252:00D1E883:4F8BD480:qmigrate:104:root@pam:
Apr 16 10:12:50 proxmox2-02 udevd-work[272493]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel to provide the proper name
Apr 16 10:12:50 proxmox2-02 pvedaemon[273582]: <root@pam> end task UPID:proxmox2-02:00043252:00D1E883:4F8BD480:qmigrate:104:root@pam: OK
Apr 16 10:14:03 proxmox2-02 pmxcfs[1374]: [status] notice: received log
 
Last edited by a moderator:
I did the online migration test as well and it worked without any problem... I'm a bit puzzled here :confused: :

Online migration

Here is the output :
  • lvscan before online migration
  • syslog during migration
  • lvscan after
  • syslog during migration back to original node
  • lvscan after

proxmox2-02
(source node) :

Code:
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Apr 16 11:07:05 proxmox2-02 pvedaemon[281782]: <root@pam> starting task UPID:proxmox2-02:00044FD1:00D6E0EE:4F8BE139:qmigrate:104:root@pam:
Apr 16 11:07:07 proxmox2-02 pmxcfs[1374]: [status] notice: received log
Apr 16 11:07:07 proxmox2-02 pmxcfs[1374]: [status] notice: received log
Apr 16 11:07:13 proxmox2-02 kernel: vmbr0: port 3(tap104i0) entering disabled state
Apr 16 11:07:13 proxmox2-02 kernel: vmbr0: port 3(tap104i0) entering disabled state
Apr 16 11:07:15 proxmox2-02 udevd-work[272493]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel to provide the proper name
Apr 16 11:07:16 proxmox2-02 pvedaemon[281782]: <root@pam> end task UPID:proxmox2-02:00044FD1:00D6E0EE:4F8BE139:qmigrate:104:root@pam: OK

root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Apr 16 11:11:23 proxmox2-02 qm[282755]: <root@pam> starting task UPID:proxmox2-02:00045084:00D74592:4F8BE23B:qmstart:104:root@pam:
Apr 16 11:11:23 proxmox2-02 qm[282756]: start VM 104: UPID:proxmox2-02:00045084:00D74592:4F8BE23B:qmstart:104:root@pam:
Apr 16 11:11:24 proxmox2-02 udevd-work[272493]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel
 to provide the proper name
Apr 16 11:11:24 proxmox2-02 kernel: device tap104i0 entered promiscuous mode
Apr 16 11:11:24 proxmox2-02 kernel: vmbr0: port 3(tap104i0) entering forwarding state
Apr 16 11:11:24 proxmox2-02 qm[282755]: <root@pam> end task UPID:proxmox2-02:00045084:00D74592:4F8BE23B:qmstart:104:root@pam: OK
Apr 16 11:11:32 proxmox2-02 pmxcfs[1374]: [status] notice: received log
Apr 16 11:11:35 proxmox2-02 kernel: tap104i0: no IPv6 routers present
Apr 16 11:11:44 proxmox2-02 corosync[2827]:   [TOTEM ] Retransmit List: 44c06 44c07 44c08 44c09 44c0a 44c0b 44c0c

root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

proxmox2-01

Code:
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Apr 16 11:07:07 proxmox2-01 qm[256463]: <root@pam> starting task UPID:proxmox2-01:0003E9D2:00D5BB8C:4F8BE13B:qmstart:104:root@pam:
Apr 16 11:07:07 proxmox2-01 qm[256466]: start VM 104: UPID:proxmox2-01:0003E9D2:00D5BB8C:4F8BE13B:qmstart:104:root@pam:
Apr 16 11:07:07 proxmox2-01 udevd-work[193886]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel to provide the proper name
Apr 16 11:07:07 proxmox2-01 kernel: device tap104i0 entered promiscuous mode
Apr 16 11:07:07 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering forwarding state
Apr 16 11:07:07 proxmox2-01 qm[256463]: <root@pam> end task UPID:proxmox2-01:0003E9D2:00D5BB8C:4F8BE13B:qmstart:104:root@pam: OK
  
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Apr 16 11:11:22 proxmox2-01 pvedaemon[255796]: <root@pam> starting task UPID:proxmox2-01:0003EECF:00D61F3E:4F8BE23A:qmigrate:104:root@pam:
Apr 16 11:11:23 proxmox2-01 pmxcfs[1331]: [status] notice: received log
Apr 16 11:11:24 proxmox2-01 pmxcfs[1331]: [status] notice: received log
Apr 16 11:11:30 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering disabled state
Apr 16 11:11:30 proxmox2-01 kernel: vmbr0: port 4(tap104i0) entering disabled state
Apr 16 11:11:32 proxmox2-01 udevd-work[193886]: kernel-provided name 'drbd2' and NAME= 'drbd_proxmox2-02' disagree, please use SYMLINK+= or change the kernel
 to provide the proper name
Apr 16 11:11:32 proxmox2-01 pvedaemon[255796]: <root@pam> end task UPID:proxmox2-01:0003EECF:00D61F3E:4F8BE23A:qmigrate:104:root@pam: OK
Apr 16 11:11:42 proxmox2-01 corosync[2774]:   [TOTEM ] Retransmit List: 44bfc 44bfd 44bfe 44bff 44c00

root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

On both nodes there was a lot of corosync messages...

I'll install another VM this afternoon to handle my ssh connexions and I'll try to migrate VM 102
 
I've tried a cold migration with VM 102 :

Same output than previously, except syslog after migrating back...

proxmox2-01

Code:
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit
  
Apr 16 21:09:23 proxmox2-01 pvedaemon[394905]: <root@pam> starting task UPID:proxmox2-01:000613EA:010CDF59:4F8C6E63:qmigrate:102:root@pam:
Apr 16 21:09:25 proxmox2-01 udevd-work[193886]: kernel-provided name 'drbd1' and NAME= 'drbd_proxmox2-01' disagree, please use SYMLINK+= or change the kernel to provide the proper name
Apr 16 21:09:25 proxmox2-01 pvedaemon[394905]: <root@pam> end task UPID:proxmox2-01:000613EA:010CDF59:4F8C6E63:qmigrate:102:root@pam: OK
  
root@proxmox2-01:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  inactive          '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

proxmox2-02

Code:
root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

Apr 16 21:09:23 proxmox2-02 pmxcfs[1374]: [status] notice: received log
Apr 16 21:09:25 proxmox2-02 pmxcfs[1374]: [status] notice: received log
Apr 16 21:09:55 proxmox2-02 pvedaemon[264688]: worker 286099 finished
Apr 16 21:09:55 proxmox2-02 pvedaemon[264688]: starting 1 worker(s)
Apr 16 21:09:55 proxmox2-02 pvedaemon[264688]: worker 316312 started

root@proxmox2-02:~# lvscan
  ACTIVE            '/dev/pve/data' [351,80 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-103-disk-1' [20,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-02/vm-104-disk-1' [10,01 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-102-disk-1' [32,00 GiB] inherit
  ACTIVE            '/dev/drbd-proxmox2-01/vm-101-disk-1' [100,00 GiB] inherit

And the VM is not booting on second node :

foobar.png
 
Hi,
I had also one effect (after playing a lot), that the content of an drbd-lv are not the same. The sync was not the problem, it's have something to do with lvm. Try following:
shutdown vm 102 (running on host 01)
control drbd with "drbd-overview"
on host 02 do an "vgchange -a n drbd-proxmox2-01" and after that an "vgchange -a y drbd-proxmox2-01".
Try to migrate vm 102 and boot.

Udo
 
Code:
root@proxmox2-01:~# drbd-overview
  1:proxmox2-01  Connected Primary/Primary UpToDate/UpToDate C r-----
  2:proxmox2-02  Connected Primary/Primary UpToDate/UpToDate C r-----

Code:
root@proxmox2-02:~# vgchange -a n drbd-proxmox2-01
  0 logical volume(s) in volume group "drbd-proxmox2-01" now active
root@proxmox2-02:~# vgchange -a y drbd-proxmox2-01
  2 logical volume(s) in volume group "drbd-proxmox2-01" now active

And exactly the same result (not booting after migration)

I had the same problem with VM 104 but, after some migration tests the disk was totally corrupted and I had to restore a vzdump backup. It seems it's now working like a charm (I used VM 104 in my first tests this working and both cold/hot migrations were working)

I think about doing the same for VM 102 but that doesn't make me feeling confident on the DRBD/Proxmox 2.0 couple. And I never had this kind of problems with 1.9 (and I used online migration quite often)...
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!