Unable to migrate virtual machines on drbd-storage

Discussion in 'Proxmox VE: Installation and configuration' started by dmp, Jun 8, 2016.

  1. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Good morning from Germany,

    we're using Proxmox in a simple three node cluster, where one node is just a quorom and control node (px1, px2, px3). On px1 and px2 we're using drbd9 as a shared storage and a LVM volume group as a local storage for non-roaming virtual machines. The VG used for DRBD9 has the same name as the local LVM-storage (VG1).

    Our problem is that we're unable to migrate virtual machines saved on drbd-storage from one host to another until we either share or disable the LVM-storage in storage settings.

    Code:
    Jun 08 08:27:00 starting migration of VM 100 to node 'px2' (172.19.80.102)
    Jun 08 08:27:00 copying disk images
    Jun 08 08:27:01 ERROR: Failed to sync data - can't migrate 'vg1:vm-100-disk-1_00' - storage type 'lvm' not supported
    Jun 08 08:27:01 aborting phase 1 - cleanup resources
    Jun 08 08:27:01 ERROR: migration aborted (duration 00:00:01): Failed to sync data - can't migrate 'vg1:vm-100-disk-1_00' - storage type 'lvm' not supported
    TASK ERROR: migration aborted
    When we either share or disable the LVM-storage it works flawless.

    Code:
    Jun 08 08:28:02 starting migration of VM 100 to node 'px2' (172.19.80.102)
    Jun 08 08:28:02 copying disk images
    Jun 08 08:28:03 migration finished successfully (duration 00:00:01)
    TASK OK
    We don't think this is a normal behaviour even when there's the same volume group name twice.

    pveversion -v
    Code:
    proxmox-ve: 4.2-52 (running kernel: 4.4.8-1-pve)
    pve-manager: 4.2-11 (running version: 4.2-11/2c626aa1)
    pve-kernel-4.4.8-1-pve: 4.4.8-52
    lvm2: 2.02.116-pve2
    corosync-pve: 2.3.5-2
    libqb0: 1.0-1
    pve-cluster: 4.0-40
    qemu-server: 4.0-79
    pve-firmware: 1.1-8
    libpve-common-perl: 4.0-67
    libpve-access-control: 4.0-16
    libpve-storage-perl: 4.0-51
    pve-libspice-server1: 0.12.5-2
    vncterm: 1.2-1
    pve-qemu-kvm: 2.5-19
    pve-container: 1.0-67
    pve-firewall: 2.0-29
    pve-ha-manager: 1.0-31
    ksm-control-daemon: 1.2-1
    glusterfs-client: 3.5.2-2+deb8u2
    lxc-pve: 1.1.5-7
    lxcfs: 2.0.0-pve2
    cgmanager: 0.39-pve1
    criu: 1.6.0-1
    drbdmanage: 0.95-1
    /etc/pve/storage.cfg
    Code:
    dir: local
            path /var/lib/vz
            content rootdir,backup,images,vztmpl,iso
            maxfiles 0
    
    drbd: drbd1
            content rootdir,images
            redundancy 2
            nodes px2,px1
    
    lvm: vg1
            vgname VG1
            content images,rootdir
            shared
            nodes px1,px2
    /etc/drbdmanaged.cfg
    Code:
    [GLOBAL]
    storage-plugin = drbdmanage.storage.lvm.Lvm
    
    [LOCAL]
    drbdctrl-vg = VG1
    Code:
    root@px1:/etc# drbdmanage list-volumes
    +------------------------------------------------------------------------------------------------------------+
    | Name  | Vol ID |  Size | Minor |  | State |
    |------------------------------------------------------------------------------------------------------------|
    | vm-102-disk-1 |  0 | 409600 |  101 |  |  ok |
    | vm-103-disk-1 |  0 | 409600 |  102 |  |  ok |
    | vm-105-disk-1 |  0 |  32768 |  100 |  |  ok |
    +------------------------------------------------------------------------------------------------------------+
    root@px1:/etc# drbdmanage list-resources
    +------------------------------------------------------------------------------------------------------------+
    | Name  |  | State |
    |------------------------------------------------------------------------------------------------------------|
    | vm-102-disk-1 |  |  ok |
    | vm-103-disk-1 |  |  ok |
    | vm-105-disk-1 |  |  ok |
    +------------------------------------------------------------------------------------------------------------+
    root@px1:/etc# drbdmanage list-assignments
    +------------------------------------------------------------------------------------------------------------+
    | Node | Resource  | Vol ID |  | State |
    |------------------------------------------------------------------------------------------------------------|
    | px1  | vm-102-disk-1 |  * |  |  ok |
    | px1  | vm-103-disk-1 |  * |  |  ok |
    | px1  | vm-105-disk-1 |  * |  |  ok |
    | px2  | vm-102-disk-1 |  * |  |  ok |
    | px2  | vm-103-disk-1 |  * |  |  ok |
    | px2  | vm-105-disk-1 |  * |  |  ok |
    +------------------------------------------------------------------------------------------------------------+
    
     
  2. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    The VM 100 is configured to use the LVM storage ("Jun 08 08:27:01 ERROR: Failed to sync data - can't migrate 'vg1:vm-100-disk-1_00' - storage type 'lvm' not supported"), so naturally DRBD is not used for this (as you can see in the drbdmanage output, where only the disks belonging to VMs with IDs 102, 103, 105 are shown). When you set the LVM storage to "shared", Proxmox believes you and does not copy anything (shared means that this storage is available on all nodes with the same content). BUT the LVM storage itself is not shared (only those volumes on it which are replicated by DRBD), so while the migration seems to work (as the shared disks are simply skipped), starting the VM on the target node won't (as the disk is not available there). Migrating the DRBD-managed VMs (102,103,105) should work as expected.

    Do you need to use the volume group as plain LVM storage? If not, I would remove the "vg1" storage after moving all the disks to the "drbd1" storage. I would not recommend such a mixed setup in any case.

    Also please note that you deviate from Proxmox's regular DRBD9 default by using plain LVM instead of LVM-thin. Furthermore, DRBD9 is only a technology preview in Proxmox, so please don't rely on it for production use.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Thanks for your answer, Fabian.

    I'm so sorry that I pasted the wrong drbdmanage outputs. Disk vm-100-disk-1 ist not on LVM, though it's on DRBD. This was my fault and I'm going to get the correct drbdmanage output soon.

    But even when the disk is on drbd (which it is), a migration to host px2 is not possible due to that error which makes no sense at all.

    I think the problem is that the drbdpool has the same name as the big volume group (VG1) and not drbdpool. We know that this is not a normal setup, but it should be possible to migrate even with non-shared LVM-storage.
     
  4. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    Your migrate output that I quoted ("Jun 08 08:27:01 ERROR: Failed to sync data - can't migrate 'vg1:vm-100-disk-1_00' - storage type 'lvm' not supported") explicitly says that the migration does not work because the disk is on LVM storage (which is not yet supported for migrate, the patches are currently discussed on pve-devel). I think you confused something here when setting up the VM, which was also the reason why I asked for the VM config (which will probably have 'vg1:vm-100-disk-1_00' as one of its disks). You need to select the 'drbd1' storage if you want the disks to be created via DRBD/drbdmanage.

    Can you please post the configuration of a DRBD VM ("qm config ID") and the output of the migration for that VM. If you have changed any of the previous files you posted, please also post the updated versions.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Let's just use VM 105 for this example. As shown in the output above, the disk of VM 105 is on drbd - right?

    So, when I try to migrate VM 105 from px1 to px2 it gives me this error:
    Code:
    Jun 08 13:27:34 starting migration of VM 105 to node 'px2' (172.19.80.102)
    Jun 08 13:27:34 copying disk images
    Jun 08 13:27:34 ERROR: Failed to sync data - unable to migrate 'testVG:vm-105-disk-1_00' to 'testVG:vm-105-disk-1_00' on host '172.19.80.102' - source type 'lvm' not implemented
    Jun 08 13:27:34 aborting phase 1 - cleanup resources
    Jun 08 13:27:34 ERROR: found stale volume copy 'testVG:vm-105-disk-1_00' on node 'px2'
    Jun 08 13:27:34 ERROR: migration aborted (duration 00:00:00): Failed to sync data - unable to migrate 'testVG:vm-105-disk-1_00' to 'testVG:vm-105-disk-1_00' on host '172.19.80.102' - source type 'lvm' not implemented
    TASK ERROR: migration aborted
    Unfortunately "qm config 105" just gives me an error.
    Code:
    root@px1:/etc# qm config 105
    "my" variable $volid masks earlier declaration in same statement at /usr/share/perl5/PVE/QemuMigrate.pm line 258.
    "my" variable $parent masks earlier declaration in same statement at /usr/share/perl5/PVE/QemuMigrate.pm line 258.
    "my" variable $parent masks earlier declaration in same statement at /usr/share/perl5/PVE/QemuMigrate.pm line 258.
    syntax error at /usr/share/perl5/PVE/QemuMigrate.pm line 256, near ") {"
    Global symbol "$volhash" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 262.
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 264.
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 265.
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 265.
    syntax error at /usr/share/perl5/PVE/QemuMigrate.pm line 267, near "}"
    syntax error at /usr/share/perl5/PVE/QemuMigrate.pm line 269, near "}"
    Can't use global @_ in "my" at /usr/share/perl5/PVE/QemuMigrate.pm line 272, near "= @_"
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 274.
    Global symbol "$vmid" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 274.
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 274.
    Global symbol "$self" requires explicit package name at /usr/share/perl5/PVE/QemuMigrate.pm line 274.
    syntax error at /usr/share/perl5/PVE/QemuMigrate.pm line 284, near "}"
    /usr/share/perl5/PVE/QemuMigrate.pm has too many errors.
    Compilation failed in require at /usr/share/perl5/PVE/API2/Qemu.pm line 18.
    BEGIN failed--compilation aborted at /usr/share/perl5/PVE/API2/Qemu.pm line 18.
    Compilation failed in require at /usr/share/perl5/PVE/CLI/qm.pm line 20.
    BEGIN failed--compilation aborted at /usr/share/perl5/PVE/CLI/qm.pm line 20.
    Compilation failed in require at /usr/sbin/qm line 6.
    BEGIN failed--compilation aborted at /usr/sbin/qm line 6.
    Here's the actual storage.cfg:
    Code:
    dir: local
            path /var/lib/vz
            maxfiles 0
            content images,rootdir,vztmpl,iso
    
    drbd: drbd1
            nodes px2,px1
            content rootdir,images
            redundancy 2
    
    lvm: vg1
            vgname VG1
            shared
            content rootdir,images
    
    lvm: testVG
            vgname VG1
            content images,rootdir
    testVG was just a test, the disk of VM105 is only on drbd1 - see attachment.

    Sorry for the confusion.
     

    Attached Files:

  6. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    I am not sure how you ended up where you did - did you manually edit the source code of proxmox modules or did you run into upgrade errors that you did not correct? the not-working "qm config" points to a rather broken installation..

    anyhow: your migration log (again) clearly shows that your VM uses a disk on the storage "testVG", not "drbd1" (this is not something that we set/change anywhere unless explicitly asked for).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Code:
    root@px1:/etc# qm config 105
    bootdisk: ide0
    cores: 1
    ide0: drbd1:vm-105-disk-1,size=32G
    ide2: none,media=cdrom
    memory: 512
    name: test2
    net0: bridge=vmbr1,e1000=36:30:64:37:35:37
    numa: 0
    ostype: l26
    smbios1: uuid=689e9200-06a2-4347-abe9-9a09e55176a2
    sockets: 1
    Fixed that - was indeed my fault, I had the /usr/share/perl5/PVE/QemuMigrate.pm opened in the background. Not sure why it throws this error then.

    Whatever: as you can see, vm-105-disk-1 is definitely stored on drbd1, not on testVG and that's the main problem in this case: Proxmox however thinks it is stored on testVG, maybe because drbd1 and VG1/testVG are using the same volume group (VG1).
     
  8. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    I have a suspicion where this originates from, but I will have to do some tests to confirm. Like I said, it is definitely not recommended to share VG in this way (and the usual caveat about DRBD9 still applies ;)).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  9. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Thanks! Glad you now understand the error as it wasn't easy to explain.

    Of course it's not supported or recommended but this isn't a normal behaviour so it should be fixed, I guess. :)
     
  10. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Any news on this topic?
     
  11. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    I am pretty sure where this originates from, but just to get a complete picture: could you post the output of "lvs" and for each VM ID note whether it is configured to use the drbd storage or the lvm storage?

    I think the fix for the mixup should be straightforward (but on the other hand I am still not convinced such a setup is a good idea, so if possible I would recommend to use separate volume groups as DRBD backing store and for direct LVM usage).
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  12. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Sure!

    Code:
    root@px1:~# lvs
      LV               VG   Attr       LSize   Pool Origin Data%  Meta%  Move Log Cpy%Sync Convert
      .drbdctrl_0      VG1  -wi-ao----   4.00m
      .drbdctrl_1      VG1  -wi-ao----   4.00m
      SYSTEM           VG1  -wi-ao----  93.13g
      vm-100-disk-1    VG1  -wi-a----- 400.00g
      vm-102-disk-1_00 VG1  -wi-ao---- 400.09g
      vm-103-disk-1_00 VG1  -wi-ao---- 400.09g
    VM 100 is configured to use LVM.
    VM 102 is configured to use DRBD.
    VM 103 is configured to use DRBD.

    Yup, I'm sure it's not the best approach in this case, but for us it's the only one with which we can work. Long story... ;)
     
  13. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    News on this one?
     
  14. fabian

    fabian Proxmox Staff Member
    Staff Member

    Joined:
    Jan 7, 2016
    Messages:
    3,191
    Likes Received:
    493
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  15. dmp

    dmp New Member

    Joined:
    Jun 8, 2016
    Messages:
    9
    Likes Received:
    0
    Awesome, thank you very much!
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice