"ERROR: can't do online migration - VM uses local disks" on shared storage

Discussion in 'Proxmox VE: Installation and configuration' started by tarax, Dec 14, 2012.

  1. tarax

    tarax Member

    Joined:
    Apr 2, 2010
    Messages:
    43
    Likes Received:
    1
    Hi,

    Slowly building my first production PVE HA cluster (2 node), I solved all the troubles I faced up to now (and learned _a_lot_ on this road !) but I'm breaking my teeth on the last I'm struggling with :'(

    Nodes (pve01 and pve02, HP ML350, up to date PVE 2.2), network (2xGbEth bond0/vmbr0 for the LAN, 2xGbEth direct links bond1/vmbr1 w/ jumbo frames for storage), storage (HW RAID, LVM VG on DRBD on LVM LV) are setup, cluster is created and nodes are joined, fencing (domain and ipmilan devices) is configured and barely tested. Only qdisk left to be configured.

    About the storage configuration
    Code:
    2xIntel 520 240Gb SSDs
      \_HP SA P410 w/ 512M FBWC RAID1 array with accelerator disabled as recommended by HP
        \_ /dev/sda3 configured as LVM physical volume
          \_ vgpve01vms: LVM VG for virtual machines storage
            \_ lvdrbd0vm101: DRBD backing LV for VM 101 (for resizable DRBD devices etc.)
              \_ drbd0: Pri/Pri DRBD device dedicated to VM 101 (for isolated DRBD mgmt of each VM)
                \_ vgdrbd0vm101: "vm101" PVE _shared_ storage (dedicated to VM 101)
                  \_ vm-101-disk-1: VM 101 KVM disk device configured as "virtio0: vm101:vm-101-disk-1"
    
    VM 101 is a KVM W7 VM configured as follow:
    Code:
    bootdisk: virtio0
    cores: 2
    cpu: host
    keyboard: fr
    memory: 2048
    name: wseven
    net0: virtio=8E:CA:D1:58:82:A0,bridge=vmbr0
    ostype: win7
    sockets: 1
    virtio0: vm101:vm-101-disk-1
    
    DRBD is working nicely through its dedicated network.
    Code:
     0: cs:Connected ro:Primary/Primary ds:UpToDate/UpToDate C r-----
        ns:21734972 nr:0 dw:13289664 dr:131091039 al:2372 bm:1298 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0
    
    Network configuration seems stable and well performing according to iperf.
    PVE storage configuration seems all right and is coherent on both nodes. "vm101" VG is actually marked as shared and available to nodes pve01 and pve02 in storage.cfg on both servers:
    Code:
    lvm: vm101
            vgname vgdrbd0vm101
            shared
            content images
            nodes pve01,pve02
    
    VM starts, runs, stops and is backep up (to RDX) smoothly...

    BUT I can't migrate it because of the following error:
    Code:
    1443 Dec 14 12:02:26 starting migration of VM 101 to node 'pve02' (192.168.100.20)
    1444 Dec 14 12:02:26 copying disk images
    1445 Dec 14 12:02:26 ERROR: Failed to sync data - can't do online migration - VM uses local disks
    1446 Dec 14 12:02:26 aborting phase 1 - cleanup resources
    1447 Dec 14 12:02:26 ERROR: migration aborted (duration 00:00:00): Failed to sync data - can't do online migration - VM uses local disks
    1448 TASK ERROR: migration aborted
    
    I may have made a mistake, but I really can't see which one as this worked on my PoC setup (that I sadly can't access ATM :'( ) This PoC was built exactly the same way (LVM VG on DRBD on LVM LV on Soft RAID) and live migration of (W$, Linux, FreeBSD) VMs was working without a hitch.

    But The Beast keep telling me I'm using local disks and I can't possibly figure out why ?!
    The only suspect I can think of is CLVM which I'd have supposed to be running... but is not. As the few relevant mentions of it (replacing PVE's own lock mechanism in v2 on one side, and only needed in v2 in rare circumstances on the other) seem a little contradictory to me, I thought I'd better ask for enlightenment before struggling further.

    Thank you in advance for your answers and comments.
    Time for a recomforting 12 year ol' bram...
    Have a nice WE
    Bests
     
  2. udo

    udo Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Apr 22, 2009
    Messages:
    5,691
    Likes Received:
    147
    Hi,
    I'm also don't see the problem - but for me is your config much too complicated and first of all too unflexible.
    I prefer:
    Code:
    disks -> Raid-controller -> volume [sdb]
    sdb1 - drbd0 - lvm a_ssd_r0 (lvm-storage for all VMs on node a on drbd ressource r0)
    sdb2 - drbd1 - lvm b_ssd_r1 (lvm-storage for all VMs on node b on drbd ressource r1)
    
    If one VM need the full space - why not, but normaly you can use the drbd-space for some VMs and expanding an vm-disks is very easy.

    One curios thing: your vm-disks has'nt the disk space in the config. Normaly with pve2.2 you should have the entry?!
    Like "virtio0: a_sas_r0:vm-151-disk-1,size=8G".

    Udo
     
  3. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,211
    Likes Received:
    269
    Code:
    1445 Dec 14 12:02:26 ERROR: Failed to sync data - can't do online migration - VM uses local disks
    
    Most likely there is some stale disk somewhere. Try to run:

    # qm rescan --vmid 101

    After that you should see all disk in the VM config.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  4. mmenaz

    mmenaz Member

    Joined:
    Jun 25, 2009
    Messages:
    735
    Likes Received:
    4
    Are you sure your VM is not using an ISO as CD? I know nothing about cluster and HA, but when I backup a VM with an ISO (like the virtio.iso or the installation ISO that I forgot) and try to restore in another Proxmox server I get an error.
    So go in your hardware config of the VM, or check the 101.config for that.
     
  5. tarax

    tarax Member

    Joined:
    Apr 2, 2010
    Messages:
    43
    Likes Received:
    1
    Hello,

    Dietmar, you made my day !!!
    Code:
    # qm rescan -vmid 101
    # git diff pve/nodes/pve01/qemu-server/101.conf
    diff --git a/pve/nodes/pve01/qemu-server/101.conf b/pve/nodes/pve01/qemu-server/101.conf
    index 103a2bd..5b57b8d 100644
    --- a/pve/nodes/pve01/qemu-server/101.conf
    +++ b/pve/nodes/pve01/qemu-server/101.conf
    @@ -1,11 +1,11 @@
     bootdisk: virtio0
     cores: 2
     cpu: host
    -ide2: none,media=cdrom,size=44690K
     keyboard: fr
     memory: 2048
     name: wseven
     net0: virtio=8E:CA:D1:58:82:A0,bridge=vmbr0
     ostype: win7
     sockets: 1
    -virtio0: vm101:vm-101-disk-1
    +unused0: local:101/vm-101-disk-1.raw
    +virtio0: vm101:vm-101-disk-1,size=18G
    
    Remove the once deleted and back from the grave ide2 and unused0 devices !
    And after that:
    Code:
    Dec 15 15:07:02 starting migration of VM 101 to node 'pve02' (192.168.100.20)
    Dec 15 15:07:02 copying disk images
    Dec 15 15:07:02 starting VM 101 on remote node 'pve02'
    Dec 15 15:07:03 starting migration tunnel
    Dec 15 15:07:04 starting online/live migration on port 60000
    Dec 15 15:07:16 migration speed: 170.67 MB/s
    Dec 15 15:07:16 migration status: completed
    Dec 15 15:07:19 migration finished successfuly (duration 00:00:17)
    TASK OK
    
    This nevertheless lead me to the following question: where do these ide2 and unused0 devices were staled ( ? Is there some kind of caching somewhere ? Is this due to pmxcfs internals where FS view can become different of DB content ?

    Anyway, you can be sure I won't forget this one 'cause it's bitten me hard !
    Thanks again Dietmar for the speed and efficacity of your support
    Have a _very_ nice WE !

    PS: When switching a cdrom to not use any media, shouldn't the size be set to 0K or simply unset instead of retaining the size of the previous media ?
     
  6. tarax

    tarax Member

    Joined:
    Apr 2, 2010
    Messages:
    43
    Likes Received:
    1
    Hi Udo,

    Why is that unflexible ? In which case(s) ?

    With this setup, storage space allocation between VMs of each node is carved in stone. How do you handle the case where you initially allocated 50/50 and you later have a VM requiring more storage space than others putting you in need of a 30/70 repartition ?
    Making sdb a physical volume, creating a volume group for VM storage, and using logical volume as DRBD backing devices, I can allocate space to any VM on any node. Plus, if a VM comes to need more space, a 'lvresize' followed by a 'drbdadm resize' do the job.
    Finally, for devices stack creation/deletion, a small shell script with a handfull of arguments automates and normalize these operations
    I agree this is a piece of layer cake where one can get confused, but a good naming convention (still not really satisfied with mine) and ideally a recent version of util-linux providing the allmighty 'lsblk' command make things very much clear.

    Well spotted ! Solved by Dietmar's 'qm rescan' advise :)

    Bests
     
  7. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,211
    Likes Received:
    269
    Seems a remove failed, but I have node idea why. Maybe you manually edited the VM config?

    what?

    yes
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  8. tarax

    tarax Member

    Joined:
    Apr 2, 2010
    Messages:
    43
    Likes Received:
    1
    Yes I did ! Aren't we supposed to do so ?

    As I understood it pmxcfs is a fuse FS above some kind of distributed SQLite DB (but chances are I completly misunderstood !)... so I though there could be situations where FS view/files content could be out of sync with internal/underlying DB content and 'qm rescan' is a way to recover sync state (may be reading straight from DB).
    If I totaly lost myself here, be assured of my gratefullness if you care sheding some light on this... may be simply just expliciting what 'qm rescan' does actually ?

    ok. Should I fill some kind of ticket/bug report somewhere for this ?
     
  9. udo

    udo Well-Known Member
    Proxmox VE Subscriber

    Joined:
    Apr 22, 2009
    Messages:
    5,691
    Likes Received:
    147
    Hi tarax,
    the most effort of an virtualisation platform is to create fast VMs. In your case you must define for each first an LV, an DRBD-resource, sync them and so on.
    And if you have trouble with an node or the connection between both, you must sync a lot of drbd-resources...
    Yes, this is an limit which you don't have. But I can live without problems with this (depends on the recource planing).
    ok - with an script to do the job automaticly it's perhaps also usefull (but not for colleagues, which need the gui to create an VM).
    Have you make performance test? If you have 20-30 VMs some with more than one disks I think there was an impact, or not?

    Udo
     
  10. dietmar

    dietmar Proxmox Staff Member
    Staff Member

    Joined:
    Apr 28, 2005
    Messages:
    16,211
    Likes Received:
    269
    If you remove a drive manually from the config, you also need to remove the corresponding file manually..

    You simply removed something from the config without removing the corresponding file. 'qm rescan' detects that and re-add it to the config.


    Yes - please file a bug at bugzilla.proxmox.com
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  11. tarax

    tarax Member

    Joined:
    Apr 2, 2010
    Messages:
    43
    Likes Received:
    1
    In this case it detects a non existing cdrom with a faulty size remaining from the last used ISO image (the virtio one for instance), and a dummy file image just aimed at loading VirtIO drivers on new W$ installs. In both cases, once the job done, I don't want to delete neither the CD ISO, neither the dummy disk image as I will use them again.
    Anyway, this is still not clear where those params can get stuck and where/how does 'qm rescan' reveal them ?


    Done
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice