[SOLVED] VM migration failed

Discussion in 'Proxmox VE: Installation and configuration' started by Claudiu Popescu, Dec 14, 2018.

  1. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    Hi,

    I finished migrating VMs from a proxmox server 4 to 5.3 and it worked fine. Now I have another one from which I want to migrate VMs but I am facing a problem.

    Code:
    send from @ to rpool/ROOT/subvol-125-disk-1@__migration__ estimated size is 610M
    total estimated size is 610M
    TIME        SENT   SNAPSHOT
    cannot receive new filesystem stream: destination 'rpool/ROOT/subvol-125-disk-1' exists
    must specify -F to overwrite it
    zfs send/receive failed, cleaning up snapshot(s)..
    could not find any snapshots to destroy; check snapshot names.
    could not remove target snapshot: command 'ssh root@IP zfs destroy rpool/ROOT/subvol-125-disk-1@__migration__' failed: exit code 1
    
    Dec 14 13:35:12 ERROR: command 'set -o pipefail && zfs send -Rpv rpool/ROOT/subvol-125-disk-1@__migration__ | ssh root@IP zfs recv rpool/ROOT/subvol-125-disk-1' failed: exit code 1
    Dec 14 13:35:12 aborting phase 1 - cleanup resources
    Dec 14 13:35:12 ERROR: found stale volume copy 'z1local:subvol-125-disk-1' on node 'node5'
    Dec 14 13:35:12 ERROR: found stale volume copy 'z2local:subvol-125-disk-1' on node 'node5'
    Dec 14 13:35:12 start final cleanup
    Dec 14 13:35:12 start container on target node
    Dec 14 13:35:12 # /usr/bin/ssh -o 'BatchMode=yes' root@IP pct start 125
    Dec 14 13:35:13 Configuration file 'nodes/node5/lxc/125.conf' does not exist
    Dec 14 13:35:13 ERROR: command '/usr/bin/ssh -o 'BatchMode=yes' root@IP pct start 125' failed: exit code 255
    Dec 14 13:35:13 ERROR: migration aborted (duration 00:00:31): command 'set -o pipefail && zfs send -Rpv rpool/ROOT/subvol-125-disk-1@__migration__ | ssh root@IP zfs recv rpool/ROOT/subvol-125-disk-1' failed: exit code 1
    migration aborted
    
    I am sure that volume was not present on destination server.
    Any ideas what could be wrong? Thank you.
     
  2. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,454
    Likes Received:
    285
    Hi,

    the output look like the disk exists on the dest sever.
    please check twice.

    Code:
    zfs list -rt all rpool
    
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  3. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    I checked that prior to migration and the disk was not present.
    I deleted the disk and retried the migration, same issue. Also tried to migrate a new container with no disk present on destination server, same issue.
     
  4. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,454
    Likes Received:
    285
    How do you do the migration?
    If you use the command line please send the full command.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  5. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    Tried both command line and web.
    CLI command: pct migrate 125 node5 --restart

    Not sure where to specify -F: "must specify -F to overwrite it", can't find it in man pages.

    When the disk is not present on destination server, the migration process takes longer since it migrates the disk but eventually fails with that error.
    If I retry the migration with disk present on destination server, it fails instantly without migrating the disk.
     
  6. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,454
    Likes Received:
    285
    This is a ZFS receive option, this can't be set at the migration.

    Do you have multiple mount points?
    please send the output of
    Code:
    pct config 125
    
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  7. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    Just one mount point and this is the case for all my containers/VMs.
    Code:
    arch: amd64
    cpulimit: 1
    cpuunits: 1024
    hostname: CT125
    memory: 512
    net0: name=eth0,bridge=vmbr1,hwaddr=66:32:32:62:35:32,type=veth
    onboot: 1
    ostype: ubuntu
    rootfs: z2local:subvol-125-disk-1,size=8G
    swap: 512
     
  8. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    I decided to upgrade the older node to the latest proxmox version available and reboot since I had no way of migrating my workload.
    Now after this process I see the same error:

    Code:
    2019-02-14 09:48:18 starting migration of CT 125 to node 'z8' (IP)
    2019-02-14 09:48:19 found local volume 'z1local:subvol-125-disk-1' (via storage)
    2019-02-14 09:48:19 found local volume 'z2local:subvol-125-disk-1' (in current VM config)
    full send of rpool/ROOT/subvol-125-disk-1@__migration__ estimated size is 610M
    total estimated size is 610M
    TIME SENT SNAPSHOT
    09:48:21 52.3M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:22 61.7M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:23 61.7M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:24 78.6M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:25 129M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:26 134M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:27 149M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:28 210M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:29 304M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:30 411M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:31 507M rpool/ROOT/subvol-125-disk-1@__migration__
    09:48:32 605M rpool/ROOT/subvol-125-disk-1@__migration__
    full send of rpool/ROOT/subvol-125-disk-1@__migration__ estimated size is 610M
    total estimated size is 610M
    TIME SENT SNAPSHOT
    rpool/ROOT/subvol-125-disk-1 name rpool/ROOT/subvol-125-disk-1 -
    volume 'rpool/ROOT/subvol-125-disk-1' already exists
    command 'zfs send -Rpv -- rpool/ROOT/subvol-125-disk-1@__migration__' failed: got signal 13
    send/receive failed, cleaning up snapshot(s)..
    2019-02-14 09:48:34 ERROR: command 'set -o pipefail && pvesm export z1local:subvol-125-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=z8' root@IP -- pvesm import z1local:subvol-125-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
    2019-02-14 09:48:34 aborting phase 1 - cleanup resources
    2019-02-14 09:48:34 ERROR: found stale volume copy 'z2local:subvol-125-disk-1' on node 'z8'
    2019-02-14 09:48:34 ERROR: found stale volume copy 'z1local:subvol-125-disk-1' on node 'z8'
    2019-02-14 09:48:34 start final cleanup
    2019-02-14 09:48:34 ERROR: migration aborted (duration 00:00:16): command 'set -o pipefail && pvesm export z1local:subvol-125-disk-1 zfs - -with-snapshots 0 -snapshot __migration__ | /usr/bin/ssh -e none -o 'BatchMode=yes' -o 'HostKeyAlias=z8' root@IP -- pvesm import z1local:subvol-125-disk-1 zfs - -with-snapshots 0 -delete-snapshot __migration__' failed: exit code 255
    TASK ERROR: migration aborted
    
    The volume is created as soon as the first disk __migration__ transfer starts. The error is displayed after the transfer completes since it tries to create the disk again.
    So I guess the problem is that it does not create the volume with: rpool/ROOT/subvol-125-disk-1@__migration__ but instead as the final destination like shown with zfs list: rpool/ROOT/subvol-125-disk-1 367M 7.64G 367M /rpool/ROOT/subvol-125-disk-1
    I really am at a loss here and I can't figure it out. I need your support in order to get past this.

    Thank you.
     
  9. wolfgang

    wolfgang Proxmox Staff Member
    Staff Member

    Joined:
    Oct 1, 2014
    Messages:
    4,454
    Likes Received:
    285
    Now I see the problem.
    You have 2 storages in the same subset of the pool.
    So, PVE tries to migrate the same image two times.
    You should not have two storage descriptions on the same subset of a pool.
     
    Stop hovering to collapse... Click to collapse... Hover to expand... Click to expand...
  10. Claudiu Popescu

    Claudiu Popescu New Member
    Proxmox Subscriber

    Joined:
    Jul 24, 2018
    Messages:
    9
    Likes Received:
    0
    Ok, this is solved now, thank you.
     
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.
    Dismiss Notice