Storage migration fails at 99,86%

mgiammarco

Well-Known Member
Feb 18, 2010
161
7
58
Hello,
I have a three nodes proxmox 5.1-36 licensed cluster with ceph.
I am using "move disk" to move some vms disks in another ceph pool.
I have started with first two vms and disks migrated fine.
The third vm has two disks. I tried to migrate the first disk several times but each time it reached around 99.86% then decided immediately and automatically to cancel the operation.
I then tried to move the disk with the vm in stopped state and this time I got a different error "exit code 1".
I have tried on cli the same command I see on gui log and it says to me "error reading from... no such file or directory".
The path was the new rbd in the new pool so I can understand it was not able to find it, if it does not create the rbd before.
Can you help me?
Where is a good log I can read?
Thanks,
Mario
 
Is the pool you are moving the disk to, big enough (incl. replication)? You can check in the syslog for details. Does the VM disk move if you shutdown the VM?
 
Actually if I try to move the disk with VM shutdown the result is worse: I got an "exit error code1" as I said. If you tell me where these errors are located I can paste the exact error.
I have enough space for migration.
 
You will find some output in the cluster log and you may see something in syslog, that might give a hint on what is happening. As a low level solution, you can use 'rbd copy default/vm-100-disk-1 test1/vm-100-disk-1'.
 
I have found the command that proxmox generates when I try storage migration with vm SHUTDOWN.
Here it is with the error (sorry for bad formatting):

/usr/bin/qemu-img convert -p -n -f raw -O raw
'rbd:tank/vm-101-disk1:mon_host=10.2.17.1;10.2.17.2;10.2.17.3:auth_supported=cephx:id=admin:keyring=/etc/pve/priv/ceph/new-tank.keyring'
'zeroinit:rbd:tankC/vm-101-disk-1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/tankC_vm.keyring'
qemu-img: Could not open 'zeroinit:rbd:tankC/vm-101-disk-1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/tankC_vm.keyring'

: error reading header from vm-101-disk-1: No such file or directory
 
'zeroinit:rbd:tankC/vm-101-disk-1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/tankC_vm.keyring'
qemu-img: Could not open 'zeroinit:rbd:tankC/vm-101-disk-1:conf=/etc/pve/ceph.conf:id=admin:keyring=/etc/pve/priv/ceph/tankC_vm.keyring'
Here is the culprit in the naming 'tankC' vs 'tankC_vm'. PVE is not able to connect to the pool as it expects the keyring to be the same name as the storage.
 
Thank you for this precious hint. I will try to copy keyring and see what's happen.
Now I have these considerations:
- is this a bug? The command has been generated automatically by Proxmox. Also this is the first time that I create a pool telling to Proxmox to automatically create _ct and _vm storage.
- only this vm had the problem. Other vms migrated without problems. But please note that I have NOT tried to migrate other vms offline.
 
Ah, sorry I didn't follow up :confused: that you are on 5.1, _ct & _vm storage entries. As you have migrated other VMs already, then there is no connectivity issue to the pool. Did you do any non-standard configurations to the VM?
 
It is a vm (like others...) that I have virtualized from physical with two disks. I try to find whats happened but as far as I know I have not done anything strange with it.
 
Can you please post your storage.conf and your vmid.conf? And does a move disk to local and then to the ceph pool work?
 
Here is storage.conf:
Code:
dir: local
        path /var/lib/vz
        content backup,iso,vztmpl

lvmthin: local-lvm
        thinpool data
        vgname pve
        content images,rootdir

nfs: OLD
        export /e/new-bk
        path /mnt/pve/OLD
        server 10.2.17.250
        content images,iso,backup,vztmpl,rootdir
        maxfiles 10
        options vers=3

nfs: backupproxmox
        export /volume1/backupproxmox
        path /mnt/pve/backupproxmox
        server 10.2.17.231
        content rootdir,vztmpl,backup,images
        maxfiles 4
        options vers=3

nfs: backupceph
        export /volume1/backupceph
        path /mnt/pve/backupceph
        server 10.2.17.231
        content images
        maxfiles 1
        options vers=3

nfs: ISO
        export /volume1/iso
        path /mnt/pve/ISO
        server 10.2.17.231
        content iso
        maxfiles 1
        options vers=3

rbd: new-tank
        content images
        krbd 0
        monhost 10.2.17.1,10.2.17.2,10.2.17.3
        pool tank
        username admin

rbd: tankC_vm
        content images
        krbd 0
        pool tankC

rbd: tankC_ct
        content rootdir
        krbd 1
        pool tankC

and vmid.conf:

Code:
agent: 1
boot: cdn
bootdisk: virtio0
cores: 16
ide2: none,media=cdrom
memory: 30000
name: Navision
net0: virtio=46:7C:4B:55:C7:F4,bridge=vmbr0
numa: 0
ostype: win7
scsihw: virtio-scsi-pci
smbios1: uuid=2ec30733-101e-4b9b-9ecb-43a52516fbcc
sockets: 1
virtio0: new-tank:vm-101-disk-1,size=279G
virtio1: new-tank:vm-101-disk-2,size=558G

Unfortunately I have not free space in local storage. I try to see what can I do, I understand is a good idea.
 
Are both tank-new and tankC-VM on the same cluster? And what are the rbd features of the VM disk (rbd info tank/vm-101-disk-1)?
 
Are both tank-new and tankC-VM on the same cluster? And what are the rbd features of the VM disk (rbd info tank/vm-101-disk-1)?
Yes they are both on same cluster.
Here it is rbd info:
Code:
rbd image 'vm-101-disk-1':
        size 279 GB in 71518 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.1dc5c3643c9869
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        flags:
        create_timestamp: Thu Aug 17 00:52:04 2017

and rbd info of another vm:

Code:
rbd info tankC/vm-102-disk-1
rbd image 'vm-102-disk-1':
        size 436 GB in 111616 objects
        order 22 (4096 kB objects)
        block_name_prefix: rbd_data.60dae0643c9869
        format: 2
        features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
        flags:
        create_timestamp: Sat Nov  4 21:25:19 2017
 
Does your ceph pool have a quota? Are the migrated machines from the sam server?
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!