[SOLVED] Guest migration via ceph copies disk to local-storage

rh-xZa2W

Active Member
Jun 17, 2020
39
1
28
Hi guys,

i have a single debian with guest-agent, running as a VM on a 3 node-cluster (each node has 2 OSD's building the ceph-cluster (rpool1)).

When i tried to online-migrate the VM i got (you'll notice later that the disk does not lay on the local-storage but on the ceph (pool1), and this local disk below is nowhere configured on the VM but has the same name):
Code:
found local disk 'local:100/vm-100-disk-0.qcow2' (via storage)

So it's copying the VM to the target nodes local-storage as you can see below:
Code:
task started by HA resource agent
2023-06-28 11:04:43 starting migration of VM 100 to node 'host2' (10.1.1.2)
2023-06-28 11:04:43 found local disk 'local:100/vm-100-disk-0.qcow2' (via storage)
2023-06-28 11:04:43 copying local disk images
2023-06-28 11:04:44 file '/var/lib/vz/images/100/vm-100-disk-0.qcow2' already exists - importing with a different name
2023-06-28 11:04:44 Formatting '/var/lib/vz/images/100/vm-100-disk-1.qcow2', fmt=qcow2 cluster_size=65536 extended_l2=off preallocation=metadata compression_type=zlib size=107374182400 lazy_refcounts=off refcount_bits=16
2023-06-28 11:10:53 21320267+0 records in
2023-06-28 11:10:53 21320267+0 records out
2023-06-28 11:10:53 87327813632 bytes (87 GB, 81 GiB) copied, 369.342 s, 236 MB/s
2023-06-28 11:10:53 successfully imported 'local:100/vm-100-disk-1.qcow2'
2023-06-28 11:10:53 679+5327654 records in
2023-06-28 11:10:53 679+5327654 records out
2023-06-28 11:10:53 87327813632 bytes (87 GB, 81 GiB) copied, 366.817 s, 238 MB/s
2023-06-28 11:10:53 volume 'local:100/vm-100-disk-0.qcow2' is 'local:100/vm-100-disk-1.qcow2' on the target
2023-06-28 11:10:53 starting VM 100 on remote node 'host2'
2023-06-28 11:10:54 start remote tunnel
2023-06-28 11:10:55 ssh tunnel ver 1
2023-06-28 11:10:55 starting online/live migration on unix:/run/qemu-server/100.migrate
2023-06-28 11:10:55 set migration capabilities
2023-06-28 11:10:55 migration downtime limit: 100 ms
2023-06-28 11:10:55 migration cachesize: 4.0 GiB
2023-06-28 11:10:55 set migration parameters
2023-06-28 11:10:55 start migrate command to unix:/run/qemu-server/100.migrate
2023-06-28 11:10:56 migration active, transferred 271.8 MiB of 32.0 GiB VM-state, 500.1 MiB/s
2023-06-28 11:10:57 migration active, transferred 499.4 MiB of 32.0 GiB VM-state, 430.6 MiB/s
2023-06-28 11:10:58 migration active, transferred 726.2 MiB of 32.0 GiB VM-state, 226.3 MiB/s
2023-06-28 11:10:59 migration active, transferred 953.5 MiB of 32.0 GiB VM-state, 230.8 MiB/s
2023-06-28 11:11:00 migration active, transferred 1.2 GiB of 32.0 GiB VM-state, 231.0 MiB/s
2023-06-28 11:11:01 migration active, transferred 1.4 GiB of 32.0 GiB VM-state, 229.4 MiB/s
2023-06-28 11:11:02 migration active, transferred 1.6 GiB of 32.0 GiB VM-state, 238.3 MiB/s
2023-06-28 11:11:03 migration active, transferred 1.8 GiB of 32.0 GiB VM-state, 231.4 MiB/s
2023-06-28 11:11:04 migration active, transferred 2.0 GiB of 32.0 GiB VM-state, 231.0 MiB/s
2023-06-28 11:11:05 migration active, transferred 2.3 GiB of 32.0 GiB VM-state, 242.9 MiB/s
2023-06-28 11:11:06 migration active, transferred 2.5 GiB of 32.0 GiB VM-state, 245.4 MiB/s
2023-06-28 11:11:07 migration active, transferred 2.7 GiB of 32.0 GiB VM-state, 238.2 MiB/s
2023-06-28 11:11:08 migration active, transferred 2.9 GiB of 32.0 GiB VM-state, 240.5 MiB/s
2023-06-28 11:11:09 migration active, transferred 3.1 GiB of 32.0 GiB VM-state, 240.5 MiB/s
2023-06-28 11:11:10 migration active, transferred 3.4 GiB of 32.0 GiB VM-state, 252.4 MiB/s
2023-06-28 11:11:11 migration active, transferred 3.6 GiB of 32.0 GiB VM-state, 233.6 MiB/s
2023-06-28 11:11:12 migration active, transferred 3.8 GiB of 32.0 GiB VM-state, 231.6 MiB/s
2023-06-28 11:11:13 migration active, transferred 4.0 GiB of 32.0 GiB VM-state, 240.6 MiB/s
2023-06-28 11:11:14 migration active, transferred 4.3 GiB of 32.0 GiB VM-state, 238.2 MiB/s
2023-06-28 11:11:15 migration active, transferred 4.5 GiB of 32.0 GiB VM-state, 235.9 MiB/s
2023-06-28 11:11:16 migration active, transferred 4.7 GiB of 32.0 GiB VM-state, 252.4 MiB/s
2023-06-28 11:11:17 migration active, transferred 4.9 GiB of 32.0 GiB VM-state, 2.6 GiB/s
2023-06-28 11:11:18 migration active, transferred 5.1 GiB of 32.0 GiB VM-state, 250.1 MiB/s
2023-06-28 11:11:19 migration active, transferred 5.4 GiB of 32.0 GiB VM-state, 3.0 GiB/s
2023-06-28 11:11:20 migration active, transferred 5.6 GiB of 32.0 GiB VM-state, 247.7 MiB/s
2023-06-28 11:11:21 migration active, transferred 5.8 GiB of 32.0 GiB VM-state, 3.3 GiB/s
2023-06-28 11:11:22 migration active, transferred 6.0 GiB of 32.0 GiB VM-state, 222.3 MiB/s
2023-06-28 11:11:23 migration active, transferred 6.2 GiB of 32.0 GiB VM-state, 223.4 MiB/s
2023-06-28 11:11:24 average migration speed: 1.1 GiB/s - downtime 169 ms
2023-06-28 11:11:24 migration status: completed
2023-06-28 11:11:29 migration finished successfully (duration 00:06:46)
TASK OK

Plese note that i have an old disk laying around on the source- & target-node's local-storage, so the process noticed that and renamed the disk (again, the disk of the vm lays on the ceph's rpool1 and not on the loca-storage "/var/lib/vz/"):
Code:
 file '/var/lib/vz/images/100/vm-100-disk-0.qcow2' already exists - importing with a different name

The VM config looks like this:
Code:
root@host1:~# qm config 100
agent: 1
balloon: 0
boot: order=scsi0;ide2;net0
cores: 16
ide2: none,media=cdrom
memory: 32768
meta: creation-qemu=7.0.0,ctime=1668695399
name: hostname
net0: virtio=mac-address,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: pool1:vm-100-disk-0,cache=writeback,size=100G
scsihw: virtio-scsi-pci
smbios1: uuid=690ce6fa-5769-421f-8599-77f67a9677b1
sockets: 1
vmgenid: 3776fc1f-0ee8-44ea-ae2c-3c9c4fbe9445

The Ceph-Cluster status looks like this:
Code:
root@host1:~# ceph -s
  cluster:
    id:     3c755622-85c4-43c7-8ce7-8fe5bd704478
    health: HEALTH_OK
 
  services:
    mon: 3 daemons, quorum host1,host2,host3 (age 21h)
    mgr: host1(active, since 21h), standbys: host2, host3
    osd: 6 osds: 6 up (since 21h), 6 in (since 4M)
 
  data:
    pools:   2 pools, 33 pgs
    objects: 25.43k objects, 99 GiB
    usage:   301 GiB used, 32 TiB / 33 TiB avail
    pgs:     33 active+clean
 
  io:
    client:   2.7 KiB/s wr, 0 op/s rd, 0 op/s wr

All nodes share the following identical version information:
Code:
root@host1:~# pveversion -v
proxmox-ve: 7.4-1 (running kernel: 5.15.108-1-pve)
pve-manager: 7.4-15 (running version: 7.4-15/a5d2a31e)
pve-kernel-5.15: 7.4-4
pve-kernel-5.15.108-1-pve: 5.15.108-1
pve-kernel-5.15.83-1-pve: 5.15.83-1
ceph: 17.2.6-pve1
ceph-fuse: 17.2.6-pve1
corosync: 3.1.7-pve1
criu: 3.15-1+pve-1
glusterfs-client: 9.2-1
ifupdown2: 3.1.0-1+pmx4
libjs-extjs: 7.0.0-1
libknet1: 1.24-pve2
libproxmox-acme-perl: 1.4.4
libproxmox-backup-qemu0: 1.3.1-1
libproxmox-rs-perl: 0.2.1
libpve-access-control: 7.4.1
libpve-apiclient-perl: 3.2-1
libpve-common-perl: 7.4-2
libpve-guest-common-perl: 4.2-4
libpve-http-server-perl: 4.2-3
libpve-rs-perl: 0.7.7
libpve-storage-perl: 7.4-3
libspice-server1: 0.14.3-2.1
lvm2: 2.03.11-2.1
lxc-pve: 5.0.2-2
lxcfs: 5.0.3-pve1
novnc-pve: 1.4.0-1
proxmox-backup-client: 2.4.2-1
proxmox-backup-file-restore: 2.4.2-1
proxmox-kernel-helper: 7.4-1
proxmox-mail-forward: 0.1.1-1
proxmox-mini-journalreader: 1.3-1
proxmox-offline-mirror-helper: 0.5.2
proxmox-widget-toolkit: 3.7.3
pve-cluster: 7.3-3
pve-container: 4.4-6
pve-docs: 7.4-2
pve-edk2-firmware: 3.20230228-4~bpo11+1
pve-firewall: 4.3-4
pve-firmware: 3.6-5
pve-ha-manager: 3.6.1
pve-i18n: 2.12-1
pve-qemu-kvm: 7.2.0-8
pve-xtermjs: 4.16.0-2
qemu-server: 7.4-4
smartmontools: 7.2-pve3
spiceterm: 3.2-2
swtpm: 0.8.0~bpo11+3
vncterm: 1.7-1
zfsutils-linux: 2.1.11-pve

Is it a problem if i have same-named disks laying on local-storage even if those aren't configured on the VM where it should use the ceph rpool1?

Thanks guys, it's probably a no-brainer but maybe you have some hint.
 
Hi,
yes, migration in Proxmox VE 7 and earlier scans all local storages for orphaned disks and picks them up during migration. Since that is rather confusing, it was changed in Proxmox VE 8 (QEMU section, point about migration): https://pve.proxmox.com/wiki/Roadmap#8.0-known-issues

If you don't need the orphaned disk, remove it, or rename it to avoid interference with migration.
 
Last edited:

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!