livemirgration failes since (probabbly) qemu 2.11

Sralityhe

Well-Known Member
Jul 5, 2017
81
3
48
31
Hello Forum,

since the upgrade to qemu 2.11 (i think the issue is here, i cant say for sure, but it came with the last "monthly update" i did) i get a strange error while performing livemirgration.

At first a migrations works but still output this error:
Code:
all mirroring jobs are ready
2018-04-13 13:48:37 starting online/live migration on unix:/run/qemu-server/127.                                                           migrate
2018-04-13 13:48:37 migrate_set_speed: 8589934592
2018-04-13 13:48:37 migrate_set_downtime: 0.1
2018-04-13 13:48:37 set migration_caps
2018-04-13 13:48:37 set cachesize: 67108864
2018-04-13 13:48:37 start migrate command to unix:/run/qemu-server/127.migrate
2018-04-13 13:48:38 migration status: active (transferred 119332508, remaining 1                                                           98565888), total 554508288)
2018-04-13 13:48:38 migration xbzrle cachesize: 67108864 transferred 0 pages 0 c                                                           achemiss 0 overflow 0
2018-04-13 13:48:39 migration status: active (transferred 236949574, remaining 7                                                           5415552), total 554508288)
2018-04-13 13:48:39 migration xbzrle cachesize: 67108864 transferred 0 pages 0 c                                                           achemiss 0 overflow 0
2018-04-13 13:48:40 migration speed: 10.24 MB/s - downtime 24 ms
2018-04-13 13:48:40 migration status: completed
drive-scsi0: transferred: 5368709120 bytes remaining: 0 bytes total: 5368709120                                                            bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
drive-scsi0: Completing block job...
drive-scsi0: Completed successfully.
drive-scsi0 : finished
2018-04-13 13:48:41 ERROR: command '/usr/bin/ssh -e none -o 'BatchMode=yes' -o '                                                           HostKeyAlias=pve03' root@10.14.0.112 qm nbdstop 127' failed: exit code 2
2018-04-13 13:48:45 ERROR: migration finished with problems (duration 00:00:57)
migration problems

The VM is mirgrated after that and it seems to work "perfectly"

However, when i try to migrate back to the same node the vm was running before:

Code:
all mirroring jobs are ready
2018-04-13 13:49:03 starting online/live migration on unix:/run/qemu-server/127.migrate
2018-04-13 13:49:03 migrate_set_speed: 8589934592
2018-04-13 13:49:03 migrate_set_downtime: 0.1
2018-04-13 13:49:03 set migration_caps
2018-04-13 13:49:03 set cachesize: 67108864
2018-04-13 13:49:03 start migrate command to unix:/run/qemu-server/127.migrate
2018-04-13 13:49:04 migration status: active (transferred 118150530, remaining 199753728), total 554508288)
2018-04-13 13:49:04 migration xbzrle cachesize: 67108864 transferred 0 pages 0 cachemiss 0 overflow 0
2018-04-13 13:49:05 migration status: active (transferred 235159834, remaining 77381632), total 554508288)
2018-04-13 13:49:05 migration xbzrle cachesize: 67108864 transferred 0 pages 0 cachemiss 0 overflow 0
2018-04-13 13:49:06 migration status error: failed
2018-04-13 13:49:06 ERROR: online migrate failure - aborting
2018-04-13 13:49:06 aborting phase 2 - cleanup resources
2018-04-13 13:49:06 migrate_cancel
drive-scsi0: Cancelling block job
drive-scsi0: Done.
2018-04-13 13:49:09 ERROR: migration finished with problems (duration 00:00:10)
migration problems

I tried to execute the command that failed and got:

Code:
root@pve03:~# qm nbdstop 127
Undefined subroutine &PVE::QemuServer::nbd_stop called at /usr/share/perl5/PVE/CLI/qm.pm line 259.

Do you guys have any idea or can confirm?

Kind regards
 
mhmm the nbd_stop sub was accidentally deleted, i already sent a patch to readd it

in the meanwhile you can stop the nbd server on pve03:

go to the webinterface to the vm
click on 'monitor'
enter 'nbd_server_stop'

this achieves the same
if the migration back still does not work, please send a pveversion -v of both nodes and the vm config
 
  • Like
Reactions: Sralityhe
Hi Dominik,

Thank you very much. It seems to work!
Also big thanks for the patch :) It will probabbly come back within the next weeks or so?

Kind regards