Migrate VM problem: SSH error?

devawpz

Member
Sep 21, 2020
30
0
6
Coming from this thread, for information purposes.

In order to get this done, I migrated the VM to another node, to avoid the downtime.

Migrating back, I got this error:


Code:
2020-10-06 19:32:44 ERROR: online migrate failure - aborting

2020-10-06 19:32:44 aborting phase 2 - cleanup resources

2020-10-06 19:32:44 migrate_cancel

drive-scsi0: Cancelling block job

channel 4: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

drive-scsi0: Done.

2020-10-06 19:32:49 ERROR: migration finished with problems (duration 00:05:06)

TASK ERROR: migration problems


I've tried putting AllowTcpForwarding yes , to handle local and remote port forwarding, but with no success.

Is my cluster screwed up?

Would be happy if I can get any help.
 
Trying to post some logs, maybe this can help. Part 1/3:


Code:
2020-10-11 01:56:55 starting migration of VM 105 to node 'alpha' (110.225.10.131)
2020-10-11 01:56:55 found local disk 'local-raid:105/vm-105-disk-0.qcow2' (in current VM config)
2020-10-11 01:56:55 copying local disk images
2020-10-11 01:56:55 starting VM 105 on remote node 'alpha'
2020-10-11 01:57:00 start remote tunnel
2020-10-11 01:57:01 ssh tunnel ver 1
2020-10-11 01:57:01 starting storage migration
2020-10-11 01:57:01 scsi0: start migration to nbd:unix:/run/qemu-server/105_nbd.migrate:exportname=drive-scsi0
drive mirror is starting for drive-scsi0
drive-scsi0: transferred: 127926272 bytes remaining: 68591550464 bytes total: 68719476736 bytes progression: 0.19 % busy: 1 ready: 0
drive-scsi0: transferred: 242221056 bytes remaining: 68477255680 bytes total: 68719476736 bytes progression: 0.35 % busy: 1 ready: 0
drive-scsi0: transferred: 357564416 bytes remaining: 68361912320 bytes total: 68719476736 bytes progression: 0.52 % busy: 1 ready: 0
drive-scsi0: transferred: 479199232 bytes remaining: 68240277504 bytes total: 68719476736 bytes progression: 0.70 % busy: 1 ready: 0
drive-scsi0: transferred: 598736896 bytes remaining: 68120739840 bytes total: 68719476736 bytes progression: 0.87 % busy: 1 ready: 0
drive-scsi0: transferred: 719323136 bytes remaining: 68000153600 bytes total: 68719476736 bytes progression: 1.05 % busy: 1 ready: 0
drive-scsi0: transferred: 839909376 bytes remaining: 67879567360 bytes total: 68719476736 bytes progression: 1.22 % busy: 1 ready: 0
drive-scsi0: transferred: 962592768 bytes remaining: 67756883968 bytes total: 68719476736 bytes progression: 1.40 % busy: 1 ready: 0
drive-scsi0: transferred: 1081081856 bytes remaining: 67638394880 bytes total: 68719476736 bytes progression: 1.57 % busy: 1 ready: 0
drive-scsi0: transferred: 1206910976 bytes remaining: 67512565760 bytes total: 68719476736 bytes progression: 1.76 % busy: 1 ready: 0
drive-scsi0: transferred: 1328545792 bytes remaining: 67390930944 bytes total: 68719476736 bytes progression: 1.93 % busy: 1 ready: 0
drive-scsi0: transferred: 1454374912 bytes remaining: 67265101824 bytes total: 68719476736 bytes progression: 2.12 % busy: 1 ready: 0
drive-scsi0: transferred: 2320498688 bytes remaining: 66398978048 bytes total: 68719476736 bytes progression: 3.38 % busy: 1 ready: 0
drive-scsi0: transferred: 2443182080 bytes remaining: 66276294656 bytes total: 68719476736 bytes progression: 3.56 % busy: 1 ready: 0
drive-scsi0: transferred: 2979004416 bytes remaining: 65740472320 bytes total: 68719476736 bytes progression: 4.34 % busy: 1 ready: 0
drive-scsi0: transferred: 3096444928 bytes remaining: 65623031808 bytes total: 68719476736 bytes progression: 4.51 % busy: 1 ready: 0
drive-scsi0: transferred: 3215982592 bytes remaining: 65503494144 bytes total: 68719476736 bytes progression: 4.68 % busy: 1 ready: 0
drive-scsi0: transferred: 3335520256 bytes remaining: 65383956480 bytes total: 68719476736 bytes progression: 4.85 % busy: 1 ready: 0
drive-scsi0: transferred: 3463446528 bytes remaining: 65256030208 bytes total: 68719476736 bytes progression: 5.04 % busy: 1 ready: 0
drive-scsi0: transferred: 3580887040 bytes remaining: 65138589696 bytes total: 68719476736 bytes progression: 5.21 % busy: 1 ready: 0
drive-scsi0: transferred: 3708813312 bytes remaining: 65010663424 bytes total: 68719476736 bytes progression: 5.40 % busy: 1 ready: 0
drive-scsi0: transferred: 3827302400 bytes remaining: 64892174336 bytes total: 68719476736 bytes progression: 5.57 % busy: 1 ready: 0
drive-scsi0: transferred: 3943694336 bytes remaining: 64775782400 bytes total: 68719476736 bytes progression: 5.74 % busy: 1 ready: 0
drive-scsi0: transferred: 4502585344 bytes remaining: 64216891392 bytes total: 68719476736 bytes progression: 6.55 % busy: 1 ready: 0
drive-scsi0: transferred: 4623171584 bytes remaining: 64096305152 bytes total: 68719476736 bytes progression: 6.73 % busy: 1 ready: 0
drive-scsi0: transferred: 6680477696 bytes remaining: 62038999040 bytes total: 68719476736 bytes progression: 9.72 % busy: 1 ready: 0
drive-scsi0: transferred: 6793723904 bytes remaining: 61925752832 bytes total: 68719476736 bytes progression: 9.89 % busy: 1 ready: 0
drive-scsi0: transferred: 10737418240 bytes remaining: 57982058496 bytes total: 68719476736 bytes progression: 15.62 % busy: 1 ready: 0

<cut>
 
Part 2/3:

Code:
<cut>


drive-scsi0: transferred: 23904387072 bytes remaining: 44816859136 bytes total: 68721246208 bytes progression: 34.78 % busy: 1 ready: 0
drive-scsi0: transferred: 24015536128 bytes remaining: 44705710080 bytes total: 68721246208 bytes progression: 34.95 % busy: 1 ready: 0
drive-scsi0: transferred: 24135073792 bytes remaining: 44586172416 bytes total: 68721246208 bytes progression: 35.12 % busy: 1 ready: 0
drive-scsi0: transferred: 24254611456 bytes remaining: 44466634752 bytes total: 68721246208 bytes progression: 35.29 % busy: 1 ready: 0
drive-scsi0: transferred: 24373100544 bytes remaining: 44348145664 bytes total: 68721246208 bytes progression: 35.47 % busy: 1 ready: 0
drive-scsi0: transferred: 24494735360 bytes remaining: 44226510848 bytes total: 68721246208 bytes progression: 35.64 % busy: 1 ready: 0
drive-scsi0: transferred: 24623710208 bytes remaining: 44097536000 bytes total: 68721246208 bytes progression: 35.83 % busy: 1 ready: 0
drive-scsi0: transferred: 24742199296 bytes remaining: 43979046912 bytes total: 68721246208 bytes progression: 36.00 % busy: 1 ready: 0
drive-scsi0: transferred: 24861736960 bytes remaining: 43859509248 bytes total: 68721246208 bytes progression: 36.18 % busy: 1 ready: 0
drive-scsi0: transferred: 24980226048 bytes remaining: 43741020160 bytes total: 68721246208 bytes progression: 36.35 % busy: 1 ready: 0
drive-scsi0: transferred: 25099763712 bytes remaining: 43621482496 bytes total: 68721246208 bytes progression: 36.52 % busy: 1 ready: 0
drive-scsi0: transferred: 25219301376 bytes remaining: 43501944832 bytes total: 68721246208 bytes progression: 36.70 % busy: 1 ready: 0
drive-scsi0: transferred: 26305626112 bytes remaining: 42415620096 bytes total: 68721246208 bytes progression: 38.28 % busy: 1 ready: 0
drive-scsi0: transferred: 51522830336 bytes remaining: 17198415872 bytes total: 68721246208 bytes progression: 74.97 % busy: 1 ready: 0
drive-scsi0: transferred: 68721246208 bytes remaining: 0 bytes total: 68721246208 bytes progression: 100.00 % busy: 0 ready: 1
all mirroring jobs are ready
2020-10-11 01:59:25 volume 'local-raid:105/vm-105-disk-0.qcow2' is 'local-raid:105/vm-105-disk-0.qcow2' on the target
2020-10-11 01:59:25 starting online/live migration on unix:/run/qemu-server/105.migrate
2020-10-11 01:59:25 set migration_caps
2020-10-11 01:59:25 migration speed limit: 8589934592 B/s
2020-10-11 01:59:25 migration downtime limit: 100 ms
2020-10-11 01:59:25 migration cachesize: 2147483648 B
2020-10-11 01:59:25 set migration parameters
2020-10-11 01:59:25 start migrate command to unix:/run/qemu-server/105.migrate
2020-10-11 01:59:26 migration status: active (transferred 103873617, remaining 17052528640), total 17197768704)
2020-10-11 01:59:26 migration xbzrle cachesize: 2147483648 transferred 41270374 pages 62487 cachemiss 372065 overflow 327
2020-10-11 01:59:27 migration status: active (transferred 221913800, remaining 16930635776), total 17197768704)

<cut>
 
Part 3/3:


Code:
<cut>

2020-10-11 02:01:48 migration xbzrle cachesize: 2147483648 transferred 41270374 pages 62487 cachemiss 415180 overflow 327
2020-10-11 02:01:48 migration status: active (transferred 16740636437, remaining 20156416), total 17197768704)
2020-10-11 02:01:48 migration xbzrle cachesize: 2147483648 transferred 41270374 pages 62487 cachemiss 418205 overflow 327
2020-10-11 02:01:48 migration status: active (transferred 16753239864, remaining 28352512), total 17197768704)
2020-10-11 02:01:48 migration xbzrle cachesize: 2147483648 transferred 41270374 pages 62487 cachemiss 421276 overflow 327
2020-10-11 02:01:48 migration status error: failed
2020-10-11 02:01:48 ERROR: online migrate failure - aborting
2020-10-11 02:01:48 aborting phase 2 - cleanup resources
2020-10-11 02:01:48 migrate_cancel
drive-scsi0: Cancelling block job
channel 4: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

channel 3: open failed: connect failed: open failed

drive-scsi0: Done.
2020-10-11 02:02:02 ERROR: migration finished with problems (duration 00:05:08)
TASK ERROR: migration problems
 
Last edited:
Still trying to get some indication for this, if someone can help it would really be helping me in my progress, I'm really stuck right now.

I think (my opinion) that the most important part is the third, where the migration fails, as this doesn't occur when I try to migrate other VMs. As in Part 3 above, this last line before the error:

Code:
2020-10-11 02:01:48 migration xbzrle cachesize: 2147483648 transferred 41270374 pages 62487 cachemiss 421276 overflow 327


On other VMs, when I migrate successfully, it appears like this:


Code:
2020-10-11 08:12:10 migration xbzrle cachesize: 536870912 transferred 0 pages 0 cachemiss 1976 overflow 0

Notice how the "transferred, pages, overflow" all show zero, unlike the failed migration.

The rest of the lines show a successful status (on the other VMs):

Code:
2020-10-11 08:12:10 migration speed: 28.44 MB/s - downtime 73 ms
2020-10-11 08:12:0 migration status: completed

Could this be some indication as to the cause of the failed migration, later leading to the connection error that would then be a consequence of the migration itself having been unsuccessful?

Would appreciate any help.
 
I have the exact same issue

I noticed you can only offline migrate the VM anymore. Online migration to any node is than broken.
 

About

The Proxmox community has been around for many years and offers help and support for Proxmox VE, Proxmox Backup Server, and Proxmox Mail Gateway.
We think our community is one of the best thanks to people like you!

Get your subscription!

The Proxmox team works very hard to make sure you are running the best software and getting stable updates and security enhancements, as well as quick enterprise support. Tens of thousands of happy customers have a Proxmox subscription. Get yours easily in our online shop.

Buy now!