Hello!
First post, yay! We are in the works of setting up a cluster using this awesome software.
We've set up a 3 node cluster. As backend for our VM disks we are currently running GlusterFS (A move to CEPH is underway).
While live-migrating a VM we bump into the following error every now and then:
And the above error causes the VM to go offline aswell.
The servers are running everything on a single 1GbE NIC at the moment. We are currently waiting for a batch of new 2x10GbE NICs for the servers and then we will separate all traffic. Could it have something to do with that? But the error always happens after the transfer has completed.
Sometimes it just says something like this (But the VM stays online when this happens):
Any pointers?
Tell me if you need anything else i'm pretty new to Proxmox
Best regards
Marcus
First post, yay! We are in the works of setting up a cluster using this awesome software.
We've set up a 3 node cluster. As backend for our VM disks we are currently running GlusterFS (A move to CEPH is underway).
While live-migrating a VM we bump into the following error every now and then:
Code:
2020-11-12 20:32:17 starting migration of VM 101 to node 'XXXXXXX' (x.x.x.x)
2020-11-12 20:32:17 starting VM 101 on remote node 'XXXXXXX'
2020-11-12 20:32:18 start remote tunnel
2020-11-12 20:32:19 ssh tunnel ver 1
2020-11-12 20:32:19 starting online/live migration on unix:/run/qemu-server/101.migrate
2020-11-12 20:32:19 set migration_caps
2020-11-12 20:32:19 migration speed limit: 8589934592 B/s
2020-11-12 20:32:19 migration downtime limit: 100 ms
2020-11-12 20:32:19 migration cachesize: 268435456 B
2020-11-12 20:32:19 set migration parameters
2020-11-12 20:32:19 start migrate command to unix:/run/qemu-server/101.migrate
2020-11-12 20:32:20 migration status: active (transferred 116961080, remaining 2042331136), total 2165121024)
2020-11-12 20:32:20 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:21 migration status: active (transferred 227347565, remaining 1926090752), total 2165121024)
2020-11-12 20:32:21 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:22 migration status: active (transferred 335118928, remaining 1810997248), total 2165121024)
2020-11-12 20:32:22 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:23 migration status: active (transferred 449655259, remaining 1690288128), total 2165121024)
2020-11-12 20:32:23 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:24 migration status: active (transferred 561048878, remaining 1577897984), total 2165121024)
2020-11-12 20:32:24 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:25 migration status: active (transferred 666718984, remaining 1470496768), total 2165121024)
2020-11-12 20:32:25 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:26 migration status: active (transferred 781827768, remaining 1352048640), total 2165121024)
2020-11-12 20:32:26 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:27 migration status: active (transferred 892257326, remaining 1238573056), total 2165121024)
2020-11-12 20:32:27 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:28 migration status: active (transferred 997598483, remaining 1131782144), total 2165121024)
2020-11-12 20:32:28 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:29 migration status: active (transferred 1113164386, remaining 1012162560), total 2165121024)
2020-11-12 20:32:29 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:30 migration status: active (transferred 1223998457, remaining 899096576), total 2165121024)
2020-11-12 20:32:30 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:31 migration status: active (transferred 1329081638, remaining 792301568), total 2165121024)
2020-11-12 20:32:31 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:32 migration status: active (transferred 1442324470, remaining 673230848), total 2165121024)
2020-11-12 20:32:32 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:33 migration status: active (transferred 1552935405, remaining 559210496), total 2165121024)
2020-11-12 20:32:33 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:34 migration status: active (transferred 1659646306, remaining 442191872), total 2165121024)
2020-11-12 20:32:34 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:35 migration status: active (transferred 1775577033, remaining 324268032), total 2165121024)
2020-11-12 20:32:35 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:36 migration status: active (transferred 1888253567, remaining 209465344), total 2165121024)
2020-11-12 20:32:36 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2002826716, remaining 83193856), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2014116080, remaining 70406144), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2025801264, remaining 58253312), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2037826621, remaining 45969408), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2049654365, remaining 34164736), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2061482109, remaining 22360064), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 1816 overflow 0
2020-11-12 20:32:37 migration status: active (transferred 2073500001, remaining 13221888), total 2165121024)
2020-11-12 20:32:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 3482 overflow 0
query migrate failed: VM 101 qmp command 'query-migrate' failed - client closed connection
2020-11-12 20:32:38 query migrate failed: VM 101 qmp command 'query-migrate' failed - client closed connection
query migrate failed: VM 101 not running
2020-11-12 20:32:39 query migrate failed: VM 101 not running
query migrate failed: VM 101 not running
2020-11-12 20:32:40 query migrate failed: VM 101 not running
query migrate failed: VM 101 not running
2020-11-12 20:32:41 query migrate failed: VM 101 not running
query migrate failed: VM 101 not running
2020-11-12 20:32:42 query migrate failed: VM 101 not running
query migrate failed: VM 101 not running
2020-11-12 20:32:43 query migrate failed: VM 101 not running
2020-11-12 20:32:43 ERROR: online migrate failure - too many query migrate failures - aborting
2020-11-12 20:32:43 aborting phase 2 - cleanup resources
2020-11-12 20:32:43 migrate_cancel
2020-11-12 20:32:43 migrate_cancel error: VM 101 not running
2020-11-12 20:32:45 ERROR: migration finished with problems (duration 00:00:28)
TASK ERROR: migration problems
And the above error causes the VM to go offline aswell.
The servers are running everything on a single 1GbE NIC at the moment. We are currently waiting for a batch of new 2x10GbE NICs for the servers and then we will separate all traffic. Could it have something to do with that? But the error always happens after the transfer has completed.
Sometimes it just says something like this (But the VM stays online when this happens):
Code:
2020-11-12 20:14:21 starting migration of VM 101 to node 'XXXXXXXXX' (x.x.x.x)
2020-11-12 20:14:22 starting VM 101 on remote node 'XXXXXXXXXX'
2020-11-12 20:14:23 start remote tunnel
2020-11-12 20:14:24 ssh tunnel ver 1
2020-11-12 20:14:24 starting online/live migration on unix:/run/qemu-server/101.migrate
2020-11-12 20:14:24 set migration_caps
2020-11-12 20:14:24 migration speed limit: 8589934592 B/s
2020-11-12 20:14:24 migration downtime limit: 100 ms
2020-11-12 20:14:24 migration cachesize: 268435456 B
2020-11-12 20:14:24 set migration parameters
2020-11-12 20:14:24 start migrate command to unix:/run/qemu-server/101.migrate
2020-11-12 20:14:25 migration status: active (transferred 112043967, remaining 2047471616), total 2165121024)
2020-11-12 20:14:25 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:26 migration status: active (transferred 217177637, remaining 1936338944), total 2165121024)
2020-11-12 20:14:26 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:27 migration status: active (transferred 329286990, remaining 1816891392), total 2165121024)
2020-11-12 20:14:27 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:28 migration status: active (transferred 442813862, remaining 1697144832), total 2165121024)
2020-11-12 20:14:28 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:29 migration status: active (transferred 547443685, remaining 1591676928), total 2165121024)
2020-11-12 20:14:29 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:30 migration status: active (transferred 663283881, remaining 1473953792), total 2165121024)
2020-11-12 20:14:30 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:31 migration status: active (transferred 775638773, remaining 1358303232), total 2165121024)
2020-11-12 20:14:31 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:32 migration status: active (transferred 885616774, remaining 1245331456), total 2165121024)
2020-11-12 20:14:32 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:33 migration status: active (transferred 994590204, remaining 1134813184), total 2165121024)
2020-11-12 20:14:33 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:34 migration status: active (transferred 1106552588, remaining 1018884096), total 2165121024)
2020-11-12 20:14:34 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:35 migration status: active (transferred 1211361304, remaining 912138240), total 2165121024)
2020-11-12 20:14:35 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:36 migration status: active (transferred 1324329033, remaining 797130752), total 2165121024)
2020-11-12 20:14:36 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:37 migration status: active (transferred 1438043897, remaining 677556224), total 2165121024)
2020-11-12 20:14:37 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:38 migration status: active (transferred 1543380904, remaining 568930304), total 2165121024)
2020-11-12 20:14:38 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:39 migration status: active (transferred 1657495205, remaining 444620800), total 2165121024)
2020-11-12 20:14:39 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:40 migration status: active (transferred 1771698661, remaining 328187904), total 2165121024)
2020-11-12 20:14:40 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:41 migration status: active (transferred 1873987359, remaining 224030720), total 2165121024)
2020-11-12 20:14:41 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 1987290428, remaining 99909632), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 1998258528, remaining 87965696), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 2010311056, remaining 74498048), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 2022132059, remaining 62033920), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 2034128796, remaining 49729536), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 2045931916, remaining 37949440), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:42 migration status: active (transferred 2057759660, remaining 26144768), total 2165121024)
2020-11-12 20:14:42 migration xbzrle cachesize: 268435456 transferred 0 pages 0 cachemiss 0 overflow 0
2020-11-12 20:14:43 migration status error: failed
2020-11-12 20:14:43 ERROR: online migrate failure - aborting
2020-11-12 20:14:43 aborting phase 2 - cleanup resources
2020-11-12 20:14:43 migrate_cancel
2020-11-12 20:14:44 ERROR: migration finished with problems (duration 00:00:23)
TASK ERROR: migration problems
Any pointers?
Tell me if you need anything else i'm pretty new to Proxmox
Best regards
Marcus