Hello!
We are running a PVE 7.1-4 cluster with 2 nodes (and a third corosync device) with a dedicated 10Gbit/s cluster link (192.168.0.0/24)
When we try to migrate a VM which has heavy RAM usage (a video surveillance server) the migration fails every time and the VM gets stopped suddenly and the VM has to be restarted aigain ...
We are using the latest updates from the enterprise repo and also had the problem on 6.4 (we just did the pve6to7 upgrade successfully)
Any ideas?
	
	
	
		
full log attached ...
				
			We are running a PVE 7.1-4 cluster with 2 nodes (and a third corosync device) with a dedicated 10Gbit/s cluster link (192.168.0.0/24)
When we try to migrate a VM which has heavy RAM usage (a video surveillance server) the migration fails every time and the VM gets stopped suddenly and the VM has to be restarted aigain ...
We are using the latest updates from the enterprise repo and also had the problem on 6.4 (we just did the pve6to7 upgrade successfully)
Any ideas?
		Code:
	
	task started by HA resource agent
2021-11-19 14:31:53 use dedicated network address for sending migration traffic (192.168.0.1)
2021-11-19 14:31:53 starting migration of VM 115 to node 'node1' (192.168.0.1)
2021-11-19 14:31:53 starting VM 115 on remote node 'node1'
2021-11-19 14:31:55 start remote tunnel
2021-11-19 14:31:55 ssh tunnel ver 1
2021-11-19 14:31:55 starting online/live migration on unix:/run/qemu-server/115.migrate
2021-11-19 14:31:55 set migration capabilities
2021-11-19 14:31:55 migration downtime limit: 100 ms
2021-11-19 14:31:55 migration cachesize: 2.0 GiB
2021-11-19 14:31:55 set migration parameters
2021-11-19 14:31:55 start migrate command to unix:/run/qemu-server/115.migrate
2021-11-19 14:31:56 migration active, transferred 178.7 MiB of 16.0 GiB VM-state, 293.3 MiB/s
2021-11-19 14:31:57 migration active, transferred 497.0 MiB of 16.0 GiB VM-state, 307.8 MiB/s
2021-11-19 14:31:58 migration active, transferred 797.6 MiB of 16.0 GiB VM-state, 319.8 MiB/s
2021-11-19 14:31:59 migration active, transferred 1.1 GiB of 16.0 GiB VM-state, 424.8 MiB/s
2021-11-19 14:32:06 migration active, transferred 2.8 GiB of 16.0 GiB VM-state, 315.6 MiB/s
2021-11-19 14:32:07 migration active, transferred 3.1 GiB of 16.0 GiB VM-state, 319.1 MiB/s
  ....
2021-11-19 14:32:20 migration active, transferred 6.6 GiB of 16.0 GiB VM-state, 364.4 MiB/s
2021-11-19 14:32:21 migration active, transferred 6.9 GiB of 16.0 GiB VM-state, 307.2 MiB/s
2021-11-19 14:32:42 migration active, transferred 12.2 GiB of 16.0 GiB VM-state, 274.3 MiB/s
2021-11-19 14:32:44 migration active, transferred 12.6 GiB of 16.0 GiB VM-state, 237.9 MiB/s
2021-11-19 14:32:45 migration active, transferred 12.9 GiB of 16.0 GiB VM-state, 250.1 MiB/s, VM dirties lots of memory: 262.4 MiB/s
2021-11-19 14:32:46 migration active, transferred 13.2 GiB of 16.0 GiB VM-state, 305.9 MiB/s
2021-11-19 14:32:47 migration active, transferred 13.4 GiB of 16.0 GiB VM-state, 356.3 MiB/s
2021-11-19 14:32:47 xbzrle: send updates to 63987 pages in 89.1 MiB encoded memory, cache-miss 93.40%, overflow 10581
2021-11-19 14:32:48 migration active, transferred 13.7 GiB of 16.0 GiB VM-state, 413.2 MiB/s
2021-11-19 14:32:48 xbzrle: send updates to 116032 pages in 225.8 MiB encoded memory, cache-miss 93.40%, overflow 35605
2021-11-19 14:32:59 xbzrle: send updates to 347783 pages in 790.5 MiB encoded memory, cache-miss 65.61%, overflow 121242
2021-11-19 14:33:00 migration active, transferred 16.5 GiB of 16.0 GiB VM-state, 294.0 MiB/s
2021-11-19 14:33:00 xbzrle: send updates to 388479 pages in 894.4 MiB encoded memory, cache-miss 65.61%, overflow 137694
2021-11-19 14:33:01 migration active, transferred 16.7 GiB of 16.0 GiB VM-state, 445.3 MiB/s
2021-11-19 14:33:01 xbzrle: send updates to 428831 pages in 1011.8 MiB encoded memory, cache-miss 65.61%, overflow 157000
2021-11-19 14:33:03 migration active, transferred 17.0 GiB of 16.0 GiB VM-state, 284.6 MiB/s, VM dirties lots of memory: 291.2 MiB/s
2021-11-19 14:33:03 xbzrle: send updates to 460003 pages in 1.1 GiB encoded memory, cache-miss 65.61%, overflow 172434
2021-11-19 14:33:04 migration active, transferred 17.2 GiB of 16.0 GiB VM-state, 327.5 MiB/s
2021-11-19 14:33:04 xbzrle: send updates to 498630 pages in 1.2 GiB encoded memory, cache-miss 65.61%, overflow 192016
... ...
2021-11-19 14:34:42 migration active, transferred 39.0 GiB of 16.0 GiB VM-state, 751.4 MiB/s
2021-11-19 14:34:42 xbzrle: send updates to 3977696 pages in 10.6 GiB encoded memory, cache-miss 45.54%, overflow 1766700
2021-11-19 14:34:43 migration active, transferred 39.2 GiB of 16.0 GiB VM-state, 322.4 MiB/s
2021-11-19 14:34:43 xbzrle: send updates to 4038051 pages in 10.7 GiB encoded memory, cache-miss 29.98%, overflow 1787119
2021-11-19 14:34:45 migration active, transferred 39.4 GiB of 16.0 GiB VM-state, 354.9 MiB/s, VM dirties lots of memory: 438.6 MiB/s
2021-11-19 14:34:45 xbzrle: send updates to 4099325 pages in 10.8 GiB encoded memory, cache-miss 14.73%, overflow 1811303
2021-11-19 14:34:46 migration active, transferred 39.7 GiB of 16.0 GiB VM-state, 543.4 MiB/s
2021-11-19 14:34:46 xbzrle: send updates to 4170823 pages in 10.9 GiB encoded memory, cache-miss 10.64%, overflow 1835116
2021-11-19 14:34:46 auto-increased downtime to continue migration: 800 ms
2021-11-19 14:34:47 migration active, transferred 39.9 GiB of 16.0 GiB VM-state, 337.8 MiB/s, VM dirties lots of memory: 374.8 MiB/s
2021-11-19 14:34:47 xbzrle: send updates to 4244688 pages in 11.1 GiB encoded memory, cache-miss 11.97%, overflow 1856522
2021-11-19 14:34:49 migration active, transferred 40.1 GiB of 16.0 GiB VM-state, 359.7 MiB/s, VM dirties lots of memory: 385.5 MiB/s
2021-11-19 14:34:49 xbzrle: send updates to 4316458 pages in 11.2 GiB encoded memory, cache-miss 16.51%, overflow 1881573
2021-11-19 14:34:49 auto-increased downtime to continue migration: 1600 ms
2021-11-19 14:34:50 migration active, transferred 40.3 GiB of 16.0 GiB VM-state, 1.3 GiB/s
2021-11-19 14:34:50 xbzrle: send updates to 4384104 pages in 11.4 GiB encoded memory, cache-miss 14.18%, overflow 1903430
query migrate failed: VM 115 qmp command 'query-migrate' failed - client closed connection
2021-11-19 14:34:53 query migrate failed: VM 115 qmp command 'query-migrate' failed - client closed connection
query migrate failed: VM 115 not running
2021-11-19 14:34:54 query migrate failed: VM 115 not running
query migrate failed: VM 115 not running
2021-11-19 14:34:55 query migrate failed: VM 115 not running
query migrate failed: VM 115 not running
2021-11-19 14:34:56 query migrate failed: VM 115 not running
query migrate failed: VM 115 not running
2021-11-19 14:34:57 query migrate failed: VM 115 not running
query migrate failed: VM 115 not running
2021-11-19 14:34:58 query migrate failed: VM 115 not running
2021-11-19 14:34:58 ERROR: online migrate failure - too many query migrate failures - aborting
2021-11-19 14:34:58 aborting phase 2 - cleanup resources
2021-11-19 14:34:58 migrate_cancel
2021-11-19 14:34:58 migrate_cancel error: VM 115 not running
2021-11-19 14:35:00 ERROR: migration finished with problems (duration 00:03:07)
TASK ERROR: migration problemsfull log attached ...
Attachments
			
				Last edited: 
				
		
	
										
										
											
	
										
									
								 
	 
	 
 
		