Hello.
I have problem with IO delay while live migration. Two nodes are Supermicro 2U servers connected with 10G link and proxmox verson 7.1-10.
Now 1st node is in test stage without VM's. It's 40 x Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz (2 Sockets) witch 187.58 GiB RAM.
Both nodes have same zfs storage configured. For test in destination machine I'm using SSD pool (3x 512GB Intel disks).
When I start migration VM (2 disks 100G) to empty node it takes 1:10:35 with average speed 514 MiB/s (secured):
While migration is in progress the destination node get very big load average (~50) and IO delay about 30%. Htop on this machine show same load but CPU/DISK usage is jumping - few seconds normal and jump for few seconds to ~80%.
With empty node there is't too much problem (sometimes few tasks on VM need to be restarted), but there is worst when I start to migrate second VM to this node (same pool).
IO delay and load average hit same amount. CPU/DISK usage jumping too from:
to:
The problem is with other VM's on destination node. Some tasks stoping working and syslog show kernel errors. Htop on VM not show any problem - no CPU/DISK usage and almost 0 load average. After some times and few more kernel error VM freez with log in console "Reboot after 5 secs".
I don't know where to search reason. In configuration I disabled secured migration - this gave me faster transfer.
Have you any clue what could be wrong? Thanks in advance.
I have problem with IO delay while live migration. Two nodes are Supermicro 2U servers connected with 10G link and proxmox verson 7.1-10.
Now 1st node is in test stage without VM's. It's 40 x Intel(R) Xeon(R) Silver 4114 CPU @ 2.20GHz (2 Sockets) witch 187.58 GiB RAM.
Both nodes have same zfs storage configured. For test in destination machine I'm using SSD pool (3x 512GB Intel disks).
When I start migration VM (2 disks 100G) to empty node it takes 1:10:35 with average speed 514 MiB/s (secured):
While migration is in progress the destination node get very big load average (~50) and IO delay about 30%. Htop on this machine show same load but CPU/DISK usage is jumping - few seconds normal and jump for few seconds to ~80%.
With empty node there is't too much problem (sometimes few tasks on VM need to be restarted), but there is worst when I start to migrate second VM to this node (same pool).
IO delay and load average hit same amount. CPU/DISK usage jumping too from:
to:
The problem is with other VM's on destination node. Some tasks stoping working and syslog show kernel errors. Htop on VM not show any problem - no CPU/DISK usage and almost 0 load average. After some times and few more kernel error VM freez with log in console "Reboot after 5 secs".
I don't know where to search reason. In configuration I disabled secured migration - this gave me faster transfer.
Have you any clue what could be wrong? Thanks in advance.