Big IO and Load during Migrate an VM 6.4-15

Bidi

Renowned Member
Feb 12, 2016
109
2
83
37
Hello Guys,

I have a problem with a node and i dont understand why dose it makes like this.

When i migrate an VM to the server, the entire server is slowly or not responding.
The server is on an cluster with another 5 servers, network is 10GB.

I made test migrating same VM om another nodes and the only problem is with the node i`m facing this issues, other nodes not a problem at all.

All the servers ar the same spec, configs, disk...etc all the same.

During migrating on the problem node i sow this witch Read/Write is small but IO is big .

On summary server has IO delay 58.28% , the problem is i dont even know where to identify/check the problem

48 x Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (2 Sockets)
Kernel Version

Linux 5.4.157-1-pve #1 SMP PVE 5.4.157-1 (Mon, 29 Nov 2021 12:01:44 +0100)
PVE Manager Version

pve-manager/6.4-15/af7986e6

RAID10 - SSD Kingston DC500M 1.92TB


Total DISK READ: 888.26 K/s | Total DISK WRITE: 12.53 M/s
Current DISK READ: 881.39 K/s | Current DISK WRITE: 17.89 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
47539 be/4 root 0.00 B/s 290.74 K/s 0.00 % 99.99 % dd of=/var/lib/vz/images/245/vm-245-disk-0.raw conv=sparse bs=64k
418 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd1]
417 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd0]
31003 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kworker/u97:0+flush-253:2]
26735 be/4 root 20.60 K/s 0.00 B/s 0.00 % 99.99 % perl -T /usr/bin/pvesr run --mail 1
992 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/dm-2-8]
648 be/3 root 0.00 B/s 0.00 B/s 0.00 % 35.41 % [jbd2/dm-1-8]
22804 be/3 root 0.00 B/s 13.74 K/s 0.00 % 12.13 % [jbd2/loop1-8]
987 be/3 root 0.00 B/s 36.63 K/s 0.00 % 7.23 % [jbd2/sdb1-8]
 
Hello Guys,

I have a problem with a node and i dont understand why dose it makes like this.

When i migrate an VM to the server, the entire server is slowly or not responding.
The server is on an cluster with another 5 servers, network is 10GB.

I made test migrating same VM om another nodes and the only problem is with the node i`m facing this issues, other nodes not a problem at all.

All the servers ar the same spec, configs, disk...etc all the same.

During migrating on the problem node i sow this witch Read/Write is small but IO is big .

On summary server has IO delay 58.28% , the problem is i dont even know where to identify/check the problem

48 x Intel(R) Xeon(R) CPU E5-2690 v3 @ 2.60GHz (2 Sockets)
Kernel Version

Linux 5.4.157-1-pve #1 SMP PVE 5.4.157-1 (Mon, 29 Nov 2021 12:01:44 +0100)
PVE Manager Version

pve-manager/6.4-15/af7986e6

RAID10 - SSD Kingston DC500M 1.92TB


Total DISK READ: 888.26 K/s | Total DISK WRITE: 12.53 M/s
Current DISK READ: 881.39 K/s | Current DISK WRITE: 17.89 M/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
47539 be/4 root 0.00 B/s 290.74 K/s 0.00 % 99.99 % dd of=/var/lib/vz/images/245/vm-245-disk-0.raw conv=sparse bs=64k
418 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd1]
417 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kswapd0]
31003 be/4 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [kworker/u97:0+flush-253:2]
26735 be/4 root 20.60 K/s 0.00 B/s 0.00 % 99.99 % perl -T /usr/bin/pvesr run --mail 1
992 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.99 % [jbd2/dm-2-8]
648 be/3 root 0.00 B/s 0.00 B/s 0.00 % 35.41 % [jbd2/dm-1-8]
22804 be/3 root 0.00 B/s 13.74 K/s 0.00 % 12.13 % [jbd2/loop1-8]
987 be/3 root 0.00 B/s 36.63 K/s 0.00 % 7.23 % [jbd2/sdb1-8]
Hi,
do you see any errors in the journal around the time of the migration on that particular node?
Sounds like a storage issue, or maybe bad memory.
What kind of storage is this, please post the storage configuration cat /etc/pve/storage.cfg as well as the migration log and the journal journalctl --since <datetime> --unitl <datetime>.
 
So ower disk on the servers ar like this:

dir: local
path /var/lib/vz
content vztmpl,iso,images,rootdir,backup,snippets
prune-backups keep-all=1
shared 0

dir: local-ssd
path /home3
content snippets,iso,rootdir,images,backup,vztmpl
nodes ...........................
prune-backups keep-all=1
shared 0


local 2 x 480GB SSD RAID1 only for Proxmox as LVM
storage-vms 6 x 1.92 TB Kingston DC500M RAID10 for VM as ext4

After i made the post i notice the migrated VMs was on "local" and migrated on the problem node same on storage "local"
I migrated more vms to the problem node vms witch was on "storage-vms" on same target storage "storage-vms" and there was no load or big IO at all, migrated 25GB in "migration finished successfully (duration 00:01:30)"

I think sompting is wrong with those disks the local - 480GB SSD

This load and high IO happen only if i migrate sompting to the "local 2 x 480GB SSD"

:|