[usable workaround found] PVE 6.1 / Live migration with Local Storage: other VMs crashing at target node

hellfire · Apr 6, 2020

Hi,

Summary

When live migrating a VM with a certain disk size(70 GB for example) the migration target is extremely heavily loaded in the first phase of migration process. The default bandwidth limit had been set to a very low level, but it seems not to be applied in this phase of the migration. Other VMs at the target system are crashing with kernel panic.

Diagnostic Information

Code:

pveversion -v

proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-8 (running version: 6.1-8/806edfe1)
pve-kernel-helper: 6.1-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-17
libpve-guest-common-perl: 3.0-5
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 4.0.1-pve1
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-23
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.0-10
pve-firmware: 3.0-6
pve-ha-manager: 3.0-9
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-7
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1

Storage status at source node:

Code:

# pvesm status
Name              Type     Status           Total            Used       Available        %
iso-images         nfs     active     10952134656      5121724416      5830410240   46.76%
local              dir     active        98559220         2880032        90629640    2.92%
local-lvm      lvmthin     active      1792536576       695504191      1097032384   38.80%

Storage status at target node:

Code:

# pvesm status
Name              Type     Status           Total            Used       Available        %
iso-images         nfs     active     10952133632      5121515520      5830618112   46.76%
local              dir     active        98559220         2848008        90661664    2.89%
local-lvm      lvmthin     active       833396736        27918790       805477945    3.35%

VM-Config:

Code:

# qm config 109
bootdisk: scsi0
cores: 1
ide2: none,media=cdrom
memory: 4096
name: testvm70gb.bla.tld
net0: virtio=8A:36:A7:C2:DB:C2,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-109-disk-0,cache=unsafe,discard=on,format=raw,size=70G
scsihw: virtio-scsi-pci
smbios1: uuid=d0dcd114-d595-4b11-bc14-d019ee158a71
sockets: 1
vmgenid: ecf15ad7-26a1-4dc7-ba3b-1167d05d5c00

Details

datacenter.cfg
Code:
```
keyboard: de
bwlimit: default=10240
```
I'm migrating a VM with 70 GB hard disk size.
Storage is lvm-thin on both sides
Source disk is a HW-RAID 10(4 disks). Target is a single disk. (Situation applies for the reverse way too)

When I initiate the migration, the first step at the log entry

Code:

2020-04-06 13:33:06 scsi0: start migration to nbd:192.168.207.40:60000:exportname=drive-scsi0
drive mirror is starting for drive-scsi0 with bandwidth limit: 10240 KB/s

is that it seems that target logical volume is written to for its full capacity with multiple processes at a rate of up to 150-200 MiB/sec. Here's a snapshot of iotop of this phase:

Code:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 71.79 M/s
Current DISK READ: 0.00 B/s | Current DISK WRITE: 0.00 B/s
TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND
20516 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20523 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20527 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20525 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20515 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20521 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.59 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20519 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.58 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20520 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.53 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20518 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.49 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20508 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.49 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20513 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.48 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20526 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.47 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20524 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.44 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20514 be/4 root 0.00 B/s 4.46 M/s 0.00 % 97.10 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20522 be/4 root 0.00 B/s 4.46 M/s 0.00 % 96.95 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
20517 be/4 root 0.00 B/s 4.34 M/s 0.00 % 96.56 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S

Im observing that the "mapped size" value of lvdisplay rises from 0% to 100%.
When I do this with a VM with a smaller disk(10 GB), this does not happen. The phase of filling up the logical volume does not happen then.
In this phase other VMs at the target system crash with Kernel panic. I assume due to timeouts, occurring while waiting for i/o tasks to complete, because the bandwidth got eaten up for the migration processes.
Load average of the target server reaches 20-30

hellfire · Apr 6, 2020

The migration process completes successfully.

hellfire · Apr 6, 2020

offline migration is the current workaround which does not have an initial full write at maximum io-capacity to the complete LV.

hellfire · Apr 7, 2020

I tried the following without success:

Added the bps_wr=2000000 to the -drive Option of the migration within QemuServer.pm (line 3484) to limit the io-bandwidth of the vm. The Option was added for the newly started migration as I verified via ps. But the speed situation was as before.
Tried to add "/usr/bin/ionice -c3" in front of the run_command in QemuServer.pm(line 4955) in order to prioritize the io of the migration vm lower than everything else. The command did not work. I need to add more debug, to get this working.

hellfire · Apr 7, 2020

Same situation with a zfs-thin based target storage server with far more io capacity(The speed may come from writing zeros through lz4 compression):

Code:

Total DISK READ:         3.01 K/s | Total DISK WRITE:       994.56 M/s
Current DISK READ:       0.00 B/s | Current DISK WRITE:    2002.75 M/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
11635 be/4 root        0.00 B/s   33.62 M/s  0.00 % 95.69 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11638 be/4 root        0.00 B/s   33.62 M/s  0.00 % 93.61 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11640 be/4 root        0.00 B/s   33.62 M/s  0.00 % 91.01 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11630 be/4 root        0.00 B/s   33.62 M/s  0.00 % 87.63 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11627 be/4 root        0.00 B/s   33.62 M/s  0.00 % 84.82 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11634 be/4 root        0.00 B/s   33.62 M/s  0.00 % 84.22 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11633 be/4 root        0.00 B/s   31.65 M/s  0.00 % 83.06 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11639 be/4 root        0.00 B/s   33.62 M/s  0.00 % 80.67 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11626 be/4 root        0.00 B/s   32.15 M/s  0.00 % 78.31 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11628 be/4 root        0.00 B/s   28.07 M/s  0.00 % 74.59 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11637 be/4 root        0.00 B/s   41.45 M/s  0.00 % 71.61 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11636 be/4 root        0.00 B/s   30.65 M/s  0.00 % 70.74 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S
11629 be/4 root        0.00 B/s   38.76 M/s  0.00 % 68.90 % kvm -id 116 -name testvm55GB.domain.tld~nix:/run/qemu-server/116.migrate -S

hellfire · Apr 7, 2020

doing ionice -c3 -p <pid> on a kvm ... -incoming process does not help either

used this script:

Code:

#!/bin/bash

export PATH=/sbin:/usr/sbin:/bin:/usr/bin:/usr/local/sbin:/usr/local/bin
export PROGS=(pgrep ionice)

get_migration_pids() {
        pgrep -f "kvm.*-incoming"
}

io_renice_pids() {

        for pid in "$@";do
                if kill -0 $pid ; then
                        nice_value="$(ionice -p $pid)"
                        if [ "$nice_value" != idle ] ;then
                                echo -e "\nsetting io priority class of migration process $pid to idle"
                                ionice -p $pid -c 3
                        fi
                fi
        done
}

shell_init() {

        for prog in ${PROGS[@]};do
                if ! which $prog &>/dev/null ; then
                        echo "needed prog $prog is missing, aborting..."
                        return 1
                fi
        done


}

main() {

        shell_init || exit 1
        while :; do
                io_renice_pids $(get_migration_pids)
                sleep 1
                echo -n "."
        done
}

main

... and doing offline migration seems to work only if the same storage tech is used.

So I have to use export/import to migrate.

hellfire · Apr 8, 2020

It seems the limit when it needs a remote storage initialization is between 20 GB and 40 GB disk size. The former directly starts the network transfer. The latter remotely initializes(?) the storage and eats up all io capacity until fully written the target device and then starts the network transfer.

hellfire · Apr 9, 2020

I checked lvm storage instead of lvmthin. The situation is the same.

The only working solution for me seems to be to use dir type storage with qcow2 image format(for having thin provisioning and snapshot capabilities). That way the server load when live migrating is being low and more closer sticking to the configured bandwidth limit und thus not affecting other virtual machines on the host.

I wonder if I'm the only one having that problem...

Search

Search

[usable workaround found] PVE 6.1 / Live migration with Local Storage: other VMs crashing at target node

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

hellfire

Renowned Member

We value your privacy