Hi,
Summary
When live migrating a VM with a certain disk size(70 GB for example) the migration target is extremely heavily loaded in the first phase of migration process. The default bandwidth limit had been set to a very low level, but it seems not to be applied in this phase of the migration. Other VMs at the target system are crashing with kernel panic.
Diagnostic Information
Storage status at source node:
Storage status at target node:
VM-Config:
Details
Summary
When live migrating a VM with a certain disk size(70 GB for example) the migration target is extremely heavily loaded in the first phase of migration process. The default bandwidth limit had been set to a very low level, but it seems not to be applied in this phase of the migration. Other VMs at the target system are crashing with kernel panic.
Diagnostic Information
Code:
pveversion -v
proxmox-ve: 6.1-2 (running kernel: 5.3.18-3-pve)
pve-manager: 6.1-8 (running version: 6.1-8/806edfe1)
pve-kernel-helper: 6.1-7
pve-kernel-5.3: 6.1-6
pve-kernel-5.3.18-3-pve: 5.3.18-3
pve-kernel-5.3.10-1-pve: 5.3.10-1
ceph-fuse: 12.2.11+dfsg1-2.1+b1
corosync: 3.0.3-pve1
criu: 3.11-3
glusterfs-client: 5.5-3
ifupdown: 0.8.35+pve1
ksm-control-daemon: 1.3-1
libjs-extjs: 6.0.1-10
libknet1: 1.15-pve1
libpve-access-control: 6.0-6
libpve-apiclient-perl: 3.0-3
libpve-common-perl: 6.0-17
libpve-guest-common-perl: 3.0-5
libpve-http-server-perl: 3.0-5
libpve-storage-perl: 6.1-5
libqb0: 1.0.5-1
libspice-server1: 0.14.2-4~pve6+1
lvm2: 2.03.02-pve4
lxc-pve: 3.2.1-1
lxcfs: 4.0.1-pve1
novnc-pve: 1.1.0-1
proxmox-mini-journalreader: 1.1-1
proxmox-widget-toolkit: 2.1-3
pve-cluster: 6.1-4
pve-container: 3.0-23
pve-docs: 6.1-6
pve-edk2-firmware: 2.20200229-1
pve-firewall: 4.0-10
pve-firmware: 3.0-6
pve-ha-manager: 3.0-9
pve-i18n: 2.0-4
pve-qemu-kvm: 4.1.1-4
pve-xtermjs: 4.3.0-1
qemu-server: 6.1-7
smartmontools: 7.1-pve2
spiceterm: 3.1-1
vncterm: 1.6-1
zfsutils-linux: 0.8.3-pve1
Storage status at source node:
Code:
# pvesm status
Name Type Status Total Used Available %
iso-images nfs active 10952134656 5121724416 5830410240 46.76%
local dir active 98559220 2880032 90629640 2.92%
local-lvm lvmthin active 1792536576 695504191 1097032384 38.80%
Storage status at target node:
Code:
# pvesm status
Name Type Status Total Used Available %
iso-images nfs active 10952133632 5121515520 5830618112 46.76%
local dir active 98559220 2848008 90661664 2.89%
local-lvm lvmthin active 833396736 27918790 805477945 3.35%
VM-Config:
Code:
# qm config 109
bootdisk: scsi0
cores: 1
ide2: none,media=cdrom
memory: 4096
name: testvm70gb.bla.tld
net0: virtio=8A:36:A7:C2:DB:C2,bridge=vmbr1,firewall=1
numa: 0
onboot: 1
ostype: l26
scsi0: local-lvm:vm-109-disk-0,cache=unsafe,discard=on,format=raw,size=70G
scsihw: virtio-scsi-pci
smbios1: uuid=d0dcd114-d595-4b11-bc14-d019ee158a71
sockets: 1
vmgenid: ecf15ad7-26a1-4dc7-ba3b-1167d05d5c00
Details
- datacenter.cfg
Code:keyboard: de bwlimit: default=10240
- I'm migrating a VM with 70 GB hard disk size.
- Storage is lvm-thin on both sides
- Source disk is a HW-RAID 10(4 disks). Target is a single disk. (Situation applies for the reverse way too)
- When I initiate the migration, the first step at the log entry
Code:
2020-04-06 13:33:06 scsi0: start migration to nbd:192.168.207.40:60000:exportname=drive-scsi0 drive mirror is starting for drive-scsi0 with bandwidth limit: 10240 KB/s
Code:Total DISK READ: 0.00 B/s | Total DISK WRITE: 71.79 M/s Current DISK READ: 0.00 B/s | Current DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 20516 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20523 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20527 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20525 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20515 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.99 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20521 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.59 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20519 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.58 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20520 be/4 root 0.00 B/s 4.57 M/s 0.00 % 99.53 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20518 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.49 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20508 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.49 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20513 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.48 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20526 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.47 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20524 be/4 root 0.00 B/s 4.46 M/s 0.00 % 99.44 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20514 be/4 root 0.00 B/s 4.46 M/s 0.00 % 97.10 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20522 be/4 root 0.00 B/s 4.46 M/s 0.00 % 96.95 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S 20517 be/4 root 0.00 B/s 4.34 M/s 0.00 % 96.56 % kvm -id 109 -name testvm70gb.bla.tld -chardev socket,id=qmp,path=/var/run/qemu-se~00 -machine type=pc-i440fx-4.1+pve0 -incoming unix:/run/qemu-server/109.migrate -S
- Im observing that the "mapped size" value of lvdisplay rises from 0% to 100%.
- When I do this with a VM with a smaller disk(10 GB), this does not happen. The phase of filling up the logical volume does not happen then.
- In this phase other VMs at the target system crash with Kernel panic. I assume due to timeouts, occurring while waiting for i/o tasks to complete, because the bandwidth got eaten up for the migration processes.
- Load average of the target server reaches 20-30
Last edited: